Multibyte encodings, why do we need functions with the mb prefix in PHP
Often, when developing a web application or site, it is necessary to work with text resources. As a rule, the text has its own encoding, so it is important to use the appropriate functions. Today the most popular encoding is UTF8, it is a multibyte encoding.
What does multibyte encoding mean? This means that more than one byte can be allocated per character. Indeed, all characters are represented by bytes, to encode a character, a certain number of them will be required, and one may not be enough. This is especially true for unusual symbols and letters of any languages. Therefore, multibyte encodings are needed, of course PHP supports them.
There are functions that can independently determine the encoding of the text. You can also specify the desired encoding in them, if necessary. There are some functions that begin with the mb_ prefix . They are specially designed to work with text, mb means multibyte .
Let's see what are the main mb functions in PHP , below are only the most used ones:
- mb_convert_case - changes the case of characters in a line,
- mb_convert_encoding - converts character encoding,
- mb_detect_encoding - character encoding detection,
- mb_internal_encoding - setting or getting the internal encoding of the script,
- mb_ord - gets the character code point,
- mb_split - splitting strings in multibyte encodings using a regular expression
- mb_strcut - get part of a string,
- mb_stripos - case-insensitive search for the position of the first occurrence of one string in another,
- mb_strlen - gets the line length,
- mb_strpos - search for the position of the first occurrence of one line in another,
- mb_strripos - search for the last occurrence of one string in another, case insensitive
- mb_strrpos - search for the position of the last occurrence of one line in another,
- mb_strstr - finds the first occurrence of a substring in a string,
- mb_strtolower - converting the string to lower case,
- mb_strtoupper - converting a string to upper case,
- mb_substr - Returns part of a string.
Thus, it is best to use multibyte encodings for working with text. They allow correct operations with symbols.
Latest articles
- IT / Misc 08.07.21 How to make a free translation for a website without an API, translate documents in Google Translate
- IT / Misc 06.07.21 How to make a subscription button on a website, a subscriber base and automatic mailing
- Food / Misc 06.07.21 How to quickly cook delicious fried pies with potatoes and onions
- IT / Misc 04.07.21 Caching - create, load and reset. Where to store the cache, methods and types of caching
- IT / Database 03.07.21 Custom NoSQL - storing data in files and not only in a database. Storing settings, small data and caching files