Multibyte encodings, why do we need functions with the mb prefix in PHP
Often, when developing a web application or site, it is necessary to work with text resources. As a rule, the text has its own encoding, so it is important to use the appropriate functions. Today the most popular encoding is UTF8, it is a multibyte encoding.
What does multibyte encoding mean? This means that more than one byte can be allocated per character. Indeed, all characters are represented by bytes, to encode a character, a certain number of them will be required, and one may not be enough. This is especially true for unusual symbols and letters of any languages. Therefore, multibyte encodings are needed, of course PHP supports them.
There are functions that can independently determine the encoding of the text. You can also specify the desired encoding in them, if necessary. There are some functions that begin with the mb_ prefix . They are specially designed to work with text, mb means multibyte .
Let's see what are the main mb functions in PHP , below are only the most used ones:
- mb_convert_case - changes the case of characters in a line,
- mb_convert_encoding - converts character encoding,
- mb_detect_encoding - character encoding detection,
- mb_internal_encoding - setting or getting the internal encoding of the script,
- mb_ord - gets the character code point,
- mb_split - splitting strings in multibyte encodings using a regular expression
- mb_strcut - get part of a string,
- mb_stripos - case-insensitive search for the position of the first occurrence of one string in another,
- mb_strlen - gets the line length,
- mb_strpos - search for the position of the first occurrence of one line in another,
- mb_strripos - search for the last occurrence of one string in another, case insensitive
- mb_strrpos - search for the position of the last occurrence of one line in another,
- mb_strstr - finds the first occurrence of a substring in a string,
- mb_strtolower - converting the string to lower case,
- mb_strtoupper - converting a string to upper case,
- mb_substr - Returns part of a string.
Thus, it is best to use multibyte encodings for working with text. They allow correct operations with symbols.
Latest articles
- 03.04.24IT / Уроки PHP Уроки простыми словами. Урок 3. Все операторы PHP с примерами, с выводом работы кода на экран.
- 02.04.24IT / Уроки PHP Уроки простыми словами. Урок 2. Типы данных в PHP с примерами.
- 02.04.24IT / Уроки PHP Уроки простыми словами. Урок 1. Коротко о языке веб-программирования PHP. Основы синтаксиса.
- 09.11.23IT / Database Errors when migrating from MySQL 5.6 to 5.7 and how to fix them - database dump import failed with an error or INSERT does not work. Disabling STRICT_TRANS_TABLES strict mode or using IGNORE
- 08.07.22IT / Misc Convert office files DOC, DOCX, DOCM, RTF to DOCX, DOCM, DOC, RTF, PDF, HTML, XML, TXT formats without loss and markup changes