Norway


On thePHP.cc today they have a quick post that looks ahead at the future of the language towards PHP version 8 and one planned feature 211; the deprecation of some multi-byte character handling.

Since the attempt to create a Unicode-based PHP implementation has failed, PHP 7 – just like PHP – does not handle Unicode strings natively. The commonly used UTF-8 encoding, for example, is a multibyte encoding, as opposed to ASCII, where each character is represented by one single byte.

[…] UTF-8 is a variable-length encoding and each character (code point, to be exact) is represented by one to four bytes. For ASCII characters, everything works smoothly, because UTF-8 is a superset of ASCII. The problems start with non-ASCII characters.

The post covers some of the common issues with multi-byte Unicode characters in PHP and the role that the iconv and mbstring functions play in their handling. It shows how the mbstring handling allows developers to "cheat a little" and where, when PHP 8 comes around, the main issue will lie: the deprecation of thembstring.func_overload setting in the php.ini.



Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here