Delan Azabani

UTF-8 Everywhere: Manifesto

 67 words 0 min  chain

A comprehensive rundown of Unicode encodings, why UTF-16 is the worst of all worlds, and why Windows and Java got it wrong by being too early in the game — before Unicode 2.0, where UCS-2 became UTF-16 and no longer an encoding with fixed-size code units. The manifesto also effectively attacks the notion that UTF-16 is more efficient for CJK text, by virtue of compression beating both.