i18n fails

UTF-16 truncation?

People say that emoji is the technology that finally gets Latin-script users to really care about text encoding.

Date Published

Vendor

Google G logoGoogle

Product

YouTube full-color icon (2024)YouTube
From
English (United States)
English (United States)
To
GitHub 
9 hours ago 

Even the best design systems can miss critical accessibility details. But what if you could catch them earlier? �… Read more
GitHub 
9 hours ago 

Even the best design systems can miss critical accessibility details. But what if you could catch them earlier? 🤔

This guide shows how to use design system annotations to proactively build accessibility in by: 
🧩 Leveraging reusable a11y annotations with your components. 
📂 Documenting details that don't show up in Figma assets or code. 
🚀 Empowering your teams to build accessibly by default. 

Start bridging the accessibility gap and creating better experiences for everyone. ✅

Speculation: YouTube stored the content of posts in UTF-16, and truncated the text based on UTF-16 charactcers. Emoji, and all other characters outside of the Basic Multilingual Plane, are encoded as a surrogate pair that are invalid when isolated. In case when the text is truncated in the middle of an emoji, the text becomes invalid and resulting in a �.

External links