Hide secret messages within emoji using Unicode variation selectors - based on Paul Butler's research
This method uses invisible Unicode variation selectors (U+FE00 to U+FE0F) to encode data after emoji characters. Each variation selector encodes 4 bits of data, making this technique both compact and invisible to human readers.
Click to load example combinations:
This technique is based on Paul Butler's research into Unicode steganography, published in his article "Steganography in Emoji" (2020). It exploits Unicode variation selectors - invisible characters originally designed to modify the presentation of other characters - to embed arbitrary data within emoji. These variation selectors are preserved across most platforms while remaining completely invisible to users.
Variation selectors are a set of Unicode characters (U+FE00 through U+FE0F) originally intended to specify different visual presentations of the same character. For example, some characters can be displayed in text style or emoji style, and variation selectors control this. However, when attached to emoji, these selectors typically have no visible effect while still being preserved in the underlying data.
This tool uses all 16 variation selectors (U+FE00 to U+FE0F), mapping each to a unique 4-bit binary value (0000 to 1111). This allows efficient data encoding.
Step 1: UTF-8 Encoding
Your message is first converted to UTF-8 bytes. Each character becomes one or more bytes depending on its Unicode codepoint. Example: "hello" becomes 5 bytes: [68, 65, 6C, 6C, 6F].
Step 2: Byte to Nibble Conversion
Each byte is split into two 4-bit chunks called "nibbles". A byte like 0x68 (104 in decimal, 01101000 in binary) becomes two nibbles: 0110 (high) and 1000 (low). This doubling of data units is necessary because each variation selector encodes 4 bits.
Step 3: Nibble to Selector Mapping
Each 4-bit nibble maps to one of the 16 variation selectors:
- 0000 → U+FE00
- 0001 → U+FE01
- ...
- 1111 → U+FE0F
Step 4: Selector Appending
All variation selectors are appended to your chosen base emoji in sequence. The result looks like a single emoji but contains your entire hidden message.
Complete Example:
"hello" (5 chars, 5 UTF-8 bytes, 10 nibbles) → 10 variation selectors appended to base emoji
Visual result: 😊 (appears as single emoji)
Actual data: 😊[U+FE00][U+FE01]...[U+FE0F] (base + 10 invisible selectors)
Decoding reverses the encoding:
If there's an odd number of nibbles (malformed data), the last nibble is discarded as padding.
The encoding is quite efficient:
There's no hard limit on message length - you can embed hundreds of characters in a single emoji, though extremely long messages may cause display issues on some platforms.