Hide messages using Unicode space characters and zero-width characters
This technique uses Unicode space characters of different widths to encode secret data, based on Microsoft Word document space classification research. It employs two groups of spaces:
Uses combinations of Thin, Six-Per-Em, Hair spaces with zero-width characters. Encodes 4 bits per space (2 for Unicode space + 2 for ZWC position).
Uses Hair, Six-Per-Em, Punctuation, and Thin spaces for end-of-line and paragraph spaces. Encodes 2 bits per space.
This implementation is based on the research paper "Text Steganography Using Word Document Spacing" which categorizes Unicode spaces into two groups for data encoding:
Used between words in the middle of sentences. Combines 3 Unicode space types with 4 zero-width character positions:
Each inter-word space can encode 4 bits (2 bits for space type + 2 bits for ZWC position).
Used at the end of lines or paragraphs. Uses 4 Unicode space types:
Each line-end space can encode 2 bits.
When the cover text doesn't have enough spaces to hide all secret data, remaining bits are appended as special whitespace characters at the end.
This method provides concealment but not cryptographic security. The hidden data can be detected by:
For sensitive data, encrypt your message before hiding it.
Processing...