Font Steganography

Hide secret messages by alternating between two visually similar fonts

📄 Font-Based PDF Steganography

This technique encodes secret data by alternating between two visually similar fonts (LiberationSans and Arial) on a per-character basis. Each character's font encodes one bit: LiberationSans = 0, Arial = 1. The output is a PDF document that looks normal but contains hidden information.

🔓 Extract Hidden Message from PDF

Upload a PDF file that was encoded using font steganography to extract the hidden message. The decoder analyzes the font used for each character to reconstruct the binary data.

📄 Click to select PDF or drag & drop here
No file selected

About Font Steganography

How It Works

Font steganography hides secret messages by using two visually similar fonts to encode binary data. Each character in the cover text uses one of two fonts:

  • LiberationSans represents binary 0
  • Arial represents binary 1

The fonts are so similar that casual readers cannot detect which font is used for each character, but the pattern encodes the hidden message.

Encoding Process

  1. Header Creation: A 16-bit header encodes the length of the secret message (supporting up to 65,535 characters).
  2. Text to Binary: Your secret message is converted to binary (8 bits per character).
  3. Font Selection: For each bit, the corresponding cover text character is rendered in LiberationSans (for 0) or Arial (for 1).
  4. PDF Generation: The encoded text is saved as a PDF document that preserves the font information.

Example: The letter "A" (ASCII 65, binary 01000001) requires 8 characters in the cover text, with fonts: LiberationSans, Arial, LiberationSans, LiberationSans, LiberationSans, LiberationSans, LiberationSans, Arial.

Decoding Process

Decoding reverses the encoding:

  1. Font Extraction: The decoder opens the PDF and examines the font used for each character.
  2. Binary Reconstruction: Each font is converted back to its bit value (LiberationSans=0, Arial=1).
  3. Length Reading: The first 16 bits are decoded to determine the message length.
  4. Message Recovery: The remaining bits are converted back to text using the length header.

Capacity and Requirements

  • Each secret character requires 8 cover characters (one per bit).
  • The 16-bit header requires 16 cover characters.
  • Total requirement: (16 + message_length × 8) characters.
  • For example, a 10-character secret message needs at least 96 cover characters.
  • Cover text can be any printable characters, including spaces and punctuation.

Advantages

  • Visually Imperceptible: LiberationSans and Arial are extremely similar; most readers cannot tell which is used.
  • Survives Printing: Unlike some digital steganography, this can survive print-and-scan if high quality.
  • Format Preservation: PDF format maintains font metadata across platforms.
  • No File Size Anomaly: The PDF appears normal with standard fonts.
  • Text Remains Searchable: The cover text is fully readable and searchable.

Security Considerations

  • Not Encryption: This provides concealment, not cryptographic security. Encrypt sensitive data before hiding it.
  • Font Analysis Detection: Analyzing font usage patterns can reveal hidden data. Security tools may flag documents with unusual font alternation.
  • Requires Both Fonts: The decoder needs access to the same fonts used during encoding for accurate extraction.
  • PDF Editing May Disrupt: Editing the PDF (re-flowing text, changing fonts) will corrupt the hidden message.
  • Statistical Analysis: If the fonts alternate with near 50/50 distribution, it may indicate steganography.

Legitimate Applications

  • Document Watermarking: Embed authorship or tracking information invisibly.
  • Copyright Protection: Mark documents with hidden ownership data.
  • Data Integrity: Embed checksums or verification codes.
  • Covert Communication: Send hidden messages in innocuous documents (combine with encryption).
  • Research and Education: Study steganography techniques and detection methods.

Technical Requirements

  • Encoding: Requires reportlab library and font files (Liberation Sans, Arial).
  • Decoding: Requires PyMuPDF (fitz) library for PDF analysis.
  • Font Files: Both TTF font files must be available in the fonts/ directory.
  • Browser Compatibility: Works in all modern browsers; PDF generation happens server-side.