Character Frequency Analyzer
Guide
Character Frequency Analyzer
Paste any text and instantly see how often every character appears. The Character Frequency Analyzer counts every letter, digit, or symbol, ranks them, shows percentages, and renders a visual bar chart so you can spot patterns at a glance. It is a go-to companion for cryptanalysis, linguistics homework, password audits, content audits, and any time you need a precise count of what is in a body of text.
Unlike a word counter, this tool works at the character level. That makes it especially useful for breaking simple substitution ciphers, where the trick is to compare the observed letter distribution to the expected distribution of the source language. Toggle “Compare to English baseline” and the tool will show, for each letter, the standard English frequency and how much your text deviates from it.
Comment utiliser
- Paste your text into the input box, or click “Try an example” to load a sample.
- Pick what you want to count: Letters only, Letters and digits, Printable (no whitespace), ou All characters.
- Basculer Sensible à la casse if A and a should be counted separately.
- For substitution-cipher work, leave Compare to English baseline on to see deviations from the standard ETAOIN distribution.
- Sort any column by clicking its header. Use Copier en CSV, Télécharger CSV, ou Copier en JSON to export the table.
Caractéristiques
- Frequency table – Rank, character, count, percentage, and a proportional bar for every unique character.
- Four counting modes – Letters only, letters and digits, printable characters (no whitespace), or every code point including spaces and punctuation.
- Case sensitivity toggle – Treat A and a as the same character or count them separately.
- English baseline comparison – When counting letters, see expected English percentages and the signed deviation in the same row, color-coded over and under.
- Shannon entropy – See the bits-per-character entropy of your text alongside the theoretical maximum for its alphabet size, useful for password strength and randomness checks.
- Sortable columns – Click any header to sort by rank, character, count, percentage, or deviation.
- Unicode-aware – Handles any code point, with friendly labels for whitespace and control characters.
- CSV and JSON export – Copy or download the table in either format, including character codepoints, for further analysis.
- Mises à jour en temps réel – Results refresh automatically as you type or change options.
- Privé par design – Everything runs in your browser. Your text is never uploaded.
FAQ
-
What is character frequency analysis?
Character frequency analysis is the practice of counting how often each character appears in a body of text and comparing that distribution against an expected baseline. It dates back to 9th-century Arab mathematician Al-Kindi, who used it to break substitution ciphers, and it remains the foundation of classical cryptanalysis, statistical linguistics, and many modern compression and language-detection algorithms.
-
What are the most common letters in English?
In standard English text the order is roughly E, T, A, O, I, N, S, H, R, D, L, U — often memorised as ETAOIN SHRDLU. E is by far the most common at about 12.7 percent, followed by T at 9.1 percent and A at 8.2 percent. The least common letters are J, Q, X, and Z, each under 0.2 percent. Real-world frequencies vary slightly with the source corpus, but the overall ranking is remarkably stable across modern English texts.
-
How do you break a Caesar or substitution cipher with frequency analysis?
For monoalphabetic ciphers, count the letters in the ciphertext and rank them. Map the most frequent ciphertext letter to E, the next to T, and so on, then refine the mapping using common digrams (TH, HE, IN), trigrams (THE, AND, ING), and short words. With enough text the underlying language shines through. Polyalphabetic ciphers like Vigenère blunt this attack by smearing the distribution across multiple alphabets, but periodic structure can still be detected with the index of coincidence and Kasiski examination.
-
What is Shannon entropy and why is it shown here?
Shannon entropy measures the average information content of a symbol from a given source, expressed in bits per character. A perfectly uniform alphabet has maximum entropy equal to log2(N) for N symbols. Real text is far less random — English prose is about 1.0 to 1.5 bits per letter once context is considered. Comparing observed entropy to the maximum tells you how predictable a string is, which is useful for sanity-checking randomness, evaluating password strength, and detecting unusual content.
-
Why does my text deviate from the English baseline?
Several reasons: short samples are noisy and naturally drift from population averages; technical writing skews letter distributions toward consonants used in jargon; non-English words, names, or code introduce letters with atypical frequencies; and intentional stylistic choices such as lipograms can suppress specific letters entirely. Large deviations on long, ordinary prose can be a fingerprint of obfuscation, encryption, or language other than English.
Installez nos extensions
Ajoutez des outils IO à votre navigateur préféré pour un accès instantané et une recherche plus rapide
恵 Le Tableau de Bord Est Arrivé !
Tableau de Bord est une façon amusante de suivre vos jeux, toutes les données sont stockées dans votre navigateur. D'autres fonctionnalités arrivent bientôt !
Outils essentiels
Tout voir Nouveautés
Tout voirMise à jour: Notre dernier outil was added on Mai 7, 2026
