2 unstable releases
0.2.0 | Feb 28, 2025 |
---|---|
0.1.0 | Feb 28, 2025 |
#302 in Cryptography
883 downloads per month
Used in project_ares
90KB
969 lines
cipher_identifier
A Rust library and CLI tool for identifying classical ciphers based on statistical analysis of ciphertext. It calculates various statistical metrics and compares them against known patterns for different cipher types to determine the most likely cipher used.
Features
- Analyzes ciphertext using multiple statistical tests
- Identifies the most likely cipher from 58 different classical cipher types
- Provides detailed statistical information about the ciphertext
- Command-line interface for easy use
- Can be used as a library in other Rust projects
Supported Ciphers
The tool can identify the following 58 classical cipher types:
Cipher Types | |||
---|---|---|---|
6x6bifid | 6x6playfair | Autokey | Bazeries |
Beaufort | CONDI | Grandpre | Grandpre10x10 |
Gromark | NihilistSub6x6 | Patristocrat | Quagmire I |
Quagmire II | Quagmire III | Quagmire IV | Slidefair |
Swagman | Variant | Vigenere | amsco |
bifid | cadenus | checkerboard | cmBifid |
columnar | compressocrat | digrafid | foursquare |
fractionatedMorse | grille | homophonic | keyphrase |
monomeDinome | morbit | myszkowski | nicodemus |
nihilistSub | nihilistTramp | numberedKey | periodicGromark |
phillips | playfair | pollux | porta |
portax | progressiveKey | ragbaby | redefence |
routeTramp | runningKey | sequenceTramp | seriatedPlayfair |
simplesubstitution | syllabary | tridigital | trifid |
trisquare | twosquare |
Installation
Prerequisites
- Rust and Cargo (install from rustup.rs)
Building from Source
# Clone the repository
git clone https://github.com/yourusername/cipher_identifier.git
cd cipher_identifier
# Build the project
cargo build --release
# The binary will be available at target/release/cipher_identifier
Usage
Command Line Interface
# Analyze ciphertext provided directly
cipher_identifier --text "THEQUICKBROWNFOXJUMPSOVERTHELAZYDOG" --number 5
# Analyze ciphertext from a file
cipher_identifier --file path/to/ciphertext.txt --number 10
# Highlight a specific cipher in the results
cipher_identifier --text "THEQUICKBROWNFOXJUMPSOVERTHELAZYDOG" --cipher playfair
Command Line Options
--text
,-t
: The ciphertext to analyze--file
,-f
: Input file containing ciphertext--number
,-n
: The top n most likely ciphers to display (default: 5)--cipher
,-c
: Highlight a specific cipher in the list--verbose
,-v
: Increase verbosity level--help
,-h
: Display help information
API Documentation
Library Overview
The cipher_identifier
library provides a comprehensive API for analyzing ciphertext and identifying classical ciphers. It can be integrated into other Rust projects to add cipher identification capabilities.
Key Components
- Cipher Identification: Core functionality to identify the most likely cipher type
- Statistical Tests: Various algorithms to analyze text patterns
- Cipher Type Definitions: Data structures with cipher metadata
API Output Format
Cipher Identification
The main function identify_cipher
returns a vector of CipherScore
pairs, which are tuples of (String, f64)
representing the cipher name and its score:
// Type definition
pub type CipherScore = (String, f64);
// Example output
[
("playfair", 1.234), // Lower scores indicate better matches
("bifid", 2.345),
("columnar", 3.456),
("vigenere", 4.567),
("simplesubstitution", 5.678)
]
The scores represent the "distance" between the statistical properties of the input text and the expected properties of each cipher type. Lower scores indicate better matches.
Statistical Tests
The get_all_stats
function returns a HashMap<String, f64>
containing the results of all statistical tests:
// Example output
{
"IoC": 1.78, // Index of Coincidence
"MIC": 48.9, // Mutual Index of Coincidence
"MKA": 64.9, // Mean Kappa Test
"DIC": 16.5, // Digraphic Index of Coincidence
"EDI": 17.4, // Even Distribution Index
"LR": 4.9, // Length Ratio
"ROD": 48.1, // Repeat Order Distribution
"LDI": 279.1, // Letter Distribution Index
"SDD": 68.2, // Standard Deviation Distribution
"Shannon": 4.2, // Shannon Entropy
"BinaryRandom": 1.0 // Binary Random Test (1.0 = passed, 0.0 = failed)
}
Cipher Type Information
The CipherType
struct provides metadata about each cipher type:
// Example output for a single cipher type
CipherType {
types: ["substitution"], // Primary classification
subtypes: ["polygraphic"], // Secondary classification
subtypes2: ["fractionation"], // Tertiary classification
table: ["polybius square"], // Table or grid used
size: "6x6", // Size of the cipher
notes: "Uses a 6x6 grid instead of 5x5" // Additional notes
}
Using the API
Identifying Ciphers
use cipher_identifier::identify_cipher;
fn main() {
let text = "THEQUICKBROWNFOXJUMPSOVERTHELAZYDOG";
// Get the top 5 most likely ciphers
let results = identify_cipher(text, 5, None);
for (cipher, score) in results {
println!("{}: {:.3}", cipher, score);
}
}
Getting Statistical Information
use cipher_identifier::statistical_tests::all_stats::get_all_stats;
fn main() {
let text = "THEQUICKBROWNFOXJUMPSOVERTHELAZYDOG";
// Get all statistical test results
let stats = get_all_stats(text);
for (test, value) in stats {
println!("{}: {:.3}", test, value);
}
}
Using the CipherAnalyzer
use cipher_identifier::CipherAnalyzer;
fn main() {
let text = "THEQUICKBROWNFOXJUMPSOVERTHELAZYDOG";
let analyzer = CipherAnalyzer::new();
// Display basic statistics
analyzer.display_basic_stats(text);
// Identify the cipher
analyzer.identify_cipher(text, 5, None).unwrap();
}
Loading Cipher Type Definitions
use cipher_identifier::models::cipher_type::{load_cipher_types, get_cipher_primary_type};
fn main() {
// Load cipher type definitions
let cipher_types = load_cipher_types("resources/cipher_types.json").unwrap();
// Get the primary type of a specific cipher
let primary_type = get_cipher_primary_type(&cipher_types, "playfair");
println!("Primary type of playfair: {}", primary_type);
}
Integration Example
Here's a complete example of how to integrate the cipher_identifier library into another project:
use cipher_identifier::{
identify_cipher,
statistical_tests::all_stats::get_all_stats,
models::cipher_type::{load_cipher_types, get_cipher_primary_type},
};
fn analyze_ciphertext(text: &str) {
// Step 1: Get statistical information
let stats = get_all_stats(text);
println!("Statistical Analysis:");
println!("IoC: {:.6}", stats["IoC"]);
println!("Shannon Entropy: {:.6}", stats["Shannon"]);
// Step 2: Identify the cipher
let results = identify_cipher(text, 5, None);
// Step 3: Load cipher type definitions for additional information
let cipher_types = load_cipher_types("resources/cipher_types.json").unwrap_or_default();
// Step 4: Display results with additional information
println!("\nMost Likely Ciphers:");
for (i, (cipher, score)) in results.iter().enumerate() {
let primary_type = get_cipher_primary_type(&cipher_types, cipher);
println!("{}. {} (Score: {:.3}, Type: {})",
i+1, cipher, score, primary_type);
}
}
fn main() {
let text = "THEQUICKBROWNFOXJUMPSOVERTHELAZYDOG";
analyze_ciphertext(text);
}
Statistical Tests
The tool uses the following statistical tests to analyze ciphertext:
- IoC (Index of Coincidence): Measures the probability of two randomly selected letters being the same
- MIC (Mutual Index of Coincidence): Measures the maximum periodic index of coincidence
- MKA (Mean Kappa Test): Measures the average kappa value for the text
- DIC (Digraphic Index of Coincidence): Measures the frequency of digraphs (pairs of letters)
- EDI (Even Distribution Index): Measures how evenly distributed the digraphs are
- LR (Length Ratio): Measures the ratio of unique n-grams to total possible n-grams
- ROD (Repeat Order Distribution): Measures the distribution of repeated patterns
- LDI (Letter Distribution Index): Measures how closely the letter distribution matches expected frequencies
- SDD (Standard Deviation Distribution): Measures the standard deviation of letter frequencies
- Shannon Entropy: Measures the information content or randomness of the text
- Binary Random Test: Tests whether the text appears random when converted to binary
Benchmarking
The library includes a benchmarking module to test the accuracy of the cipher identification algorithm:
use cipher_identifier::benchmark::run_benchmark;
fn main() {
run_benchmark("path/to/test_data.json");
}
The test data should be a JSON file with each line containing a test case in the format:
{"ciphertype": "playfair", "ciphertext": "HELLOWORLD"}
License
Dependencies
~6–15MB
~173K SLoC