Cryptography Cheatsheet
AKA I've only got a day until my exam what do I really need to know?
Note: Difficult (todo if time)
- AES key schedule
- Galois field
Classic Ciphers
- Before sending date, compress, then encrypt, then add error detection
- Either substitution (swapping letters for other letters) or transposition (moving letters around)
- Combination of both is good for encryption (want to maximise confusion and diffusion), ideally use product ciphers to do both (transposition and substitution)
- Can use techniques such as frequency analysis, index of coincidence and the Kasisky Test to figure out classic ciphers
- Steganography is the practice of hiding a message in plain sight
- Alter the image or use a null cipher or use something like microdots embedded in the image
Ciphers
- Caesar: shift of \(N\) letters of the alphabet for all.
- Vignere: multiple alphabets with different shifts and a keyword
- Enigma: combination of confusion and diffusion
- Scytale: wrap around a dowel of the same diameter, then write across the ribbon
- Railfence: \(N\) rails, write in a zig-zag, then read off row-by-row
- Columnar: Use a keyword, write the message in \(N\) columns same as length of keyword, then sort the keyword alphabetically
Tests
- Kasisky: Decrypt ciphertest with some key of length \(N\), then search for identical segments length \(N\). Key length is GCD of the distances between the segments.
- Index of Coincidence: Probability of two random letters from the text being identical. \(IC = \sum_{i=a}^z P_i^2\). Will be similar to English if monoalphabetic or transposition, so will differ for Vignere or other text not decrypted properly
- Message Length: can sometimes deduce that the key length divides the total length of the message
- Fitness Metrics: Can look at \(n\)-grams from the text and compare to \(n\)-gram frequency for English text
Cracking the Key
- Can brute force up to about 10 key length then gets difficult
- Can use fitness metrics to order results
- Hill Climbing: generate random starting keyword, then measure fitness of resulting text. Generate child keys, then measure their fitness. Choose higher fitness child keys as new seed keys
Terminology
- Private key, or symmetric cryptography: shared key between parties. Typically used for session keys after setup with asymmetric.
- E.g., DES, 3DES, Twofish, Rijndael
- Public key, or asymmetric: use two different keys, one private, one public
- Relies on trapdoor functions
- For sending data between parties where we cannot safely send a symmetric key
- E.g., RSA, ED25519
- One way function: \(f(x)\) trivial, given \(f,x\). Evaluation of \(f^{-1}\) should not be trivial.
- Trapdoor One Way Function: \(f^{-1}\) hard to get unless we have function \(g\) which we define at the same time as \(f\).
Shannon and Kirchoff
- Entropy: \(H(X) = \sum_i p_i \log_2(1/p_i)\). Similar messages have lower entropy
- Compression: theoretical min number of bits to store the data. This is \(B(X) = nH(X)\), where \(n\) is length of data
- Shannon Perfect secrecy: For all messages in the message space of the same length, the probability of each ciphertext appearing should be the same
- Computational perfect secrecy: similar idea but theoretically different just needs a lot of computing power
- Cipher must not depend on the secrecy of the mechanism. Inner workings should be publicly available
- Permutation causes confusion
- Transposition causes diffusion
- Want to have a cipher cause an avalanche effect, altering lots of bits in resultant text if we change the cipher.
Attack Types
- Known-plaintext: access to both the plaintext and the ciphertext for a specific cipher and can then gain marginally better performance solving the cipher compared to brute force or general sieve
- Chosen-plaintext: Can choose our own plaintext messages and get the ciphertext for them.
- Ciphertext-only: Nothing other than the ciphertext. Anything vulnerable to this is completely insecure
- Side channel and quantum
- Ideally, a cipher should be resilient to these
- Unsalted text can be broken by exhaustively storing it and using things such as rainbow tables
Symmetric Ciphers
- Either stream or block. Stream is one bit or byte at a time, block is a block of bits or bytes with a fixed size
- E.g., AES is 128, or 192 bits in a block, and XOR is a stream cipher
- Stream ciphers use some bitstream generation algorithm to get a bitstream to XOR with the data
- Once key compromised, then can decrypt all data used. Rekey often
- Cipher defined over \((K,M,C)\) as algorithms \((E,D)\), where encryption followed by decryption with the same key yields the same plaintext
- Pseudorandom functions map some key in \(K\) and message in \(X\) to an output \(Y\)
- Pseudorandom permutations are a subset where \(X=Y\) and thus \(PRP: X \times K \to X\).
- PRPs must be invertible, PRFs don't have to be
- Security of PRPs is defined in terms of efficient algorithms:
- \(Adv[A,e] = |Exp_A[E(k,m_1)]-Exp_A[E(k,m_2)]|\)
- Ideal block cipher would have truly random permutation of each of the bits in the block (e.g., \(01001010 \to 00100101\) or anything else in \(\{0,1\}^8\))
- This would mean storing huge amounts of mapping data
DES
- 1971, IBM invented
- Adopted 1976
- Key length 56 bits, block length of 64 bits
- Broken 1997
- Replaced with AES 2000
- Uses feistel structure
- Split plaintext into LHS and RHS
- XOR LHS with some \(F\) applied to RHS
- Then switch result from XOR to the RHS and use previous RHS as LHS
- At end do XOR and \(F\) but don't swap
- Can choose many different parameters
- DES uses 16 rounds Feistel with block size 64, key length 56
- \(F\) takes 32 bits, expands to 48 bits, combined with round key of 48 and XORed, then reduced back to 32 bits
- All these operations invertible
- Has some weak keys
- Can also be brute forced as key space is small
- Replaced with double-DES but susceptible to a meet-in-the-middle attack
- Ultimately, 3DES works with two keys, and two keys as EDE means that we can use legacy systems with new hardware for 3DES, using the same key \(k1=k2\) for DES hardware
AES
- New standard started consulting in 1997
- 128- 192- and 256-bit key sized
- Substitution-permutation based network, not Feistel
- Each 128 bits into 16 bytes, stored in 4x4 matrix
- This is the state
- Consider the bytes as a Galois field
- Flow:
- Add round key (simple XOR)
- Then loop on this:
- byte substitute using LUT based on GF
- Shift row (left shift by 0,1,2,3,4)
- Mix column (left multiply by specific values)
- Add round key (again simple XOR)
- Then loop unless last round
- If last round do byte substitution, shift row, then add round key
- Key schedule for deriving round keys, quite complicated
Block Cipher Modes
- If using cipher with same key would yield the same result
- Need to consider padding, and how to break plaintext into blocks
- Want to ensure that encryption of the same block yields different ciphertext, e.g., same 128 bits would not give same ciphertext
- This is where modes come in handy
- Naïve implementation is ECB, storing \(C_i = E_K(P_i)\).
- Using a mode will allow us to fix this problem
- CBC: takes initial vector then encrypts with previous ciphertext XORed with the current block
- Can't run in parallel
- Transmission errors compound
- Need to rekey to prevent chosen plaintext every \(2^{12}\) blocks
- CTR: initialize a counter to some IV then increment by 1 each block. XOR the encrypted CTR with the plaintext to get the ciphertext
- Can parallelize
- But is more prone to CPA and IV needs to be communicated
- CFB: Use output of the previous block and encrypt with the key and this previous block, then XOR with the next plaintext block
- OFB: Similar to CFB but only send some bits from the output
Number Theory
- Mainly focused around modular arithmetic
- Mod congruence is where we wrap round at the modulus
- Addition, multiplication do as normal then take mod congruence
- Modular inversion or multiplicative inverse is where we have \(xy=1 (\mod n)\) for a chosen \(x\) with \(y\) as the inverse
- We denote \(y\) as \(x^{-1}\)
- Division is a multiplication of the multiplicative inverse, or modular inversion of 1/denominator
- So \(2/3\) in \(\mathbb{Z}_7\) would be \(2 \times \frac{1}{3}\), and multiplicative inverse of \(\frac{1}{3}\) which is \(3y=1\), so choose 5.
- Then have \(2 * 5 = 10 = 3\)
- Not all elements have an inverse
- This is only if \(\gcd(x,N)= 1\)
- Euclid algorithm \(\gcd(r_0, r_1) = \gcd(r_0 - r_1, r_1)\)
- This is the classical algorithm
- We can apply it to modular arithmetic: \(\gcd(r_0, r_1) = \gcd(r_1, r_0 \mod r_1)\)
- Inverses can be calculated with extended Euclid
- First check \(\gcd(x,N) = 1\), then find \(a,b | ax + bN = 1\)
- \(a\) is then the inverse of \(x\) in \(\mathbb{Z}_n\).
- Extended Euclid can be done by expressing \(\gcd(x,y)\) as \(x = y(k) + n\), where \(k\) is some integer we don't really care about, and \(n\) is the remainder.
- In the LHS, for \(\gcd(8,11) = 1\), we have \(11 = 8(1) + 3\)
- On the RHS, we then rearrange for \(3\)
- Do this down the chain, then rearrange and back substitute
- Division can be rearranged as the quotient and then 1/denominator
Modular Inversion
- So far have seen this naïvely
- This is where we get \(xy=1\) for a chosen \(x\) with \(y\) inverse
- Can use Fermat's little theorem instead
- Only if modulus is prime \(p\)
- \(\forall x \in (\mathbb{Z}_p)^*: x^{p-1} = 1 (\mod p)\)
- \((\mathbb{Z}_N)^*\) is the set of invertible elements
- So for \(p=5\), we know that \(x^{5-1} = 1\), and is thus invertible
- So \(3^4 = 81 = 1\) in \(\mathbb{Z}_5\)
- Modular inverse can be found by taking \(xx^{p-2} = 1 = x^{-1} = x^{p-2}\) in \(\mathbb{Z}_p\)
- Less efficient than Euclid
Roots
- If \(p\) prime, \(c \in \mathbb{Z}_p\), then \(x \in \mathbb{Z}_p\) such that \(x^e=c\). \(x\) is then an \(e\)th root of \(c\)
- If \(\gcd(e,p-1) = 1\), then \(d\) is the inverse of \(e\) in \(\mathbb{Z}_{p-1}\) (\(d=e^{-1}\) in \(\mathbb{Z}_{p-1}\))
- Two cases with different solutions \(\gcd(e,p-1) = 1\) and \(\ne 1\)
- If \(\ne 1\), then have to use quadratic residue
- QR is an element \(x\) in \(\mathbb{Z}_p\) if it has a square root in \(\mathbb{Z}_p\)
- Euler's Theorem: let \(p\) an odd prime, if \(x\) in \((\mathbb{Z}_p)^*\) is QR, then \(x^{(p-1)/2} = 1\) in \(\mathbb{Z}_p\)
- To get QR, take every element and raise to the power \((p-1)/2\)
- If this equals 1, then element is QR
- Square root modular odd prime
- Either \(p=3 (\mod 4)\)
- or \(p = 1 (\mod 4)\)
- First allows us to use theorem if \(c \in (\mathbb{Z}_p)^*\) is QR, then \(\sqrt{c} = c^{\frac{p+1}{4}}\) in \(\mathbb{Z}_p\)
- E.g. \(\sqrt{6}\) in \(\mathbb{Z}_{43}\) would be \(p = 43 = 3 \mod 4\), which gives \(\sqrt{6} = 6^{\frac{43+1}{4}} = 6^{11} = 36\)
Groups
- Hold 4 properties over a binary operation \(*\) on \(G\)
- These 4 properties are closure, associativity, identity and invertability
- Order is size or number of elements in the group
- Denote groups as set of numbers, and the binary operation
- E.g., \((\mathbb{Z}_{12}, +)\)
- Abelian groups are where commutativity holds
- Totient function \(\phi\) is Euler function
- This is \(\phi(N) = | (\mathbb{Z}_N)^*|\), with two cases:
- For \(N = p\) where \(p\) prime, \(\phi(N) = N-1\)
- For \(N = pq\), where \(p,q\) prime, \(\phi(N) = (p-1)(q-1)\)
- This function used as a generalization of Fermat's little theorem:
- Fermat: \(\forall x \in (\mathbb{Z}_p)^*: x^{p-1} = 1\) in \(\mathbb{Z}_p\)
- Euler: \(\forall x \in (\mathbb{Z}_N)^*: x^{\phi(N)} \mod N = 1\)
- This is used for RSA
Intractable Problems
- Discrete log: given \(y=2^x\), try and compute \(x = D\log_2(y)\):
- Generalise to \(y \mapsto g^x\)
- Can compute \(y\) given \(g,x\)
- Given \(g,y\), can't easily compute \(x\)
- Fastest is \(e^{O(\sqrt[3]{N})}\), general number field sieve
- Factoring problem: factoring large numbers:
- Best currently general number field sieve
- Forms basis for RSA
Asymmetric
- Key pair with private and public part
Trapdoor Functions
- Makes use of trapdoor functions \(f,f^{-1}\), where we are given \(f^{-1}\) as the private key, but hard to get otherwise
- Technically a triple of \((G,F,F^{-1})\), comprising:
- \(G\) keygen
- \(F\) encryption function with public key
- \(F^{-1}\) decryption function with private key
- Secure if \(F\) is one way
- Must assume that the adversary can see \(C\), the public key and \(F\)
- Don't use by applying directly to plaintext
RSA
- Choose random \(p,q\), then set \(N = pq\)
- Choose \(e,d\) such that \(ed = 1 \mod \phi(N)\)
- Public key is \(N,e\)
- Private key is \((p,q,d)\)
- To encrypt, take \(x^e\), to decrypt take \(y^d\)