Innovation and Cryptoventures Cryptology Campbell R. Harvey Duke University, NBER and Investment Strategy Advisor, Man Group, plc January 20, 2017
Overview Cryptology Cryptography Cryptanalysis Symmetric Ciphers Asymmetric Ciphers Protocols
Overview Cryptology Science of making things secret Cryptography Cryptanalysis Science of breaking cryptosystems Symmetric Ciphers Asymmetric Ciphers Protocols
Overview Cryptology Science of making things secret Cryptography Cryptanalysis Science of breaking cryptosystems Symmetric Ciphers Share a secret key Asymmetric Ciphers Share a public key but each has secret private key Protocols Application of cryptographic algos, like TLS
Overview Process of concealing messages Greek κρυπτω meaning secret or hidden Used for 4,000 years Early techniques involved concealed writing/symbols Parchments that had to be wrapped around a rod of a specific size to figure out the message Material drawn liberally from M. Cozzens and S. J. Miller, The Mathematics of Encryption, 2013.
Overview We will not talk about steganography This is the practice of concealing a message In contrast to cryptography, steganography does not attract any attention In cryptography, you encrypt the content of the message In steganography, you focus on hiding the fact that a secret message is even being sent
Polybius square 300 400 BCE Polybius advocated a square (originally using the Greek alphabet) Note that i/j are ambiguous Read off row, column. CAM = 13, 11, 32
Cipher From Arabic, sifr, meaning nothing Method of concealment where letters are replaced by other letters, numbers or symbols or the order of the letters is shifted Code is related but different. Code is a method of concealment that uses words, numbers or syllables to replace original words or phrases (does not appear until modern times). Texting short forms, e.g. ttyl, would not qualify because everyone knows them. Ciphers traditionally have been broken by frequency analysis. For example, e and t are the two most common English letters.
Caesar Cipher shift letters by fixed number of places (originally 3). Note 3 is called the key. The shift could be arbitrary. +3 CAM=FDP Not very secure
Caesar Cipher is early example of using modulo arithmetic. If we shifted +26 (or 26), we end up with the regular alphabet If we shifted +27, it is the same as +1 If we shifted +54, it is the same as +2 Aclock is modulo 12 Note: Modulo arithmetic very important for advanced encryption
Caesar Cipher is early example of using modulo arithmetic. Let A=0, B=1,, Z=25 Then: Encrypted(x) = (x + k) mod 26 Here k is the shift or key, mod is the modulo operation (in Python code on earlier slide denoted by % ) Caesar cipher is a special case of an affine cipher ; more generally encrypted (x) = (ax + k) mod 26; a=1 for Caesar.
Definition Plaintext is the message you want to encrypt (e.g. CAM) Ciphertext is the encrypted message (e.g. FDP)
Caesar Cipher is monoalphabetic cipher Each plaintext letter will always have the same ciphertext letter Easy to crack brute force only requires 25 different tries
It is also possible to use a keyword (with no repeating letters). Suppose keyword = cipher CAM = PCY Normal alphabet A B C D E F G H I J K L M N O P Q R S T U V W X Y Z Cipher alphabet C I P H E R S T U V W X Y Z A B D F G J K L M N O Q But this is just one of many possible alternative reorderings
Many other monoalphabetic ciphers There are 26! (factorial, i.e. 26x25x24x x1) ways to reorder This is a large number (4.032914611x10 26 ) of distinct ciphers. Brute force: if you could try 1 trillion combinations a second, it would take 12,000 years to brute force all combinations
Breaking monoalphabetic ciphers However, you do not need brute force These ciphers are vulnerable to frequency analysis
Breaking monoalphabetic ciphers However, you do not need brute force These ciphers are vulnerable to frequency analysis https://en.wikipedia.org/wiki/letter_frequency
Properties of Valid Ciphers Properties of a valid encryption scheme Easy to encrypt Easy to transmit Easy to decode If intercepted, should be hard to decode Ideally, source of message should be validated
Even more advanced uses polyalphabetic substitution Use of Vigenère square (just like Caesar but all possible starting points) Define a keyword (called keystream ) and repeat it to make it as long as your message: suppose my key BTC C A M H A R V E Y G U I L T Y B T C B T C B T C B T C B T C
Encryption CAM HARVEY BTC BTCBTC For C, go to the row beginning with B (first letter of BTC) and read off the letter corresponding to C in the first row (which is D )
Encryption CAM HARVEY BTC BTCBTC For C, go to the row beginning with B (first letter of BTC) and read off the letter corresponding to C in the first row (which is D ) For A go to the row beginning with T and read off A column T
Encryption CAM HARVEY BTC BTCBTC For C, go to the row beginning with B (first letter of BTC) and read off the letter corresponding to C in the first row (which is D ) For A go to the row beginning with T and read off first column T For M go to the row beginning with C and read off letter under M which is O etc.
There are 25 reorderings with Vigenère square But the square is just a visual way of doing modulo arithmetic Let A =0, B =1,, Z =25 C A M H A R V E Y G U I L T Y B T C B T C B T C B T C B T C 2 0 12 7 0 17 21 4 24 6 20 8 11 19 24 + 1 19 2 1 19 2 1 19 2 1 19 2 1 19 2 3 19 14 8 19 19 22 23 0 7 13 10 12 12 0 = D T O I T T W X A H N K M M A 19+19=38 mod 26 =12 (divide 38/26 and remainder is 12) Excel =mod((row1 + ROW2),26)
There are 25 reorderings with Vigenère square Easy to decipher. Write down code and keystream underneath and subtract D T O I T T W X A H N K M M A B T C B T C B T C B T C B T C 3 19 14 8 19 19 22 23 0 7 13 10 12 12 0 1 19 2 1 19 2 1 19 2 1 19 2 1 19 2 2 0 12 7 0 17 21 4 24 0 6 20 8 11 19 24 = C A M H A R V E Y G U I L T Y Excel =mod((row1 ROW2),26)
SEAN WIEUIIUZH DTG CNP LBHXGK OZ BJQB FEQT XZBW JJOY TK FHR TPZWK PVU RYSQ VOUPZXGG OEPH CK UASFKIPW PLVO JIZ HMN NVAEUD XYF DURJ BOVPA SF MLV FYYRDE LVPL MFYSIN XY FQEO NPK M OBPC FYXJFHOHT AS ETOV B OCAJDSVQU M ZTZV TPHY DAW FQTI UTTJ J DOGOAIA FLWHTXTI QMTR SEA LVLFLXFO
Transposition Cipher Letters remain the same but the order is scrambled Start with key word, say BTC Write down order of letters in keyword Fill out rectangle with message Read off columns in order YROIOERUHENSUAOPNSTCE Keyword B T C Order 1 3 2 Y O U R P H O N E I S N O T S E C U Left over spaces R E A Col #1 Col #3 Col #2
Transposition Cipher Letters remain the same but the order is scrambled This type of cipher is immune to an attack based on frequency analysis because the exact same letters are used the order is subject to permutation
Transposition Cipher Chinese cipher Fill rectangle with message down far right column and up the next column Read off rows ESSIY DICEO AMONU BOMOR CRPHP = Your phone is compromised(abc) E S S I Y D I C E O A M O N U B O M O R C R P H P
Permutation Cipher Mixes up the letters. Example: (1, 2, 3) > (3, 1, 2) So the word THE would be ETH C A M H A R V E Y I S S A T O S H I 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 M C A R H A Y V E S I S O A T I S H To decrypt, we use the inverse permutation
Hill Cipher Uses matrix operations. Choose the length of blocks, say 3 Form 3x1 (3 rows, 1 column) matrices and use numbers for letters, i.e. A=0, B=1 Matrix A is the key Multiply each block by A (result will be a 3x1) then modulo 26 each element. This produces the Hill Cipher To decipher, multiple each cipher block by the inverse of A, modulo 26
Advanced Ciphers Modern ciphers use both substitution and transposition Mixing is called product cipher Mix includes substitution, transformation and modulo operations Foundational work by Claude Shannon Modern standards are DES* (Data Encryption Standard from early 1970s and no longer considered secure) and AES** (Advanced Encryption Standard adopted in 2001) *Also known as Lucifer, based on the work of Horst Feistel **Also known as Rijndael, after founders Vincent Rijmen and Joan Daemen