IB Computer Science Extended Essay: Encryption Standards, High school final essays of Computer science

An example of an extended essay done with computer science subject.

Typology: High school final essays

2018/2019

Uploaded on 04/03/2019

unknown user
unknown user 🇳🇱

1 / 17

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
To what extent is the Advanced Encryption Standard
information-theoretically secure?
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff

Partial preview of the text

Download IB Computer Science Extended Essay: Encryption Standards and more High school final essays Computer science in PDF only on Docsity!

To what extent is the Advanced Encryption Standard

information-theoretically secure?

ABSTRACT

This paper evaluates the Advanced Encryption Standard when subjected to changes in key sizes and

input sizes, in order to see how secure the encryption standard is with key sizes and input sizes, and

what implications are given by the results obtained.

In order to understand the relation and implications, this investigation uses perfect secrecy as a

measurement of security. The research gives a brief insight into the topic of cryptography and

information theory, it first covers the history and development of encryption standards, then the

higher level design of the current encryption standard is described, and finally, perfect secrecy is

introduced as a definition of security. A test with varied input sizes was conducted in a controlled

manner; obtained results are recorded and presented in a tabular form in order to visualize the

correlation between input sizes and secrecy values, along with the distinction between key sizes and

secrecy values.

It was concluded that the Advanced Encryption Standard is indeed secure in terms of perfect secrecy,

but it does not have perfect security. The results showed consistency within the algorithm, and it was

indicated that changing the input size does not affect the security of the algorithm to a great extent.

On the other hand, the test carried out did not take in other factors that are typically present in real-

life usage. Therefore the results obtained do not represent practical scenarios but rather give a

theoretical overview of how secure the algorithm is.

1. Introduction

“It was thanks to ULTRA that we won the war.” (Brown, 1987, p. 671). During World War II, the

British intelligence recruited allied code breakers to crack the secret code of the Enigma, a machine

that encoded all of Germany’s wartime communication. With 10

16

possible settings of the Enigma,

it was thought to be unbreakable. However, a flaw in the Enigma’s mechanism was discovered,

which subsequently led to the designation of ULTRA, a secret project dedicated to decoding all of

secret communication of the Axis powers (Lendl, 2012). The breaking of Enigma was considered by

Supreme Allied Commander Dwight D. Eisenhower to have made a significant contribution to the

Allied victory.

Alan Turing, the leading British cryptanalyst who devised techniques for cracking coded messages,

also developed an electromechanical machine that could derive the settings for Enigma which was

estimated to shorten the time of war by two to four years. During that time, Turing was assigned to

share his methods of breaking coded messages in Washington where he met an American

mathematician and electronic engineer, Claude Shannon (Hodges, 1999). Over a cup of tea, Turing

introduced the idea of “Universal Turing Machine”, an abstract theory of computational machine,

to Shannon (Hodges, 1999). As the war comes to an end, Shannon and his colleagues published a

paper on signal processing and data smoothing which commenced the information age and digital

revolution (Mindell, 2002). Shortly after the war, “A Mathematical Theory of Communication”

appeared in the 1948 July and October issues of the Bell System Technical Journal. Shannon defines

the concept of information and how it can be quantified using probability theory, information

entropy is introduced as a measure of uncertainty in a message, and essentially his concepts formed

the field of information theory (Shannon & Weaver, 1949).

Inspired from wartime research, Shannon published “Communication Theory of Secrecy System”

where he addresses an interesting application of cryptography in his theory of communication. One

of the most important concept in Shannon’s theory of secrecy system is the measure of secrecy

which can serve as a metric for evaluating the security of a cryptosystem, Shannon defines two

notions for security: Information-Theoretical Security and Computational Security, both can also be

identified as theoretical and practical secrecy. The notions of security devised by Shannon greatly

influenced the development of modern cryptography, notably public-key cryptography which it was

discovered that shared key between communication parties is not necessary for secrecy (Golomb, et

al., 2002).

The rapidly increasing number of communications systems and the introduction of the internet has

brought high demand for security services and measures to protect digital information, over the past

decade the internet has experienced an 806.0% growth in the number of users connected to the

internet worldwide (Internet World Stats, 2015). The Advanced Encryption Standard, also known

as the AES, is an accepted encryption scheme used by the US government to protect classified

information (National Institute of Standards and Technology, 2001). Although it became a federal

government standard, it was also made publicly available for use in protecting non-classified

information. The AES adopts an algorithm called Rijndael, the selection process of the algorithm

has ensured the best possible encryption scheme for the standard. It provides fast encryption

(Schneier & Whiting, 2000), making it suitable for software applications and especially

implementation in firmware or hardware such as routers or firewalls. Modern applications also

employ security protocols such as SSL or TLS, which rely on AES for their encryption functionality

to ensure secure transmission of information over the World Wide Web (McGrew & Bailey, 2015).

From winning the war to protecting your credentials, the field of cryptography has had a significant

impact on communication. The Enigma was once thought to be impossible to break, but eventually

it was proven wrong and the following consequences were devastating. I ask myself “Would it ever

happen again even with today’s modern cryptography?” the question has led me to investigate the

extent of which the AES is information-theoretically secure. Using Shansnon’s definition of

information-theoretical security to investigate modern cryptosystem like the AES, will provide an

insight on how secure it really is, the outcome of this extended essay should justify how important

encryption is to us.

2. History and Early Development of the Encryption Standard

In the early 1970s, non-military research about cryptographic algorithms was nearly inexistent and

not a lot of people understood the field of cryptography. In 1972, the former US “National Bureau

of Standards” (NBS) initiated a program with the goal of protecting sensitive and unclassified

government data (Tuchman, 1998), one of their aims included a development of a single standardized

cryptographic algorithm, such that it could be tested and certified, and different equipment using

algorithm could interoperate easily.

1

A year later on, the National Bureau of Standards issued a public request for proposals of the

standardized algorithm, the request has raised public interest thus the request has exposed the idea

1

Interoperability refers to the ability for different information systems to communicate and exchange data.

the Rijndael cipher has been considered as an unbreakable algorithm even in modern day computing

(N. Penchalaiah et al., 2010).

3. Design and Implementation

Rijndael is a suite of cryptographic algorithms with different key and block sizes. For the Advanced

Encryption Standard, the National Institute of Standards and Technology selected three members

of the Rijndael ciphers, each member consists of the same fixed block size of 128 bits but with

different key sizes ranging from 128 bits, 192 bits and a maximum of 256 bits. By contrast, the

Rijndael specification actually consists of block and key sizes that may be any multiple of 32 bits,

both with a minimum of 128 and a maximum of 256 bits. (Daemen & Vincent, 2002)

The Rijndael cipher used in Advanced Encryption Standard is a symmetric-key algorithm, meaning

the same key is used for both encrypting and decrypting the plain text data. Rijndael is also a block

cipher, essentially a cryptosystem which consists of functions that map n-bit of plain text blocks into

n-bit of cipher text blocks, and their main objective is to provide confidentiality.

The design of Rijndael algorithm is based on the idea of processing a 128-bit plain text data and

mapping it onto a 4 ∗ 4 matrix of bytes ordered as shown in Figure 3.1.

Figure 3.1. Diagram illustrating a four by four matrix of bytes

Although some members of Rijndael ciphers have a larger block size and have additional columns

in the matrix, most of the calculations are done in a special finite field.

2

The first step in the

encryption process is called key expansions, the round keys are derived from Rijndael’s key

scheduling algorithm. The initial round of the encryption process involves XOR operation on each

plain text input data with the corresponding byte of the first round key. Then a fixed number of

rounds based on a substitution-permutation network is applied.

3

2

See Daemen and Vincent pages 10 to 15 for a more detailed description of finite fields.

3

See Daemen and Vincent pages 19 to 22 for a more detailed description of substitution-permutation operations.

The encryption process is made up of 3 different number of rounds of processing: 10 rounds for

128 - bit keys, 12 rounds for 192-bit keys and 14 rounds for 256-bit keys. Each rounds of processing

are identical, except for the last rounds of each encryption process. The key size of the algorithm

specifies the number of repetitions of transformation rounds.

Each round consists of four sequences of transformation called steps, including one single-byte

based substitution step called 𝑆𝑢𝑏𝐵𝑦𝑡𝑒 , a circular shift row-wise permutation step called

𝑆ℎ𝑖𝑓𝑡𝑅𝑜𝑤𝑠 applied to all rows in the matrix, a linear transformation column-wise mixing step called

𝑀𝑖𝑥𝐶𝑜𝑙𝑢𝑚𝑛𝑠 applied to each column, and the XOR operation of the round key called

The decryption process of the algorithm consists of reversing the round steps while using the same

key as one used in the encryption process, these reverse operations are applied to a cipher text to

retrieve the plain text.

4. Communication Theory of Secrecy System

As the main objective of block ciphers such as Rijndael is to provide confidentiality, a corresponding

objective of an adversary is to be able to recover plain text data from cipher text without knowing

the key used in the encryption process. The block cipher is said to be totally broken if the adversary

can obtain information about the key, and partially broken if they can obtain parts of plain text from

the cipher text. When evaluating the security of a cryptosystem, certain assumptions have to be made.

The first assumption that should always be made is that an adversary has access to all data transmitted

over a communication channel, we also use Kerchkhoff’s principle which assumes that an adversary

has knowledge of the cryptosystem and its encryption function. The principle has been reformulated

by Claude Shannon as “the enemy knows the system”. (Shannon, 1949)

The metrics used in the security evaluation of cryptosystems have been devised by Shannon, the first

model is called computational security. It is a security model that concerns the computational effort

required to break a cryptosystem. A cryptosystem is computationally secure if the algorithm used to

break the cryptosystem requires at least 𝑁 operations. However, this definition of security is more

Figure 4.1 An illustration of the encryption scheme described by Shannon

If we suppose there’s a probability distribution on a message space {𝑃 1

2

𝑛

}, and the key space

1

2

𝑛

} are also distributed with known probabilities. By Shannon’s original definition, a

cryptosystem is perfectly secure if Pr[𝑥|𝑦] = Pr[𝑥] for all 𝑥 ∈ 𝑃, 𝑦 ∈ 𝐶. What the formula means

is that: the a-posteriori probability that the plain text is x, given that the cipher text y is intercepted,

is identical to the a-priori probability that the plain text is x. In simpler terms, observing an

intercepted cipher text does not give the adversary any more information about plain text whicthe h

the adversary does not already know from the a-priori message distribution of plain text space. For

an adversary who only has access to a page from the cipher text space, the only knowledge they have

is that all messages in the message space is equally likely. Even with unlimited computational power,

an adversary is only able to guess which renders high computational power useless. A possible flaw

we can identify from the scheme is the generation of shared key, thea adversary may be able to

obtain information about the key through randomness analysis where the pattern of key generation

can be determined if the key generator is not truly random.

The following definition of perfect secrecy is restricted to a scenario where a key is used for only one

encryption of the plain text, to realistically measure the security of a cryptosystem, the idea of entropy

derived by Shannon is used in a situation where more plain texts are encrypted using the same key.

5

Assume that 𝑋 is a discrete random variable from a finite set 𝑋. The entropy of the message 𝑋 is

defined to be 𝐻(𝑋) as:

= − ∑ Pr[𝑥]𝑙𝑜𝑔

2

Pr[𝑥]

𝑥∈𝑋

The entropy 𝐻(𝑋) is the least quantity of bits required to encode all possible meanings of the

message 𝑋, assuming the probability of all messages occurring Pr[𝑥] are equally likely.

5

Cryptanalyst can carry out a ciphertext-only attack when the same key is used to encrypt more plain texts.

The entropy of a message can also tell us the uncertainty, where the uncertainty of a message is the

number of plain text bits that an adversary must recover from the given ciphertext to obtain the plain

text. Lastly, secrecy value of a cryptosystem is calculated in terms of key equivocation denoted as

𝑐

(𝐾), where key equivocation of a key 𝐾 given cipher text 𝐶 is the conditional entropy of plain

text 𝑃 given cipher text 𝐶, i.e. key eq,uivocation is the uncertainty of a message which is reduced

when additional information is provided. Finally we get, our secrecy value formula:

𝑐

𝑐

2

[𝑃

𝑐

]

6. Testing

The investigation of this research aims to evaluate the Advanced Encryption Standard using Perfect

Secrecy as a security model. The testing factors include varying key sizes from 128 bits to 256 bits,

varying plain text sizes from 10 Kilobytes to 100 Kilobytes and a single mode of operation to encrypt

these data. A cryptographic module called “PyCrypto” written by Dwayne Litzenberger was used to

implement the AES cipher, and a scientific computing module called “NumPy” was used to

manipulate data and perform mathematical operations.

6

The method of assessing secrecy begins with a calculation of byte distribution within the key. A

function called 𝑐𝑜𝑢𝑛𝑡𝐵𝑦𝑡𝑒𝐷𝑖𝑠𝑡(𝑑𝑎𝑡𝑎) takes in an array of byte values as a parameter. The

algorithm works by iterating through a byte array and recording how often each byte appears by

constructing an array called 𝑐𝑜𝑢𝑛𝑡𝑒𝑑𝐷𝑎𝑡𝑎 with a size of 256 cells imitating the size of a byte i.e. 8-

bit value. Given a cipher 𝐶, the algorithm computes the probability 𝑃

𝑐

(𝐾) of each byte in the cipher

text that appears in the key 𝐾, then the total sum of 𝑃 𝑐

2

𝑐

(𝐾) gives the entropy of the key 𝐾

given cipher 𝐶.

The second part of the algorithm attempts to find 𝑃(𝐶) by using the same function

𝑐𝑜𝑢𝑛𝑡𝐵𝑦𝑡𝑒𝐷𝑖𝑠𝑡(𝑑𝑎𝑡𝑎) to calculate how often each cipher byte has appeared in cipher text 𝐶. The

probability of each byte appearing in the key 𝐾 is then computed and summed all together for all

possibilities of the cipher bytes. This cipher text is obtained after the plain text operations with the

key; i.e. this cipher text is correlated to the above key.

6

Visit https://github.com/dlitz/pycrypto for the source code repository

7. Results and Evaluation

7.1 Presenting results

Average Secrecy Values

Input Size (kB) 128 - bit key 192 - bit key 256 - bit key

10 0.1320 0.2243 0.

20 0.1327 0.2238 0.

30 0.1323 0.2241 0.

40 0.1325 0.2238 0.

50 0.1330 0.2130 0.

60 0.1326 0.2155 0.

70 0.1327 0.2242 0.

80 0.1327 0.2062 0.

90 0.1327 0.2080 0.

100 0.1327 0.2000 0.

Table 6.1. Tabular results of the significant variation test of data sizes

Table 6.1. Shows no continuous correlation between input sizes and secrecy values, the difference

in secrecy values for each key sizes indicate that the higher the size of the key, the higher the secrecy

value meaning that they are more secure. The secrecy values obtained in each input sizes and key

sizes indicate no correlation, in fact somewhat random but within the order of magnitude of each

corresponding key sizes. One hundred twenty-eight bits key is in the order of 0.1 of the secrecy

values, 192 bits key is in the order 0.2, and 256 bits key is close to the order of 0.3 of the secrecy

values.

7.2 Evaluation

The results clearly show that the Advanced Encryption Standard is very consistent in the encryption

process with a small range of secrecy values for each key sizes. Change in input sizes do not seem to

make a significant impact on the secrecy of the cipher, this means that the size of plain text data does

not affect the amount of possible information about the key making Advanced Encryption Standard

a cryptosystem that is less prone to attacks that require computational efforts, this is because the

secrecy values are roughly the same despite plain text data sizes.

The highest secrecy value obtained 0.3141 using a 256-bit key, the amount shows that the Advanced

Encryption Standard does not conform to the definition of perfect secrecy meaning some

information about the key can be extracted from the plain text data. However, this does not imply

that the Advanced Encryption Standard is not highly secure. The results obtained from this

investigation should be compared with other known encryption standards especially the Data

Encryption Standard. This investigation left out many factors such as the operation modes of the

block cipher, the type of data to be encrypted, or the speed of the encryption/decryption algorithm.

Therefore the results obtained are not realistic. Repeating the calculation and computing the average

secrecy value gives the results reliability and better accuracy as a single calculation of secrecy value

can provide a range of approximately up to 1%.

8. Conclusion

The results obtained during the test shows a slight variation of secrecy values between input sizes;

the difference in secrecy between key dimensions are highly noticeable as expected. The consistency

in secrecy values indicates a secure cryptographic algorithm.

Because the only testing factors were input sizes and key sizes, the scope of this investigation is

limited to assess the whole algorithm effectively. However, the research instead provides a general

overview of the security offered by the Advanced Encryption Standard.

Overall, the consistency of secrecy values throughout various input sizes and the apparent positive

correlation between secrecy and key sizes make Advanced Encryption Standard somewhat secure in

terms of perfect security despite seemingly negative results. The unrealistic scenario of this

investigation does not equate theoretical results; the results from this investigation show that

Shannon’s definitions of protection are likely to be too strong to be realistic in practice.

Weerasinghe, T., 2014. A Tool to Analyse Symmetric Key Algorithms. International Journal of

Information & Network Security (IJINS), pp. 26-32.

Appendix A: Source code for secrecy value calculation

from Crypto. Cipher import AES

from Crypto import Random

import numpy as np

import math

def countByte ( data ):

countedData = [ 0 **] *** 256

for k in data :

countedData [ k ] += 1

return countedData

def calculateSecrecy ( key , cipher ):

countedKey = countByte ( key )

countedCipher = countByte ( cipher )

entropy = 0

secrecy = 0

for j in range ( 0 , 256 ):

p_k = 1.0 ***** countedKey [ j ] / len ( key )

p_c = 1.0 ***** countedCipher [ j ] / len ( cipher )

if ( p_k > 0 ):

entropy += p_k ***** np. log2 ( p_k )

secrecy += - p_c ***** entropy

return secrecy

def encryptAES ( key , plaintext ):

cipher = AES. new ( key )

ciphertxt = cipher. encrypt ( plaintext )

return ciphertxt

def getResults ( keysize , plaintextsize ):

key = Random. new (). read ( keysize )

totalValue = 0

for i in range(0, 100):

plaintext = Random.new().read(plaintextsize)

ciphertxt = encryptAES(key, plaintext)

cipherbyte = np.fromstring(ciphertxt, dtype=np.uint8)

keybyte = np.fromstring(key, dtype=np.uint8)

totalValue += calculateSecrecy(keybyte, cipherbyte)

avgValue = totalValue/

return avgValue