Symmetric Encryption Schemes: Definition and Security Properties - Prof. Alexandra Boldyre, Study notes of Cryptography and System Security

The concept of symmetric encryption schemes, providing definitions and security properties such as ind-cpa, pr-cpa, and ctr mode. It covers various encryption algorithms like ecb, cbc, and ctrc, and discusses issues in privacy.

Typology: Study notes

Pre 2010

Uploaded on 08/05/2009

koofers-user-kni
koofers-user-kni 🇺🇸

5

(1)

8 documents

1 / 39

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 4
Symmetric Encryption
The symmetric setting considers two parties who share a key and will use this key to
imbue communicated data with various security attributes. The main security goals
are privacy and authenticity of the communicated data. The present chapter looks
at privacy, Chapter ?? looks at authenticity, and Chapter ?? looks at providing
both together. Chapters ?? and ?? describe tools we shall use here.
4.1 Symmetric encryption schemes
The primitive we will consider is called an encryption scheme. Such a scheme
specifies an encryption algorithm, which tells the sender how to process the plaintext
using the key, thereby producing the ciphertext that is actually transmitted. An
encryption scheme also specifies a decryption algorithm, which tells the receiver how
to retrieve the original plaintext from the transmission while possibly performing
some verification, too. Finally, there is a key-generation algorithm, which produces
a key that the parties need to share. The formal description follows.
Definition 4.1 Asymmetric encryption scheme SE = (K,E,D) consists of three
algorithms, as follows:
The randomized key generation algorithm Kreturns a string K. We let
Keys(SE ) denote the set of all strings that have non-zero probability of be-
ing output by K. The members of this set are called keys. We write K$
K
for the operation of executing Kand letting Kdenote the key returned.
The encryption algorithm Etakes a key KKeys(SE) and a plaintext M
{0,1}to return a ciphertext C {0,1} {⊥}. This algorithm might be
randomized or stateful. We write C$
EK(M).
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27

Partial preview of the text

Download Symmetric Encryption Schemes: Definition and Security Properties - Prof. Alexandra Boldyre and more Study notes Cryptography and System Security in PDF only on Docsity!

Chapter 4

Symmetric Encryption

The symmetric setting considers two parties who share a key and will use this key to imbue communicated data with various security attributes. The main security goals are privacy and authenticity of the communicated data. The present chapter looks at privacy, Chapter ?? looks at authenticity, and Chapter ?? looks at providing both together. Chapters ?? and ?? describe tools we shall use here.

4.1 Symmetric encryption schemes

The primitive we will consider is called an encryption scheme. Such a scheme specifies an encryption algorithm, which tells the sender how to process the plaintext using the key, thereby producing the ciphertext that is actually transmitted. An encryption scheme also specifies a decryption algorithm, which tells the receiver how to retrieve the original plaintext from the transmission while possibly performing some verification, too. Finally, there is a key-generation algorithm, which produces a key that the parties need to share. The formal description follows.

Definition 4.1 A symmetric encryption scheme SE = (K, E, D) consists of three algorithms, as follows:

  • The randomized key generation algorithm K returns a string K. We let Keys(SE) denote the set of all strings that have non-zero probability of be- ing output by K. The members of this set are called keys. We write K ← K$ for the operation of executing K and letting K denote the key returned.
  • The encryption algorithm E takes a key K ∈ Keys(SE) and a plaintext M ∈ { 0 , 1 }∗^ to return a ciphertext C ∈ { 0 , 1 }∗^ ∪ {⊥}. This algorithm might be randomized or stateful. We write C ← E$ K (M ).

2 SYMMETRIC ENCRYPTION

  • The deterministic decryption algorithm D takes a key K ∈ Keys(SE) and a ciphertext C ∈ { 0 , 1 }∗^ to return some M ∈ { 0 , 1 }∗^ ∪ {⊥}. We write M ← DK (C).

The scheme is said to provide correct decryption if for any key K ∈ Keys(SE) and any message M ∈ { 0 , 1 }∗

Pr

[ C = ⊥ OR DK (C) = M : C ← E$ K (M )

] = 1.

The key-generation algorithm, as the definition indicates, is randomized. It takes no inputs. When it is run, it flips coins internally and uses these to select a key K. Typically, the key is just a random string of some length, in which case this length is called the key length of the scheme. When two parties want to use the scheme, it is assumed they are in possession of K generated via K. How they came into joint possession of this key K in such a way that the ad- versary did not get to know K is not our concern here, and will be addressed later. For now we assume the key has been shared. Once in possession of a shared key, the sender can run the encryption algorithm with key K and input message M to get back a string we call the ciphertext. The latter can then be transmitted to the receiver. The encryption algorithm may be either randomized or stateful. If randomized, it flips coins and uses those to compute its output on a given input K, M. Each time the algorithm is invoked, it flips coins anew. In particular, invoking the encryption algorithm twice on the same inputs may not yield the same response both times. We say the encryption algorithm is stateful if its operation depends on a quantity called the state that is initialized in some pre-specified way. When the encryption algorithm is invoked on inputs K, M , it computes a ciphertext based on K, M and the current state. It then updates the state, and the new state value is stored. (The receiver does not maintain matching state and, in particular, decryption does not require access to any global variable or call for any synchronization between parties.) Usually, when there is state to be maintained, the state is just a counter. If there is no state maintained by the encryption algorithm the encryption scheme is said to be stateless. The encryption algorithm might be both randomized and stateful, but in practice this is rare: it is usually one or the other but not both. When we talk of a randomized symmetric encryption scheme we mean that the encryption algorithm is randomized. When we talk of a stateful symmetric encryption scheme we mean that the encryption algorithm is stateful. The receiver, upon receiving a ciphertext C, will run the decryption algorithm with the same key used to create the ciphertext, namely compute DK (C). The decryption algorithm is neither randomized nor stateful. Many encryption schemes restrict the set of strings that they are willing to encrypt. (For example, perhaps the algorithm can only encrypt plaintexts of length a positive multiple of some block length n, and can only encrypt plaintexts of length up to some maximum length.) These kinds of restrictions are captured by having the

4 SYMMETRIC ENCRYPTION

algorithm EK (M ) Let static ctr ← 0 Let m ← |M | if ctr + m > k then return ⊥ C ← M ⊕ K[ctr + 1 .. ctr + m] ctr ← ctr + m return 〈ctr − m, C〉

algorithm DK (〈ctr, C〉) Let m ← |M | if ctr + m > k then return ⊥ M ← C ⊕ K[ctr + 1 .. ctr + m] return M

Here X[i .. j] denotes the i-th through j-th bit of the binary string X. By 〈ctr, C〉 we mean a string that encodes the number ctr and the string C. The most natural encoding is to encode ctr using some fixed number of bits, at least lg k, and to prepend this to C. Conventions are established so that every string Y is regarded as encoding some ctr, C for some ctr, C. The encryption algorithm XORs the message bits with key bits, starting with the key bit indicated by one plus the current counter value. The counter is then incremented by the length of the message. Key bits are not reused, and thus if not enough key bits are available to encrypt a message, the encryption algorithm returns ⊥. Note that the ciphertext returned includes the value of the counter. This is to enable decryption. (Recall that the decryption algorithm, as per Definition 4.1, must be stateless and deterministic, so we do not want it to have to maintain a counter as well.)

4.2.2 Some modes of operation

The following schemes rely either on a family of permutations (i.e., a block cipher) or a family of functions. Effectively, the mechanisms spell out how to use the block cipher to encrypt. We call such a mechanism a mode of operation of the block cipher. For some of the schemes it is convenient to assume that the length of the message to be encrypted is a positive multiple of a block length associated to the family. Accordingly, we will let the encryption algorithm returns ⊥ if this is not the case. In practice, one could pad the message appropriately so that the padded message always had length a positive multiple of the block length, and apply the encryption algorithm to the padded message. The padding function should be injective and easily invertible. In this way you would create a new encryption scheme. The first scheme we consider is ECB (Electronic Codebook Mode), whose secu- rity is considered in Section 4.5.1.

Scheme 4.3 [ECB mode] Let E: K × { 0 , 1 }n^ → { 0 , 1 }n^ be a block cipher. Oper- ating it in ECB (Electronic Code Book) mode yields a stateless symmetric encryp- tion scheme SE = (K, E, D). The key-generation algorithm simply returns a random key for the block cipher, meaning it picks a random string K ← K$ and returns it. The encryption and decryption algorithms are depicted in Fig. 4.1. “Break M into n-bit blocks M [1] · · · M [m]” means to set m = |M |/n and, for i ∈ { 1 ,... , m}, set M [i] to the i-th n-bit block in M , that is, (i − 1)n + 1 through in of M. Similarly for breaking C into C[1] · · · C[m]. Notice that this time the encryption algorithm

Bellare and Rogaway 5

algorithm EK (M ) if (|M | mod n 6 = 0 or |M | = 0) then return ⊥ Break M into n-bit blocks M [1] · · · M [m] for i ← 1 to m do C[i] ← EK (M [i]) C ← C[1] · · · C[m] return C

algorithm DK (C) if (|C| mod n 6 = 0 or |C| = 0) then return ⊥ Break C into n-bit blocks C[1] · · · C[m] for i ← 1 to m do M [i] ← E K−^1 (C[i]) M ← M [1] · · · M [m] return M

Figure 4.1: ECB mode.

did not make any random choices. (That does not mean it is not, technically, a randomized algorithm; it is simply a randomized algorithm that happened not to make any random choices.)

The next scheme, cipher-block chaining (CBC) with random initial vector, is the most popular block-cipher mode of operation, used pervasively in practice.

Scheme 4.4 [CBC$ mode] Let E: K × { 0 , 1 }n^ → { 0 , 1 }n^ be a block cipher. Operating it in CBC mode with random IV yields a stateless symmetric encryption scheme, SE = (K, E, D). The key generation algorithm simply returns a random key for the block cipher, K ← K$. The encryption and decryption algorithms are depicted in Fig. 4.2. The IV (“initialization vector”) is C[0], which is chosen at random by the encryption algorithm. This choice is made independently each time the algorithm is invoked.

For the following schemes it is useful to introduce some notation. If n ≥ 1 and i ≥ 0 are integers then we let [i]n denote the n-bit string that is the binary representation of integer i mod 2n. If we use a number i ≥ 0 in a context for which a string I ∈ { 0 , 1 }n^ is required, it is understood that we mean to replace i by I = [i]n. The following is a counter-based version of CBC mode, whose security is considered in Section 4.5.3.

Scheme 4.5 [CBCC mode] Let E: K × { 0 , 1 }n^ → { 0 , 1 }n^ be a block cipher. Operating it in CBC mode with counter IV yields a stateful symmetric encryption

Bellare and Rogaway 7

algorithm EK (M ) static ctr ← 0 if (|M | mod n 6 = 0 or |M | = 0) then return ⊥ Break M into n-bit blocks M [1] · · · M [m] if ctr ≥ 2 n^ then return ⊥ C[0] ← IV ← [ctr]n for i ← 1 to m do C[i] ← EK (C[i − 1] ⊕ M [i]) C ← C[1] · · · C[m] ctr ← ctr + 1 return 〈IV, C〉

algorithm DK (〈IV, C〉) if (|C| mod n 6 = 0 or |C| = 0) then return ⊥ Break C into n-bit blocks C[1] · · · C[m] if IV + m > 2 n^ then return ⊥ C[0] ← IV for i ← 1 to m do M [i] ← E K−^1 (C[i]) ⊕ C[i − 1]) M ← M [1] · · · M [m] return M

Figure 4.3: CBCC mode.

point R is used to define a sequence of values on which FK is applied to produce a “pseudo one-time pad” to which the plaintext is XORed. The starting point R chosen by the encryption algorithm is a random -bit string. To add an-bit string R to an integer i—when we write FK (R + i)—convert the -bit string R into an integer in the range [0 .. 2^ − 1] in the usual way, add this number to i, take the result modulo 2, and then convert this back into an-bit string. Note that the starting point R is included in the ciphertext, to enable decryption. On encryption, the pad Pad is understood to be the empty string when m = 0.

We now give the counter-based version of CTR mode.

Scheme 4.7 [CTRC mode] Let F : K×{ 0 , 1 }`^ → { 0 , 1 }L^ be a family of functions. (Possibly a block cipher, but not necessarily.) Operating it in CTR mode with a counter starting point is a stateful symmetric encryption scheme, SE = (K, E, D), which we call CTRC. The key-generation algorithm simply returns a random key for F. The encryptor maintains a counter ctr which is initially zero. The encryption and decryption algorithms are depicted in Fig. 4.5. Position index ctr is not allowed to wrap around: the encryption algorithm returns ⊥ if this would happen. The

8 SYMMETRIC ENCRYPTION

algorithm EK (M ) m ← d|M |/Le R ← {$ 0 , 1 }` Pad ← FK (R + 1) ‖ FK (R + 2) ‖ · · · ‖ FK (R + m) Pad ← the first |M | bits of Pad C′^ ← M ⊕ Pad C ← R ‖ C′ return C

algorithm DK (C) if |C| < then return ⊥ Parse C into R ‖ C′^ where |R| = m ← d|C′|/Le Pad ← FK (R + 1) ‖ FK (R + 2) ‖ · · · ‖ FK (R + m) Pad ← the first |C′| bits of Pad M ← C′^ ⊕ Pad return M

Figure 4.4: CTR$ mode using a family of functions F : K × { 0 , 1 }`^ → { 0 , 1 }L. This version of counter mode is randomized and stateless.

position index is included in the ciphertext in order to enable decryption. The encryption algorithm updates the position index upon each invocation, and begins with this updated value the next time it is invoked.

We will return to the security of these schemes after we have developed the appro- priate notions.

4.3 Issues in privacy

Let us fix a symmetric encryption scheme SE = (K, E, D). Two parties share a key K for this scheme, this key having being generated as K ← K$. The adversary does not a priori know K. We now want to explore the issue of what the privacy of the scheme might mean. For this chapter, security is privacy, and we are trying to get to the heart of what security is about. The adversary is assumed able to capture any ciphertext that flows on the chan- nel between the two parties. It can thus collect ciphertexts, and try to glean some- thing from them. Our first question is: what exactly does “glean” mean? What tasks, were the adversary to accomplish them, would make us declare the scheme insecure? And, correspondingly, what tasks, were the adversary unable to accom- plish them, would make us declare the scheme secure?

10 SYMMETRIC ENCRYPTION

bit leaks, the adversary knows whether I want to buy or sell stock 1, which may be something I don’t want to reveal. If the sum of the bits leaks, the adversary knows how many stocks I am buying. Granted, this might not be a problem at all if the data were in a different format. However, making assumptions, or requirements, on how users format data, or how they use it, is a bad and dangerous approach to secure protocol design. An important principle of good cryptographic design is that the encryption scheme should provide security regardless of the format of the plaintext. Users should not have to worry about the how they format their data: they format it as they like, and encryption should provide privacy nonetheless. Put another way, as designers of security protocols, we should not make as- sumptions about data content or formats. Our protocols must protect any data, no matter how formatted. We view it as the job of the protocol designer to ensure this is true. At this point it should start becoming obvious that there is an infinite list of insecurity properties, and we can hardly attempt to characterize security as their absence. We need to think about security in a different and more direct way and arrive at some definition of it. This important task is surprisingly neglected in many treatments of cryptog- raphy, which will provide you with many schemes and attacks, but never actually define the goal by saying what an encryption scheme is actually trying to achieve and when it should be considered secure rather than merely not known to be insecure. This is the task that we want to address. One might want to say something like: the encryption scheme is secure if given C, the adversary has no idea what M is. This however cannot be true, because of what is called a priori information. Often, something about the message is known. For example, it might be a packet with known headers. Or, it might be an English word. So the adversary, and everyone else, has some information about the message even before it is encrypted. We want schemes that are secure in the strongest possible natural sense. What is the best we could hope for? It is useful to make a thought experiment. What would an “ideal” encryption be like? Well, it would be as though some angel took the message M from the sender and delivered it to the receiver, in some magic way. The adversary would see nothing at all. Intuitively, our goal is to approximate this as best as possible. We would like encryption to have the properties of ideal encryption. In particular, no partial information would leak. We do deviate from the ideal in one way, though. Encryption is not asked to hide the length of the plaintext string. This information not only can leak but is usually supposed to be known to the adversary a priori. As an example, consider the ECB encryption scheme of Scheme 4.3. Given the ciphertext, can an eavesdropping adversary figure out the message? It is hard to see how, since it does not know K, and if F is a “good” block cipher, then it ought to have a hard time inverting FK without knowledge of the underlying key. Nonetheless

Bellare and Rogaway 11

this is not a good scheme. Consider just the case n = 1 of a single block message. Suppose a missile command center has just two messages, 1n^ for fire and 0n^ for don’t fire. It keeps sending data, but always one of these two. What happens? When the first ciphertext C 1 goes by, the adversary may not know what is the plaintext. But then, let us say it sees a missile taking off. Now, it knows the message M 1 underlying C 1 was 1n. But then it can easily decrypt all subsequent messages, for if it sees a ciphertext C, the message is 1n^ if C = C 1 and 0n^ if C 6 = C 1.

In a secure encryption scheme, it should not be possible to relate ciphertexts of different messages of the same length in such a way that information is leaked.

Not allowing message-equalities to be leaked has a dramatic implication. Namely, encryption must be probabilistic or depend on state information. If not, you can always tell if the same message was sent twice. Each encryption must use fresh coin tosses, or, say, a counter, and an encryption of a particular message may be different each time. In terms of our setup it means E is a probabilistic or stateful algorithm. That’s why we defined symmetric encryption schemes, above, to allow these types of algorithms.

The reason this is dramatic is that it goes in many ways against the historical or popular notion of encryption. Encryption was once thought of as a code, a fixed mapping of plaintexts to ciphertexts. But this is not the contemporary viewpoint. A single plaintext should have many possible ciphertexts (depending on the random choices or the state of the encryption algorithm). Yet it must be possible to decrypt. How is this possible? We have seen several examples above.

One formalization of privacy is what is called perfect security, an information- theoretic notion introduced by Shannon and showed by him to be met by the one- time pad scheme, and covered in Chapter ??. Perfect security asks that regardless of the computing power available to the adversary, the ciphertext provides it no information about the plaintext beyond the a priori information it had prior to seeing the ciphertext. Perfect security is a very strong attribute, but achieving it requires a key as long as the total amount of data encrypted, and this is not usually practical. So here we look at a notion of computational security. The security will only hold with respect to adversaries of limited computing power. If the adversary works harder, she can figure out more, but a “feasible” amount of effort yields no noticeable information. This is the important notion for us and will be used to analyze the security of schemes such as those presented above.

4.4 Indistinguishability under chosen-plaintext attack

Having discussed the issues in Section 4.3 above, we will now distill a formal defini- tion of security.

Bellare and Rogaway 13

C ← E$ K (M 0 ), and returns C as the answer.

World 1: The oracle provided to the adversary is EK (LR(·, ·, 1)). So, whenever the adversary makes a query (M 0 , M 1 ) with |M 0 | = |M 1 | to its oracle, the oracle computes C ← E$ K (M 1 ), and returns C as the answer.

We also call the first world (or oracle) the “left” world (or oracle), and the second world (or oracle) the “right” world (or oracle). The problem for the adversary is, after talking to its oracle for some time, to tell which of the two oracles it was given. Before we pin this down, let us further clarify exactly how the oracle operates. Think of the oracle as a subroutine to which A has access. Adversary A can make an oracle query (M 0 , M 1 ) by calling the subroutine with arguments (M 0 , M 1 ). In one step, the answer is then returned. Adversary A has no control on how the answer is computed, nor can A see the inner workings of the subroutine, which will typically depend on secret information that A is not provided. Adversary A has only an interface to the subroutine—the ability to call it as a black-box, and get back an answer. First assume the given symmetric encryption scheme SE is stateless. The oracle, in either world, is probabilistic, because it calls the encryption algorithm. Recall that this algorithm is probabilistic. Above, when we say C ← E$ K (Mb), it is implicit that the oracle picks its own random coins and uses them to compute ciphertext C. The random choices of the encryption function are somewhat “under the rug” here, not being explicitly represented in the notation. But these random bits should not be forgotten. They are central to the meaningfulness of the notion and the security of the schemes. If the given symmetric encryption scheme SE is stateful, the oracles, in either world, become stateful, too. (Think of a subroutine that maintains a “static” vari- able across successive calls.) An oracle begins with a state value initialized to a value specified by the encryption scheme. For example, in CTRC mode, the state is an integer ctr that is initialized to 0. Now, each time the oracle is invoked, it computes EK (Mb) according to the specification of algorithm E. The algorithm may, as a side-effect, update the state, and upon the next invocation of the oracle, the new state value will be used. The following definition associates to a symmetric encryption scheme SE and an adversary A a pair of experiments, one capturing each of the worlds described above. The adversary’s advantage, which measures its success in breaking the scheme, is the difference in probabilities of the two experiments returning the bit one.

Definition 4.8 Let SE = (K, E, D) be a symmetric encryption scheme, and let A be an algorithm that has access to an oracle. We consider the following experiments:

Experiment Expind SE - cpa-^1 (A) K ← K$ d ←$ AEK^ (LR(·,·,1)) Return d

Experiment Expind SE - cpa-^0 (A) K ← K$ d ←$ AEK^ (LR(·,·,0)) Return d

14 SYMMETRIC ENCRYPTION

The oracle used above is specified in Fig. 4.6. The IND-CPA advantage of A is defined as

Advind SE - cpa(A) = Pr

[ Expind SE - cpa-^1 (A) = 1

] − Pr

[ Expind SE - cpa-^0 (A) = 1

] .

As the above indicates, the choice of which world we are in is made just once, at the beginning, before the adversary starts to interact with the oracle. In world 0, all message pairs sent to the oracle are answered by the oracle encrypting the left message in the pair, while in world 1, all message pairs are answered by the oracle encrypting the right message in the pair. The choice of which does not flip-flop from oracle query to oracle query. If Advind SE - cpa(A) is small (meaning close to zero), it means that A is outputting 1 about as often in world 0 as in world 1, meaning it is not doing a good job of telling which world it is in. If this quantity is large (meaning close to one—or at least far from zero) then the adversary A is doing well, meaning our scheme SE is not secure, at least to the extent that we regard A as “reasonable.” Informally, for symmetric encryption scheme SE to be secure against chosen plaintext attack, the IND-CPA advantage of an adversary must be small, no matter what strategy the adversary tries. However, we have to be realistic in our expec- tations, understanding that the advantage may grow as the adversary invests more effort in its attack. Security is a measure of how large the advantage of the adversary might when compared against the adversary’s resources. We consider an encryption scheme to be “secure against chosen-plaintext at- tack” if an adversary restricted to using “practical” amount of resources (computing time, number of queries) cannot obtain “significant” advantage. The technical no- tion is called left-or-right indistinguishability under chosen-plaintext attack, denoted IND-CPA. We discuss some important conventions regarding the resources of adversary A. The running time of an adversary A is the worst case execution time of A over all possible coins of A and all conceivable oracle return values (including return values that could never arise in the experiments used to define the advantage). Oracle queries are understood to return a value in unit time, but it takes the adversary one unit of time to read any bit that it chooses to read. By convention, the running time of A also includes the size of the code of the adversary A, in some fixed RAM model of computation. This convention for measuring time complexity is the same as used in other parts of these notes, for all kinds of adversaries. Other resource conventions are specific to the IND-CPA notion. When the ad- versary asks its left-or-right encryption oracle a query (M 0 , M 1 ) we say that length of this query is max(|M 0 |, |M 1 |). (This will equal |M 0 | for any reasonable adversary since an oracle query with messages of different lengths results in the adversary being returned ⊥, so we can assume no reasonable adversary makes such a query.) The total length of queries is the sum of the length of each query. We can measure query lengths in bits or in blocks, with block having some understood number of bits n.

16 SYMMETRIC ENCRYPTION

The Proposition says that this rescaled advantage is exactly the same measure as before.

Proof of Proposition 4.9: We let Pr [·] be the probability of event “·” in the experiment Expind SE - cpa-cg(A), and refer below to quantities in this experiment. The claim of the Proposition follows by a straightforward calculation:

Pr

[ Expind SE - cpa-cg(A) = 1

]

= Pr

[ b = b′

]

= Pr

[ b = b′^ | b = 1

] · Pr [b = 1] + Pr

[ b = b′^ | b = 0

] · Pr [b = 0]

= Pr

[ b = b′^ | b = 1

] ·

  • Pr

[ b = b′^ | b = 0

] ·

= Pr

[ b′^ = 1 | b = 1

] ·

  • Pr

[ b′^ = 0 | b = 0

] ·

= Pr

[ b′^ = 1 | b = 1

] ·

( 1 − Pr

[ b′^ = 1 | b = 0

]) ·

( Pr

[ b′^ = 1 | b = 1

] − Pr

[ b′^ = 1 | b = 0

])

( Pr

[ Expind SE - cpa-^1 (A) = 1

] − Pr

[ Expind SE - cpa-^0 (A) = 1

])

· Advind SE - cpa(A).

We began by expanding the quantity of interest via standard conditioning. The term of 1/2 in the third line emerged because the choice of b is made at random. In the fourth line we noted that if we are asking whether b = b′^ given that we know b = 1, it is the same as asking whether b′^ = 1 given b = 1, and analogously for b = 0. In the fifth line and sixth lines we just manipulated the probabilities and simplified. The next line is important; here we observed that the conditional probabilities in question are exactly the probabilities that A returns 1 in the experiments of Definition 4.8.

4.4.3 Why is this a good definition?

Our thesis is that we should consider an encryption scheme to be “secure” if and only if it is IND-CPA secure, meaning that the above formalization captures our intuitive sense of privacy, and the security requirements that one might put on an encryption scheme can be boiled down to this one. But why? Why does IND-CPA capture “privacy”? This is an important question to address and answer. In particular, here is one concern. In Section 4.3 we noted a number of security properties that are necessary but not sufficient for security. For example, it should be

Bellare and Rogaway 17

computationally infeasible for an adversary to recover the key from a few plaintext- ciphertext pairs, or to recover a plaintext from a ciphertext. A test of our definition is that it implies the necessary properties that we have discussed, and others. For example, a scheme that is secure in the IND-CPA sense of our definition should also be, automatically, secure against key-recovery or plaintext- recovery. Later, we will prove such things, and even stronger things. For now, let us continue to get a better sense of how to work with the definition by using it to show that certain schemes are insecure.

4.5 Example chosen-plaintext attacks

We illustrate the use of our IND-CPA definition in finding attacks by providing an attack on ECB mode, and also a general attack on deterministic, stateless schemes.

4.5.1 Attack on ECB

Let us fix a block cipher E: K × { 0 , 1 }n^ → { 0 , 1 }n. The ECB symmetric encryption scheme SE = (K, E, D) was described as Scheme 4.3. Suppose an adversary sees a ciphertext C = EK (M ) corresponding to some random plaintext M , encrypted under the key K also unknown to the adversary. Can the adversary recover M? Not easily, if E is a “good” block cipher. For example if E is AES, it seems quite infeasible. Yet, we have already discussed how infeasibility of recovering plaintext from ciphertext is not an indication of security. ECB has other weaknesses. Notice that if two plaintexts M and M ′^ agree in the first block, then so do the corresponding ciphertexts. So an adversary, given the ciphertexts, can tell whether or not the first blocks of the corresponding plaintexts are the same. This is loss of partial information about the plaintexts, and is not permissible in a secure encryption scheme. It is a test of our definition to see that it captures these weaknesses and also finds the scheme insecure. It does. To show this, we want to show that there is an adversary that has a high IND-CPA advantage while using a small amount of resources. We now construct such an adversary A. Remember that A is given a lr-encryption oracle EK (LR(·, ·, b)) that takes as input a pair of messages and that returns an encryption of either the left or the right message in the pair, depending on the value of the bit b. The goal of A is to determine the value of b. Our adversary works like this:

Adversary AEK^ (LR(·,·,b)) M 1 ← 02 n^ ; M 0 ← 0 n^ ‖ 1 n C[1]C[2] ← EK (LR(M 0 , M 1 , b)) If C[1] = C[2] then return 1 else return 0

Above, X[i] denotes the i-th block of a string X, a block being a sequence of n bits. The adversary’s single oracle query is the pair of messages M 0 , M 1. Since each

Bellare and Rogaway 19

The requirement being made on the message space is minimal; typical schemes have messages spaces containing all strings of lengths between some minimum and maximum length, possibly restricted to strings of some given multiples. Note that this Proposition applies to ECB and is enough to show the latter is insecure.

Proof of Proposition 4.10: We must describe the adversary A. Remember that A is given an lr-encryption oracle f = EK (LR(·, ·, b)) that takes input a pair of messages and returns an encryption of either the left or the right message in the pair, depending on the value of b. The goal of A is to determine the value of b. Our adversary works like this:

Adversary Af Let X, Y be distinct, m-bit strings in the plaintext space C 1 ← EK (LR(X, Y, b)) C 2 ← EK (LR(Y, Y, b)) If C 1 = C 2 then return 1 else return 0

Now, we claim that

Pr

[ Expind SE - cpa-^1 (A) = 1

] = 1 and

Pr

[ Expind SE - cpa-^0 (A) = 1

] = 0.

Why? In world 1, meaning b = 1, the oracle returns C 1 = EK (Y ) and C 2 = EK (Y ), and since the encryption function is deterministic and stateless, C 1 = C 2 , so A returns 1. In world 0, meaning b = 0, the oracle returns C 1 = EK (X) and C 2 = EK (Y ), and since it is required that decryption be able to recover the message, it must be that C 1 6 = C 2. So A returns 0.

Subtracting, we get Advind SE - cpa(A) = 1 − 0 = 1. And A achieved this advantage by making two oracle queries, each of whose length, which as per our conventions is just the length of the first message, is m bits.

4.5.3 Attack on CBC encryption with counter IV

Let us fix a block cipher E: K × { 0 , 1 }n^ → { 0 , 1 }n. Let SE = (K, E, D) be the corresponding counter-based version of the CBC encryption mode described in Scheme 4.5. We show that this scheme is insecure. The reason is that the adversary can predict the counter value. To justify our claim of insecurity, we present an adversary A. As usual it is given an lr-encryption oracle EK (LR(·, ·, b)) and wants to determine b. Our adversary works like this:

Adversary AEK^ (LR(·,·,b)) M 0 , 1 ← 0 n^ ; M 1 , 1 ← 0 n M 0 , 2 ← 0 n^ ; M 1 , 2 ← 0 n−^11

20 SYMMETRIC ENCRYPTION

〈IV 1 , C 1 〉 ← E$ K (LR(M 0 , 1 , M 1 , 1 , b)) 〈IV 2 , C 2 〉 ← E$ K (LR(M 0 , 2 , M 1 , 2 , b)) If C 1 = C 2 then return 1 else return 0

We claim that

Pr

[ Expind SE - cpa-^1 (A) = 1

] = 1 and

Pr

[ Expind SE - cpa-^0 (A) = 1

] = 0.

Why? First consider the case b = 0, meaning we are in world 0. In that case IV 1 = 0 and IV 2 = 1 and C 1 = EK (0) and C 2 = EK (1) and so C 1 6 = C 2 and the defined experiment returns 0. On the other hand, if b = 1, meaning we are in world 1, then IV 1 = 0 and IV 2 1 = 1 and C 1 = EK (0) and C 2 = EK (0), so the defined experiment returns 1. Subtracting, we get Advind SE - cpa(A) = 1 − 0 = 1, showing that A has a very high advantage. Moreover, A is practical, using very few resources. So the scheme is insecure.

4.6 IND-CPA implies PR-CPA

In Section 4.3 we noted a number of security properties that are necessary but not sufficient for security. For example, it should be computationally infeasible for an adversary to recover the key from a few plaintext-ciphertext pairs, or to recover a plaintext from a ciphertext. A test of our definition is that it implies these properties, in the sense that a scheme that is secure in the sense of our definition is also secure against key-recovery or plaintext-recovery. The situation is analogous to what we saw in the case of PRFs. There we showed that a secure PRF is secure against key-recovery. In order to have some variation, this time we choose a different property, namely plaintext recovery. We formalize this, and then show if there was an adversary B capable of recovering the plaintext from a given ciphertext, then this would enable us to construct an adversary A that broke the scheme in the IND-CPA sense (meaning the adversary can identify which of the two worlds it is in). If the scheme is secure in the IND-CPA sense, that latter adversary could not exist, and hence neither could the former. The idea of this argument illustrates one way to evidence that a definition is good—say the definition of left-or-right indistinguishability. Take some property that you feel a secure scheme should have, like infeasibility of key recovery from a few plaintext-ciphertext pairs, or infeasibility of predicting the XOR of the plaintext bits. Imagine there were an adversary B that was successful at this task. We should show that this would enable us to construct an adversary A that broke the scheme in the original sense (left-or-right indistinguishability). Thus the adversary B does not exist if the scheme is secure in the left-or-right sense. More precisely, we use the advantage function of the scheme to bound the probability that adversary B succeeds.