Multi-Instance Security and its Application to Password-Based Cryptography, Lecture notes of Cryptography and System Security

This paper develops a theory of multi-instance (mi) security and applies it to provide the first proof-based support for the classical practice of salting in password-based cryptography. the metrics of mi security, relations, password-based encryption via KDFs, and security amplification. The document could be useful as study notes, lecture notes, summary, or thesis for a university student in computer science or cryptography.

Typology: Lecture notes

2021/2022

Uploaded on 05/11/2023

tiuw
tiuw 🇺🇸

4.7

(18)

286 documents

1 / 29

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Multi-Instance Security and its Application to
Password-Based Cryptography
Mihir BellareStefano TessaroThomas Ristenpart
November 2011
Abstract
This paper develops a theory of multi-instance (mi) security and applies it to provide the first
proof-based support for the classical practice of salting in password-based cryptography. Mi-security
comes into play in settings (like password-based cryptography) where it is computationally feasible to
compromise a single instance, and provides a second line of defense, aiming to ensure (in the case of
passwords, via salting) that the effort to compromise all of some large number mof instances grows
linearly with m. The first challenge is definitions, where we suggest LORX-security as a good metric
for mi security of encryption and support this claim by showing it implies other natural metrics,
illustrating in the process that even lifting simple results from the si setting to the mi one calls for
new techniques. Next we provide a composition-based framework to transfer standard single-instance
(si) security to mi-security with the aid of a key-derivation function. Analyzing password-based KDFs
from the PKCS#5 standard to show that they meet our indifferentiability-style mi-security definition
for KDFs, we are able to conclude with the first proof that per password salts amplify mi-security
as hoped in practice. We believe that mi-security is of interest in other domains and that this work
provides the foundation for its further theoretical development and practical application.
Keywords: Passwords, security amplification, indifferentiability, random oracles.
Department of Computer Science & Engineering, University of California San Diego, USA.
MIT CSAIL, USA.
Department of Computer Science, University of Wisconsin–Madison, USA.
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d

Partial preview of the text

Download Multi-Instance Security and its Application to Password-Based Cryptography and more Lecture notes Cryptography and System Security in PDF only on Docsity!

Multi-Instance Security and its Application to

Password-Based Cryptography

Mihir Bellare∗^ Stefano Tessaro†^ Thomas Ristenpart‡

November 2011

Abstract This paper develops a theory of multi-instance (mi) security and applies it to provide the first proof-based support for the classical practice of salting in password-based cryptography. Mi-security comes into play in settings (like password-based cryptography) where it is computationally feasible to compromise a single instance, and provides a second line of defense, aiming to ensure (in the case of passwords, via salting) that the effort to compromise all of some large number m of instances grows linearly with m. The first challenge is definitions, where we suggest LORX-security as a good metric for mi security of encryption and support this claim by showing it implies other natural metrics, illustrating in the process that even lifting simple results from the si setting to the mi one calls for new techniques. Next we provide a composition-based framework to transfer standard single-instance (si) security to mi-security with the aid of a key-derivation function. Analyzing password-based KDFs from the PKCS#5 standard to show that they meet our indifferentiability-style mi-security definition for KDFs, we are able to conclude with the first proof that per password salts amplify mi-security as hoped in practice. We believe that mi-security is of interest in other domains and that this work provides the foundation for its further theoretical development and practical application. Keywords: Passwords, security amplification, indifferentiability, random oracles.

∗Department of Computer Science & Engineering, University of California San Diego, USA. †MIT CSAIL, USA. ‡Department of Computer Science, University of Wisconsin–Madison, USA.

Contents

  • 1 Introduction
  • 2 The Multi-Instance Terrain
    • 2.1 Metrics of mi security
    • 2.2 Relations
  • 3 Password-based Encryption via KDFs
    • 3.1 Simulation-based Security for KDFs
    • 3.2 Security of PBE
  • A Brute-force password-recovery attacks
  • B section title
  • C Proofs for Section
  • D Multi-user Encryption Security
  • E More KDFs
  • F Proof of Theorem 3.2
  • G Security Analysis of KD1
  • H Proof of Theorem 3.4

LORX RORX

AND UKU

Th. 2. 2 m Th. 2.

Th 2.5 2 m Th 2.

Figure 1: Notions of multi-instance security for encryption and their relations. LORX (left-or-right xor indistinguishability) emerges as the strongest, tightly implying RORX (real-or-random xor indistinguishability) and UKU (universal key-unrecoverability). The dashed line indicates that under some (mild, usually met) conditions LORX also implies AND. RORX implies LORX and UKU but with a 2m^ loss in advantage where m is the number of instances, making LORX a better choice.

adversary wins if it breaks all m instances of the encryption but does not win if it breaks strictly fewer. If “breaking” is interpreted as recovery of the key then such a metric is easily given: it is the probability that the adversary recovers all m target keys. We refer to this as the UKU (Universal Key Unrecoverability) metric. But we know very well that key-recovery is a weak metric of encryption security. We want instead a mi analog of ind-cpa. The first thing that might come to mind is multi-user security [4, 3]. But in the latter the adversary wins (gets an advantage of one) even if it breaks just one instance so the mu-advantage of an adversary can never be less than its si (ind-cpa) advantage. We, in contrast, cannot “give up” once a single instance is broken. Something radically different is needed. Our answer is LORX (left-or-right xor indistinguishability). Our game picks m independent challenge bits b 1 ,... , bm and gives the adversary an oracle Enc(·, ·, ·) which given i, M 0 , M 1 returns an encryption of Mbi under Ki. The adversary outputs a bit b′^ and its advantage is 2 Pr[b′^ = b 1 ⊕ · · · ⊕ bm] − 1.^2 Why xor? Its well-known “sensitivity” means that even if the adversary figures out m − 1 of the challenge bits, it will have low advantage unless it also figures out the last. This intuitive and historical support is strengthened by the relations, discussed below, that show that LORX implies security under other natural metrics.

Relations. The novelty of multi-instance security prompts us to step back and consider a broad choice of definitions. Besides UKU and LORX, we define RORX (real-or-random xor indistinguishability, a mi-adaptation of the si ROR notion of [5]) and a natural AND metric where the challenge bits b 1 ,... , bm and oracle Enc(·, ·, ·) are as in the LORX game but the adversary output is a vector (b′ 1 ,... , b′ m) and its advantage is Pr[(b′ 1 ,... , b′ m) = (b 1 ,... , bm)] − 2 −m. The relations we provide, summarized in Figure 1, show that LORX emerges as the best choice because it implies all the others with tight reductions. Beyond that, they illustrate that the mi terrain differs from the si one in perhaps surprising ways, both in terms of relations and the techniques needed to establish them. Thus, in the si setting, LOR and ROR are easily shown equivalent up to a factor 2 in the advantages [5]. It continues to be true that LORX easily implies RORX but the hybrid argument used to prove that ROR implies LOR [5] does not easily extend to the mi setting and the proof that RORX implies LORX is not only more involved but incurs a factor 2m^ loss.^3 In the si setting, both LOR and ROR are easily shown to imply KU (key unrecoverability). Showing LORX implies UKU is more involved, needing a boosting argument to ensure preservation of exponentially-vanishing advantages. This reduction is tight

(^2) This is a simplification of our actual definition, which allows the adversary to adaptively corrupt instances to reveal the underlying keys and challenge bits. This capability means that LORX-security implies threshold security where the adversary wins if it predicts the xor of the challenge bits of some subset of the instances of its choice. See Section 2 for further justification for this feature of the model. (^3) This (exponential) 2m (^) factor loss is a natural consequence of the factor of 2 loss in the si case, our bound is tight, and the loss in applications is usually small because advantages are already exponentially vanishing in m. Nonetheless it is not always negligible and makes LORX preferable to RORX.

but, interestingly, the reduction showing RORX implies UKU is not, incurring a 2m-factor loss, again indicating that LORX is a better choice. We show that LORX usually implies AND by exploiting a direct product theorem by Unger [41], evidencing the connections with this area. Another natural metric of mi-security is a threshold one, but our incorporation of corruptions means that LORX implies security under this metric.

Mi-security of PBE. Under the LORX metric, we prove that the advantage ǫ′^ obtained by a time t adversary against m instances of the above PBE scheme E′^ is at most ǫ + (q/mcN )m^ (we are dropping negligible terms) where q is the number of adversary queries to RO H and ǫ is the advantage of a time t ind-cpa (si) adversary against E. This is the desired result saying that salting works to provide a second line of defense under a strong mi security metric, amplifying security linearly in the number of instances.

Framework. This result for PBE is established in a modular (rather than ad hoc) way, via a frame- work that yields corresponding results for any password-based primitive. This means not only ones like password-based message authentication (also covered in PKCS#5) or password-based authenticated en- cryption (WinZip) but public-key primitives like password-based digital signatures, where the signing key is derived from a password. We view a password-based scheme for a goal as derived by composing a key-derivation function (KDF) with a standard (si) scheme for the same goal. The framework then has the following components. (1) We provide a definition of mi-security for KDFs. (2) We provide compo- sition theorems, showing that composing a mi-secure KDF with a si-secure scheme for a goal results in a mi-secure scheme for that goal. (We will illustrate this for the case of encryption but similar results may be shown for other primitives.) (3) We analyze the iterated hash KDF of PKCS#5 and establish its mi security. The statements above are qualitative. The quantitative security aspect is crucial. The definition of mi-security of KDFs must permit showing mi-security much higher than si-security. The reductions in the composition theorems must preserve exponentially vanishing mi-advantages. And the analysis of the PKCS#5 KDF must prove that the adversary advantage in q queries to the RO H grows as (q/cmN )m, not merely q/cN. These quantitative constraints represent important technical challenges.

Mi-security of KDFs. We expand on item (1) above. The definition of mi-security we provide for KDFs is a simulation-based one inspired by the indifferentiability framework [29, 13]. The at- tacker must distinguish between the real world and an ideal counterpart. In both, target passwords pw 1 ,... , pwm and salts sa 1 ,... , sam are randomly chosen. In the real world, the adversary gets input (pw 1 , sa 1 , KD(pw 1 ‖sa 1 )),... , (pwm, sam, KD(pwm‖sa 1 )) and also gets an oracle for the RO hash func- tion H used by KD. In the ideal world, the input is (pw 1 , sa 1 , L 1 ),... , (pwm, sam, Lm) where the keys L 1 ,... , Lm are randomly chosen, and the oracle is a simulator. The simulator itself has access to a Test oracle that will take a guess for a password and tell the simulator whether or not it matches one of the target passwords. Crucially, we require that when the number of queries made by the adversary to the simulator is q, the number of queries made by the simulator to its Test oracle is only q/c. This restriction is critical to our proof of security amplification and a source of challenges in the proof.

Related work. Previous work which aimed at providing proof-based assurances for password-based key-derivation has focused on the single-instance case and the role of iteration as represented by the iteration count c. Our work focuses on the multi-instance case and the roles of both salting and iteration. The UNIX password hashing algorithm maps a password pw to Ecpw(0) where E is a blockcipher and 0 is a constant. Luby and Rackoff [27] show this is a one-way function when c = 1 and pw is a random blockcipher key. (So their result does not really cover passwords.) Wagner and Goldberg [42] treat the more general case of arbitrary c and keys that are passwords, but the goal continues to be to establish one-wayness and no security amplification (meaning increase in security with c) is shown. Boyen [10, 11] suggests various ways to enhance security, including letting users pick their own iteration counts. Yao and Yin [44] give a natural pseudorandomness definition of a KDF in which the attacker gets (K, sa) where K is either Hc(pw‖sa) or a random string of the same length and must determine which. Modeling H as a random oracle (RO) [7] to which the adversary makes q queries, they claim to prove that

main UKUA SE,m K[1],... , K[m] ←$^ K K′^ ←$ AEnc Ret K′^ = K

proc. Enc(i, M ) Ret E(K[i], M )

proc. Cor(i) Ret K[i]

main LORXA SE,m K[1],... , K[m] ←$^ K b ←$^ { 0 , 1 }m b′^ ←$^ AEnc Ret (b′^ = ⊕ib[i])

main ANDA SE,m K[1],... , K[m] ←$^ K b ←$^ { 0 , 1 }m b′^ ←$^ AEnc Ret (b′^ = b)

proc. Enc(i, M 0 , M 1 ) If |M 0 | 6 = |M 1 | then Ret ⊥ C ←$^ E(K[i], Mb[i]) Ret C

proc. Cor(i) Ret (K[i], b[i])

main RORXA SE,m K[1],... , K[m] ←$^ ({ 0 , 1 }k^ )m b ←$ { 0 , 1 }m b′^ ←$^ AEnc Ret (b′^ = ⊕ib[i])

proc. Enc(i, M ) C 1 ←$^ E(K[i], M ) M 0 ←$ { 0 , 1 }|M^ | C 0 ←$^ E(K[i], M 0 ) Ret Cb[i]

proc. Cor(i) Ret (K[i], b[i])

Figure 2: Multi instance security notions for encryption.

Advuku SE,m(A) = Pr[UKUA SE,m ⇒ true]. Naturally, this advantage depends on the adversary’s resources. (It could be 1 if the adversary corrupts all instances.) We say that A is a (t, q, qc)-adversary if it runs in time t and makes at most q[i] encryption queries of the form Enc(i, ·) and makes at most qc corruption queries. Then we let Advuku SE,m(t, q, qc) = maxA Advuku SE,m(A) where the maximum is over all (t, q, qc)-adversaries.

AND. Single-instance indistinguishabilty for symmetric encryption is usually formalized via left-or-right security [5]. A random bit b and key K ←$^ K are chosen, and an adversary A is given access to an oracle Enc that given equal-length messages M 0 , M 1 returns E(K, Mb). The adversary outputs a bit b′^ and its advantage is 2 Pr[b = b′] − 1. There are several ways one might consider creating an mi analog. Let us first consider a natural AND-based metric based on game ANDSE,m of Figure 2. It picks at random a vector b ←$^ { 0 , 1 }m^ of challenge bits as well as a vector K[1],... , K[m] of keys, and the adversary is given access to oracle Enc that on input i, M 0 , M 1 , where |M 0 | = |M 1 |, returns E(K[i], Mb[i]). Additionally, the corruption oracle Cor takes i and returns the pair (K[i], b[i]). The adversary finally outputs a bit vector b′, and wins if and only if b = b′. (It is equivalent to test that b[i] = b′[i] for all uncorrupted i.) The advantage of adversary A is Advand SE,m(A) = Pr[ANDA SE,m ⇒ true] − 2 −m. We say that A is a (t, q, qc)-adversary if it runs in time t and makes at most q[i] encryption queries of the form Enc(i, ·, ·) and makes at most qc corruption queries. Then we let Advand SE,m(t, q, qc) = maxA Advand SE,m(A) where the maximum is over all (t, q, qc)-adversaries. This metric has many points in its favor. By (later) showing that security under it is implied by security under our preferred LORX metric, we automatically garner whatever value it offers. But the AND metric also has weaknesses that in our view make it inadequate as the primary choice. Namely, it does not capture the hardness of breaking all the uncorrupted instances. For example, an adversary that corrupts instances 1,... , m − 1 to get b[1],... , b[m − 1], makes a random guess g for b[m] and returns (b[1],... , b[m − 1], g) has the high advantage 0. 5 − 2 −m^ without breaking all instances. We prefer a metric where this adversary’s advantage is close to 0.

LORX. To overcome the above issue with the AND advantage, we introduce the XOR advantage measure and use it to define LORX. Game LORXSE,m of Figure 2 makes its initial choices the same way as game ANDSE,m and provides the adversary with the same oracles. However, rather than a vector, the adversary must output a bit b′, and wins if this equals b[1]⊕ · · · ⊕b[m]. (It is equivalent to test that b′^ = ⊕i∈S b[i] where S is the uncorrupted set.) The advantage of adversary A is Advlorx SE,m(A) = 2 Pr[LORXA SE,m ⇒ true] − 1. We say that A is a (t, q, qc)-adversary if it runs in time t and makes at most q[i] encryption queries of the form Enc(i, ·, ·) and makes at most qc corruption queries. Then we let Advlorx SE,m(t, q, qc) =

maxA Advlorx SE,m(A) where the maximum is over all (t, q, qc)-adversaries. Returning to the example we gave for the AND case, if an adversary corrupts the first m − 1 instances to get back b[1],... , b[m − 1], makes a random guess g for b[m] and outputs b′^ = b[1]⊕ · · · ⊕b[m − 1]⊕g, it will have advantage 0.

RORX. A variant of the si LOR notion, ROR, was given in [5]. Here the adversary must distinguish between an encryption of a message M it provides and the encryption of a random message of length |M |. This was shown equivalent to LOR up to a factor 2 in the advantages [5]. This leads us to define the mi analog RORX and ask how it relates to LORX. Game RORXSE,m of Figure 2 makes its initial choices the same way as game LORXSE,m. The adversary is given access to oracle Enc that on input i, M , returns E(K[i], M ) if b[i] = 1 and otherwise returns E(K[i], M 1 ) where M 1 ←$^ { 0 , 1 }|M^ |. It also gets the usual Cor oracle. It outputs a bit b′^ and wins if this equals b[1]⊕ · · · ⊕b[m]. The advantage of adversary A is Advrorx SE,m(A) = 2 Pr[RORXA SE,m ⇒ true] − 1. We say that A is a (t, q, qc)-adversary if it runs in time t and makes at most q[i] encryption queries of the form Enc(i, ·) and makes at most qc corruption queries. Then we let Advrorx SE,m(t, q, qc) = maxA Advrorx SE,m(A) where the maximum is over all (t, q, qc)-adversaries.

Discussion. The multi-user security goal from [4] gives rise to a version of the above games without corruptions and where all instances share the same challenge bit b, which the adversary tries to guess. But it is easy to see that this does not measure mi security, since recovering a single key, for example, suffices to learn b. The above approach extends naturally to providing a mi counterpart to any security definition based on a decisional game, where the adversary needs to guess a bit b. For example we may similarly create mi metrics of CCA security. Why does the model include corruptions? The following example may help illustrate. Suppose SE is entirely insecure when the key has first bit 0 and highly secure otherwise. (From the si perspective, it is insecure.) In the LORX game, an adversary will be able to figure out around half the challenge bits. If we disallow corruptions, it would still have very low advantage. From the application point of view, this seems to send the wrong message. We want LORX-security to mean that the probability of “large scale” damage is low. But breaking half the instances is pretty large scale. Allowing corruptions removes this defect because the adversary could corrupt the instances it could not break and then, having corrupted only around half the instances, get a very high advantage, breaking LORX-security. In this way, we may conceptually keep the focus on an adversary goal of breaking all instances, yet cover the case of breaking some threshold number via the corruption capability. An alternative way to address the above issue without corruptions is to define threshold metrics where the adversary wins by outputting a dynamically chosen set S and predicting the xor of the challenge bits for the indexes in S. This, again, has much going for it as a metric. But LORX with corruptions, as we define it, will imply security under this metric.

2.2 Relations

We provide formal result statements and proofs in support of the implications claimed in Figure 1.

LORX implies UKU. In the si setting, it is easy to see that LOR security implies KU security. The LOR adversary simply runs the KU adversary. When the latter makes oracle query M , the LOR adversary queries its own oracle with M, M and returns the outcome to the KU adversary. When the latter returns a key K′, the LOR adversary submits a last oracle query consisting of a pair M 0 , M 1 of random messages to get back a challenge ciphertext C, returning 1 if D(K′, C) = M 1 and 0 otherwise. A similar but slightly more involved proof shows that ROR implies KU. It is important to establish analogs of these basic results in the mi setting, for they function as “tests” for the validity of our mi notions. The following shows that LORX security implies UKU. Interestingly, it is not as simple to establish in the mi case as in the si case. Also, as we will see later, the proof that RORX implies UKU is not only even more involved but incurs a factor 2m^ loss, making LORX a better choice as the metric to target in designs.

Theorem 2.1 [LORX ⇒ UKU] Let SE = (K, E, D) be a symmetric encryption scheme with message

and B choosing c = c′. Similarly, let q(b′) be the probability that A wins LORXSE,m conditioned on b = b′^ in the game. It is easy to see that q(b′) = p(1m, b′) for every b′^ ∈ { 0 , 1 }m, where 1m^ is the all-one vector. This is because when b = 1m^ in RORXSE,m, a query Enc(i, M 0 , M 1 ) by the simulated A is answered with Enc(i, Mc[i]) = Enc(i, Mb′ (^) [i]) by B. Moreover, again in the case where b = 1m, the adversary B wins RORXSE,m if A outputs b′^ such that b′^ ⊕

i b ′[i] ⊕ (m mod 2) = ⊕ i b[i] =^ m^ mod 2, which is equivalent to b′^ =

i b ′[i].

We also note that for all vectors b′, c′, c′′^ ∈ { 0 , 1 }m, if there exists i ∈ [1 .. m] such that b′[i] = 0, c′[i] = 0, and c′′[i] = 1, then p(b′, c′) + p(b′, c′′) = 1. This is due to the fact that the probability that A, when run by B using c = c′^ in RORXSE,m with b = b′, outputs a certain bit b′^ is the same as when B runs with c = c′′, because A’s queries Enc(i, M 0 , M 1 ) are answered with the encryption of a random plaintext in both cases. However, the bits output by B when c = c′^ and when c = c′′^ are exactly the complement of each other, and hence p(b′, c′′) = 1 − p(b′, c′). Further note that as we can always pair strings c′^ ∈ { 0 , 1 }m^ so that they differ in exactly one component (just take any perfect matching of the m-dimensional hypercube), then

c′∈{ 0 , 1 }m^ p(b ′, c′) = 2m− (^1) holds for all b′ (^6) = 1m.

Putting pieces together, and using p(·, ·) and q(·) to express winning probabilities,

Pr

[

RORXB SE,m ⇒ true

]

22 m

b′,c′∈{ 0 , 1 }m

p(b′, c′) =

22 m

c′∈{ 0 , 1 }m

p(1m, c′) +

b′ 6 =1m

c′∈{ 0 , 1 }m

p(b′, c′)

22 m

b′∈{ 0 , 1 }m

q(b′) + (2m^ − 1) · 2 m−^1

2 m^

Pr

[

LORXA SE,m ⇒ true

]

2 m+^

Rearranging terms yields 2m^ · Advrorx SE,m(B) = Advlorx SE,m(A). It is easy to see that B is a (t′, q, qc)- distinguisher, and the theorem follows by maximizing over all (t, q, qc)-distinguishers A.

We omit the much simpler proof of the converse.

Theorem 2.3 [LORX ⇒ RORX] Let SE = (K, E, D) be a symmetric encryption scheme. For all m, t, qc > 0, and all vectors q we have Advrorx SE,m(t, q, qc) ≤ Advlorx SE,m(t′, q, qc), where t′^ = t + O(1).

LORX implies AND. Intuitively, one might expect AND security to be a stronger requirement than LORX security, as the former seems easier to break than the latter. However we show that under a fairly minimal requirement, LORX implies AND. This brings another argument in support of LORX: Even if an application requires AND security, it turns out that proving LORX security is generally sufficient. The proof relies on the following probabilistic lemma, due to Unger [41].

Lemma 2.4 [41] Let Y 1 ,... , Ym ∈ { 0 , 1 } be random variables such that there exist β 1 ,... , βm ∈ [− 1 , 1] and C, γ > 0 with Pr

[ ⊕

i∈S Yi^ = 0^

]

≤ (1 + C ·

i∈S βi^ +^ γ)/2 for all^ S^ ⊆ {^1 ,... , m}. Then, Pr [

∑m i=1Yi^ =^ m^ ]^ ≤^ γ^ +^ C^ ·^

i∈S

1 + βi 2

The following theorem is to be interpreted as follows: In general, if we only know that Advlorx SE,m(t, q, qc)

is small, we do not know how to prove Advand SE,m(t′, q, qc) is also small (for t′^ ≈ t), or whether this is true at all. As we sketched above, the reason is that we do not know how to use an adversary A for which the ANDSE,m advantage is large to construct an adversary for which the LORXSE,m advantage is large. Still, one would expect that such an adversary might more easily yield one for which the LORXSE,k advantage is sufficiently large, for some k ≤ m. The following theorem uses the above probabilistic lemma to confirm this intuition.

Theorem 2.5 Let SE = (K, E, D) be a symmetric encryption scheme. Further, let m, t, q, and qc be given, and assume that there exist C, ǫ, and γ such that for all 1 ≤ i ≤ m,

max S⊆{ 1 ,...,m},|S|=i

Advlorx SE,i(t∗ S , q[S], qc) ≤ C · ǫi^ + γ ,

where q[S] is the projection of q on the components in S, and t∗ S = t + O(tE ·

i /∈S q[i]), with^ tE^ denoting the running time needed for one encryption with E. Then, Advand SE,m(t, q, qc) ≤ γ + C ·

∏m i=1(1 +^ ǫi)/2.

Does the converse also hold true? It is worth mentioning that in general we are not able to prove that AND implies LORX. Still, we note that in the corruption-free case, one can upper bound Advlorx SE,m(t, q, 0) in terms of Advand SE,m′ (t′, q′, 0) for m′^ ≈ 2 m and t′^ and q′^ being much larger than t, q. The proof, which we omit, follows the lines of the proof of the XOR Lemma from the Direct Product Theorem given by Goldreich, Nisan, and Wigderson [19], and relies on the Goldreich-Levin theorem [18]. As the loss in concrete security in this reduction is very large, and it only holds in the corruption-free statement, we find this an additional argument to support the usage of the LORX metric.

3 Password-based Encryption via KDFs

We now turn to our main motivating application, that of password based encryption (PBE) as specified in PKCS#5 [38]. The schemes specified there combine a conventional mode of operation (e.g., CBC mode) with a password-based key derivation function (KDF). We start with formalizing the latter.

Password-based KDFs. Formally, a (k, s, c)-KDF is a deterministic map KD: { 0 , 1 }∗^ × { 0 , 1 }s^ → { 0 , 1 }k^ that may make use of an underlying ideal primitive. Here c is the iteration count, which specifies the multiplicative increase in work that should slow down brute force attacks. PKCS#5 describes two KDFs [38]. We treat the first in detail and discuss the second in Appendix E. Let KD1H^ (pw, sa) = Hc(pw ‖ sa) where Hc^ is the function that composes H with itself c times. To generalize beyond concatenation, we can define a function Encode(pw, sa) that describes how to encode its inputs onto { 0 , 1 }∗^ with efficiently computable inverse Decode(W ).

PBE schemes. A PBE scheme is just a symmetric encryption scheme where we view the keys as passwords and key generation as a password sampling algorithm. To highlight when we are thinking of key generation as password sampling we will use P to denote key generation (instead of K). We will also write pw for a key that we think of as a password. Let KD be a (k, s, c)-KDF and let SE = (K, E, D) be an encryption scheme with K outputting uniformly selected k-bit keys. Then we define the PBE scheme SE[KD, SE] = (P, E, D) as follows. Encryption E(pw, M ) is done via sa ←$^ { 0 , 1 }s^ ; K ← KD(pw, sa) ; C ←$^ E(K, M ), returning (sa, C) as the ciphertext. Decryption recomputes the key K by reapplying the KDF and then applies D. If the KDF is KD1 and the encryption scheme is CBC mode, then one obtains the first PBE scheme from PKCS#5 [38].

Password guessing. We aim to show that security of the above constructions holds up to the amount of work required to brute-force the passwords output by P. This begs the question of how we measure the strength of a password sampler. We will formalize the hardness of guessing passwords output by some sampler P via an adaptive guessing game: It challenges an adversary with guessing passwords adaptively in a setting where the attacker may, also, adaptively learn some passwords via a corruption oracle. Concretely, let GUESSP,m be the game defined in Figure 3. A (qt, qc)-guessing adversary is one that makes at most qt queries to Test and qc queries to Cor. An adversary B’s guessing advantage is Advguess P,m (B) = Pr

[

GUESSBP,m ⇒ true

]

. We assume without loss of generality that A does not make any pointless queries: (1) repeated queries to Cor on the same value; (2) a query Test(i, ·) following a query of Cor(i); and (3) a query Cor(i) after a query Test(i, pw) that returned true. We also define a variant of the above guessing game that includes salts and allows an attacker to test password-salt pairs against all m instances simultaneously. This will be useful as an intermediate step when reducing to guessing advantage. The game saGUESSP,m,ρ is shown in Figure 3 and we define advantage via Advsa-guess P,m (B) = Pr

[

saGUESSBP,m ⇒ true

]

. An easy argument proves the following lemma.

Lemma 3.1 Let m, ρ > 0 and P be a password sampler. Let A be an (qt, qc)-guessing GUESSP,m adversary. Then there exists a (qt, qc)-guessing saGUESSP,m,rho adversary B such that Advsa-guess P,m,ρ (A) ≤

Advguess P,m (B) + m^2 ρ^2 / 2 s. 

main RealKD,M,r (pw, sa) ←$^ M(r) For i = 1 to r do K[i] ←$^ KDH^ (pw[i], sa[i]) b′^ ←$^ DPrim(pw, sa, K) Ret b′ proc. Prim(X) Ret H(X)

main IdealS,M,r (pw, sa) ←$^ M(r) For i = 1 to r do K[i] ←$^ { 0 , 1 }k b′^ ←$^ DPrim(pw, sa, K) Ret b′ proc. Prim(X) Ret STest(X)

sub. Test(pw, sa) For i = 1 to r do If (pw[i], sa[i]) = (pw, sa) then Ret K[i, j] Ret ⊥

Figure 4: Games for the simulation-based security notion for KDFs.

having r elements and with |sa[i]| = s for 1 ≤ i ≤ r. A simulator S is a randomized, stateful procedure. It expects oracle access to a procedure Test to which it can query a message. Game RealKD,M,r gives a distinguisher D the messages and associated derived keys. Also, D can adaptively query the ideal primitive H underlying KD. Game IdealS,M,r gives D the messages and keys chosen uniformly at random. Now D can adaptively query a primitive oracle implemented by a simulator S that, itself, has access to a Test oracle. Then we define KDF advantage by

Advkdf KD,M,r(D, S) = Pr

[

RealD KD,M,r ⇒ 1

]

− Pr

[

IdealD S,M,r ⇒ 1

]

To be useful, we will require proving that there exists a simulator S such that for any D, M pair the KDF advantage is “small”. This notion is equivalent to applying the indifferentiability framework [29] to a particular ideal KDF functionality. That functionality chooses messages according to an algorithm M and outputs on its honest interface the messages and uniform keys associated to them. On the adversarial interface is the test routine which allows the simulator to learn keys associated to messages. This raises the question of why not just use indifferentiability from a RO as our target security notion. The reasons are two-fold. First, it is not clear that Hc^ is indifferentiable from a random oracle. Second, even if it were, a proof would seem to require a simulator that makes at least the same number of queries to the RO as it receives from the distinguisher. This rules out showing security amplification due to the iteration count c. Our approach solves both issues, since we will show KDF security for simulators that make one call to Test for every c made to it. For example, our simulator for KD1 will only query Test if a chain of c hashes leads to the being-queried point X and this chain is not a continuation of some longer chain. We formally capture this property of simulators next.

c-amplifying simulators. Let τ = (X 1 , Y 1 ),... , (Xq , Yq ) be a (possibly partial) transcript of Prim queries and responses. We restrict attention to (k, s, c)-KDFs for which we can define a predicate finalKD(Xi, τ ) which evaluates to true if there exists exactly one sequence of c indices j 1 < · · · < jc such that (1) jc = i, (2) there exist unique (pw, sa) such that evaluating KDH^ (pw, sa) when H is such that Yj = H(Xj ) for 1 ≤ j ≤ i results exactly in the queries Xj 1 ,... , Xjc in any order where Xi is the last query, and (3) finalKD(Xjr , τ ) = false for all r < c. Our simulators only query Test on queries Xi for which finalKD(Xi, τ ) = true; we call such queries KD-completion queries and simulators satisfying this are called c-amplifying. Note that (3) implies that there are at most q/c total KD-completion queries in a q-query transcript.

Hash-dependent passwords. We do not allow M access to the random oracle H. This removes from consideration hash-dependent passwords. Our results should extend to cover hash-dependent passwords if one has explicit domain separation between use of H during password selection and during key derivation. Otherwise, an indifferentiability-style approach as we use here will not work due to limitations pointed out in [39]. A full analysis of the hash-dependent password setting would therefore appear to require direct analysis of PBE schemes without taking advantage of the modularity provided by simulation-based approaches.

Security of KD1. For a message sampler M, let γ(M, r) := Pr[∃i 6 = j : (pw[i], sa[i]) = (pw[j], sa[j])] where (pw, sa) ←$^ M(r). We prove the following theorem in Appendix G.

Theorem 3.3 Fix r > 0. Let KD1 be as above. There exists a simulator S such that for all adversaries D making q RO queries, of which qc are chain completion queries, and all message samplers M,

Advkdf KD1,M,r(D, S) ≤ 4 γ(P, r) +

2 r^2 + 7 (2q + rc)^2 2 n^

The simulator S makes at most qc Test queries, and answers each query in time O(c). 

3.2 Security of PBE

We are now, finally, in a position to analyze the security of password based encryption as used in PKCS#5. The following theorem uses the multi-user left-or-right security notion from [4], that measures secu- rity when given access to multiple left-or-right oracles each using the same bit b. It is formalized in Appendix D. The proof of the following theorem appears in Appendix H.

Theorem 3.4 Let m ≥ 1, let SE[KD, SE] = (P, E , D) be the encryption scheme built from an (k, s, c)- KDF KD and an encryption scheme SE = (K, E, D) with k-bit keys. Let A be an adversary making ρ queries to Enc(i, ·, ·) for each i ∈ { 1 ,... , m} and making at most qc < m corruption queries. Let S be a c-amplifying simulator. Then there exists message sampler M and adversaries D, C, and B such that

Advlorx SE,m(A) ≤ m·Advmu-lor SE,ρ (C) + 2·Advguess P,m,ρ(B) + 2·Advkdf KD,M,mρ(D, S)

If A makes q queries to H, then: D makes at most q queries to its H oracle; B makes at most ⌈q/c⌉ queries to Test and at most qc corruption queries; and C makes a single query Enc(i, ·, ·) for each 1 ≤ i ≤ ρ. Moreover, C’s running time equals tA + q · tS plus a small, absolute constant, and where tA is the running time of A, and tS is the time needed by S to answer a query. Finally, γ(M, mρ) ≤ m^2 ρ^2 / 2 s. 

Note that the theorem holds even when SE is only one-time secure (meaning it can be deterministic), which implies that the analysis covers tools such as WinZip (c.f., [25]). In terms of the bound we achieve, Theorem 3.3 for KD1 shows that an adversary that makes Advkdf KD,P∗,mρ(D, S) large requires q ≈ 2 n/^2 queries to H, provided salts are large. If H is SHA-256 then this is about 2^128 work. Likewise, a good choice of SE will ensure that Advmu-lor SE,K,ρ(C) will be very small. Thus the dominating term ends up the guessing advantage of B against P, which measures its ability to guess m − qc passwords.

References

[1] M. Abadi and B. Warinschi. Password-based encryption analyzed. In L. Caires, G. F. Italiano, L. Monteiro, C. Palamidessi, and M. Yung, editors, ICALP 2005: 32nd International Colloquium on Automata, Languages and Programming, volume 3580 of Lecture Notes in Computer Science, pages 664–676. Springer, July 2005. [2] M. Abdalla, D. Catalano, C. Chevalier, and D. Pointcheval. Efficient two-party password-based key exchange protocols in the UC framework. In T. Malkin, editor, Topics in Cryptology – CT-RSA 2008, volume 4964 of Lecture Notes in Computer Science, pages 335–351. Springer, Apr. 2008. [3] O. Baudron, D. Pointcheval, and J. Stern. Extended notions of security for multicast public key cryptosystems. In U. Montanari, J. D. P. Rolim, and E. Welzl, editors, ICALP 2000: 27th International Colloquium on Automata, Languages and Programming, volume 1853 of Lecture Notes in Computer Science, pages 499–511. Springer, July 2000. [4] M. Bellare, A. Boldyreva, and S. Micali. Public-key encryption in a multi-user setting: Security proofs and improvements. In B. Preneel, editor, Advances in Cryptology – EUROCRYPT 2000, volume 1807 of Lecture Notes in Computer Science, pages 259–274. Springer, May 2000. [5] M. Bellare, A. Desai, E. Jokipii, and P. Rogaway. A concrete security treatment of symmetric encryption. In 38th Annual Symposium on Foundations of Computer Science, pages 394–403. IEEE Computer Society Press, Oct. 1997. [6] M. Bellare, D. Pointcheval, and P. Rogaway. Authenticated key exchange secure against dictionary attacks. In B. Preneel, editor, Advances in Cryptology – EUROCRYPT 2000, volume 1807 of Lecture Notes in Computer Science, pages 139–155. Springer, May 2000.

[24] J. Katz, R. Ostrovsky, and M. Yung. Efficient password-authenticated key exchange using human-memorable passwords. In B. Pfitzmann, editor, Advances in Cryptology – EUROCRYPT 2001, volume 2045 of Lecture Notes in Computer Science, pages 475–494. Springer, May 2001.

[25] T. Kohno. Attacking and repairing the winZip encryption scheme. In V. Atluri, B. Pfitzmann, and P. McDaniel, editors, ACM CCS 04: 11th Conference on Computer and Communications Security, pages 72–81. ACM Press, Oct. 2004.

[26] H. Krawczyk. Cryptographic extraction and key derivation: The HKDF scheme. In T. Rabin, editor, Advances in Cryptology – CRYPTO 2010, volume 6223 of Lecture Notes in Computer Science, pages 631–648. Springer, Aug. 2010.

[27] M. Luby and C. Rackoff. A study of password security. In C. Pomerance, editor, Advances in Cryptology – CRYPTO’87, volume 293 of Lecture Notes in Computer Science, pages 392–397. Springer, Aug. 1988.

[28] U. M. Maurer, K. Pietrzak, and R. Renner. Indistinguishability amplification. In A. Menezes, editor, Advances in Cryptology – CRYPTO 2007, volume 4622 of Lecture Notes in Computer Science, pages 130–149. Springer, Aug. 2007.

[29] U. M. Maurer, R. Renner, and C. Holenstein. Indifferentiability, impossibility results on reductions, and applications to the random oracle methodology. In M. Naor, editor, TCC 2004: 1st Theory of Cryptography Conference, volume 2951 of Lecture Notes in Computer Science, pages 21–39. Springer, Feb. 2004.

[30] U. M. Maurer and S. Tessaro. Computational indistinguishability amplification: Tight product theorems for system composition. In S. Halevi, editor, Advances in Cryptology – CRYPTO 2009, volume 5677 of Lecture Notes in Computer Science, pages 355–373. Springer, Aug. 2009.

[31] U. M. Maurer and S. Tessaro. A hardcore lemma for computational indistinguishability: Security amplifi- cation for arbitrarily weak PRGs with optimal stretch. In D. Micciancio, editor, TCC 2010: 7th Theory of Cryptography Conference, volume 5978 of Lecture Notes in Computer Science, pages 237–254. Springer, Feb.

[32] R. Morris and K. Thompson. Password security: a case history. Commun. ACM, 22:594–597, November 1979.

[33] S. Myers. Efficient amplification of the security of weak pseudo-random function generators. Journal of Cryptology, 16(1):1–24, Jan. 2003.

[34] A. Narayanan and V. Shmatikov. Fast dictionary attacks on passwords using time-space tradeoff. In V. Atluri, C. Meadows, and A. Juels, editors, ACM CCS 05: 12th Conference on Computer and Communications Secu- rity, pages 364–372. ACM Press, Nov. 2005.

[35] P. Oechslin. Making a faster cryptanalytic time-memory trade-off. In D. Boneh, editor, Advances in Cryptology

  • CRYPTO 2003, volume 2729 of Lecture Notes in Computer Science, pages 617–630. Springer, Aug. 2003.

[36] A. Panconesi and A. Srinivasan. Randomized distributed edge coloring via an extension of the chernoff- hoeffding bounds. SIAM J. Comput., 26(2):350–368, 1997.

[37] K. G. Paterson and D. Stebila. One-time-password-authenticated key exchange. In R. Steinfeld and P. Hawkes, editors, ACISP 10: 15th Australasian Conference on Information Security and Privacy, volume 6168 of Lecture Notes in Computer Science, pages 264–281. Springer, July 2010.

[38] PKCS #5: Password-based cryptography standard (rfc 2898). RSA Data Security, Inc., Sept. 2000. Version 2.0.

[39] T. Ristenpart, H. Shacham, and T. Shrimpton. Careful with composition: Limitations of the indifferentiability framework. In K. G. Paterson, editor, Advances in Cryptology – EUROCRYPT 2011, volume 6632 of Lecture Notes in Computer Science, pages 487–506. Springer, May 2011.

[40] S. Tessaro. Security amplification for the cascade of arbitrarily weak PRPs: Tight bounds via the interactive hardcore lemma. In Y. Ishai, editor, TCC 2011: 8th Theory of Cryptography Conference, volume 6597 of Lecture Notes in Computer Science, pages 37–54. Springer, Mar. 2011.

[41] F. Unger. A probabilistic inequality with applications to threshold direct-product theorems. In 50th Annual Symposium on Foundations of Computer Science, pages 221–229. IEEE Computer Society Press, Oct. 2009.

[42] D. Wagner and I. Goldberg. Proofs of security for the Unix password hashing algorithm. In T. Okamoto, editor, Advances in Cryptology – ASIACRYPT 2000, volume 1976 of Lecture Notes in Computer Science, pages 560–572. Springer, Dec. 2000.

[43] A. C. Yao. Theory and applications of trapdoor functions. In 23rd Annual Symposium on Foundations of Computer Science, pages 80–91. IEEE Computer Society Press, Nov. 1982.

[44] F. F. Yao and Y. L. Yin. Design and analysis of password-based key derivation functions. In A. Menezes, editor, Topics in Cryptology – CT-RSA 2005, volume 3376 of Lecture Notes in Computer Science, pages 245–261. Springer, Feb. 2005.

A Brute-force password-recovery attacks

Here we provide some more details on the (well-known) attacks on PBE mentioned in Section 1.

The si case. Recall that we encrypt a message M under a password pw by picking a random s- bit salt sa, deriving a key L ← KD(pw‖sa) and returning C′^ ← C‖sa where C ←$^ E(L, M ). Here E is a symmetric encryption scheme, typically an IND-CPA AES mode of operation, and key-derivation function (KDF) KD: { 0 , 1 }∗^ → { 0 , 1 }n^ is the c-fold iteration KD = Hc^ of a cryptographic hash function H: { 0 , 1 }∗^ → { 0 , 1 }n. We assume the attacker has access to an oracle TestKey which on input a candidate key L′^ returns (L = L′) where L is the key derived under the target password pw and target salt sa via L ← KD(pw‖sa). (For example the attacker may know a message M and corresponding ciphertext C ←$ E(L, M ) and could test whether D(L′, C) = M where D is the decryption algorithm corresponding to E. We abstract this capability via the oracle.) The attacker is given sa. The brute-force attack now creates a table T with T[pw′] = Hc(pw′‖sa) for all pw′^ ∈ D and then returns pw′^ such that TestKey(T[pw′]) = true. The attack takes cN computations of H, where N = |D| is the size of the dictionary, as well as N tests.

The mi case. Ask now how hard it is to recover a large number m of target passwords pw 1 ,... , pwm, the associated salts denoted sa 1 ,... , sam respectively. If s = 0, the answer is, not much harder than recovering one target password. The adversary’s test oracle TestKey now takes i, L′^ and returns (Li = L′) where Li = KD(pwi‖sai). The attacker creates a table T with T[pw′] = Hc(pw′) for all pw′^ ∈ D (remember the salt has length 0, meaning is absent). Then for each pw′^ ∈ D and each i = 1,... , m it calls TestKey(i, T[pw′]), returning pw′^ if any of these calls returns true. It recovers all the target passwords using cN hashes and mN tests. Rainbow tables and other time-memory trade-offs can be used here to save on space [35, 34]. But with s large (enough to make sa 1 ,... , sam usually distinct), the brute-force attack takes cmN hash computations and mN tests, for it needs a N by m array T where T[pw′, sai] = Hc(pw′‖sai). This is why we salt. That the increase in effort from cN hashes to cmN hashes to mount the brute-force attack is considered valuable in practice is evidenced by pervasive use of salting, starting with the 1976 UNIX Password Hash and continuing into PKCS#5 [38], today’s ubiquitously-employed standard for KDFs and PBE. Yet, this practice has not, until our work, received any proof-based support.

B Problems in the proofs of Yao and Yin

The earlier work of Yao and Yin [44] claimed a security proof for both functions KD1 and KD2 from the PKCS#5 standard for the case m = 1, a much more restricted scenario than the one considered in this paper (e.g., their result cannot be used to infer PBE security). Even more importantly, however, the presented proof is incorrect. In attempting to prove [44, Theorem 1], game K is not equivalent to the real world as claimed in [44, Lemma 2.2]. Namely, their proof incorrectly assumes that the (c + 1) intermediate values u 0 , u 1 ,... , uc are distinct in the evaluation of KD1 on input the randomly selected password pw and random salt sa, where u 0 = pw ‖ sa and uc is the derived key. The bound claimed by [44, Theorem 1] is in fact wrong since it claims to hold for arbitrary c: if c is very large (e.g., 2n) then an attacker will beat the bound they claim.

main mu-LORA SE,m K[1],... , K[m] ←$^ K b ←$^ { 0 , 1 } b′^ ←$^ AEnc Ret (b′^ = b)

proc. Enc(i, M 0 , M 1 ) If |M 0 | 6 = |M 1 | then Ret ⊥ C ←$^ E(K[i], Mb) Ret C

Figure 5: Multi-user security notion for encryption.

We now use this to conclude

Pr

[

LORXA

′ SE,m ⇒^ true

]

= Pr [ bad∅ ] +

S 6 =∅

Pr [ badS ] · Pr

[

LORXA

′ SE,m ⇒^ true^ |^ badS

]

≥ Pr [ bad∅ ] +

S 6 =∅

Pr [ badS ] · Pr

[

LORXA

′ SE,m ⇒^ true^ ∧^ bad ′ S |^ badS

]

≥ Pr [ bad∅ ] +

S 6 =∅

Pr [ badS ] ·

− m ·

2 ℓ^ − 1

)k)

· Pr [ bad∅ ] − m ·

2 ℓ^ − 1

)k

1 + Advuku SE,m(A) 2 − m ·

2 ℓ^ − 1

)k .

The theorem statement follows by rearranging terms.

Proof of of Theorem 2.5: Let A be a AND adversary. Consider the vector b′^ output by A. For all subsets S ⊆ { 1 ,... , m}, and with b being the vector chosen in AND, it is not hard to verify that

Pr

[

i∈S

b′[i] =

i∈S

b[i]

]

1 + Advlorx SE,|S|(t∗ S , q′, qc) 2

1 + Cǫ|S|^ + γ 2

where q′^ is the |S|-dimensional vector obtained as the projection of q on components in S. Indeed, for every adversary A and S ⊆ { 1 ,... , m}, where S = {i 1 ,... , ik},, we can build an adversary B for LORXSE,|S| as follows: It first chooses random bits b[i] ←$ { 0 , 1 } and K[i] ←$ K for i /∈ S, and then runs A. Whenever A queries Enc(ij , M 0 , M 1 ) for j = 1,... , k, B uses the oracle Enc(j, M 0 , M 1 ) from the underlying LORXm,SE game to answer the query, whereas queries Enc(i, M 0 , M 1 ) for i /∈ S are replied with E(K[i], Mb[i]). Similarly, corruption queries for i ∈ S are answered using the corresponding Cor oracle for the LORX game, whereas queries for i /∈ S are answered directly returning b[i] and K[i]. At the end of the game, if A replies with b′, B outputs

i b

′[i]. It is clear by inspection that B has running

time at most t∗ S. To conclude, we apply Lemma 2.4 with Yi = b[i] ⊕ b′[i] to obtain an upper bound on Pr [

∑m i=1 Yi^ =^ m^ ] = Pr [^ b^ =^ b

′ ].

D Multi-user Encryption Security

We recall the multi-user security notion from [4]. Game mu-LORSE,m of Figure 5 defines the security experiment. The advantage of adversary A is Advmu-lor SE,m (A) = 2 Pr[LORXA SE,m ⇒ true]−1. We say that A is a (t, q)-adversary if it runs in time t and makes at most q[i] encryption queries of the form Enc(i, ·, ·). Then we let Advmu-lor SE,m (t, q) = maxA Advmu-lor SE,m (A) where the maximum is over all (t, q)-adversaries.

E More KDFs

The second KDF from PKCS#5 uses a function F : { 0 , 1 }∗^ × { 0 , 1 }∗^ → { 0 , 1 }n^ with two designated inputs. Note that in the standard F is referred to as a PRF, but since what is needed is not a PRF in the traditional sense, we refer to it as just a (keyed) function. Then define

KD2F^ (pw, sa) = U 1 ⊕ U 2 ⊕ · · · ⊕ Uc

where Ui = F (pw, Ui− 1 ) for all i = 1,... , c, and U 0 = sa. We sketch an analysis of this KDF in Appendix G. For both KDFs we have for simplicity deviated from the standard in that we assume the output length of the hash is equal to the desired key length. Achieving shorter key lengths with KD1 and KD just requires truncation, while for KD2 one can also request longer derived keys. This is accomplished by repeated applications of KD2 using domain separation.

F Proof of Theorem 3.

The proof follows directly by combining two lemmas that we now state and prove. The first shows that a threshold variant of the guessing game implies security of the version given in the body that uses corruptions. The second bounds the advantage against this threshold guessing game using a generalized Chernoff bound due to Panconesi and Srinivasan [36] that reduces threshold direct products theorems to (non-threshold) direct product theorems. Finally, we use an amplification lemma due to Maurer, Pietrzak, and Renner [28] that yields a direct product theorem for the password guessing game (without corruptions).

Corruptions and threshold are equivalent. We define a threshold security variant of the guessing game. Let τ -GUESSP,m be the same as GUESSP,m except: (1) the corruption oracle is removed; (2) the winning condition is now

S:|S|≥τ

i∈S (pw

′[i] = pw[i]), i.e., the adversary wins if a subset of size at

least τ of the passwords in the output is correct. The following lemma shows that threshold security for t = m − qc implies security when given qc corrupt queries.

Lemma F.1 Let m ≥ 0, qt, qc be numbers, and τ = m − qc. Let A be a (qt, qc)-guessing GUESSP,m adversary. Then there exists a τ -GUESSP,m adversary B making qt Test queries such that

Advguess P,m (A) ≤ Advτ P^ -guess,m (B). 

Proof: (Sketch.) The adversary B works as follows on input sa. It runs ATestSim,CorSim(sa) simulating queries as follows. For a TestSim(i, pw) query it queries its own test oracle Test(i, pw). If the response is ⊥, it adds pw to a set Ti and returns ⊥ to A. Otherwise it returns true to A. For a CorSim(i) query, it samples a fresh password pw∗[i] according to the distribution of P conditioned on not outputting a password in the set Ti. Finally pw∗[i] is returned to A.

Note that to be able to sample the above distribution, B keeps, for each 1 ≤ i ≤ m, an array pi[·] where pi[pw] initially indicates the probability that P samples pw for each pw ∈ P , where P is the set of possible passwords output by P. (Recall that B is not computationally bounded, and in fact we can even allow B to store arbitrary precision numbers.) When pw is added to Ti, B sets pi[pw] to 0, computes p∗^ =

pw′∈P pi[pw ′], and sets pi[pw′] ← pi[pw′]/p∗ (^) for all pw′ (^) ∈ P. Assume now we order the elements

of P arbitrarily as pw 1 , pw 2 , ..., pw|P |. Then, to sample pw∗[i], B picks a random number x ←$^ [0, 1], and outputs pwk for the smallest k such that

j≤k− 1 pi[pwj^ ]^ ≤^ x, and^

j≤k pi[pwj^ ]^ > x. By construction, the probability that any pw is returned by B in response to a corruption query equals the probability that any pw would be returned in response to such a query in GUESSP,m. (This also uses that A make no pointless queries.)