Breaking a Transposition Cipher - Codes and Cryptography - Notes | MCS 425, Study notes of Cryptography and System Security

Material Type: Notes; Class: Codes and Cryptography; Subject: Mathematical Computer Science; University: University of Illinois - Chicago; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 07/23/2009

koofers-user-3r4
koofers-user-3r4 🇺🇸

0

(1)

9 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Breaking a Transposition Cipher
Say we have some ciphertext that we know was encrypted with a transpo-
sition cipher. At first, we assume we know the degree of the permutation.
Say the degree is 13. We arrange our ciphertext into 13 columns (perhaps
disregarding an incomplete last row).
1 2 3 4 5 6 7 8 9 10 11 12 13
t i f a t
p
ok
g
r
ian
e b s t m n e l
r
tiae
t c t n i s h s e s i n i
d f e n a h m e u s t v o
o e e m a t
r
ltasls
l i e
p
t n
y
letahk
s s
r
g
e a t u n s t t i
h o o t o
p
eot webs
r
t l i
r
esf eet oh
e e w h c e l
p
t
p
nos
b t l a a l f e s o a n i
t k i a
r
df coneoc
e e s c
r
nozo
r
aon
v l i i u t e h f m e
g
y
e e e z s f
r
hi a ot s
o t t m t e a i s h
r
hi
n o i a n l h h s
g
dt u
w a i a t
y
p
esma
y
r
o i o
y
o t
r
sl f usb
f h e
r
e
r
ecosoaf
m l t u e n a b h
r
aae
h b a t v m n l t n e e u
d f e s e t t o t c e
r
i
e s s
g
i t h n
g
tsou
s i o
r
e c l n e u e u v
o f i l d m s e l d f b u
r
d t
r
i s i e e
r
ezt
t l o e a n t
p
nt s l a
r
i t e o n
r
df f e of
n u t w o e i o o h w m
r
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Breaking a Transposition Cipher - Codes and Cryptography - Notes | MCS 425 and more Study notes Cryptography and System Security in PDF only on Docsity!

Breaking a Transposition Cipher

Say we have some ciphertext that we know was encrypted with a transpo-

sition cipher. At first, we assume we know the degree of the permutation.

Say the degree is 13. We arrange our ciphertext into 13 columns (perhaps

disregarding an incomplete last row).

t i f a t p o k g r i a n e b s t m n e l r t i a e t c t n i s h s e s i n i d f e n a h m e u s t v o o e e m a t r l t a s l s l i e p t n y l e t a h k s s r g e a t u n s t t i h o o t o p e o t w e b s r t l i r e s f e e t o h e e w h c e l p t p n o s b t l a a l f e s o a n i t k i a r d f c o n e o c e e s c r n o z o r a o n v l i i u t e h f m e g y e e e z s f r h i a o t s o t t m t e a i s h r h i n o i a n l h h s g d t u w a i a t y p e s m a y r o i o y o t r s l f u s b f h e r e r e c o s o a f m l t u e n a b h r a a e h b a t v m n l t n e e u d f e s e t t o t c e r i e s s g i t h n g t s o u s i o r e c l n e u e u v o f i l d m s e l d f b u r d t r i s i e e r e z t t l o e a n t p n t s l a r i t e o n r d f f e o f n u t w o e i o o h w m r

The frequencies of individual characters, by themselves, don't help us.

They are the same as in the plaintext, and don't depend on the key (the

permutation used to rearrange the columns).

But the frequencies of digrams can be very helpful.

Consider just 4 characters from the cibertext above.

row 4 (^) e t row 12 (^) k c

Based solely on the information above, which column, 2 or 11, is more

likely to come immediately after column 8 in the plaintext? In other

words, which is more likely in the plaintext?

row 4 (^) ... e ... e t ... row 12 (^) ... c k. ... c ...

i) In the first case (columns 8,2), the plaintext has a digram ck. We

estimated prob ( ck ) = 10 / 10000.

ii) In the second case (columns 8,11), the plain text has a digram et.

We estimated prob ( et ) = 83 / 10000.

Since et occurs about eight times as often as ck , in English text, we might

conclude the second case (8,11) is more likely to occur, based on our

limited information.

But consider more carefully. Suppose column 8 and 11 are “far” apart in

the plaintext. The the letters in columns 8 and 11 of some row are nearly

independent. This means that

prob ( et ) = prob ( e ) prob ( t ) = (1237 / 10000) (921 / 10000) = 114 / 10000.

Thus the digram et is more likely to occur in plaintext columns that are

well separated, than in adjacent columns (114 / 83 ≈ 1.37 times as likely).

plaintext (and the nature of the dependency is different). For simplicity,

we shall treat these columns as independent, recognizing this will

introduce some error.

For each digram λμ, we compute prob ( λμ ) /( prob ( λ ) prob ( μ )).

i) If prob (λμ) /( prob (λ) prob (μ)) > 1, λμ in columns i , j makes it

more likely that column j is the column that follows column i in

the plaintext. (The larger prob (λμ) /( prob (λ) prob (μ)) is, the

stronger the effect.)

ii) If prob (λμ) /( prob (λ) prob (μ)) < 1, λμ in columns i , j makes it

less likely that column j is the column that follows column i in the

plaintext.

In the table on the following page, I have given each of the 26 2 = 676

digrams λμ a score between -8 and 8, with higher scores indicating a

higher values of prob (λμ) /( prob (λ) prob (μ)).

A score of 0 indicates prob (λμ) /( prob (λ) prob (μ)) ≈ 1 (specifically,

between 1 / 1.189 and 1..

A score of 8 indicates prob (λμ) /( prob (λ) prob (μ)) > 13.45, and -

indicates prob (λμ) /( prob (λ) prob (μ)) < 1 / 13.45.

I may explain in class more about how the scores were assigned.

For a pair of i , j of columns, we can easily compute the score of each row

in columns i , j and then compute the average score (over all rows) for

columns i and j.

i) If column j immediately follows column i in the plaintext, we

expect this average score to be roughly

all di
grams λμ

prob (λμ) score (λμ) ≈ 1.15.

a^

b^

c^

d^

e^

f^

g^

h^

i^

j^

k^

l^

m

n^

o^

p^

q^

r^

s^

t^

u^

v^

w

x^

y^

z

a^

b^

c^

d^

e^

f^

g^

h^

i^

j^

k^

l^

m

n^

o^

p^

q^

r^

s^

t^

u^

v^

w

x^

y^

z^

The entry in row

, column

is the score for digram

Each row except row 7 has a unique entry that is substantially larger than

the others in the row (larger by at least 0.6), and that is much closer to

1.15 than to -1.41. (In fact, these entries average 1.07, very close to the

predicted 1.15.) If this entry occurs in row i , column j , it signals that in

all liklihood in the plain text column j comes immediately after column i.

All the other entries are considerably smaller than any of the entries

describe above, and average -1.33, close to the predicted -1.41. If one of

these entries occurs in row i , column j , it strongly suggests that column j

does not come immediately after column i in the plaintext.

Note row 7 has no entry indicating another column following column 7 in

the plaintext. Presumably this indicates column 7 is the last column in the

plaintext.

Likewise, column 5 in our matrix above has no entry indicating it comes

after some other column. Presumably it comes first in the plaintext.

We can read off the presumed order of the columns, starting with 5:

Rearranging the ciphertext columns in this order, we get the plaintext.

t a k i n g a t i p f r o m a l b e r t e i n s t e i n s c i e n t i s t s h a v e f o u n d t h e s m a l l e s t m o s t e a r t h l i k e p l a n e t y e t u s i n g s t a r s t o b o o s t t h e p o w e r o f t h e i r t e l e s c o p e s t h e n e w p l a n e t i s a b a l l o f r o c k c o a t e d i n f r o z e n o c e a n s r o u g h l y f i v e t i m e s t h e s i z e o f e a r t h i t i s m o r e t h a n t h o u s a n d l i g h t y e a r s a w a y i m p o s s i b l y o u t o f r e a c h f o r f o r e s e e a b l e h u m a n t r a v e l b u t t h e m a n n e r o f i t s d e t e c t i o n s u g g e s t s t h e u n i v e r s e c o u l d b e f u l l o f m i d s i z e d t e r r e s t r i a l p l a n e t s n o t t o o d i f f e r e n t f r o m o u r o w n w e t h i