Download Final Project - Implementation and Performance Analysis of Skipjack | ECE 646 and more Study Guides, Projects, Research Cryptography and System Security in PDF only on Docsity!
(&()LQDO3URMHFW
)DOO
,PSOHPHQWDWLRQDQG3HUIRUPDQFH$QDO\VLVRI
6NLSMDFNDQG5&
6\PPHWULF.H\$OJRULWKPV
Shannon Martin
G. Drew Floyd
TABLE OF CONTENTS
- Introduction ...........................................................................................................................................................
- Algorithm Description ...........................................................................................................................................
- A. Skipjack .........................................................................................................................................................
- Encryption and Decryption ........................................................................................................................
- Key Scheduling..........................................................................................................................................
- B. RC6................................................................................................................................................................
- Encryption and Decryption ........................................................................................................................
- Key Scheduling..........................................................................................................................................
- Implementation......................................................................................................................................................
- A. Software Architecture ....................................................................................................................................
- B. Methodology................................................................................................................................................
- C. I/O Specification..........................................................................................................................................
- Performance Analysis..........................................................................................................................................
- Conclusion...........................................................................................................................................................
- References ...........................................................................................................................................................
- Appendix – Source Code .....................................................................................................................................
1) Encryption and Decryption
The Skipjack encryption algorithm is fundamentally composed of two stepping rules
entitled Rule A and Rule B. Each rule operates on four 16-bit words (w1...w4) composing
the block size of 4 x 16 = 64 bits. The words are shifted, exclusive ORed, and G-
transformed according to the permutation presented schematically in Figure 2-1. These
rules are repeated 8 times and alternated such that the encryption stepping sequence is 8x
Rule A , 8x Rule B , 8x Rule A , and 8x Rule B making a total of 4 x 8 = 32 rounds.
Figure 2-1. Skipjack Encryption Stepping Rules
The mathematical relationships for the two stepping rules derived directly form
the block diagram are given below in Figure 2-2. The counter is initialized to 1 at the
beginning of the encryption sequence. K is the round number.
Figure 2-2. Equations of Rule A and Rule B
Likewise the Skipjack decryption algorithm is composed of two stepping rules.
In this case, the rules are the inverse of the encryption rules entitled Rule A Inverse and
Rule B Inverse. The inverse rules are also stepped 8 times each in an alternating fashion
yielding 32 rounds composed of 8x Rule B Inverse , 8x Rule A Inverse , 8x Rule B Inverse ,
and 8x Rule A Inverse.
The mathematical relationships for the two inverse stepping rules can again be
derived directly form the block diagram, taken in reverse order. The counter is initialized
to 32 at the beginning of the decryption sequence. K represents the round number. The
mathematical relationships needed for decryption are given below in Figure 2-3.
Figure 2-3. Equations of Rule A Inverse and Rule B Inverse
Each stepping rule contains a G-transform for each of the 8 rounds. The G-
transform, seen as the heart of the algorithm, is a key dependent 4-round Feistel structure.
The permutation is recursive and composed of the mathematical relationships given
below in Figure 2-4. Each g value (g1 … g6) is 1 byte. The input bytes g1, and g2 are
the high and low bytes of the 16-bit input word respectively. The F function is a simple
permutation lookup table. The exact specification for the permutation table can be found
in the implementation details under the Appendix.
Figure 2-4. Equation for G-Transform
The schematic representation of the G-transform for both encryption and
decryption is given below in Figure 2-5. In the diagram, CV stands for crypto-variable
that is taken as the key. The subscript of CV decides the byte location in the 10-byte key
for the particular transform.
Table 2. RC6 Parameters
Parameters
RC6(W, R, B) Definition AES Values
W Word size in bits 32
R Number of Rounds 20
B Number of 8-bit bytes in Key K 16, 24, 32
The set of primitive operations used by RC6 is comprised of integer modulo addition,
subtraction, and multiplication, bitwise exclusive OR, and bitwise shift left and shift
right. A summary of the operations is given below in Table 3.
Table 3. RC6 Primitive Operations
Primitive
Operations Definition
a + b Integer addition modulo 2 w
a – b Integer subtraction modulo 2
w
a ⊕ b Bitwise exclusive-or of w-bit words
a × b Integer multiplication modulo 2^
w
a <<< b
Rotate the w-bit word a to the left by the
amount given by the least significant lg w
bits of b.
a >>> b
Rotate the w-bit word a to the right by the
amount given by the least significant lg w
bits of b.
3) Encryption and Decryption
RC6 uses the five primitive operations presented above to perform encryption and
decryption. The encryption and decryption transforms use 4 w-bit registers labeled A, B,
C, and D as input and output to the function block. The first byte of the plaintext or
ciphertext is placed in the lowest order byte of A and the last byte of text is placed in
highest order byte of D. The notation (A, B, C, D) = (B, C, D, A) represents parallel
assignment of A=B, B=C, C=D and D=A.
RC6 Encryption
B = B + S[ 0 ]
D = D + S[ 1 ]
for i = 1 to r do
t = ( B x ( 2B + 1 ) ) <<< lg( w )
u = ( D x ( 2D + 1 ) ) <<< lg( w )
A = ( ( A ⊕ t ) <<< u ) + S[ 2i ]
C = ( ( C ⊕^ u ) <<< t ) + S[ 2i + 1 ]
(A, B, C, D) = (B, C, D, A)
A = A + S[ 2r + 2 ]
C = C + S[ 2r + 3 ]
By examining a single of the RC6 algorithm, it can be observed that the strength of
the algorithm versus typical cryptanalysis is achieved by the data dependent rotations.
This approach results in a mechanism that modifies all the input data in every round. By
comparison, a classic Feistel cipher structure modifies only half the input data in as single
round.
One Round of RC
5 5
f f
6>L@ 6>L@
W X
4) Key Scheduling
RC6 uses a complex set of operations to create sub-keys for encryption and
decryption. The key generation creates two keys for every round plus two more for a
The project is composed of a project header file entitled Project.h and a main
implementation for the timing analysis entitled timetest.c. Each algorithm is
implemented in a header file and a source file pair including Skipjack.h and Skipjack.c ,
and RC6.h and RC6.c. The block modes for each symmetric-key algorithm are
implemented in modes.c. Finally, the general input/output functionality is implemented
in io.c. The source code for each of these files is included in the Appendix.
B. Methodology
Each algorithm was interpreted from its published specification, then implemented
with original code in ANSI C. Next, the implementations were verified against the
published test vector set for each algorithm. Following the first validation, the source
code was optimized by hand using empirical results to illustrate performance gains. After
the implementations were optimized for speed, they were once again verified against the
test vectors and established as the experimentation baseline. Finally, a flexible software
framework for the algorithms was designed and implemented and the algorithms were
placed in the framework.
The core component of the software implementation that facilitated performance
experimentation was the Timing Harness. A design decision was made to only collect
the most accurate timing measurements possible, which resulted in the implementation of
a timing harness to measure only the number of clock cycles used during the execution of
a particular set of functions. While this yielded a very accurate measure of performance
abstracted from clock speed for a particular microprocessor architecture, it required the
use of non-ANSI C assembly level source code to gather the measurements. This
outstanding measurement tool on the Intel Pentium-class platform has no corollary in the
UltraSPARC architecture, thus resulting in the inability to take measurements on a
Solaris platform.
The lack of an accurate clock cycle measurement tool on the UltraSPARC is noted in
the publication: “ Efficiency Testing of ANSI C Implementations of Round 2 Candidate
Algorithms for the Advanced Encryption Standard” where the author states “Cycle
counting could only be performed on the Intel processor based systems. This is the only
processor used by NIST during Round 2 testing that provides access to a true cycle
counting mechanism.” The UltraSPARC architecture was among the test platforms
considered by the AES performance evaluation team. Thus, clock cycle measurements
were taken only for the Intel platform.
The ANSI C programming language used to code Skipjack and RC6 was compiled
using the Microsoft Visual C++ 6.0 compiler on Windows NT/2000. In general, standard
ANSI C libraries provided platform independence by allowing the same source code to
be compiled against multiple platforms. However, the use of architecture specific
assembly code for the measurements used in this study precluded the compilation and
execution of the algorithms on the UltraSPARC platform. Thus, all measured
performance results were taken only for the Intel Pentium-class platform.
C. I/O Specification
The input for each algorithm executable can be driven from a Command Line User
Interface (CLUI). The arguments for the respective algorithms are provided below.
The Skipjack encryption/decryption algorithm can be executed from the command
line with the following parameter list:
Skipjack [Input File][Output File][Key]
Input File : Input File name to be encrypted/decrypted
Output File : Output File name to be encrypted/decrypted
Key : Crypto-variable used for encryption/decryption
The RC6 encryption/decryption algorithm can be executed from the command line
with the following parameter list:
RC6 [Mode][Key][input file] [output file][W][R][B]
Mode : Mode of operation “Encryption” or “Decryption”
Key : Bitstream of length (8*N) used as encryption/decryption key
Input File : Input file name to be encryption/decryption
Output File : Output file name to be encryption/decryption
W : Encryption/decryption processing block size in bits
R : Number of rounds in encryption/decryption algorithm
B : Length of key for encryption/decryption in Bytes (N)
4. Performance Analysis
The execution performance of Skipjack and RC6 was compared based on the number
of clock cycles needed to perform a particular cryptologic transform. The transforms
included Key Expansion, Encryption, and Decryption. The measurements were taken for
each of the available block cipher modes including ECB, CBC, CFB, OFB, CTR, and
Hash mode. The results for each transform and mode were taken for three file sizes
including a single block of input data (either 8 bytes for Skipjack or 16 bytes for RC6), a
small text file (1387 bytes), and a large text file (139,021 bytes). The large file was
chosen to be approximately 100x larger than the small file and the small file was chosen
to be approximately 10x the block size.
Each experiment in the set was executed for 800 iterations to provide Monte Carlo
style statistical results. The average statistics were reported as the final result in each
category. The results for the Key Expansion times for encryption and decryption for both
algorithms is provided below in Table 5.
Table 9. RC6 Decryption Times
RC6 -
Decryption
ECB
(clocks)
CBC
(clocks)
CFB
(clocks)
OFB
(clocks)
CTR
(clocks)
Hash
(clocks)
Block 664.3 676.2 710.6 700.3 697.1 655.
Small 50962.2 51810.5 52582.1 51500.2 52738.3 51671.
Large 5,247,298.9 5,760,451.8 5,942,835.2 6,153,336.5 5,951,985.3 6,334,899.
Below are graphs showing a comparison of encryption and decryption times (in clock
cycles) for Skipjack and RC6 for a single block of data (8 bytes or 16 bytes), a small file
(1387 bytes), and a large file (139,021 bytes).
Block Encryption
ECB CBC CFB OFB CTR Hash
Mode
Time (Clock Cycles)
RC
Skipjack
Small File Encryption
0
200,
400,
600,
800,
1,000,
1,200,
ECB CBC CFB OFB CTR Hash Mode
Time (Clock Cycles)
RC
Skipjack
Large File Encryption
ECB CBC CFB OFB CTR Hash
Mode
Time (Clock Cycles)
RC
Skipjack
Large File Encryption
0
20,000,
40,000,
60,000,
80,000,
100,000,
120,000,
ECB CBC CFB OFB CTR Hash
Mode
Time (Clock Cycles)
RC
Skipjack
5. Conclusion
In considering performance in terms of clock cycles necessary to encrypt and decrypt
a typical file, including key expansion, it was found that the RC6 is superior to Skipjack.
Even though RC6 uses a larger key size (128 bits minimum for RC6 as opposed to 80 bits
for Skipjack), it outperforms Skipjack due to its reliance on simple and efficient
operations in software such as data dependent bit-wise rotations. This technique provides
good obfuscation and executes very quickly in software compared to the permutation
performed in Skipjack.
6. References
1) “ SKIPJACK and KEA Algorithm Specifications ”. NSA. Version 2.0; May
29, 1998. NIST; http://csrc.nist.gov/encryption/skipjack/skipjack.pdf.
2) “ The RC6 Block Cipher ”. Ron Rivest. RSA;
http://www.rsasecurity.com/rsalabs/rc6/index.html.
3) “Efficiency Testing of ANSI C Implementations of Round 2 Candidate
Algorithms for the Advanced Encryption Standard”. Lawrence E.
Basshamm III. April 27, 2000. Computer Security Division – NIST;
http://csrc.nist.gov/encryption/aes/round2/conf3/aes3agenda.html
4) “Using the RDTSC Instruction for performance monitoring”. Intel. 1997.
http://developer.intel.com/drg/pentiumII/appnotes/RDTSCPM1.HTM.
5) “Timing Harness in ANSI C”. NIST;
http://csrc.nist.gov/encryption/aes/round2/round2.htm
7. Appendix – Source Code
Project.h
#ifndef PROJECT_H #define PROJECT_H
#include <stdio.h> #include <stdlib.h> #include <malloc.h> #include <math.h> #include <stddef.h> #include <string.h> #include <time.h>
//RC6 parameters #define RC #define ALG "RC6" #define KEY_SIZE 128 //bits
//Skipjack parameters //#define SKIPJACK //#define ALG "SJ" //#define KEY_SIZE 80 //bits
//Number of iterations #define MAX_LOOPS 1000 #define NUM_LOOPS 800
//Type definitions typedef unsigned int word32; //32 bits typedef unsigned short word16; //16 bits typedef unsigned char byte; //8 bits
//Prototypes void error(char *str); void printff(char string[],word32 text[], int size); word32 readFile(char in_fname[],word32 **text); void writeFile(char out_fname[],word32 *text,unsigned num_word);
void RC6EncryptECB(word32 sub_key[],word32 text[], word32 num_word); void RC6DecryptECB(word32 sub_key[],word32 text[], word32 num_word); void RC6EncryptCBC(word32 sub_key[],word32 text[], word32 num_word); void RC6DecryptCBC(word32 sub_key[],word32 text[], word32 num_word); void RC6EncryptCTR(word32 sub_key[],word32 text[], word32 num_word); void RC6DecryptCTR(word32 sub_key[],word32 text[], word32 num_word);
#define ROUNDS 20 /* rounds of algorithm */ #define FACTOR 40 #define FACTOR1 41 #define FACTOR2 42 #define FACTOR3 43 #define FACTOR4 44
#define ROTL(x,y) ( ((x)<<(y)) | ((x)>>(32-(y))) ) #define ROTR(x,y) ( ((x)>>(y)) | ((x)<<(32-(y))) )
//Prototypes void encryptBlock(word32 sub_key[],word32 text[]); void decryptBlock(word32 sub_key[],word32 text[]); void KeyExpansion(char key[],word32 sub_key[], int size);
#endif /* RC6_H */
timetest.c
#include "project.h" #include "RC6.h" #include "Skipjack.h"
//Define the number of iteration for the timing test #define NUM 15
//Define data structures int data[MAX_LOOPS][NUM]; int num_loops; int num_trials;
//Define File pointers FILE *raw; FILE *time_fp;
//RC6 time to create an encryption key (key expansion) #ifdef RC word32 TimeEKey() { word32 cycles; byte key[]={0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00}; word32 sub_key[FACTOR4];
__asm { pushad CPUID RDTSC mov cycles, eax popad
KeyExpansion(key, sub_key, KEY_SIZE/8);
__asm { pushad CPUID RDTSC sub eax, cycles mov cycles, eax popad } return cycles; } //Skipjack time to create an encryption key (key expansion) #else word32 TimeEKey() { word32 cycles;
__asm { pushad CPUID RDTSC mov cycles, eax popad }
//Do nothing: No key expansion is needed for Skipjack
__asm { pushad CPUID RDTSC sub eax, cycles mov cycles, eax popad } return cycles; } #endif
//RC6 time to create a decryption key (key expansion) #ifdef RC word32 TimeDKey() { word32 cycles; byte key[]={0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00}; word32 sub_key[FACTOR4];
__asm { pushad CPUID RDTSC