project by: james townsend cse704 spring 2011 completed under dr. russ miller using mpi to break...
TRANSCRIPT
PROJECT BY: JAMES TOWNSENDCSE704 SPRING 2011
COMPLETED UNDER DR. RUSS MILLER
Using MPI to Break Data Encryption
Data Security
Cryptography has been used as far back as Julius Caesar
Important data cannot be sent in plaintextIn 1976 a standard was created by the NSB,
now NISTIBM was internally using the Lucifer cipherAdapted as the Data Encryption Standard
(DES)
About DES
DES is a symmetric block cipher, meaning the two communicating parties share a key I.E. one key encrypts and decrypts blocks
Messages are encrypted by breaking them into individual 64 bit blocks (8 characters)
Each block is encrypted with a 56 bit key with 8 parity bits
This encrypted message can then be transmitted without worry
Controversy Around DES
Original submitted cipher used 128 bit keysNSA reduced the size to 56 bits and hid the
internal design of the substitution boxesSome believed they did this so they could
somehow decode all of the encryptionsControversy was not calmed until the release
of the internal design of the algorithmMany still believed it was not secure
Cryptanalysis of DES
Cryptanalysis is the process of mathematically attacking the algorithm to find weaknesses Goal is to discover a connection between plaintext and
cipher text that would be faster than brute forceOver 30 years of dedicated work has been put
into cryptanalyzing DES with no significant results Differential Cryptanalysis was discovered in 1990 NSA and IBM knew of it 20 years earlier and designed
DES to be resistant to this attack
Brute Force Attacks
Process of searching the entire key-space to find the correct key
In 1976 it was inconceivable to attack Even in 1990, estimates were almost 2500 years for a
single computer to brute force DESProposals were made as early as 1977 that a
$20 million machine could brute force DES in a day In 1990, the numbers were down to $1 million
machine that could break it in 7 hours None of these machines were publicly built
Distributed.net
Non-profit organization dedicated to solving large-scale problems
Created a version of grid computing where people could volunteer their idle computer cycles to help search the key-space for a reward
In 1997, the efforts of dsitributed.net cracked a DES encryption in 96 days
In 2001, had an estimated throughput of over 30 teraflops
How They Succeeded
They used the combined efforts of 78,000 computers
Users could log on and let their idle cycles be used trying different keys
The person whose computer found the correct key would get $4000 in prize money
Through this crowd-sourcing type of cracking, great strides were made in making a public outcry
EFF
The Electronic Frontier Foundation is a cyberspace civil rights group
Leading crusaders for the need of a new algorithm
First to publicly implement a custom DES breaker
Used this machine to break a cipher in just 56 hours in 1998
How They Succeeded
They built Deep Crack for just under $250,0001500 “Deep Crack” chips would each search
different keys and eliminate false-positivesA head node would periodically retrieve the
possibilities from the chips and run the full decryption on them
Over 37,000 search units were involved in the first decryption in 1998 24 Units per Deep Crack chip
In collaboration with distributed.net, just 22.5 hours for the third DES challenge
How It Worked
For each key, the nodes would decrypt the first block and check if it came as plaintext Returns as a 64 bit block If Plaintext, it would correspond to 8 characters of ASCII
code Normal English text falls into only 69 ASCII values Odds of a random key returning 8 bytes of ASCII code is just
1/65536If this succeeded, try the same with the second
Odds of all bytes returning as ASCII just 1 in 4 trillionAny keys that are still possibilities are returned to
a central processor that attempts to decrypt the full text
Implementing on the Edge Cluster
This approach is perfect for MPIThe Edge computer has the OpenSSL library
Contains many standard encryption techniques Can expand DES Cracker to test the survivability of
many other algorithms Dividing the key-space among all of the nodes and
report back the possible keys For theoretical purposes, I kept track of the keys
searched to estimate the total time needed
Results
Searched 34.3 billion keys One 2 millionth of the key-space
Almost perfect speedup is achievedOnly communication step is to sum up the
counted possibilities and make sure all nodes reported results
Differences in speedup factors likely due to load balancing issues as the regions are divided
DES Results
8 16 24 32 40 48 56 640
5
10
15
20
25
30
Time to Run 268 Million Keys
Sec
PEs
Seconds
Processors
DES Results
1 8 16 24 32 40 48 56 640
10
20
30
40
50
60
70
80
90
Total Keys per Second
Keys/Sec
Keys/
Second (
In M
illi
ons)
DES Results
0 20 40 60 80 100 120 1400.9
0.92
0.94
0.96
0.98
1
1.02
Speedup
Speedup
Processors
Implications
A single node is capable of searching roughly 1.1 million keys per second In comparison, each Deep Crack node searching 2.5
million keys per second Shows the large difference between running DES in
specialized hardware vs. softwareHowever, using 64 PEs, over 80 million keys
per second are possibleUsing an entire 1024 PE Edge partition,
roughly 1.2 billion keys could be tested every second
Implications
Using a completely general purpose parallel computer, it is possible to approach the key search speeds Deep Crack was able to achieve
Utilizing an entire Edge partition could crack DES on average in just 9 months The Edge partition has a total of just 1024 PEs, compared to
the 37000 search units on the original Deep Crack machineThis is still just using non-optimized software
versions of the algorithm OpenSSL focuses on usability, not efficiency Hardware encryption would still take less than half the time
of even optimized software encryption
Introduction of AES
The Advanced Encryption Standard was brought about largely by the efforts of Deep Crack and Distributed.net
A new algorithm using a much larger key size (variably 128-256 bits) was selected from a publically submitted contest
Much of the controversy that surrounded DES was mitigated by this open-source process
Implementation on Edge
Algorithm followed the same concept as the DES Cracker
Blocks are twice as many bits, so using 2 blocks is even less likely to be all ASCII by chance that DES Returned only 41 possibilities out of 4 trillion keys just
by checking the first two blocks Keys were harder to determine because only
a portion of the key is used per round, but that was accounted for in the process
AES Results on the Edge
8 16 24 32 40 48 56 640
100
200
300
400
500
600
Time to Run 4.3 Billion Keys
Seconds
PEs
Seconds
AES Results on the Edge
1 8 16 24 32 40 48 56 640
10
20
30
40
50
60
70
Total Keys/Second
PEs
Keys/
Second (
In M
illi
ons)
AES Results on the Edge
0 10 20 30 40 50 60 700.93
0.935
0.94
0.945
0.95
0.955
0.96
Speedup
PEs
Speedup F
acto
r
AES Expansion to GPGPUs
GPGPUs (General Purpose Graphics Processing Units) offer an exciting opportunity for parallel computing
Consist of CPUs extremely limited in processing power, streamlined for very fast, simple computations
Perfect for simple parallel tasks, such as encrypting files with AES
AES Expansion to GPGPUs
NVIDIA is a leader in scientific computing on GPGPUs Opened the CUDA language to developers to run on
their video cardsDr. Russ Miller has headed a project to create
a supercomputer at the University at Buffalo using NVIDIA cards as the processing power Entitled the MAGIC computer
AES Cracker Implementation
For simplicity, I used my personal computer with a CUDA-enabled NVIDIA card as a test subject Performed on a NVIDIA 9500GT
I was able to find an open-source AES implementation that was suited for a similarly styled AES Cracker
Many optimizations still had to be made to decrypt with many keys per one block, as opposed to many blocks with one key
AES Results on GPGPU
36 Million keys per second on a single GPGPU
By comparison, it took 40 nodes to reach 37 Million keys per second
Extrapolating the numbers, it would take 2.85x10^23 years for a single card to search the entire keyspace
The Edge machine would take 1.1x10^22, a savings of just one order of magnitude
GPGPU Supercomputers- Magic
The University at Buffalo Cyberinfrastructure Laboratory has a nVidia Tesla/Intel Xeon cluster Hybrid GPGPU/Central Processor Hierarchy of Dell PE1950s controlling 15 nVidia Tesla
S1070s Approximately 57.5 TFLOPS Total cost of the system was under $100,000
Extension to MAGIC Computer
nVidia GeForce 9500 GT – 134.4 GFLOPSnVidia Tesla S1070 – 4147.2 GFLOPS
Each of the 15 nodes are a 30 times faster Instead of just 36 Million keys/second, MAGIC is
capable of more than 16.7 Billion keys/secondAs GPGPUs become more widespread, speeds
will continue to skyrocket as prices will begin to plummet Tesla S2050 cards already reach 5152 GFLOPS
Comparison Between Computers
The partition of the Edge computer used consists of 128 dual-quad core nodes Each node cost upwards of $3500 Total machine cost over $400,000 More expensive partitions also exist
A Theoretical limit of just over 9000 GFLOPS for $400,000 compared to MAGICs 57500 GFLOPS for just under $100,000 This shows the real potential for GPGPU
supercomputing opportunities
Conclusion
DES is certainly no longer secure due to the efforts of DeepCrack and Distributed.net, as well as the dramatic role GPGPUs will continue to play in the supercomputer market
AES is still a very strong algorithm that is completely infeasible to crack by current measures Even the MAGIC system would take 6x10^20 years to search the entire
keyspace Continuing advances in GPGPU supercomputing will make
attempts at building a successful AES cracker more realistic, but will not be successful anytime soon Currently would take 10^21 GPGPUs to reduce the time to crack to within a
single year Even if the 128 bit key size becomes obsolete, 192 and 256 bit
key versions are already in use and can be easily adopted universally These key sizes would eliminate the chance of insecurity exponentially
References
EFF and Deep Crack http://w2.eff.org/Privacy/Crypto/Crypto_misc/DESCracker/HTML/19980716_eff_des_faq.html
DES Information http://en.wikipedia.org/wiki/Data_Encryption_Standard
Standard DES and AES Implementations in C www.openssl.org
RSA Security DES Challenges http://en.wikipedia.org/wiki/DES_Challenges
Distributed.net http://en.wikipedia.org/wiki/Distributed.net
AES Information http://en.wikipedia.org/wiki/Advanced_Encryption_Standard
Standard AES Implementation in CUDA http://shader.kaist.edu/sslshader/libgpucrypto/
Center for Computational Research www.ccr.buffalo.edu
CyberInfrastructure Laboratory http://www.cse.buffalo.edu/faculty/miller/CI/equipment.shtml
nVidia Specifications http://www.nvidia.com/object/why-choose-tesla.html http://en.wikipedia.org/wiki/Comparison_of_Nvidia_graphics_processing_units
Intel L5520 Specifications http://www.tecchannel.de/bild-zoom/2019750/11/382245/il-80380865738247327/
Thanks to Dr. Russ Miller, Kevin Cleary, and Matt Jones for specifications and costs of the CCR systems