captcha processing

23
CAPTCHA Processing CPRE 583 Fall 2010 Project

Upload: zaynah

Post on 23-Jan-2016

66 views

Category:

Documents


0 download

DESCRIPTION

CAPTCHA Processing. CPRE 583 Fall 2010 Project. CAPTCHA Processing Responsibilities. Brian Washburn – Loading Image into RAM and Preprocessing and related portion of writeup/presentation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: CAPTCHA Processing

CAPTCHA Processing

CPRE 583 Fall 2010 Project

Page 2: CAPTCHA Processing

CAPTCHA ProcessingResponsibilities

• Brian Washburn – Loading Image into RAM and Preprocessing and related portion of writeup/presentation

• Nicholas Rundle – Text Detection, related portion of writeup/presentation, and writeup/presentation Assembly

• Daniel Uhrman – Text Recognition and related portion of writeup/presentation

Page 3: CAPTCHA Processing

CAPTCHA ProcessingMotivation

The ever increasing spam e-mail has led to the development of CAPTCHAs to try and distinguish between humans and computers. The ability to distinguish between humans and computers is becoming more difficult as computer systems improve. New CAPTCHA systems that are harder to break with a computer are necessary in order to maintain security. This project aims to break current CAPTCHA systems as a means of showing the weaknesses inherent in the system and to motivate ways to improve upon the current designs.

Page 4: CAPTCHA Processing

CAPTCHA ProcessingDesign

FPGA TextImage o v e r l o o k s

Page 5: CAPTCHA Processing

Interface Design

TerminalPowerPC

440

Auxiliary Processor

Unit

Ethernet

Load

Store

There are two main interfaces into the system: 1.) Ethernet to/from the PPC 2.) Loads and Stores to/from PPC and APU

Page 6: CAPTCHA Processing

File Transfer ProtocolClient Server

TCP Connect

“220”

“AUTH”

“234”

“USER Captcha Group”

“230”

Page 7: CAPTCHA Processing

Passive FTPClient Server Control Port Server Data

PortPASV

227IP, IP, IP, IP, Port, Port

Connect to Addr

ACK

DATATerminate

“226 Success”

Page 8: CAPTCHA Processing

Features of the Xilinx llwip4 library(Lightweight IP)

• Standard Berkeley model for sockets– Lwip_listen()– Lwip_write()– Lwip_socket()– Lwip_bind()– Lwip_socket() (SOCK_STREAM for TCP)– Lwip_accept()– Read()– Close()

Page 9: CAPTCHA Processing

lxilKernel library

• Features an easy threading model

• Pthread like mutex’ing

FTP Server Thread

Control Port listen

Thread

Process Control

Port

Listen Data Port

Process Data Port

Page 10: CAPTCHA Processing

Captcha Controller

• Our Controller coordinates dataflow between all of our different subsystems

Auxiliary Processor

Unit

BRAM

Segmenter

BRAM BRAM

Classifier

Page 11: CAPTCHA Processing

Future PPC Work

• The PowerPC can be used for pre-processing– Noise Reduction– Edge detection– Color correction

• Also, it could be used to parse the headers of image files and pass this data along coherently

Page 12: CAPTCHA Processing

Segmenter Unit

• Searches columns of the input image for the edges of letters and copies these columns into BRAM.

• For uniformity, output letters are fixed size of 32x32. Right filled with white pixels.

Page 13: CAPTCHA Processing

Segmenter Unit

Input bram address 0

Output bram address 0 Address 32

Page 14: CAPTCHA Processing

Segmentation

• Histogram thresholding

• Edge detection

• Region-based

Page 15: CAPTCHA Processing

Classifier Unit

• Receives indication of successful segmentation of up to 8 characters from Segmenter

• Reads Segmented Characters from BRAM.• Compares each input character to 36 template

characters (A-Z and 0-9).• Outputs an array of up to 8 ASCII values.

Page 16: CAPTCHA Processing

Horizontal Projection• The segmented characters and template characters

are analyzed using HP (horizontal projection).

• The HP is determined by calculating the sum of each horizontal row of pixel values for an image.

• For our 32x32 pixel images, the HP values will be arrays of size 32 containing sums of up to 32 in each position.

Page 17: CAPTCHA Processing

Classifier Template BRAM

• The expected HP values are pre-calculated for each template character.

• These values are stored in a ROM made in a BRAM IP core that is preconfigured with a .COE file.

• The input images from the segmenter are read from BRAM and compared to each of the template characters to find the best match.

Page 18: CAPTCHA Processing

Correlation Algorithm• The HP values are compared utilizing the correlation

function from statistics shown below:

• Where: X and Y are the HP values for an input image and a given template and N is the length of the HP array.

Page 19: CAPTCHA Processing

Correlation Algorithm Cont’d• Due to the following constraints we went with the

following modification of the correlation equation:– No IP Core for floating point conversion in version 10.1 of tools.– No IP Core for an integer-based square root function.– Potential overflows as a result of large summations and

multiplication.• Implemented as 16 dedicated multipliers, 1 larger width

multiplier as well as 1 dedicated divider.

Page 20: CAPTCHA Processing

Potential Future Work

• Implement “learning” functionality in classifier so that the template ROM is actually a RAM and can be updated based upon CAPTCHA techniques it observes.

• Utilize CAPTCHA Detection Unit for name recognition from security badges, or license plate identification on speed cameras.

Page 21: CAPTCHA Processing

Integration

• In its current form, the project works fully in Modelsim with various test inputs.

• In HW, the project works all the way up to the classifier. The classifier unit has many multipliers and uses a pipelined divider which is a potential point of timing irregularities. We are adding pipeline stages to account for these timing issues.

Page 22: CAPTCHA Processing

Potential Future Work

• Implement “learning” functionality in classifier so that the template ROM is actually a RAM and can be updated based upon CAPTCHA techniques it observes.

• Utilize CAPTCHA Detection Unit for name recognition from security badges, or license plate identification on speed cameras.

Page 23: CAPTCHA Processing

CAPTCHA ProcessingPapers

• Algorithm to Break Visual CAPTCHA (ICETET 2009)• Bio-inspired unified model of visual segmentation system

for CAPTCHA character recognition (SiPS 2008)• CAPTCHA Security: A Case Study (Security & Privacy

July 2009)• Recognizing object in adversarial clutter: breaking a

visual CAPTCHA (Computer Vision and Pattern Recognition 2003)

• Reverse Engineering CAPTCHAs (WCRE 2008)