non-standard captchas for the web: a motion...

NON-STANDARD CAPTCHAS FORTHE WEB:

A MOTION BASED CHARACTERRECOGNITION HIP

A dissertation submitted to the University of Manchester

for the degree of Master of Science

in the Faculty of Engineering and Physical Sciences

2011

ByIvo Kund

School of Computer Science

Contents

Abstract 8

Declaration 9

Copyright 10

Acknowledgements 11

1 Introduction 131.1 Definition of Captchas . . . . . . . . . . . . . . . . . . . . . . . . 131.2 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.3 Motivation and Significance . . . . . . . . . . . . . . . . . . . . . 151.4 Structure of the Document . . . . . . . . . . . . . . . . . . . . . . 16

2 Background 172.1 Evolution of Captchas . . . . . . . . . . . . . . . . . . . . . . . . 17

2.1.1 Conception . . . . . . . . . . . . . . . . . . . . . . . . . . 172.1.2 Text Based Captchas . . . . . . . . . . . . . . . . . . . . . 182.1.3 Image Based Captchas . . . . . . . . . . . . . . . . . . . . 232.1.4 Other Types . . . . . . . . . . . . . . . . . . . . . . . . . . 252.1.5 Recent Research and Motion-based Captchas . . . . . . . . 26

2.2 Usability vs Security . . . . . . . . . . . . . . . . . . . . . . . . . 282.2.1 Captcha Usability . . . . . . . . . . . . . . . . . . . . . . . 282.2.2 Breaking Captchas . . . . . . . . . . . . . . . . . . . . . . 28

2.3 Image and Motion Analysis . . . . . . . . . . . . . . . . . . . . . 302.3.1 Image Segmentation . . . . . . . . . . . . . . . . . . . . . 312.3.2 Moving Pattern Recognition . . . . . . . . . . . . . . . . . 31

2.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2

3 Research Methodology 333.1 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.2 Scope of Investigation and Limitations . . . . . . . . . . . . . . . 333.3 Deliverables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.4 Research Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.4.1 Literature Review . . . . . . . . . . . . . . . . . . . . . . . 353.4.2 Developing a New Captcha Concept . . . . . . . . . . . . . 363.4.3 Engineering a Prototype . . . . . . . . . . . . . . . . . . . 363.4.4 Tuning Parameters . . . . . . . . . . . . . . . . . . . . . . 373.4.5 Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 38

4 Captcha Design 394.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.2 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . 40

4.2.1 Defence Against Known Methods . . . . . . . . . . . . . . 414.3 The Captcha Programme . . . . . . . . . . . . . . . . . . . . . . . 43

4.3.1 Choice of Languages and Formats . . . . . . . . . . . . . . 434.3.2 Input Parameters . . . . . . . . . . . . . . . . . . . . . . . 434.3.3 Control Flow . . . . . . . . . . . . . . . . . . . . . . . . . 444.3.4 Description of Main Classes . . . . . . . . . . . . . . . . . 534.3.5 Characters . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5 Test Design 595.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.2.1 Individuals . . . . . . . . . . . . . . . . . . . . . . . . . . 615.2.2 Evaluation, Reproduction and Termination . . . . . . . . . 64

5.3 Implementation of the Testing Programme . . . . . . . . . . . . . 665.3.1 The Testing Environment . . . . . . . . . . . . . . . . . . 675.3.2 The Preview Environment . . . . . . . . . . . . . . . . . . 705.3.3 Other Environments . . . . . . . . . . . . . . . . . . . . . 715.3.4 Database Schema . . . . . . . . . . . . . . . . . . . . . . . 73

5.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3

6 Test Results 746.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746.2 Acquired Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . 756.3 Islands and Individuals . . . . . . . . . . . . . . . . . . . . . . . . 776.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 80

7 Discussion 817.1 Critical Analysis of the Captcha Design . . . . . . . . . . . . . . . 817.2 Critical Analysis of the Test Design . . . . . . . . . . . . . . . . . 837.3 Security vs Usability – Analysis . . . . . . . . . . . . . . . . . . . 857.4 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 86

8 Summary 878.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

References 90

A Contents of the Attached CD 98

B Software Requirements 99

C Finding Materials On-line 100

D List of Used 3rd Party Libraries 101

E Group Rotation Procedure 102

F Database Schema (User Tests) 104

G Parameters of Fittest Individuals 105

Final word count: 25,306 (all pages)Original programme code: 162 KB (eqqv. to ≈ 10, 000 words)

4

List of Tables

2.1 Commercial captcha-breaking services . . . . . . . . . . . . . . . . 30

3.1 List of effects implemented in the captcha prototype . . . . . . . . 37

4.1 Input parameters used to generate particles and effects . . . . . . 444.2 Characters used by the captcha . . . . . . . . . . . . . . . . . . . 56

5.1 Parameters included in the genetic algorithm . . . . . . . . . . . . 625.2 Parameters expected from the HTTP POST request . . . . . . . . 70

6.1 Overview of the generated islands and individuals . . . . . . . . . 77

7.1 Replies from commercial captcha-breaking services . . . . . . . . . 86

G.1 Best parameter values from test results . . . . . . . . . . . . . . . 105

5

List of Figures

2.1 An AltaVista captcha . . . . . . . . . . . . . . . . . . . . . . . . . 182.2 A Gimpy captcha . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.3 PessimalPrint, BaffleText and ScatterType captchas . . . . . . . . 202.4 A reCAPTCHA captcha . . . . . . . . . . . . . . . . . . . . . . . 202.5 A SemCAPTCHA captcha . . . . . . . . . . . . . . . . . . . . . . 212.6 A captcha from QRBGS . . . . . . . . . . . . . . . . . . . . . . . 232.7 An ARTiFACIAL captcha . . . . . . . . . . . . . . . . . . . . . . 242.8 A captcha implementing ZKPFP . . . . . . . . . . . . . . . . . . 272.9 Snapshot of a video demonstrating the emerging images principle 27

3.1 Illustration of the research work flow . . . . . . . . . . . . . . . . 353.2 An early draft of the captcha . . . . . . . . . . . . . . . . . . . . 363.3 Illustration of the problem of usability vs security . . . . . . . . . 37

4.1 A captcha showing characters “C”, “A” and “F” . . . . . . . . . . . 394.2 Flowchart of initial actions performed by the captcha . . . . . . . 454.3 Flowchart of the process that generates captchas . . . . . . . . . . 464.4 A letter “B” with no rotation and rotation of π

8. . . . . . . . . . . 50

4.5 Class diagram of the Particle Container interface . . . . . . . . . 534.6 Class diagram of the Particle class . . . . . . . . . . . . . . . . . . 544.7 Diagram of the abstract Character class . . . . . . . . . . . . . . 554.8 Characters “A” and “B” without any effects applied . . . . . . . . 554.9 Snapshot of the character drawing tool . . . . . . . . . . . . . . . 57

5.1 Principal steps common to most genetic algorithms . . . . . . . . 605.2 Illustration of the problem of local maxima . . . . . . . . . . . . . 635.3 Components of the testing software . . . . . . . . . . . . . . . . . 665.4 Snapshot of the testing software user interface . . . . . . . . . . . 675.5 Diagram of the individual selection process . . . . . . . . . . . . . 69

6

5.6 Diagram of the user response handling process 1 . . . . . . . . . . 705.7 Diagram of the user response handling process 2 . . . . . . . . . . 715.8 Control flow of the maintenance script . . . . . . . . . . . . . . . 72

6.1 Participation during the testing period . . . . . . . . . . . . . . . 746.2 Statistical analysis of the fitness function . . . . . . . . . . . . . . 756.3 Distribution of ratings and test durations . . . . . . . . . . . . . . 766.4 Distribution of the accuracy of tests . . . . . . . . . . . . . . . . . 766.5 Fitness scores of all individuals on each island . . . . . . . . . . . 776.6 Snapshots of the fittest individuals on each island . . . . . . . . . 786.7 Parameters of the fittest individuals on each island . . . . . . . . 796.8 Average fitness of islands over 9 generations . . . . . . . . . . . . 796.9 Average accuracy of results for last two generations . . . . . . . . 80

7.1 Time requirements for different captcha generation processes . . . 827.2 Combined snapshot of all frames in the captcha animation . . . . 83

7

Abstract

CAPTCHAs have become increasingly valuable security measures on the Web duringthe last decade. They are mainly used to restrict automated access to certain areas ofweb applications, usually to prevent automated posting of spam.

As spamming robots get smarter, CAPTCHAs get more complicated and also harderto solve for humans. This has resulted in today’s situation where text CAPTCHAs areoften not understandable by humans and can be broken very efficiently (many popularCAPTCHAS with 99% accuracy).

The purpose of this project was to develop a novel motion based character recog-nition CAPTCHA, that would differ greatly from current alternatives, and providesecurity and usability that is practically usable.

The developed CAPTCHA uses modern Web technology to generate a smooth ani-mated character recognition challenge. The CAPTCHA is inspired by Zero-KnowledgePer Frame Principle, by which no information can be extracted from a single frame ofan animation.

To find out optimal usability settings for the CAPTCHA, a Multi-Modal InteractiveGenetic Algorithm was created. It used subjective human rating to find out which setof parameters generates the most easily solvable challenges.

The testing took place on-line, 2716 results were submitted and 924 differentCAPTCHAS generated. The resulting CAPTCHAs were tested again to find out aplausible usability rate, which turned out to be 82%.

To find out about the security of this CAPTCHA, 6 services were contacted, thatdeal with commercial CAPTCHA-breaking. Two did not reply, the other four respondedthat they are unable to provide services for this kind of CAPTCHAs.

An evaluation suggests that these results are enough to compete with current alter-

natives.

8

Declaration

No portion of the work referred to in this dissertation hasbeen submitted in support of an application for anotherdegree or qualification of this or any other university orother institute of learning.

9

Copyright

i. The author of this thesis (including any appendices and/or schedules tothis thesis) owns certain copyright or related rights in it (the “Copyright”)and s/he has given The University of Manchester certain rights to use suchCopyright, including for administrative purposes.

ii. Copies of this thesis, either in full or in extracts and whether in hard orelectronic copy, may be made only in accordance with the Copyright, De-signs and Patents Act 1988 (as amended) and regulations issued under itor, where appropriate, in accordance with licensing agreements which theUniversity has from time to time. This page must form part of any suchcopies made.

iii. The ownership of certain Copyright, patents, designs, trade marks and otherintellectual property (the “Intellectual Property”) and any reproductions ofcopyright works in the thesis, for example graphs and tables (“Reproduc-tions”), which may be described in this thesis, may not be owned by theauthor and may be owned by third parties. Such Intellectual Property andReproductions cannot and must not be made available for use without theprior written permission of the owner(s) of the relevant Intellectual Propertyand/or Reproductions.

iv. Further information on the conditions under which disclosure, publicationand commercialisation of this thesis, the Copyright and any IntellectualProperty and/or Reproductions described in it may take place is avail-able in the University IP Policy (see http://www.campus.manchester.ac.uk/medialibrary/policies/intellectual-property.pdf), in any rele-vant Thesis restriction declarations deposited in the University Library,The University Library’s regulations (see http://www.manchester.ac.uk/library/aboutus/regulations) and in The University’s policy on presen-tation of Theses

10

Acknowledgements

Firstly, I would like to thank my good friend Kristjan Korjus for being an invalu-able source of inspiration and ideas, and Oliver Kund for his timely assistance.

This project would not have been possible without the good people who helpedby solving captchas, despite the headaches and potential seizures caused by star-ing at a blinking screen for hours. Thank you.

I would like to thank my supervisor Ke Chen for guidance and all my goodfriends for putting up with the boring me during this summer. I apologise.

I would also like to thank my parents, whose support and love has got mewhere I am now. It is hard to imagine where I would be without you.

And finally, I would like to take the opportunity and thank the Ethiopianshepherd Kaldi who, as legend goes, accidentally discovered coffee beans in around800 AD.

11

This project is dedicated to my mother Külli, who despite everything managed toraise a reasonably okay son. Thank you.

12

Chapter 1

Introduction

When Alan Turing first proposed the problem of distinguishing humans from ma-chines in his paper from 1950 (Turing, 1950), the problem seemed to be ratherhypothetical and far from any practical application. However, a lot has changedsince then, as we are today faced with challenges of beating ever-evolving com-puters in the game of being human.

As technology and advances in artificial intelligence progress, there are fewerand fewer things that separate us, humans, from machines and that can be testedwith simple Turing tests. In the beginning it was our ability to understand text.Then it was our ability to reason and answer questions. Then our ability torecognise distorted characters. If current CAPTCHAs fail, what will come next?There is a tremendous architecture on-line, reliant on such tests.

However inevitable this may sound, we need to keep trying to find answers tothis question. Who knows, perhaps we discover something new about ourselveson the way.

1.1 Definition of Captchas

CAPTCHAs, also called Human Interaction Proofs (HIPs) are tests that areusually presented to users of the Internet during registration, commenting orsimilar procedures as challenges that are meant to be very difficult for computers,but relatively easy for humans to accomplish. This is usually done to preventspammers (typically automated computer programs) from creating bogus useraccounts, posting advertisements or exploiting the system otherwise.

The definition for CAPTCHAs is Completely Automated Public Turing test

13

CHAPTER 1. INTRODUCTION 14

to tell Computers and Humans Apart. There are two noteworthy aspects aboutthis definition. First, being completely automated means that the algorithm gener-ating the tests must not use any kind of fixed set of source information. Secondly,being public means that the source code of the algorithm and any data used asinput (apart from small amount of randomness) should be publicly available toeveryone (Von Ahn et al., 2004).

Von Ahn et al. (2004) argue that being public and thus open source is a simpleway of distinguishing real CAPTCHAs (based on hard Artificial Intelligence (AI)problems) from those whose security relies on a secret that might be obtained viareverse engineering or hacker attacks.

Nevertheless, many commercial CAPTCHA products, including reCAPTCHA(originally created by Luis von Ahn), choose to not open their source code forcompetitive reasons, arguing that “If we released our code and data, computerswould not do significantly better” (reCAPTCHA Team, 2011).

There are currently four main types of CAPTCHA designs: the most widelyadapted text-based1, also popular image-based, relatively new video-based andthe audio-based scheme (Jeng et al., 2010). Please note this categorisation onlyimplies what is being presented to the user, not what medium is being used, e.g.text-based CAPTCHAs are usually being presented using images.

For the sake of legibility and clarity the term CAPTCHA will from now onbe referred to as captcha (using lowercase spelling).

1.2 The Problem

There are two major issues involved in the design and use of captchas – usabilityand security, which usually come in inverse relationship. Good usability typicallymeans that a captcha is easily solvable both to humans and computers. On theother hand more secure implementations tend to pose problems to computers andto human solvers as well.

Numerous papers have been published on both how to make captchas moresecure and on how to break existing ones (see the next Chapter). Also, numerouscommercial services exist that provide different captcha-related services. Thisincludes captcha Application Programming Interfaces (APIs) and also captcha-breaking services.

1Also called Image-Based Text-Recognition CAPTCHA


Text-based captcha, being the most popular of current captcha schemes, hasreceived most of this interest and is consequently reaching its limits in maintainingusability while being secure at the same time. This in effect means that the AI(Artificial Intelligence) problem behind text-based captchas is not hard enoughany more. Research from recent years has shown breaking accuracy of up to 99%on many widely used real-world captchas (Hindle et al., 2008) and 23% on thepopular reCAPTCHA service that provides 100 million challenges daily (GoogleInc., 2011; Wilkins, 2009).

While one might think of such a tendency as troubling, it is also importantto notice the enormous contributions from this research to the AI community,specifically to fields like Optical Character Recognition (OCR). Von Ahn et al.(2004) argue: “we want our CAPTCHAs to be broken! . . . this is also why we wantseveral different kinds of CAPTCHAs: the more CAPTCHAs that get broken,the more AI problems that get solved”.

All this calls for alternative captcha methods to be devised. Schemes based onimage recognition and videos have already shown some success. A new captchatype that has appeared in recent years, combining text-recognition and motion,will be the main interest in this research.

1.3 Motivation and Significance

Captchas have become the main defence mechanism against spam during the lastdecade. Von Ahn et al. (2003) pointed out services such as online polls and e-mailto whom protection from spammers is most essential. It would be damaging toany service if malicious users could create thousands of user accounts per secondjust to send out junk e-mail or post spam. Preventing Denial of Service (DOS)attacks has also been suggested as an application for captcha services (Moreinet al., 2003).

Yan (2009) pointed out that different types of computer game platforms aretoo having problems with being able to distinguish real players from cheatingrobots. For example aiming robots can be used in First Person Shooter (FPS)games that give a player much better performance compared to other, humanplayers. These however will not be part of this research, as it will focus on visualcaptchas for the Web that can be conveniently presented over the Internet.

It must be noted however, that there most likely does not exist a permanent


and perfect solution for the captcha problem. Von Ahn et al. (2004) have stated:“to say that a computer program can never pass a captcha is to say that thereexists something humans can do, but computers cannot”. They believe that“. . . some day computers will be as good as humans, and possibly better, in everyrespect”.

Considering this, the best course of action at the moment is to come upwith new and innovative types of captchas that cannot be broken using currenttechnology; and hopefully, when someday broken, will advance the research inartificial intelligence.

1.4 Structure of the Document

This document is divided into three parts, each equally important and comple-mentary to one another.

The first, introductory part contains the first three chapters of this disserta-tion. We have started with introduction to the project, which will be followedby a fairly in-depth background investigation of the captcha phenomena. This isfollowed by a chapter describing some formal approaches and methodology usedin the project.

The second part contains two chapters and is about the development of anew captcha. Chapter 4 describes how the captcha programme was developed,Chapter 5 how the Genetic Algorithm together with the testing environment wasdesigned.

The final, third part, is about results and reflection. It contains three chapters,the first presents test results and the second analyses achievements made andproblems encountered. The last chapter will conclude the project and describesome work, that could not be done in the time available and should be done inthe future.

Chapter 2

Background

An overview of different aspects of the captcha phenomena are presented in thischapter. We start with an overview of existing captcha types and then look atsome usability and security concerns inherent in most popular captchas. Finally,before concluding, an overview of current image and motion analysis techniquesis given.

2.1 Evolution of Captchas

This section gives and overview of what kinds of different captchas have beendeveloped so far. We will start with a brief historical overview and then lookspecifically at text based and image based captchas in more detail. We will alsocover some alternative methods before going into some more recent motion basedcaptcha research.

2.1.1 Conception

Naor (1997) was the first to provide theoretical aspects of a mechanism thatcould be considered a modern Human Interaction Proof (HIP). He pointed outsome fundamental features of such HIPs that have nowadays become essentialand self-evident elements in captcha engineering. He also described some earlyideas for such tests, such as gender recognition from images, speech recognition,handwriting recognition and more. Naor also pointed out that such tests couldbe used to encourage research in the AI areas that the HIP problems are basedon.

17

CHAPTER 2. BACKGROUND 18

The first practical captcha was created by the AltaVista search engine in 1998(Von Ahn et al., 2004). This was a text-recognition captcha with slightly rotatedletters and obfuscated background (see Figure 2.1), protecting the website’s URLsubmission functionality that had become a target for spammers. The submittedpatent also describes methods of generating audible HIPs (Lillibridge, 2001).

Figure 2.1: The original AltaVista captcha. Enhanced image from Baird andPopat (2002)

Yahoo! was having a similar problem with their online chat room – automatedcomputer programs joined chat sessions and invited real users to advertising sites.In 2000, Udi Manberg from Yahoo! approached some researchers (including thecaptcha pioneers Ahn, Blum and Langford) in the Carnegie Mellon Universitywith this problem (Coates et al., 2001). This eventually lead to publishing of theoriginal papers on captchas – Von Ahn et al. (2003, 2004), coining of the term“captcha” and creation of Gimpy – a HIP generator that used more complicatedimage warping techniques.

Since then very many different captcha implementations have appeared.Character-recognition text captchas have to this date remained the most pop-ular type, even though they have gone through intense evolution of breaking andimprovements. Developments in text and image based captchas, will be discussedin the following two chapters.

2.1.2 Text Based Captchas

This paragraph gives an overview of noteworthy text-recognition captchas thathave been developed or envisioned so far (mostly in the last decade). In additionto more traditional captchas we also look at some interesting variants that applymore innovative approaches. Almost all text captchas use images to display text.

Early Pioneers

Gimpy (mentioned before), developed by Von Ahn et al., proved to be too difficultfor humans users (Ferzli et al., 2006). It asks the user to identify some of many


words, several of which are overlapping (see Figure 2.2). EZ-Gimpy, that onlyrequires one word, was developed as an easier alternative and was employed byYahoo! until 2003. Although this approach proved to be quite successful at start,both Gimpy and EZ-Gimpy were broken by Mori and Malik (2003) with accuracyof 92% and 33% respectively.

Figure 2.2: An image produced by Gimpy, considered too difficult for real-worlduse. Enhanced image from Von Ahn et al. (2004)

Pessimal Print was one amongst the earliest text captchas (Coates et al.,2001). It used many image-degradations methods to simulate physical defectscaused by scanning or copying of printed text (see Figure 2.3). However, it alsoused a fixed lexicon of only 70 words, which made it extremely vulnerable tobrute force attacks.

BaffleText (Chew and Baird, 2003) uses random typeface and randomly gener-ated English-pronounceable (but not actual) words to generate the initial image.Then, a second image, consisting of randomly appearing filled shapes is created,which is then added1 to the text image. An experiment conducted by the authorsproved the legibility of BaffleText to be 79%.

ScatterType, described by Baird and Riopka (2005), is a captcha, that focuseson making the character segmentation part of catptcha-breaking very difficult. Ituses randomly selected characters and typeface to draw letters, which will thenbe fragmented using horizontal and vertical cuts; then the cuts are drifted moreapart to make them not resembling real characters. Unfortunately this approachalso causes great difficulties for humans, as the overall average human legibilitywas determined to be only 53%.

1Here adding can actually mean three types of boolean operations (since all images aremonochrome): OR, NOT-AND and XOR


Figure 2.3: Three examples of PessimalPrint (left), BaffleText (middle) and Scat-terType (right) captchas. Images from Baird and Riopka (2005); Chew and Baird(2003); Coates et al. (2001) respectively.

reCAPTCHA

Certainly one of the most popular captcha services is reCAPTCHA, proposedand created by Von Ahn et al.. It is a good example of how the resources (timeand human brain power) put into solving captchas can actually be used to benefitthe World. reCAPTCHA uses texts scanned from old books and newspapers asinput and presents two words to the user – one control word (meaning known)and one word that two different OCR software have failed to recognise (meaningnot known). Of course the words do not appear as they were scanned from thebook – complex distortions will be applied before presenting them to the user(see Figure 2.4).

Figure 2.4: A reCAPTCHA challenge, as in July 2011. Image from http://google.com/recaptcha

There are algorithms that allow unknown words to become known words, ifenough people recognise them in the same way. This results in approximately 4million (previously unknown) words per day being transcribed (Von Ahn et al.,2008). The authors suggest that the key success factors in reCAPTCHA aredistortions (natural fading) caused by the age of the books and visual noise intro-duced in the scanning process. reCAPTCHA also comes with a variant of audio


captcha.Although reCAPTCHA is currently the recommended captcha by Von Ahn

et al. and serves around 100 million challenges daily (Google Inc., 2011), break-ing rate of 23% has been demonstrated in attacks against it (Wilkins, 2009).reCAPTCHA was acquired by Google Inc. in 2009 and is now part of serviceslike Google Books and Google News Archive Search (Google Inc., 2009).

Semantic Approaches

SemCAPTCHA (Lupkowski and Urbanski, 2008) is an interesting captcha im-plementation that adds a semantic element to the otherwise primitive task ofrecognising and typing letters. It presents to the user a distorted image wherethree words can (ideally) be recognised (see Figure 2.5). All three are names ofanimals and the user is asked to click on a name that does not fit in, i.e. dif-fers from the rest. This difference can be for instance being a mammal whereasother animals are reptiles. To make the task easier for humans, each challengeis preceded by a short-time (70 ms) exposure of a fourth word that semanticallyassociates with the correct answer. This word can for instance be milk if thecorrect animal to pick is cow.

Figure 2.5: A SemCAPTCHA challenge showing 3 Polish words (”duck”, “cuckoo”and “cow”). Image from Lupkowski and Urbanski (2008)

Unfortunately no real experiments have been conducted yet on the security ofSemCAPTCHA. One could argue that if making a mouse click is the only inputrequired from the user, then a relatively simple brute force attack could be usedto break the captcha. What is more, the authors show that modern OCR softwarewas able to identify the distorted words in ≈12% of cases2. This combined withthe possibility of doing dictionary attacks (all animal names are known) poseseven more problems to the captcha security.

2Unfortunately it is not known whether one or three words were tested at time


Another semantic approach to captchas is SS-CAPTCHA proposed by Ya-mamoto et al. (2010),. SS stands for Strangeness in Sentence and the captchatries to exploit the human ability to distinguish naturally sounding sentences fromthose created by automatic translation software. Amongst others, the biggestconcern raised by the authors was insufficient usability, which would render thecaptcha impractical.

Similar work was done by Bergmair and Katzenbeisser (2004), who proposedthe idea of exploiting the Word Sense Disambiguation problem to distinguishhumans from computers. The captcha would pick a sentence and then generatemany versions of it, each time substituting the same word with one of its syn-onyms. The user would then be asked to select “sentences that are meaningfulreplacements of each other”. Although it is an interesting approach, it wouldprobably pose some usability problems, especially to non-native speakers.

Other Work on Text-Captchas

Gupta et al. (2009) proposed a method of embedding a number to each of thegenerated letters in the text captcha. The user would then have to enter thecharacters in the order specified by the numbers. The biggest concern regardingthis captcha seems to be very poor usability and low recognition rate of thenumbers by human users (Raj et al., 2009).

An interesting mixture of text and image captcha was proposed by Shirali-Shahreza and Shirali-Shahreza (2007). It presents a question to a user, in whichsome words have been replaced with images. For instance: “There are 3 numbersof a, 3 numbers of b and 2 numbers of c on a table. How many fruits are thereon the table in total?” In this sentence, a, b and c would be presented as images.This approach is unfortunately subject to very simple brute force attacks (sinceonly one number is required as an answer).

Quantum Random Bit Generator Service (Stevanović, 2011) uses a very in-teresting approach of asking users to solve fairly complex maths equations asa captcha test to be able to register with the website (see Figure 2.6). Thiswas shown to be a relatively ineffective against attacks by Hernandez-Castro andRibagorda (2010).

Other approaches include Teabag 3D (OCR Research Team, 2006) – a captchathat renders text challenges in 3D space (the project seems to be abandoned now).Chow et al. (2008) proposed the idea of making captchas clickable for usability


Figure 2.6: A captcha used at Quantom Random Bit Generator Service. Imagefrom http://random.irb.hr

reasons. Their captcha would present many text challenges to the user and askthe user to click on ones that represent English words. BotDetect (Lanapsoft,Inc, 2011) is a commercial captcha that comes in many image and sound styles.JCAPTCHA (Veret, 2009) is a similar open source alternative written in Java.Others include Securimage3, WebSpamProtect4, Cryptographp5 and others.

There are many more text captcha services available, but to my knowledgewe have covered the most interesting and noteworthy in this capter.

2.1.3 Image Based Captchas

Image based captchas, in contrast to text captchas, usually do not require the userto read any text. Instead, the user is expected to perform an image recognitiontask, i.e. recognise an object or idea from a picture. This is usually combined withthe task of grouping objects with similar properties together. The key element inimage captchas – the ability of humans to understand more from a picture thancan be extracted automatically – has been termed the semantic gap (Smeulderset al., 2000).

Chew and Tygar (2004) were the first to conduct any serious research in imagecaptchas. They proposed some early ideas of tasks that could be used with imageHIPs, for example the naming captcha, which asks user to write a word that de-scribes all the pictures presented. They also identified issues such as misspelling,synonymy and polysemy that could render such approaches problematic. Theyused PDictionary and Google Image Search to obtain the dictionary and sourceimages.

Von Ahn et al. (2004) also proposed two image captchas. Bongo is a captchainvolving pattern recognition tasks with simple shapes. PIX presented six imagesof a same object to the user and asked what is being depicted in the pictures.

3http://www.phpcaptcha.org/4http://webspamprotect.com/5http://www.captcha.fr/


Von Ahn and Dabbish (2004) proposed an image labelling game called ESP, whichlater resulted in a newer version of PIX, ESP-PIX, being developed.

KittenAuth6 is an image captcha that asks user to select pictures of cats outof all nine pictures presented. HumanAuth7 uses similar approach – the user isasked to distinguish between artificial and natural objects. HotCaptcha was aninfamous captcha (the project seems to be abandoned now) that made use of thedatabase at hotornot.com. It presented pictures of nine people to the user andasked to pick three which look hot.

Asirra (Elson et al., 2007), developed by Microsoft, is a service analogous toKittenAuth, it uses pictures from Petfinder.com as a source database. Hernandez-Castro et al. (2009) researched into the security aspects of Asirra and Huma-nAuth, concluding that these approaches are vulnerable to side-channel attacksand stated that “their security estimates are too optimistic”. Fritsch et al. (2010)reached similar conclusions when evaluating HumanAuth.

Imagination (Datta et al., 2005) captcha shows the user a composition of 8pictures and asks to click near a geometric centre of any one of them. User isthen presented with another picture, and is asked to choose a word from a setdescribing this picture. Thus, solving the captcha requires only two mouse clicks.

Figure 2.7: An image generated by ARTiFACIAL captcha. Image from Rui andLiu (2004)

Rui and Liu (2004) proposed ARTiFACIAL – a rather interesting captchaapproach that exploits the human ability to recognise faces. It presents to the

6http://thepcspy.com/kittenauth7http://sourceforge.net/projects/humanauth/


user an image that has a distorted face embedded in a cluttered background (seeFigure 2.7). The user is then asked to make six mouse clicks – four on eye cornersand two on both mouth corners.

Hoque et al. (2006) proposed a method of generating image captchas from3D models and adding rotations, distortions, lightning and warping effects etc.The idea behind it is that an infinite amount of very different images can begenerated using a relatively small set of source models. 3D Captcha8 is another 3Dimplementation that asks users to identify elements on automatically generatedthree dimensional scenes.

Image rotation has also been employed as a main method for generatingcaptchas (Gossweiler et al., 2009; Kim et al., 2010).

There are many more image based captchas available, but as with text captchas,only the most interesting and noteworthy were covered in this chapter.

2.1.4 Other Types

Sound captcha is another common type of HIPs. First sound captchas (calledByan and Eco) were described by Nancy Chan and Von Ahn et al. (2004) re-spectively. It should be noted that sound captchas are especially valuable forpeople who are visually impaired and have no way of solving captchas describedin previous chapters. Nevertheless it has been shown that sound captchas oftenencompass huge usability issues and can be extremely difficult for humans to solve(Bigham and Cavender, 2009; Bursztein et al., 2010). Currently, most popularcaptcha services (including reCAPTCHA), have also audio challenges available.

Captcha29 is an interesting visual approach that requires user to identify twosingle letters in a row and click on them. Although the usability of this HIPis definitely very high (it only takes two clicks), the security of it is question-able. Since the letters do not overlap, no segmentation is required – it has beenshown that computers are already better than humans at single letter recognition(Chellapilla et al., 2005b).

One innovative, although not very effective captcha was proposed by Ximeneset al. (2006). It required the user to decide which automatically generated knock-knock jokes are funny and which are not. Although it is an original approach,they only achieved ≈20% advantage over random guessing.

8http://spamfizzle.com/CAPTCHA.aspx9http://www.captcha2.com/


Baird and Bentley (2004) described ways of embedding captchas into webpages to reduce user annoyance. The user would be asked for example to “clickon a mountain top on the picture” to submit a form. They argue that since thepicture would serve as an illustration anyway, the user would not find this taskso irritating.

We will now look at some recent captcha research that is especially importantfor this project.

2.1.5 Recent Research and Motion-based Captchas

There have been a number of suggested captchas that involve video and motion.Kluever and Zanibbi (2009) took the idea of image recognition captchas onedimension further. Their captcha would play a (quite regular) video to the userand ask for “3 words which best describe this video”. They used YouTube as asource for videos and the combination of its related videos and tags functionalityto obtain the set of correct answers.

A similar approach (Shirali-Shahreza and Shirali-Shahreza, 2008) uses record-ings of over 4500 words in American Sign Language (ASL). The user is asked towatch a video of an actor saying a word and then pick one of four sentenceswhich best describe the movements (for example “The hand moves back andforth rhythmically”). Although this is undoubtedly an original approach and theauthors argue that it is very secure because “. . . understanding the motion is a dif-ficult task for a computer”, they have completely overlooked the fact that randomguessing produces 25% breaking rate with this captcha.

Fischer and Herfet (2006) proposed one of the earliest methods of using videoand animation to obfuscate otherwise human readable text. This approach wasdesigned to secure the transfer of human readable information from a source (thatalso generates the video) to a human so that the information could not be tam-pered with during the transfer process (since it is not machine-understandable).

Athanasopoulos and Antonatos (2006) suggested an animated captcha methodin hope of reducing the possibility of laundry attacks (more on those in the fol-lowing chapter).

A substantial work regarding motion based text captchas has been done by aresearch group in Wuhan University, China. They have propsed the Zero Knowl-edge per Frame Principle (ZKPFP), based on the moving object recognition prob-lem. It means that when analysing the captcha video, no information can be


extracted from a single frame (see Figure 2.8). They have proposed these kind ofcaptchas using 2D animation (Cui et al., 2010a), 3D animation (Cui et al., 2009)and 3-layered monochrome approach similar to BaffleText (Cui et al., 2010b).

Figure 2.8: An image extracted from the captcha described in Cui et al. (2010a).The fact that the characters (“M”, “8” and “R”) are not recognisable, is the keyprinciple to ZKPFP (although this particular sample also suffers from poor us-ability, as the characters are not completely recognisable from the animation aswell)

NuCaptcha10 is a commercial captcha service, which embeds text challengesinto videos. Videos are presented in Flash format and the user is asked to “typein the red moving letters”. Although praised by Craig Mori as “the most securecommercially available Captcha system available today” it does seem to havemany critical vulnerabilities. All the required information is always presented inthe same typeface, in the same colour and in the same position relative to othertext. Little background cluttering exists and letters are not obfuscated. As thisservice is relatively new, one can probably expect attacks against it soon.

Figure 2.9: An image of a running man, extracted from a video describing theemerging images technology. The figure of a man can be difficult to detect fromthis image, but becomes apparent when the video starts playing. Source: http://www.youtube.com/watch?v=z7bbU5VOfHY

Mitra et al. (2009) proposed a very interesting captcha based on moving10http://www.nucaptcha.com


emerging images, inspired by the picture of Dalmatian dog by R.C. James (seeFigure 2.9). Objects on these images are quite hard to notice for humans (andeven harder for computers), but they found a way to generate smooth animationswhere a moving object becomes apparent instantaneously, while frame-by-frameanalysis of the video remains very difficult.

The motion based captcha methods discussed here, and especially the ZKPFPis the inspiration for most of the research done in this project.

2.2 Usability vs Security

As more and more effort is being put into making captchas secure and keepingthem up to par with advancing captcha breaking technology, issues with usabilityare often overlooked or neglected. Nonetheless, the problem inherent in most HIPstends to be the positive correlation between usability and breaking efficiency.

2.2.1 Captcha Usability

The term usability is defined by the International Standards Organisation as “theextent to which a product can be used by specific users to achieve specified goalswith effectiveness, efficiency and satisfaction in a specified context of use” (Jokelaet al., 2003). In the context of captchas, usability is usually referred to as thequality of being easily (enough) solvable by humans.

It is widely accepted that captchas pose more and more usability challenges(Bursztein et al., 2010) and many studies have been conducted on how to makethem more usable while not losing any degree of security. Yan and El Ahmad(2008) identified three criteria for assessing captcha usability issues – content(what is being presented), distortion (how the content has been modified) andpresentation (how the content is presented).

Chellapilla et al. (2005a) found that distortions like translation, rotation, scalevariations and low to moderate global warp are easier for humans to process. Theyconsider success rate for a good captcha to be at least 90% for humans.

2.2.2 Breaking Captchas

There are generally four methods of breaking captcha security:

1. Solving the AI problem behind the captcha


2. Breaking the captcha while not solving the actual AI problem

3. Bypassing the captcha completely using other implementation flaws

4. Using human workforce for solving captchas

The first method is usually the preferred way for captcha designers, as itusually means that a previously unsolved AI problem behind the captcha designis solved (Von Ahn et al., 2004) and the efforts put into designing a clever captchahave been fully appreciated by the attacker.

The second method implies that the captcha would be broken using othermethods than the one intended by the designer. One example of this is “Pixel-Count Attack” described by Yan and El Ahmad (2009). The third method doesnot relate to the actual design of captchas and is therefore considered outside thescope of current investigation; also, there is not much that can be done to preventthe fourth (but more on this later). However, it should be noted that all fouraspects are of same importance, as it only takes the weakest link in the chain tobreak captcha security. It is widely accepted that a strong captcha should not beautomatically breakable on more than 0.01% of cases (Chellapilla et al., 2005a).

Solving an image based text captcha usually involves the following steps: im-age clean up (removing noise), text pixel identification (telling which pixels arepart of characters and which are not), segmentation (separating the characters)and character matching (Hindle et al., 2008). It has been shown that computersalready surpass humans in single character recognition (Chellapilla et al., 2005b)and that the most difficult task remains, because of its computational expensive-ness, segmentation (Chellapilla et al., 2005a; Huang et al., 2010).

Numerous studies have been conducted on text-captcha security. Some guide-lines for strong captchas include (Chellapilla et al., 2005a; Hernandez-Castro andRibagorda, 2009; Hindle et al., 2008; Wilkins, 2009; Yan and El Ahmad, 2007):

1. Using rotation and warping of individual characters (rather than the wholeimage)

2. Using different font faces and not using dictionary (real) words

3. Using overlapping and non-continuous characters

4. Limiting the number of attempts a user can make


5. Making foreground-background difference as minute as possible

6. Using large character sets

7. Making it impossible to separate characters by counting pixels

8. Alternative methods (e.g. audio) should not be easier than the defaultmethod

9. Randomness should be used in as many parameters as possible

10. It should be difficult for the attacker program to detect whether it wascorrect or not

One major issue with captcha security remains the so called outsourcing orlaundry attack, in which captchas are redirected to humans who either knowinglyor not help spammers (Yan, 2009). Bursztein et al. (2010) concluded that usingsystems like Amazon Mechanical Turk11 is much cheaper and more efficient wayof solving captchas than creating custom software and solving AI problems.

Service Type Pricebypasscaptcha.com Outsourcing $14 for 2,000, $130 for 20,000beatcaptchas.com Outsourcing $8 for 1,000, $35 for 5,000captchacracking.com Custom software From $100 to $2,000captchatrader.com Outsourcing $1 for 1,000deathbycaptcha.com Outsourcing $1.39 for 1,000decaptcher.com Outsourcing $2 for 1,000

Table 2.1: Commercial captcha-breaking services

A different type of outsourcing attack, described by Lopresti (2005) aspornographer-in-the-middle attack, exploits unsuspecting humans who are tryingto access another website under attacker’s control and to whom captcha chal-lenges are redirected.

Table 2.1 lists the commercial captcha breaking services that were identified.

2.3 Image and Motion Analysis

In this section a very brief overview of image segmentation and moving patternrecognition methods is given. This is by no means intended as a comprehensive

11https://www.mturk.com


description of such complex algorithms. One may wish to see Section 4.2.1 forspecifics on how the captcha design was influenced by the methods described here.

2.3.1 Image Segmentation

Image segmentation is a process that tries to simplify an image and make it easierto analyse by dividing it into regions that have strong correlation with objectsin the real world (Shapiro and Stockman, 2001; Sonka et al., 2008). Many imagesegmentation algorithms exist and all have their pros and cons depending on thetype of the specific task at hand.

Sonka et al. (2008) divides segmentation algorithms into three groups: thresh-olding, edge-based and region based segmentation.

Thresholding is one of the earliest and simplest segmentation algorithms. Thisinexpensive and fast method produces a binary image containing pixels belowor above a given brightness constant (or threshold). Edge-based detection algo-rithms try to find discontinuities in an image (e.g. in grey-level, colour or texture)that are then combined into edge chains that have a greater resemblance to bor-ders in the image. Region based segmentation does not concern with edges andborders, but rather with homogeneous regions on an image (this can be based ongrey-level, colour, texture, etc).

2.3.2 Moving Pattern Recognition

Moving objects are usually presented and perceived by computers as a discretesequence of still images. Therefore identifying moving patterns also boils downto recognising the same object in two (or more) subsequent images (which arebasically sets of pixels). It can be a very difficult task, considering that the objectmight change rapidly in position, shape or colour.

It is impossible to identify a single same pixel in two different images (Schreieret al., 2009). Consequently only larger collections of pixels, making up lines,curves or blobs, can be identified and only then matched against similar objectsin different images.

These collections, or interest points, define how clear an image is and howeasily it can be processed by computers. Similarly, it is important to note thatlack of such collections makes it hard or impossible for a computer to process animage. Reducing the number of interest points is a main goal of introducing noise


and various effects to the captcha described in this project.Many moving pattern recognition algorithms exist, some closely based on

image segmentation algorithms. The most popular ones include background sub-traction method, frame-difference method, optical flow segmentation methodsand moving edge detection methods (Chaohui et al., 2007; Cui et al., 2010a).

2.4 Chapter Summary

We have seen that considering the short history of captchas, quite substantialresearch has already been conducted. Although image based text-captchas havebeen the most popular scheme for a long time, it seems that the gap betweencomputer and human understanding is not sufficient any more, rendering thesecaptchas either insecure against attacks or unusable for humans.

This increases the need for new types of captcha schemes, based on new AIproblems. Visual motion captchas employing the Zero Knowledge Per FramePrinciple seem like an interesting and potential substitution for still image-basedtext captchas.

Chapter 3

Research Methodology

This chapter describes different formal aspects of this project. We will start withthe definition of objectives, scope and deliverables. This is followed by descriptionof the research plan, that explains different stages of the project. The final sectionwill conclude the chapter.

3.1 Objective

The general goal of this project is to help make the Web more secure and spam-free. The more specific objective is to devise a novel motion based captcha withgood enough security and usability to compete with current popular captchas.

3.2 Scope of Investigation and Limitations

As mentioned in previous chapters and also suggested by the project title, theresearch focused on visual, motion based captchas for the Web that involve com-pleting a character recognition HIP (Human Interaction Proof). This deservessome more elaboration since most of these terms are fundamental to what wasdone in the project.

Being usable on the Web means here that the captcha challenge can be pre-sented to the user using modern technology supported by modern Web browsers.This can involve combination of Hypertext Mark-up Language (HTML), JavaScriptand graphical images.

Being visual means that the user is presumed to be able to accomplish tasks

33

CHAPTER 3. RESEARCH METHODOLOGY 34

requiring normal visual and cognitive performance, therefore issues like accessi-bility for the blind are knowingly disregarded in this context. It is certainly veryimportant that captchas for disabled people exist, but it is not the focus of thisresearch.

The definition of motion based is rather straightforward – majority of currentcaptchas are based on still images, here we focus on captchas that are presentedto the user as animation, comprising of series of still images. Animated GraphicsInterchange Format (GIF), Java applets or combination of JavaScript and thecanvas element included in HTML version 5 could be used to present animationon the web. Some existing approaches also use proprietary software like AdobeFlash.

What this project does not involve, is a more thorough security analysis of thenew captcha – this would require much greater resources than available duringan MSc project (and is perhaps a good topic for another).

Also, the aim of this project was not to produce a production-ready work-ing captcha service, but rather a captcha engine that can be used for multiplepurposes (including such captcha services). Real-world implementation is a verytrivial issue that depends on specific needs and requirements and is not part ofthis project.

3.3 Deliverables

Since this project contains a significant portion of both research and softwareengineering, it will also have a mixture of both types of deliverables. The deliv-erables are:

1. A new concept and theoretical description of a novel captcha

2. Software capable of producing such a captcha, based on (1)

3. Software for tuning parameters for (2) (based on an evolutionary algorithm)

4. A recommended set of parameters for use with (2), gathered from testingsessions with (3)

Theoretical aspects (items 1 and 4) are presented in this paper. Softwaredeliverables (items 2 and 3) are presented on an attached CD1, are available

1Please refer to Appendix A for more information


on-line2 and are also described in this paper.

3.4 Research Plan

The research plan for this project consists of five consecutive stages, each beinga prerequisite for the next (please see Figure 3.1).

Developing

a new captcha

concept

Engineering

a prototype

Tuning

parametersRe!ection

Literature

review

Figure 3.1: Illustration of the research work flow

Even though each stage has a dedicated chapter or section in this dissertation,a brief overview of them is given below.

3.4.1 Literature Review

The purpose of a literature review is quite straightforward – to find and analyseas much relevant literature as possible. Most of the research focused on findingexisting (preferably motion-based) captchas. This included also text based, imagebased, and other types of captchas. Although this cannot be said with absolutecertainty, the author feels confident, that the most relevant captchas devised tothis date, were found.

Other areas of investigation included captcha security (how to break), usabil-ity and moving pattern recognition by computers. This yielded many useful ideason how to enhance both usability and security of motion based captchas.

The literature review produced many interesting findings and resulted in agood overview and understanding of what has been done in captcha research sofar. One of the most important findings was certainly the concept of ZKPFP(please see Section 2.1.5) which served as inspiration for the next stage in theproject – developing a concept for a new captcha. The literature review is givenextensive coverage in Chapters 1 and 2.

2Please refer to Appendix C for more information


3.4.2 Developing a New Captcha Concept

Developing a new captcha started with playing around with some initial ideas,some of which tried to employ the ZKPFP. The literature review was very usefulat this point, as many existing captchas gave ideas on how to best implementmotion captchas.

The idea that was taken further turned out to be a monochrome captcha thatused character recognition with characters made out of small lines and additionalbackground clutter added. Some basic effects could be added to the particles(such as movement and blinking). It was intended that numerous (more complex)effects could be added later.

Figure 3.2: An early draft of the captcha showing letters “A”, “B” and “C”

A more detailed explanation of the idea behind the captcha is presented insections 4.1 and 4.2 of this dissertation.

3.4.3 Engineering a Prototype

Some more research followed in areas such as motion perception and patternrecognition to get more ideas on how to make the captcha more secure and at thesame time remain usable for humans. Table 3.1 lists the effects that were identifiedand implemented for this purpose (all use randomness as much as possible).

The captcha was implemented in PHP scripting language. JavaScript andHTML5 canvas element were used to draw the graphic interface.

A special application was designed that allowed to conveniently change captchaparameters and see the resulting animation. Another auxiliary tool was designedfor creating character files (explained in Section 4.3.5).


Effect DescriptionLocal motion Movement of individual particlesGroup motion Movement of groups of particles (e.g. characters)Visibility A binary setting that either displays or hides individual

particlesParticle rotation Smooth rotation of individual particlesGroup rotation Smooth rotation of individual characters and background

layers

Table 3.1: Effects implemented in the captcha prototype. Please see Chapter 4.1for the description of all design decisions.

The source code of the captcha prototype was properly documented so thatPHPDocumentor3 could be used to generate a HTML API documentation.

A thorough and detailed description of the captcha design is presented inChapter 4.3 of this dissertation.

3.4.4 Tuning Parameters

As the captcha programme was designed to be as dynamic as possible, it resultedin a software component capable of accepting quite a number of parameters, eachof which affected the captcha animation in some way. As it is very difficult to saywhich combination of over than 20 continuous parameter values is the best choicefor an optimal captcha, a software was developed that tried to do that using agenetic algorithm (please see figure 3.3 for an illustration).

Usa

bil

ity

Security

An optimal captcha

Figure 3.3: A graph illustrating the problem of finding the best captcha in termsof usability and security

The genetic algorithm used user input from testers, who rated the usabilitylevel of generated captchas. Please see Chapter 5 for a complete and thorough

3http://www.phpdoc.org/


description of the methodology and implementation of this testing programme.

3.4.5 Reflection

The last part of the research plan is reflection and discussion over the outcomesand achievements. The results gathered from user testing were reviewed andanalysed, conclusions were made from the captchas that were automatically gen-erated. This part corresponds to Chapter 6 (Results) and Chapter 7 (Discussion)of this dissertation.

3.5 Chapter Summary

In this chapter we have defined the objective and deliverables of this project. Wehave also seen what belongs to the scope of this project and what does not. Theresearch plan was explained, describing each stage of the process with referencesto corresponding main parts of the dissertation.

The next chapters will show how the last four stages of this research plan wereexecuted and what outcomes they had.

Chapter 4

Captcha Design

Figure 4.1: A captcha showing characters “C”, “A” and “F”

This chapter contains a thorough description of the captcha algorithm andthe program implementing it. We will begin with a brief overview of the captchaand then carry on to the main design considerations and decisions behind thecaptcha.

After this, a technical description of the programme is given, where matterssuch as input parameters, mathematical algorithms, character design etc. arediscussed.

Finally, overview is given of how the character objects were created, before

39

CHAPTER 4. CAPTCHA DESIGN 40

concluding in the last section of the chapter.

4.1 Introduction

The general principle of this captcha is to make character recognition difficult forcomputers by introducing motion and separating characters into smaller particles.Additionally, background clutter is used to make segmentation more difficult.

At the same time, this task should remain relatively easy for humans, as themovements are designed to be smooth for the human eye and easy for the brainto understand. The outcome of such test could be used to decide whether topermit further access to a system or not.

Effects like blinking and rotation are also added to particles to make movingpattern recognition more difficult. The captcha is inspired by Zero KnowledgePer Frame Principle (ZKPFP), introduced by Cui et al. (2010a). This meansthat frame-by-frame analysis of the motion captcha is made difficult by makingit hard to distinguish objects only from a single frame of an animation. Pleasesee Figure 4.1 for a snapshot of the animation.

4.2 Design Considerations

There are two main arguments for motion based captchas (when compared tostill image captchas). Firstly, solving a moving pattern recognition problem on acomplex background is far more difficult for current computers, than solving anOCR problem (Cui et al., 2010b). Secondly, for humans (and other animals) itis easier to detect objects that move, than it is those that do not1 (Bruce et al.,1996).

Similar experiments have been conducted by Regan (1989) and Regan andHamstra (1991), from which Bruce et al. (1996) infers that “people can see, withgreat accuracy, oriented contours, shapes and region boundaries that are definedsolely by relative motion within fields of random dots”.

The captcha tries to take advantage of the more complex visual areas ofthe brain dealing with second-order (also called non-Fourier) motion perception,which is presumably harder to mimic for computers. Changes in depth, contrast

1Bruce et al. (1996) illustrates this with the following example: “A camouflaged animal willremain well-hidden only if it remains stationary. As soon as it moves, it is easier to see.”


or relative motion define second-order motion, which is opposed to first-ordermotion, defined by changes in luminance or colour (Ledgeway and Smith, 1994).

Characters were chosen as recognition objects, because literate humans arevery familiar with character-shaped objects, occasionally with distorted and ir-regular shapes (e.g. letters in different typefaces or in logos). Characters haveonly borders and have been left empty inside, because such shapes are a lot easierto detect and process for the human brain. Bruce et al. (1996) writes: “Of severalgeometrically possible perceptual organisations, that one will be seen which pro-duces a “closed” rather than an “open” figure”. Section 4.3.5 looks more closelyat the design of characters.

Dimensions for presenting the captcha (size of the animation) was chosento be 500x300 pixels, which is large enough for presenting reasonably complexanimations, while being small enough to be practical on the web (and on mobiledevices). Length of the animation was chosen to be 100 frames, which is sufficientfor presenting smooth animations. When the last frame is reached, animation isreversed and repeated backwards. This is to save bandwidth and processingpower. Both of these parameters (dimensions and length) are configurable.

Great effort has been made to reduce the number of interest points (see Section2.3) to make image segmentation and moving pattern recognition more difficult.In addition to the particles that make up characters, numerous layers of back-ground clutter are added to distribute the visual scene more evenly. During theanimation, particles are rotated, hidden, made visible again, and moved in variousways to make interest point detection more difficult (please see the next sectionfor specifics).

4.2.1 Defence Against Known Methods

Many image segmentation and moving pattern recognition algorithms exist. Agood captcha design involves consideration for at least the most popular ones.

As introduced in Section 2.3, the main methods for image segmentation arethresholding (including local and global), edge-based methods, and region basedmethods (including region growing, merging, splitting) (Sonka et al., 2008).

Thresholding is of no use in breaking this captcha, since the captcha is alreadya binary image. Further thresholding can only result in a completely black orwhite image with extreme thresholds.

Edge detection also cannot be very effective, as only changes in black and


white can be used to detect discontinuities in an image (since we are dealing withbinary images consisting of identical shapes). As the characters in the captcha aresegmented into small particles, a very large number of discontinuities is purposelycreated. In addition localised rotation and movement is applied to the particlesto make edge detection and edge chaining more difficult.

Region based segmentation cannot be very successful, since it is based on de-tecting homogeneous regions based on grey-scale, colour, texture or shape (Sonkaet al., 2008). Remember that the captcha image is 1-bit monochrome (only twointensity values), has no colour and consists only of identical particles (thus notexture). Shape detection (also called matching or template matching) is madedifficult by introducing motion, rotation and disappearance effects to the parti-cles, and rotation to characters.

There are also numerous moving pattern recognition methods, that can beused to detect characters from the captcha. Currently, the main algorithms in-clude background subtraction method, frame difference method, optical flow seg-mentation methods and moving edge detection methods (Chaohui et al., 2007;Cui et al., 2010a).

Background subtraction method does not apply to this captcha, as there is nostatic background to be subtracted. Background and characters cannot be sepa-rated because both are in constant movement and consist of identical particles.

For same reasons, the frame-difference method is not applicable. To workefficiently, it requires minimal noise and absolute rest of the background (Cuiet al., 2010a), which is not the case with the captcha.

Optical flow segmentation algorithms should be impractical to use with thiscaptcha as well. This is because optical flow of characters is quite difficult todetermine, due to numerous background layers that each move in different speedsand directions. Also, random disappearance and reappearance of particles makesthe detection harder.

Moving edge method is impractical for the same reason edge-based segmen-tation methods are.

We can see that the captcha has good potential to withstand many knownattack methods.


4.3 The Captcha Programme

This section presents a technical description of the captcha programme. We willstart with a brief justification of the choice of programming languages and thenlook at the input parameters of the captcha. After this we will delve into thetechnical workings of the captcha algorithm and see how exactly the animationsare generated. The last two sections will describe the class hierarchy and howthe characters used by the captcha were designed.

4.3.1 Choice of Languages and Formats

PHP was chosen as an implementation language for this software. The mainreason was that PHP is “the most widely used programming language on the Web,with over 40 % of all web applications written in PHP” (Lerdorf et al., 2006), thusinstalling the captcha system would most likely not require any special software orexternal libraries. Additionally, PHP supports (since version 5) all the necessarylanguage constructs required to design well-structured software with moderatearchitectural complexity (interfaces, abstractions, inheritance).

JavaScript was chosen as the client-side scripting language also for its widesupport across different browsers. JavaScript Object Notation (JSON) was chosenas format for coordinate data, since it is very compatible with JavaScript, wellsupported by PHP and it is a very efficient format for transmitting large amountsof structured integer data (when compared to XML for example).

HTML 5 canvas element (with 2d drawing context) was chosen as the graphicdrawing tool for captchas. This is because for its compatibility with JavaScriptand the JSON-encoded coordinate data (when compared to XML based SVG forexample).

4.3.2 Input Parameters

Each captcha is generated from a parameter vector ~a = (a1, . . . , an) such thateach ai is taken from a predefined set of Ai, i.e.

ai ∈ Ai = {a(1)i , a(2)i , . . . , a

(ki)i } for i = 1, . . . , n (4.1)

For a list of parameters used to generate character and background clutterparticles, and also effects for these particles, please see table below. How these


parameters are being used by the captcha system, will be explained in the nextsection.

Symbol Parameter Symbol Parameter

General parameters a14 Frequencya1 Animation width a15 Maximum rangea2 Animation height Group motion effecta3 Animation speed a16 Frequencya4 Animation length a17 Maximum rangea5 Screen offset a18 Skip ratio (also appl. to local m.)a6 Particle length Visibility effect

Background clutter a19 Number of variantsa7 Number of particles a20 Appearance probabilitya8 Number of layers a21 Disappearance probabilitya9 Screen offset Local rotation effect

Characters a22 Number of variantsa10 Characters a23 Standard deviationa11 Font size a24 Rotation spana12 Horizontal scaling Group rotation effect

Local motion effect a25 Standard deviationa13 Number of variants a26 Rotation span

Table 4.1: Input parameters used to generate particles and effects

4.3.3 Control Flow

Two HTTP requests are used to generate and show the captcha to the user. Thefirst is used to download the HTML page with all required JavaScript code andis quite trivial aspect of the programme.

The second request (Captcha generation in Figure 4.2) is used to invoke thecaptcha generation process and to download the coordinate data from the webserver. The control flow of this procedure with explanations of all relevant stepsis presented below. Input parameters are referenced to as ai and can be foundfrom Table 4.3.2.


Figure 4.2: Flowchart of initial actions performed by the captcha

Create Particles

Particles are line structures that each frame of each captcha is made out of. Theyare the building blocks for characters and they also form the background clutterused in the captcha. Each particle has four attributes: start coordinates (x, y),rotation (φ) and length (l).

A pseudo-random number generator, built into PHP, is used to generate ran-dom variables. In probability theory, this uniform probability density functionU(a, b) is defined as:

f(x) =

{1b−a for a < x ≤ b,

0 for x ≤ a or x > b(4.2)

Generating background clutter is quite straightforward. A defined number(a6) of particles are created and divided into a7 number of layers. Having particlesin layers (groups) reduces the necessary computing power, as one effect variantcan be used with many particles. Each particle is given a random position (x, y)

such that

x = U(−a1 × a5, a1 × (1 + a5)) (4.3)

y = U(−a2 × a5, a2 × (1 + a5))


Figure 4.3: Flowchart of the process that generates captchas

An initial rotation for each particle is calculated as φ = U(0, 2 × π). Allparticles share the same length l, which is calculated from the initial particlelength (a6), font size (a11), and the ratio of screen width and the number ofcharacters being used. This can be expressed as:

l = a6 × a11 ×a1

length(a10), (4.4)

Where length(s) is a function that returns the number of characters in a textstring.

Generating character objects is a bit more complicated. Each character canbe represented by a matrix of initial coordinates Cj where j = 1, . . . , length(a10).All of m rows in Cj represent a single particle in the character.


Cj =

x1 y1 φ1 l1

x2 y2 φ2 l2...

......

...xm ym φm lm

(4.5)

These initial coordinates now need to be transformed into actual screen coor-dinates, i.e. for each row (x, y, φ, l) in each character matrix Cj, a new particle(x′, y′, φ′, l′) will be generated.

To achieve this, firstly the magnification ratio (k) for current captcha will becalculated. This ratio depends on the screen width (a1), font size (a11), and thenumber of characters being drawn (length(a10)), and determines how large thecharacters will appear.

k =a1 × a11 × 100

length(a10)(4.6)

Secondly, a transformation will be made that produces coordinates for eachcharacter. This process uses the magnification ratio k, the horizontal scalingparameter (a12), and the position of the character being drawn (j) to place char-acters so that even distances would appear between them and they would fitnicely on the screen.

x′ = x× k + ((j − 1)× (100× k)× a12)) ,

y′ = y × k,

φ′ = φ,

l′ = l × k, (4.7)

Additionally, margins are added to x and y coordinates, to ensure that textis always centred (initially), regardless of how many characters are displayed.

All the generated particles (regardless of whether they belongs to backgroundor character elements) are added to the same canvas object.

Generate effects

After particles have been created, three types of effect variants will also be gener-ated – motion (local and global), rotation (local and global) and disappearance.


An effect variant can be defined as a series of certain changes that can be appliedto a particle parameter. For example, an effect variant can cause a 3◦ clockwiserotation in every second frame, or movement to left by 4 pixels in every frame.

Motion. Certainly the most important effect in a motion captcha is motion.Two types of motion effects can be added to the captcha. Local motion definesmovement of a single particle whereas group motion defines movement of a groupof particles (such as a character or a background layer).

Same algorithm is used for generating both types of motion, the only differenceis that a13 defines the number of local motion variants, whereas the number ofgroup motion variants will depend on the number of characters and backgroundlayers.

Each motion variant can be expressed as a series of movements (∆xi,∆yi)for each of the a4 frames in the animation. Every particle will be assigned onevariant, so for example having only variant would mean that all particles move inthe same way. To achieve smooth motion, a sine function is used for calculatingmovement vectors.

∆xi = sin(cx ×

(i× (1− a18)

a4+ a18

))×min(a1, a2)

∆yi = sin(cy ×

(i× (1− a18)

a4+ a18

))×min(a1, a2)

for i = 1, . . . , a4

(4.8)As we can see, each vector is a sine value from −1 to 1 that is scaled for the

smallest of two dimensions (width a1 and height a2). Input for the sine functionchanges gradually when the movement progresses (when i changes) and can causethe function to make a number of full periods.

The purpose of the skip ratio parameter (a18) is to move particles from theirinitial positions already before the animation starts. This ensures that charactersdo not appear in the same position in the beginning of every animation

cx and cy are random variables that determine the frequency of movements.They are generated using the frequency parameter a14, which sets the maximumnumber of full sine periods.


cx = U(−π × a14, π × a14)

cy = U(−π × a14, π × a14) (4.9)

For group motion, one should obviously substitute a14 and a15 with a16 anda17 respectively in the above explanation.

Visibility. For the visibility effect, also a number of variants is generated. Eachvariant is a sequence of boolean values, determining if a particle is visible or not.Whether a visible particle becomes invisible in the next frame, is determined by arandom value and the particle disappearance probability parameter a21. Similarly,whether an invisible particle becomes visible, is determined by a random valueand the particle appearance probability parameter a20.

So, boolean values (v) for the effect variant will be generated for each frame1 < i ≤ a4. The first frame of every particle is invisible (v1 = 0).

vi =

1 if U(0, 1) ≥ a21 ∧ vi−1 = 1

1 if U(0, 1) < a20 ∧ vi−1 = 0

0 otherwise

for i = 2, . . . , a4 (4.10)

Local rotation. The local rotation effect, similarly to motion, generates a ran-dom value and gradually rotates the particle until it has reached the desiredrotation. Then, another random value is generated and the process is repeated.The number of frames it takes to rotate a particle is defined by the rotation spanparameter (a24). Also, similarly to other effects, the number of different rotationvariants is determined by a parameter (a22).

The random value expressing the change in target rotation is distributed nor-mally with mean 0 and standard deviation a23. In probability theory, the normalprobability density function N (µ, σ2) is defined as:

f(x) =1

σ√

2πe−(x−µ)

2/2σ2

(4.11)

So, the target rotation change (g) for each movement will be:

g = N (0, a223) (4.12)


Then, a series of smooth movements will be generated, each spanning througha24 number of frames. This process will be repeated until the final frame has beenreached.

φi = prev() +

(g − prev()

a24

)× i for i = 1, . . . , a24, (4.13)

Where prev() is a function that returns the rotation of the particle before themovement had begun.

Group rotation. The group rotation effect uses the same algorithm as localrotation. As with motion, the only difference between local and group variants isthat for group rotation, the number of parameters is not specifiable and dependson the number of background layers and characters. Also, the standard deviationand rotation span parameters for group rotation are defined with a25 and a26

respectively.Note that particle rotation and character rotation are done separately. This

results in completely new shapes with every step of rotation (please see image4.4).

Figure 4.4: A letter “B” with no rotation and rotation of π8. Note that individual

particles remain at zero rotation on the rotated character

Finalise

After generating particles and effects, new coordinates will be calculated for eachparticle in every frame, considering all the effects that have been added to theeffect queue of each particle.


Visibility. Applying the visibility effect is quite simple. When an effect variantdefines the particle to be invisible, it will simply be removed from the canvasobject. This saves memory and also reduces required processing power.

Motion. When a motion effect variant is applied to a particle, ∆x and ∆y ofthe effect will be added to x and y parameters of the particle respectively.

Local Rotation. When local rotation is applied to a particle, it will not bejust added to the existing rotation of particles, as this would result in particlesrotating around one endpoint. Instead, the desired effect is rotation around thecentral point of the particle.

To achieve this, a simple transformation is made, that calculates new X andY coordinates and keeps the centre point in one place, using trigonometric ratiosof sine and cosine.

x′i = xi +li × cos(φi)

2− li × cos(φi + ∆φi)

2,

y′i = yi +li × sin(φi)

2− li × sin(φi + ∆φi)

2,

φ′i = φi + ∆φ,

l′i = li, (4.14)

Where (xi, yi, φi, li) is the particle being rotated and ∆φ is the rotation to beapplied.

Group Rotation. The group rotation algorithm is slightly more complicated.Firstly, the centre points (cx, cy) of each particle group (character or backgroundlayer) are calculated.

cx =xmax − xmin

2+ xmin,

cy =ymax − ymin

2+ ymin, (4.15)

where xmax and ymax are the highest X and Y coordinate values in current group,and xmin and ymin are the lowest.

Then, for each particle in a particle group, new coordinates are calculated inthe following steps:


1. Calculate the current rotation of the particle (x, y) in relation to the centrepoint (cx, cy) using arctangent.

2. Add the target rotation ∆φ to this rotation.

3. Calculate an imaginary hypotenuse between (cx, cy) and the particle.

4. Calculate new X and Y coordinates using ratios of sine and cosine, and thehypotenuse and rotation calculated in previous steps.

5. Add the local centre points cx and cy to the coordinates.

This results in rotation where every character and layer, regardless of itssize and position, is always rotating around its centre point. This can also beexpressed with a somewhat complex equation:

xi = cos(arctan y−cy

x−cx + ∆φ)× y−cy

sin(arctan y−cyx−cx

)+ cx,

yi = sin(arctan y−cy

x−cx + ∆φ)× y−cy

sin(arctan y−cyx−cx

)+ cy, (4.16)

As this algorithm can be slightly difficult to comprehend at first glance, aprogramme code of the rotation procedure is included in Appendix E.

As we know, at this stage coordinate vectors will consist of four parameters – Xposition (x), Y position (y), rotation (φ) and length (l). This format is convenientfor doing calculations and applying effects, but is less useful for drawing particlesusing JavaScript. For this reason, each coordinate vector will be translated fromform (x, y, φ, l) to (x1, y1, x2, y2)

x1 = x

x2 = x+ cos(φ)× l

y1 = y

y2 = y + sin(φ)× l (4.17)

Where x1 and y1 define coordinates of one endpoint of the particle, and x2

and y2 define the coordinates of another.


Finally, the set of frames, each containing a set of particles, is encoded as aJSON object and outputted. After this, script execution will stop.

{" f rameset " : [ [ [ 1 5 0 , 6 , 1 6 8 , 6 ] , [ 1 5 4 , 8 3 , 1 6 0 , 6 7 ] , [ 1 6 1 , 5 5 , 1 6 7 , 3 9 ] ] ] }

Listing 4.1: Example output in JSON format, containing one frame with 3particles on it is given below

4.3.4 Description of Main Classes

In this section, the purpose and behaviour of most important classes is provided.Classes are named similarly to the directory structure that they are located in. Forexample class Captcha_Canvas can be found from path /Captcha/Canvas.php

in relation to the root directory of the application.Class Captcha acts as a main entry point to the captcha system. It handles

(and delegates) all necessary actions, including generating captcha coordinates.Captcha_Generator deals with the generation of a new captcha. generate()

is the main method invoking the generation of particles and effects.

Figure 4.5: Class diagram showing classes implementing theCaptcha_ParticleContainerInterface

Captcha_Canvas is the central data structure that holds all the particle ob-jects used by the captcha. Effects (e.g. motion) and elements (e.g. characters)


can be added to the canvas. The structure of the class is relatively simple (pleasesee Figure 4.5)

Captcha_ParticleContainerAbstract defines some common methods andclass members that are required by all instances of classes that can contain groupsof particles. Please see Figure 4.5 for examples.

Captcha_Elements_Abstract contains common functionality shared by classesCaptcha_Elements_Clutter and Captcha_Elements_Characters. As suggestedby class names, the former deals with generating background clutter and the latterwith the generation of characters. As all these classes can contain particles, theyalso implement the interface Captcha_ParticleContainerInterface. Please seeFigure 4.5 for an illustration.

Captcha_Elements_Particle is one of the most important data structures,as it represents a particle in any particle container. Each particle object has quitea number of parameters and methods, please see Figure 4.6 for an illustration.

Figure 4.6: Diagram showing members and methods of theCaptcha_Elements_Particle class

Captcha_Elements_Characters_Abstract is a relatively simple data struc-ture that is shared by all classes containing character data. It contains one


property, defining which characters to accept as correct answers for this charac-ter object, and one method that returns the particle coordinates used for drawingthis character (please see Figure 4.7 for illustration).

Figure 4.7: Diagram illustrating abstract class Captcha_Elements_Characters-_Abstract

Abstract classes Captcha_Effects_Abstract and Captcha_Effects_Variant-_Abstract are extended by all 3 effect modules. The former deals with some gen-eral tasks involved in applying and generating effects, and the latter with morespecific generation of effect variants. Most of the effect logic described in theprevious section can be found in these classes.

4.3.5 Characters

The design of the character set, used by the captcha, was obviously one of the mostimportant issues in the development of the captcha. Characters are made out ofline-shaped particles that form the contour while leaving the body transparent(please see Figure 4.8).

Figure 4.8: Characters “A” and “B” without any effects applied


Since all characters can move in different directions, it is very possible, thatthe character order changes during the animation. Thus, it makes sense not torequire the user to enter the characters in the correct order (i.e. any order ofcorrect characters will be sufficient).

Not all possible characters are included in the character drawing module. Onlyupper-case letters in the ASCII character encoding scheme are included in thecaptcha, to decrease the chance of ambiguous challenges. Some characters acceptmany letters as correct answers due to their resemblance to other characters(please see table below).

Character Accepted values Character Accepted valuesA A, a N N, nB B, b, 8 O O, o, 0C C, c, G, g P P, pD D, d, O, o, 0 Q Q, q, 0, O, oE E, e R R, rF F, f S S, s, 5G G, g, C, c T T, tH H, h U U, uI I, i, 1, l V V, vJ J, j W W, wK K, k X X, xL L, l Y Y, yM M, m Z Z, z, 2

Table 4.2: Characters used by the captcha

A sans-serif typeface Arial was used as a reference when creating characterfiles, for its simplicity and popularity. The number of characters displayed is arandom value from 2 to 4 (more than 4 becomes hard to distinguish). Charactersize will always adjust to the number of characters displayed, so that they wouldnot overlap too much and would fit reasonably well in the frame.

A chance of correctly guessing a combination of characters is then roughly:

1.15

(26× 25× 24× 23) + (26× 25× 24) + (26× 25)≈ 3.1× 10−6 (4.18)


where 1.5 is the average number of capital letters accepted and 26 the totalnumber of characters. Digits and lower-case letters are omitted because we canassume that the attacker is aware that only capital letters are used by the captcha.

A special tool was designed for creating character files which displays a refer-ence character and enables to use mouse and keyboard to draw a similar characteron top of it, using a particle shaped cursor. Particle coordinates would appearafter each stroke, to be copied later into the character file. Please see Figure 4.9for an illustration.

Figure 4.9: The character drawing tool with half of letter “A” completed andparticle coordinates on the left

Character files are ordinary PHP files that contain a standard class of typeCaptcha_Elements_Characters_Abstract. Character particle data is stored us-ing a multi-dimensional array construct in PHP language (please see Listing 4.2for an illustration). The multi-dimensional array, or matrix, is identical to theone described in Equation 4.5 (page 47).

The length of a particle, used to draw characters, is the same value that isdefined by a6, and represents a fraction of the total character size. As the drawingtool produces characters that have a maximum height and width of 100 pixels2,this value can also be thought of as a percentage of the total character size.

2Obviously this will be later scaled up or down, depending on the number of characters, fontsize etc.


array (array (54 , 52 , 0 , 5 ) ,array (46 , 52 , 0 , 5 ) ,array (39 , 52 , 0 , 5 ) ,array ( 5 6 , 1 2 , 1 . 2 , 5 ) ,array ( 5 9 , 1 9 , 1 . 3 , 5 ) ,

) ;

Listing 4.2: Example character data, showing 5 particles of the letter “A”

4.4 Chapter Summary

We have been given a thorough description of the captcha programme in thischapter. We have seen how it works, why it has been designed like this, how itlooks like and what are the methodology and algorithms behind it.

The next chapter will show how the right input parameters for this captchaengine were found.

Chapter 5

Test Design

This chapter gives a thorough overview of user tests and testing that was carriedout. We will start with an introduction, describing the purpose of these tests andthe general principles of genetic algorithms.

The next section will give a detailed description of the methodology used bythe genetic algorithm, namely how the processes of evaluation, reproduction andtermination work.

The third section contains the description of how this algorithm was imple-mented in the testing environment, and how exactly does this environment work.Lastly, a chapter summary will follow, concluding the issues discussed.

5.1 Introduction

The captcha programme, described in previous section, has quite a number ofinput parameters (as listed in Table 4.3.2, page 44).

It is worth repeating that each captcha is generated from a parameter vector~a = (a1, . . . , an) such that each ai is taken from a predefined set of values Ai.

ai ∈ Ai = {a(1)i , a(2)i , . . . , a

(ki)i } for i = 1, . . . , n, (5.1)

As explained in Section 1.2, there are two issues to be concerned with aboutcaptchas – security and usability. Both vary, depending on the particular ~a beingused.

The number of all possible captchas can be expressed as:

59

CHAPTER 5. TEST DESIGN 60

P =n∏i=1

ki =n∏i=1

|Ai| = k1 × k2 × . . .× kn (5.2)

Out of this large P , a captcha ~a with optimal security and usability needsto be chosen. This means that all intended security features should remain inproper use and the usability level of such captcha should be near the maximumof what can be achieved with such security.

To assure minimum security and usability levels, boundaries for each param-eter value set (Ai) were introduced. This ensures that the effects do not becometoo easy for computers, nor too difficult for humans to understand.

Within these boundaries, captchas with best usability could then be chosen.Genetic Algorithms (GA) are a good tool for solving optimisation problems suchas this, as they are very efficient at selecting best resolutions for problems withlarge number of possible solutions.

A Genetic Algorithm (GA) is a mathematical search technique based on theprinciples of natural selection and genetic recombination (Holland and Holland,1975). It uses biology-inspired operations (such as mutation and crossover) toproduce new instances (or individuals) of a system. A way most genetic algo-rithms work is illustrated on Figure 5.1

Figure 5.1: Principal steps common to most genetic algorithms

An individual is one possible solution to a problem and a fitness functionis used to determine how good such solutions are. Best individuals survive andproduce new generations while others do not. A chromosome is the computationaldata structure representing an individual, a niche is a sub-domain of the searchspace (Davis and Mitchell, 1991). Niches promote diversity and allow different


species to be developed. A genetic algorithm that uses niching is also called amulti-modal genetic algorithm (Mahfoud, 1995).

A multi-modal Interactive Genetic Algorithm (IGA) was devised to determinethe best parameter set ~a to be used with the captcha. Being interactive meansthat subjective evaluation of a human is involved in the fitness function (Takagiet al., 1998).

Tests involving human testers were carried out to find the ~a with highestusability index (i.e. individual with the highest fitness factor). Design of thistesting programme will be described in following sections.

5.2 Methodology

This section describes the specifics of the genetic algorithm. We will start withan overview of the individuals, what parameters were used in the chromosomes,how were niches used and how was the initial population created.

After this we will look at the main processes in the genetic algorithm – evalu-ation, reproduction and termination. The last section will list the actual param-eters used with the genetic algorithm.

5.2.1 Individuals

As this genetic algorithm is relatively simple, each individual is defined by onechromosome. This means that the fitness of one individual is defined entirely bythis chromosome and by nothing else.

Chromosomes, corresponding to the set of parameters being tested, are repre-sented by simple vector-like data structures, that have keys (parameter identifiers)and values. This is represented with the array language construct in PHP andwith separate table entries in MySQL Relational Database Management System(RDBMS).

Not all parameters in ~a were included in the algorithm, some (such as anima-tion width) were left out for consistency. Each chromosome is represented by aparameter vector ~c = (p1, . . . , pn) such that each pi is taken from a predefined setof parameter values Pi. It is important to note that as boundaries have been set topossible parameter values, each Pi is a subset of the corresponding Ai (Pi ⊆ Ai).


Parameters

The following table lists all parameters ai that have a corresponding parameterpi that was included in the test and is part of every chromosome ~c. Also, theupper and lower bounds (maxi and mini) are indicated. The purpose and effectof these parameters was described in Section 4.3.3.

Symbol Parameter mini maxi

a3 Animation speed 30 50a7 Number of background particles 550 750a8 Number of background layers 5 20a11 Character size 0.6 1a12 Character horizontal scaling 0.3 1a13 Local motion variants 1 100a14 Local motion frequency 0 100a15 Local motion range 0.001 0.05a16 Group motion frequency 1 6a17 Group motion range 0.3 0.5a18 Motion effect skip ratio 0.1 0.3a19 Visibility variants 5 100a20 Appearance probability 0.2 0.7a21 Disappearance probability 0.3 0.8a22 Local rotation variants 5 30a23 Local rot. std. deviation. 0.01 0.8a24 Local rotation span 1 100a25 Group rot. std. deviation 0.2 0.8a26 Group rotation span 30 100

Table 5.1: Parameters included in each chromosome ~c

Niches

A multi-modal search technique was chosen to promote diversity amongst indi-viduals. This means that getting stuck in a local maximum does not have to meanthat the algorithm cannot produce any better individuals (please see Figure 5.2for an illustration).

To promote niche development, predefined spaces for a predefined number of


�tn

ess

(x)

pi

Local maximum

Global maximum

B

A

Figure 5.2: Simplified two-dimensional graph illustrating the problem of localmaxima. Individual B can only progress to the level of local maximum. However,another individual A can achieve the global maximum.

individuals were allocated. Each such space, or island (a parallel from ecology),has a predetermined number of individuals and any individuals spawn from theinitial population are strictly bound to the same island. This increases the numberof local maxima and improves the chances of finding a better global maximum.

The number of islands was not predefined – new islands were automaticallycreated when there was need for them. This happened when a user entered thetesting environment and there had been exactly the required number of regen-erations for all existing islands, and all individuals on existing islands had beenrated the required number of times. This also happened when there were nomore individuals that could be rated by this particular user (users cannot rateindividuals more than once).

Initial Population

Testing starts with generating a new island with the initial population. Thisis done using random values from Pi for each parameter pi in a chromosome ~c.Values are created using uniform probability distribution (please see Equation4.2) to maximise diversity:

pi = U(mini,maxi) for i = 1, . . . , n, (5.3)

where n is the number of parameters in every ~c. This creates a set of chro-mosomes with maximum diversity within allowed boundaries.


5.2.2 Evaluation, Reproduction and Termination

The next logical steps after the initial population has been created, are evaluation,reproduction and termination.

Evaluation

The best individual needs to be found from each generation. Since the usabilityof captchas is not easily determinable automatically, humans carried out testswhere individuals were subjectively assessed.

The best individual from each generation was selected using the followinginformation obtained from each test result:

• Time – how long it took for the user to submit an answer (number of secondsfrom 0 to 60)

• Score – how the user rated the usability of a captcha (score from 1 to 7)

• Accuracy – how accurate was the user response. 0.5 means that half of thecharacters were correct.

These results were used to calculate the fitness score for each test. This isdone using multiplication, to penalise low results in any parameters. In addition,the significance of time was reduced 10 times, because tests were not carried outin a controlled environment and there was no way to ensure that testers were notengaged in other (time-consuming) activities while submitting tests.

result = accuracy× score×(

1− timetimemax × 10

)(5.4)

Every individual is tested a number of times to minimise the effect of usererrors and to get more objective assessments. The sum of these test results is thefitness value for a single individual.

Reproduction

The individual with the highest fitness value is used as a source for generatingnew slightly mutated individuals (offspring).

Mutation is applied to every parameter pi in ~c by finding a new normallydistributed random value with mean pi. Standard deviation is calculated using apredefined mutation constant constmut:


p′i = N (pi, constmut × (maxi −mini)) , (5.5)

where p′i is the new mutated value for the old parameter value pi. Please seeEquation 4.11 for the definition of N (µ, σ2).

Traditionally in genetic algorithms, also crossover operations may be appliedto chromosomes in this step. In crossover, some parts of chromosomes of selectedindividuals are exchanged to form new chromosomes (Wang et al., 1997). How-ever, this operation was not considered helpful for this particular algorithm andwas not used.

Termination

After every cycle of regeneration, evaluation and selection, every island is testedagainst termination criteria. If these criteria are met, no more individuals willbe generated on this island and the individual from the last selection process isconsidered to be the local optimum and the winning individual within its niche.

There are two criteria that need to be satisfied for the algorithm to terminatean island:

1. The required number of regenerations for each individual has been reachedon this island

2. Every individual has been rated the required number of times on this island

After this, the island will be locked and users will be directed to test individ-uals on other islands.

Test Setup

People involved in testing were found mostly using social networking sites. Thetesting environment was hosted on-line, so it was very convenient for people toenter the testing environment and participate (it usually took only a mouse-click).

The following parameters were used in the genetic algorithm:

• Initial population size: 25

• Number of regenerations: 8

• Number of new individuals spawned in each regeneration: 8


• Required tests per individual (from different IP addresses): 3

• Mutation constant (constmut): 5%

• Scale for rating individuals: from 1 to 7 (inclusive)

Therefore, 267 results (25 + 8 × 8 × 3) were required for each island. Pleasesee Chapter 6 for test results and specifics on the generated islands.

5.3 Implementation of the Testing Programme

The testing programme is the software produced for conducting user tests with thealgorithm and methodology explained in the previous section. It has four maincomponents, which will be shortly discussed in this section (please see Figure5.3).

Figure 5.3: Components of the testing software

Obviously the first component (the Testing Environment) is the most impor-tant, and therefore receives most of the attention in this section.

As with the captcha programme, PHP was used as the primary scriptinglanguage for the testing application. Also similarly to the captcha programme,any client side scripting was done using JavaScript.

MySQL Relational Database Management System (RDBMS) was used tostore any necessary information in the database (please see 5.3.4 for more onthe database schema). All data in the database is stored as plain text1, exceptfor captcha coordinates, which are stored in a compressed binary format.

1Obviously, what is meant here, is that in the RDBMS interface, plain text is used in queriesto retrieve and store information. This does not concern the low-level format in which the datais actually stored on the disk or operating memory


5.3.1 The Testing Environment

This section of the software is displayed when a user arrives at the testing envi-ronment2. Its purpose is to find an available individual for testing and display it,and also to accept the returning answer from user.

User Interface

As the test captchas can be confusing enough, the user interface was designedto be as minimalistic and simple as possible. Considering the fact that testerswere not paid for their time, the test was designed to be as little time-consumingand clear as possible (users probably would not bother reading more than twosentences in a row).

There is a welcome text at the top with instructions, that can be hidden(please see Figure 5.4). To make testing more convenient for more dedicatedtesters, keyboard shortcuts were introduced and listed on the left side of thepage.

To make the testing a bit more interactive, users are also notified of the cor-rectness of their last answer and of the total number of tests they have attempted.This information box lies on the right side of the page.

Figure 5.4: Snapshot of the testing software user interface

In the centre of the page lies the captcha challenge, and underneath it theinput controls. User is instructed to enter “any letter you can recognise” into the

2The testing environment can be accessed at <documentroot>/Tests


text area and then rate the readability of the captcha on the scale of 1 to 7 byclicking on a radio button below it.

The form will be submitted immediately after the user has clicked on a rating.Users are not required to enter anything in the text area, as many captchas mightbe completely unrecognisable.

To make the meaning of radio buttons more clear, a hovering tool-tip is dis-played when user moves his or her mouse cursor over the first, last or the middleradio button. This explains the meaning of the button (e.g. when moving themouse cursor over the “7” button, text will appear: “It’s very easy to understandthis captcha”).

After the form has been submitted, the input controls will be dynamicallyhidden (to avoid multiple submits) and the page will be reloaded with a newcaptcha challenge and updated information box.

A blank page with an information message will be displayed instead of acaptcha if the browser does not meet minimum requirements, if there are nomore tests available, or if a time-out has occurred.

Minimum requirements

Introducing minimum requirements to web browsers helps to ensure that contentand captchas are always displayed similarly to all users. If minimum requirementsare not met, then instead of the test interface, user is displayed a message alongwith download links to more recent versions of browsers.

Instead of restricting browser access by the User-Agent request header sentby the web browser, a more dynamic feature-based approach was used.

Requirements are considered met if the browser is able to createXMLHttpRequest objects (use AJAX), renders content according to the W3CCSS Box Model3, and supports the HTMLCanvasElement object (HTML5 canvas

elements).This set of requirements is met by all recent versions of all major browsers.

Control Flow (Request a Test)

An island is first selected and if there are none available, then a new one will becreated (if necessary). This action can also invoke the regeneration process to

3http://www.w3.org/TR/CSS2/box.html


create new individuals (please see Figure 5.5).

Figure 5.5: Diagram showing how an individual is selected if a test is requestedby a user

If there are no islands or tests available, a user is displayed an appropriatemessage.

On the first request, an HTML page is outputted with links to CSS andJavaScript files. JavaScript will initiate the second (AJAX) request that loadsthe coordinate data. Both requests go through the same process, only differenceis the answer to the “What to output?” decision point (please see Figure 5.5).

Control Flow (Submit a Test)

This action uses a HTTP POST method to submit the test score (along withother information) to the database. As it is never known what data user mightsubmit through an HTTP request, care is taken not to rely on the trustworthinessof it. Table 5.2 lists the parameters expected from each POST requests.

For the request to be considered valid (please see Figure 5.6), it must containall the expected parameters and the score and duration must be within allowedboundaries (1 to 7 and 1 to 60 respectively),

The token value is used to make sure that one could not custom-build requests


Parameter Example Descriptionanswer tra Value in the text field

duration 3 Duration in seconds (measured using JS)score 4 Value of the clicked radio button

test_id 1256 Identifier of the individualtoken 52c57. . . 27eab Security token for validation (40 characters)

Table 5.2: Parameters expected from the HTTP POST request

Figure 5.6: Diagram showing how the user response to the test is being handledby the system

to insert invalid data into the system. The security token (a randomised SHA1hash) is generated at the same time with the test and must match the string indatabase.

If the posted data is valid, it is still checked if the individual even needs moreresults at all. This is to avoid the situation where two or more testers mightbe given the same test captcha, even though only one more answer is required.Without this extra check, more than the required number of results might endup in the database for one individual.

5.3.2 The Preview Environment

The preview environment4 is similar to the Testing environment with the followingdifferences: users cannot rate tests; answers are not recorded; and captchas aregenerated from one of the n fittest individuals by applying a mutation rate of m(the same way as explained in Equation 5.5).

4The preview environment can be accessed at <documentroot>/Preview


Figure 5.7: Diagram showing how the user response to the test is being handledby the system

The purpose of this environment is to preview the best captchas (with highestfitness factor) to see what the algorithm has produced. As with the Testingenvironment, this was also publicly accessible to anyone during the testing period.A 20% mutation rate and 10 fittest individuals were used (n = 10,m = 0.2).

As seen from Figure 5.7, this action is very simple. This is because due toperformance issues, all preview captchas are pre-generated (see the next sectionfor more on that).

Similarly to the testing environment, on the first request, only a HTML page(along with JavaScript and CSS) is outputted. After that, JavaScript will triggerthe second request that loads the coordinate data.

5.3.3 Other Environments

There are two other environments that are not directly and publicly accessible.The maintenance interface is designed to be run on background and the Admin-istration environment is used to display and control the flow of user tests.

Maintenance Interface

The maintenance script is mainly concerned with generating necessary cachedcoordinate values and deleting unnecessary ones. Since it takes around 4 to 8seconds to generate coordinates for one captcha, this cannot be done in real-time. Also, as the average (compressed) coordinate cache takes around 250 KBof storage, it makes sense to delete the ones that are not needed any more.

The maintenance script was set up to executed by the cron job scheduler every


Figure 5.8: Control flow of the maintenance script

minute on the testing server. It generated a maximum of 10 sets of coordinatesin a row and preferred individuals that are most likely to be displayed next.

The individuals that have not yet been rated the required number of timesand do not yet have a cached coordinate set, get priority over others. They arereferred to as individuals IND1 (please see Figure 5.8).

The other type, IND2 individuals, get second priority. These are the indi-viduals used to generate cached captchas for the Preview interface. When IND2

individuals exist, the required number of them will be fetched and cloned withmutations. New preview captchas need to be generated because old ones aredeleted immediately after displaying.

Administration Environment

The purpose of the Administration environment5 is to provide an overview of theindividuals and islands that have been generated and rated by users.

This environment also provides some tools for generating or deleting existingislands and individuals. As this is not immediately necessary for the purpose ofproducing a better captcha (which is the goal of this project), the administrationenvironment will not be discussed in any further detail. However the programmecode of this section is still fully documented and included on the CD (please seeAppendix A).

5The administration environment can be accessed at <documentroot>/Admin


5.3.4 Database Schema

A database was required to store test results, test parameters and cached captchas.A brief overview of the database tables will be given in this section. Please seeAppendix F for the database schema diagram.

Individuals is the table that holds all generated captchas along with thechallenge text (the right answer), and security token used to validate test sub-missions. Result count is also cached in this table, as are the captcha coordinates(as gzipped binary blobs). All individuals belong to an island; these are storedin the island table, which has no parameter other than the island identifier andname.

The parameters included in chromosomes are held in the parameters table.Parameter values for generated captchas are stored in the parameter_values

table.The 3 parameters collected from user tests (duration, score, accuracy) are

stored in the results table, along with information like date and time of thetest, client IP etc.

The purpose of the table preview_cache is only to hold pre-generated coor-dinates for individuals in the Preview environment (see Section 5.3.2).

5.4 Chapter Summary

In this chapter we have been shown what are genetic algorithms and why andhow they were used in the context of this project. We have seen how and whythe testing environment was implemented and how it looks like.

In the next chapter we will see how this testing environment was used andwhat results did the genetic algorithm generate.

Chapter 6

Test Results

This chapter will present the results to the tests described in the previous chap-ters, starting with an overview of some general statistics. The next section willdescribe some noteworthy aspects about the generic data gathered from tests.Analysis of more complex data will follow, describing the details of the islands,individuals and parameter values that were generated. The last section will con-clude the chapter.

6.1 Overview

The testing environment was accessible for the period of 20 days, but most resultswere acquired during the second week (please see Figure 6.1). A total of 2716results were submitted and 924 individuals generated. As one island could hold89 individuals, 12 islands were generated to accommodate all individuals.

703

0

200

400

600

800

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Figure 6.1: Participation during the testing period peaked on the 9th day

Results were submitted from 107 different Internet Protocol (IP) addresses, i.e.at least 107 different people took part in the testing. On average, 25 submissions

74

CHAPTER 6. TEST RESULTS 75

were made by each user, maximum submissions made from a single IP was 401.13 different countries were identified by the IP addresses acquired.

43% of the traffic originated from the social networking site Facebook.com,46% was direct traffic1.

6.2 Acquired Figures

As previously explained, 3 pieces of information were obtained from each submis-sion - score, test duration and accuracy of the answer. After that, fitness of eachindividual was calculated (see Equation 5.4).

Figure 6.2 displays how these 3 parameters affected the actual fitness valuesthat were calculated for each individual.

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 3 6 9 12 15 18 21

Accuracy

Score

Dura!on

Linear (Accuracy)

Linear (Score)

Linear (Dura!on)

Figure 6.2: This graph illustrates how accuracy, score and duration (scaled toratios of maximum values) influenced the final fitness score for individuals. Valuesfor duration are flipped, i.e. 80% duration means that 20% of time was used and80% was not.

As we can see, test duration did not greatly affect the test results. Although,as designed, a slight penalty was given for slow answers. The relationship be-tween accuracy and fitness shows strong correlation and that maximum accuracy

1Direct traffic means that the referring site was not specified in the request. This can be thecase when typing the URL in the web browser, using bookmarks, opening links from e-mailsetc.


was required for achieving higher fitness values. Also, as planned, a strong corre-lation is present between score and fitness. We can also see that individuals withinaccurate submissions or low scores could only reach lower fitness levels.

Figure 6.3 shows the distribution of test scores and durations. We can seethat ratings were generally negative, as the most popular scores are 1 and 3, andthe least popular 6 and 7. 30% of tests were submitted within the first 5 seconds,64% during the first 10, and 89% during the first 20 seconds.

100

200

300

400

500

1 2 3 4 5 6 7

0

400

800

1200

1600

0 10 20 30 40 50 60

Figure 6.3: Distribution of ratings (left) and test durations (right)

The distribution of accuracy measures (Figure 6.4) shows that a third of theresults were completely wrong and more than half completely correct. This wouldindicate that tests were either clear or they were not, and there was very littleguessing.

0

200

400

600

800

1000

1200

0% 25% 33% 50% 67% 75% 100%

Figure 6.4: Distribution of the accuracy of tests


6.3 Islands and Individuals

8 islands out of 12 reached the required number of regenerations. The remaining4 were discarded and were not included in any further analysis of the results.Please see Table 6.1 for an overview of islands and individuals they contain.

Island Avg fitness Avg duration Avg accuracy Max fitness

1 10.2 10 73% 20.62 6.2 14 56% 16.23 9.5 10 70% 19.74 2.6 13 29% 12.35 7.3 12 61% 19.76 8.2 10 61% 20.87 8.7 9 72% 19.88 7.2 9 68% 17.8

Table 6.1: Overview of the generated islands and individuals

As every island contains 89 individuals and 3 results were acquired for eachindividual, a total of 2136 results qualified for further analysis.

0

3

6

9

12

15

18

21

0 15 30 45 60 75 90

Fitness

Individuals

Island 1

Island 2

Island 3

Island 4

Island 5

Island 6

Figure 6.5: Fitness scores of all individuals on each island

Figure 6.5 visualises the fitness values in the table above, showing all individ-uals on each island with their fitness scores. It can be seen that individuals on


islands 1 and 3 were most successful, whereas individuals on island 4 were theleast.

Figure 6.6 shows snapshots of the best individuals from each island. It canbe seen that fragments of some characters are recognisable on some captchas,whereas some give very little information on what might be displayed in theanimation.

Some of these captchas, such as island 2, 3 and 5 can be characterised byquite fast movements, where characters often move outside the animation areaand might not be all visible at the same time. Some (island 6 and 7) can becharacterised by slower motion and greater rotation.

Island 1 Island 2 Island 3

Island 4 Island 5 Island 6

Island 7 Island 8

Figure 6.6: Snapshots of the fittest individuals on each of 8 islands. On somecaptchas, fragments of characters can be recognised also from these still images.

The fact that a very diverse set of individuals was produced, can also beillustrated by Figure 6.7. This graph shows the parameters included in the chro-mosomes of the fittest individuals on each 8 islands.

The values are scaled as ratios of the maximum values allowed by the setboundaries (see Table 5.1 from the previous chapter). Green colour indicates a lowvalue and dark red a higher value. It can be noticed that for nearly all parameters,both very hight and very low parameters are present in these individuals. Pleasesee Appendix G for numerical representation of this data.


Island 1

Island 2

Island 3

Island 4

Island 5

Island 6

Island 7

Island 8

a03 a07 a08 a11 a11 a13 a14 a15 a16 a17 a18 a19 a20 a21 a22 a23 a24 a25 a26

80%-100% 60%-80% 40%-60% 20%-40% 0%-20%

Figure 6.7: Parameters ai of the fittest individuals ~a on each island. Both highand low values are present for nearly all parameters.

Figure 6.8 shows the increase in average fitness of individuals, as the newgenerations were spawned. As we can see, the change in average fitness wasgenerally positive and seemed to stabilise in the 8th generation (7th re-generation).An exception was island 4 that suffered exceptionally unsuccessful individuals andlow scores.

0

2

4

6

8

10

12

1 2 3 4 5 6 7 8 9

Fitness

Island 1

Island 2

Island 3

Island 4

Island 5

Island 6

Island 7

Island 8

Figure 6.8: Average fitness of islands over 9 generations

Figure 6.9 shows average response accuracy for last two generations of indi-viduals on each island. It can be noticed that some islands reached quite highaccuracy (> 90%), whereas some did not.

The fittest overall individual was a captcha with combined fitness of 20.82.This individual was tested again (using the Preview environment) to learn the


80%

48%

83%

40%

85%

64%

94% 95%

0%

20%

40%

60%

80%

100%

Island

1

Island

2

Island

3

Island

4

Island

5

Island

6

Island

7

Island

8

Figure 6.9: Average accuracy of results for last two generations (8 and 9) on eachisland

final usability rate for the captcha. A maximum of 20% mutation was applied toits parameters to generate variations of the captcha. 178 tests were submittedwith an average accuracy of 82%. Contrary to the method used in the fitnessfunction, binary accuracy values were used here (i.e. two correct characters outof four resulted in 0% not 50% accuracy.)

6.4 Chapter Summary

We have seen many interesting figures and correlations in this section. The geneticalgorithm produced 8 qualified islands with 712 individuals and total of 2136submitted results.

The next chapter will discuss what conclusions can be made from this dataand how the results match initial expectations.

Chapter 7

Discussion

This chapter provides a discussion about the results achieved and accomplish-ments made in this project. As there were two major parts in this project, thediscussion will also analyse both of these elements separately. We will start withthe evaluation of the captcha programme, followed by an assessment of the test-ing algorithm. After this, an analysis of the issues of usability and security isgiven. The final section concludes the chapters.

7.1 Critical Analysis of the Captcha Design

The captcha engine developed during this project behaved generally as expectedand no problems were encountered during the testing period. However, someissues should be mentioned.

Performance. The fact that an average captcha takes ≈ 6 seconds to generate,might become a problem in production environments. In the testing environmentused in this project, all captchas were pre-generated and cached prior to presen-tation.

As seen from Figure 7.1, the culprit is the process that calculates new particlecoordinates for generated effect variants (described in Section 4.3.3). Note thatthe scale is logarithmic and this process takes nearly 96% of the total generationtime. The process that transforms coordinates to (x1, y1, x2, y2) format (describedin the same section), is second with 3%.

Although the described processes contain somewhat complex calculations, it isreasonable to assume that the problem lies also in PHP. It is common knowledge

81

CHAPTER 7. DISCUSSION 82

0.02

0.15

70

19

6

5,764

151

0.01 0.1 1 10 100 1000 10000

Create characters

Create background

Create mo!on variants

Create rota!on variants

Create visibility variants

Calculate coordinates

Transform coordinates

Figure 7.1: Graph illustrating time requirements (milliseconds) for different pro-cedures in the captcha generation process. Note the logarithmic scale.

that in PHP, object-oriented code is slower than plainly procedural code. Also,a language that supports strong typing and built-in operations with vectors andmatrices should yield better performance.

However, there are many arguments for PHP (see Section 4.3.1) and there isno direct reason to suggest that captchas should not be pre-generated. It is alsopossible that re-writing some code in a more procedural fashion might decreasethe time requirements (however, this was not the purpose of this project).

Presentation. One might argue that using HTML5, JavaScript and canvas

for presenting the captcha is a bad idea, since all browsers might not yet supportthese features, and the implementation of the captcha might get too messy andcomplicated. Alternative methods include using Adobe Flash, GIF animations,Java or similar.

However, there is a great advantage in using JavaScript, that one should notoverlook. Namely, most of the infrastructure in place for breaking captchas1 isdesigned for still images, or at least media that can be transmitted in a singlefile. It is very difficult to redirect this kind of animation to another site for real-time breaking. This increases greatly the security provided by such captchas (seeSection 7.3 for the security analysis).

Background clutter. Although background clutter seems to serve its purposefairly well, it could be enhanced with a more complex particle distribution algo-rithm.

1This includes services offering human-based laundry-attacks such as deathbycaptcha.comand decaptcher.com


Figure 7.2: Snapshot of all frames in the captcha animation. On some occasionscharacter regions can be identified.

Figure 7.2 shows an animation window that has not been cleared after eachframe (i.e. all frames have been drawn onto each other). With some captchas(such as this) one could easily see the fingerprints of characters – darker areas,that indicate the presence of more particles.

A more complex algorithm could try to position particles less randomly outsidethe characters, to distribute the visual scene more evenly.

User Satisfaction. Feedback indicates that this kind of captcha is generallynot very much appreciated by test users. During test sessions, it was often con-sidered annoying, if not frustrating.

However, most of this negative feedback came from the early stages of testing,where captchas had reached only first or second generation, and were often com-pletely unrecognisable. It must also be noted that average test user solved around25 captchas, but real-life usage would probably involve much fewer occasions.

7.2 Critical Analysis of the Test Design

As with the captcha, the testing programme served its purpose and producedcaptchas with desired characteristics. Nonetheless in hindsight, some problemscould have been solved better and there is some room for improvements.


Parameters in Chromosomes. One parameter, not included in the chromo-somes, was the number of character used in the animation. This is because ithas a very small range (less than 2 becomes too insecure, more than 4 becomesunreadable) and because variable challenge length increases security.

In the sense of optimisation, it might have been better to use the same numberof characters in the tests. This is because the positioning algorithm decreases textsize with every added character, and smaller characters might get fewer markseven if all other parameters are the same.

Unfortunately there are no good ways to ensure that text is always the samesize. One option would have been to use fixed challenge length, which wouldmake the captcha considerably less secure. The other would have been to usesame text size for every challenge, which would have had a negative impact onusability (having all characters too small is worse than having some characterstoo small).

Parameter Bounds The parameter bounds (see Table 5.1) were set quite sub-jectively and did not consider the possibility that usability (or security) mightdepend on more complex relationships between these parameters. For example,lower bound for parameter X might be enough to achieve sufficient security in allcases except for when parameter Y is very low.

These more complex relationships between two or more parameters were notidentified nor included in the algorithm. This resulted in captchas where theZero-Knowledge Per Frame Principle (ZKPFP) is not completely ensured andfragments of characters can be identified from individual frames as well (for ex-ample see Island 3 on Figure 6.6).

However, the goal in this project was not to implement ZKPFP, but to developa captcha inspired by it.

Figure 6.7 (page 79) shows parameter values for the fittest individuals oneach island. It can be noticed that a very high and a very low value is presentfor most of the parameters. This can indicate either that the parameters valuesdid not have enough time to stabilise and reach the actual local maximum. Orit can mean that the local maxima are very different and the captchas turnedout to be very diverse. Running same tests again using significantly more than 8regenerations could show that.


User Interface. At least two people had initially misunderstood the instruc-tions and thought they were supposed to click on the rating buttons (1 to 7) toindicate how many characters they see. This was not the case, as a subjectiveusability rating was supposed to be given using these controls.

Although they admitted not reading instructions very carefully, this couldhave been foreseen. For example, using text labels instead of digits could haveprevented this issue.

Fitness Function. The fitness function (see Section 5.2.2) performed well, buton some occasions, a near-maximum value was achieved too fast.

The best solutions would have been to require more than 3 tests for eachindividual. However, as resources were limited, this was not possible.

The other option would have been to make the fitness function more complexby using more variables. Then again, the testing interface was already quitedifficult to use, requiring users to enter two pieces of information (answer andscore).

7.3 Security vs Usability – Analysis

We saw that the random guess, or brute force efficiency for this captcha is roughly3.1×10−6 which is quite low and within acceptable levels. However, random guessis rarely used to break captchas.

As mentioned before, 6 captcha-breaking services were identified, most ofwhich offer outsourcing attacks using custom APIs at a price starting from $0.001per captcha (see Table 2.1). All 6 services were sent queries about the possibleprice for breaking the captcha described in this project.

In the communication, a link was provided to the Preview environment (seeSection 5.3.2) that displayed a random selection of the fittest individuals with20% mutation rate for parameters. Not all services responded, but the ones thatdid, confirmed that it is not possible for them to break this kind of captcha. Onereply added that “The accuracy rate would be from very little to none.”. Table7.1 lists all the services that were contacted and the replies they provided.

It is reasonable to assume that a commercial service that is still active andmight be capable of breaking a captcha, would respond to any price queries aboutit. Therefore, no commercial service could be identified that would be capable


Service Answerbypasscaptcha.com Not possiblebeatcaptchas.com No replycaptchacracking.com No replycaptchatrader.com Not possibledeathbycaptcha.com Not possibledecaptcher.com Not possible

Table 7.1: Replies from commercial captcha-breaking services

of or willing to provide a captcha-solving service (outsourcing or else) for thecaptcha described in this project. This is an incredibly good result in terms ofthe objectives set for this project.

The other important goal in this project was good enough usability to beuseful in real-life situations. Chellapilla et al. (2005a) have stated that “Forgood usability the human success rate should approach 90%” and tests havedemonstrated that this captcha can reach usability levels up to 82%. This canbe considered a good result, as minor manual tweaking of the parameters wouldprobably increase the usability rate even more.

7.4 Chapter Summary

In this chapter we have discussed the problems that were encountered duringthe development of both the captcha programme and the testing algorithm. Wehave also seen that bot usability and security prospects are good for this captchaand the final outcome can be satisfied with. The next chapter will conclude thedissertation and list some potential future improvements.

Chapter 8

Summary

The objective of this project was to “devise a novel motion based captcha withgood enough security and usability to compete with current popular captchas”.We have seen promising results that assure that this objective has been met.

Background research showed that quite substantial work has already beendone in captcha research, and some interesting ideas, such as the Zero KnowledgePer Frame Principle (ZKPFP) in motion captchas were identified.

Following the background research, a novel motion based captcha was de-signed. The captcha was capable of drawing segmented moving characters, mixedwith background clutter, to form an animation. Animation was designed to berelatively easy for humans and very difficult for computers to understand. Thecaptcha performed generally well and as expected, however the generation pro-cess of each captcha takes more than would probably be acceptable in productionenvironments. Still, this issue is not critical, as it can be eliminated with pre-generating and caching of captchas.

The captcha accepted a number of input parameters, most of which withquite a large range. To determine the best set of input parameters for achievingbest usability levels, an interactive genetic algorithm was designed, together witha testing environment for testers. The algorithm used subjective user ratingamongst other parameters in its fitness function, to produce the best and fittestindividuals (captchas).

The captchas produced by the genetic algorithm showed great diversity andgood usability levels. While the ZKPFP was not completely achieved with allcaptchas, they still are very hard to decode with frame-by-frame analysis. Thiswas confirmed by feedback from captcha-breaking services, four of whom stated

87

CHAPTER 8. SUMMARY 88

that they do not provide attacks against this captcha. The other two did notrespond and are presumably not operational any more.

Captcha with the highest fitness score produced by the genetic algorithm wastested again to find out an adequate usability rate. This turned out to be 82%,which is a good result, as it has been stated that the rate humans can understanda “good captcha“ 90% (Chellapilla et al., 2005a).

One of the biggest problems remains user satisfaction, as this captcha wasconsidered generally quite difficult and not very user-friendly. However, the twomajor goals in this project were high usability and security, not user satisfaction.It is probably safe to say that most people do not enjoy solving any type ofcaptchas, moving or not. Although user satisfaction is a desirable quality, thecurrent compromise between the three looks reasonable and practical.

8.1 Future Work

Responses from the four captcha-breaking services are a good and encouragingsign that the captcha has good potential for being more secure than currentalternatives. However, a more thorough security analysis (breaking attempt) ofthe captcha should still be performed. This was unfortunately out of the scopeof this project.

The captcha programme has served well as a proof-of-concept and PHP hasjustified itself as a programming language of choice. However, if this captcha wereto be used in a production environment, it would probably be best to re-writethe generation procedure in some programming language that performs fastercalculations and operations with objects and vectors.

Current version of the captcha uses only one set of characters. Although theactual shape of characters is different in every captcha (due to added effects), itwould enhance security to use more than one typeface. This can be done quiteeasily with the character drawing tool (see Section 4.3.5) and slight modificationsto the characters generation algorithm.

Another security enhancement would be the more complex background clutterplacement algorithm, described in previous chapter. Although not crucial, itwould probably make automated segmentation attacks more difficult.

If there is going to be a breaking algorithm for this captcha, it is probablygoing to be based on a template matching. Introducing an effect that simulates

CHAPTER 8. SUMMARY 89

3-dimensional movement would make this more difficult.

This section concludes the theoretical part of this project. The project alsoincludes the developed software, please see Appendix A and the attached CD formore. To see the working examples of the captcha, it is easier to check the on-lineresources provided in Appendix C.

References

Athanasopoulos, E. and Antonatos, S. (2006). Enhanced captchas: Using anima-tion to tell humans and computers apart. In Communications and MultimediaSecurity, pages 97–108. Springer.

Baird, H. and Bentley, J. (2004). Implicit CAPTCHAs. In Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, volume 5676,pages 191–196.

Baird, H. and Popat, K. (2002). Human interactive proofs and document im-age analysis. In Proceedings of the 5th International Workshop on DocumentAnalysis Systems V, pages 507–518. Springer-Verlag.

Baird, H. and Riopka, T. (2005). ScatterType: a reading CAPTCHA resistantto segmentation attack. In Proc. SPIE, volume 5676, pages 197–201. Citeseer.

Bergmair, R. and Katzenbeisser, S. (2004). Towards human interactive proofs inthe text-domain. Information Security, pages 257–267.

Bigham, J. and Cavender, A. (2009). Evaluating existing audio CAPTCHAs andan interface optimized for non-visual use. In Proceedings of the 27th interna-tional conference on Human factors in computing systems, pages 1829–1838.ACM.

Bruce, V., Green, P., and Georgeson, M. (1996). Visual perception: physiology,psychology, and ecology. Psychology Press.

Bursztein, E., Bethard, S., Fabry, C., Mitchell, J., and Jurafsky, D. (2010). Howgood are humans at solving CAPTCHAs? a large scale evaluation. In 2010IEEE Symposium on Security and Privacy, pages 399–413. IEEE.

90

REFERENCES 91

Chaohui, Z., Xiaohui, D., Shuoyu, X., Zheng, S., and Min, L. (2007). An im-proved moving object detection algorithm based on frame difference and edgedetection. In Image and Graphics, 2007. ICIG 2007. Fourth International Con-ference on, pages 519–523. IEEE.

Chellapilla, K., Larson, K., Simard, P., and Czerwinski, M. (2005a). Buildingsegmentation based human-friendly human interaction proofs (HIPs). HumanInteractive Proofs, pages 1–26.

Chellapilla, K., Larson, K., Simard, P., and Czerwinski, M. (2005b). Computersbeat humans at single character recognition in reading based human interactionproofs (HIPs). In Proceedings of the Second Conference on Email and Anti-Spam, pages 21–22. Citeseer.

Chew, M. and Baird, H. (2003). BaffleText: a human interactive proof. volume5010, pages 305–316. Conference on Document Recognition and Retrieval X,Santa Clara, CA, JAN 22-24, 2003.

Chew, M. and Tygar, J. (2004). Image recognition captchas. Information Secu-rity, pages 268–279.

Chow, R., Golle, P., Jakobsson, M., Wang, L., and Wang, X. (2008). Makingcaptchas clickable. In Proceedings of the 9th workshop on Mobile computingsystems and applications, pages 91–94. ACM.

Coates, A., Baird, H., and Faternan, R. (2001). Pessimal print: a reverse Turingtest. In Document Analysis and Recognition, 2001. Proceedings. Sixth Interna-tional Conference on, pages 1154–1158. IEEE.

Cui, J., Mei, J., Wang, X., Zhang, D., and Zhang, W. (2009). A CAPTCHAImplementation Based on 3D Animation. In 2009 International Conference onMultimedia Information Networking and Security, pages 179–182. IEEE.

Cui, J., Wang, L., Mei, J., Zhang, D., Wang, X., Peng, Y., and Zhang, W.(2010a). CAPTCHA design based on moving object recognition problem. InInformation Sciences and Interaction Sciences (ICIS), 2010 3rd InternationalConference on, pages 158–162.

Cui, J., Zhang, W., Peng, Y., Liang, Y., Xiao, B., Mei, J., Zhang, D., and Wang,X. (2010b). A 3-layer Dynamic CAPTCHA Implementation. In 2010 Second

REFERENCES 92

International Workshop on Education Technology and Computer Science, pages23–26. IEEE.

Datta, R., Li, J., and Wang, J. (2005). IMAGINATION: a robust image-basedCAPTCHA generation system. In Proceedings of the 13th annual ACM inter-national conference on Multimedia, pages 331–334. ACM.

Davis, L. and Mitchell, M. (1991). Handbook of genetic algorithms. Van NostrandReinhold.

Elson, J., Douceur, J., Howell, J., and Saul, J. (2007). Asirra: A CAPTCHAthat Exploits Interest-Aligned Manual Image Categorization.

Ferzli, R., Bazzi, R., and Karam, L. (2006). A captcha based on the Human VisualSystems masking characteristics. In 2006 IEEE International Conference onMultimedia and Expo, pages 517–520. IEEE.

Fischer, I. and Herfet, T. (2006). Visual CAPTCHAs for document authenti-cation. In Multimedia Signal Processing, 2006 IEEE 8th Workshop on, pages471–474. IEEE.

Fritsch, C., Netter, M., Reisser, A., and Pernul, G. (2010). Attacking imagerecognition captchas: a naive but effective approach. In Proceedings of the7th international conference on Trust, privacy and security in digital business,TrustBus’10, pages 13–25, Berlin, Heidelberg. Springer-Verlag.

Google Inc. (2009). Official Google Blog. http://googleblog.blogspot.com/

2009/09/teaching-computers-to-read-google.html.

Google Inc. (2011). reCAPTCHA FAQ. http://www.google.com/recaptcha/

faq.

Gossweiler, R., Kamvar, M., and Baluja, S. (2009). What’s up CAPTCHA?: aCAPTCHA based on image orientation. In Proceedings of the 18th internationalconference on World wide web, pages 841–850. ACM.

Gupta, A., Jain, A., Raj, A., and Jain, A. (2009). sequenced tagged Captcha:generation and its analysis. In Advance Computing Conference, 2009. IACC2009. IEEE International, pages 1286–1291. IEEE.

REFERENCES 93

Hernandez-Castro, C. and Ribagorda, A. (2009). Remotely telling humans andcomputers apart: An unsolved problem. iNetSec 2009–Open Research Problemsin Network Security, pages 9–26.

Hernandez-Castro, C. and Ribagorda, A. (2010). Pitfalls in CAPTCHA designand implementation: The Math CAPTCHA, a case study. Computers & Se-curity, 29(1):141–157.

Hernandez-Castro, C., Ribagorda, A., and Saez, Y. (2009). Side-channel attackon labeling captchas. Arxiv preprint arXiv:0908.1185.

Hindle, A., Godfrey, M., and Holt, R. (2008). Reverse Engineering CAPTCHAs.In 2008 15th Working Conference on Reverse Engineering, pages 59–68. IEEE.

Holland, J. and Holland, J. (1975). Adaptation in natural and artificial systems:an introductory analysis with applications to biology, control, and artificial in-telligence. University of Michigan Press.

Hoque, M., Russomanno, D., and Yeasin, M. (2006). 2d captchas from 3d models.In SoutheastCon, 2006. Proceedings of the IEEE, pages 165–170. IEEE.

Huang, S., Lee, Y., Bell, G., and Ou, Z. (2010). An efficient segmentation algo-rithm for CAPTCHAs with line cluttering and character warping. MultimediaTools and Applications, 48(2):267–289.

Jeng, A., Tseng, C., Tseng, D., and Wang, J. (2010). A study of CAPTCHA andits application to user authentication. Computational Collective Intelligence.Technologies and Applications, pages 433–440.

Jokela, T., Iivari, N., Matero, J., and Karukka, M. (2003). The standard ofuser-centered design and the standard definition of usability: analyzing iso13407 against iso 9241-11. In Proceedings of the Latin American conference onHuman-computer interaction, CLIHC ’03, pages 53–60, New York, NY, USA.ACM.

Kim, J., Chung, W., and Cho, H. (2010). A new image-based CAPTCHA usingthe orientation of the polygonally cropped sub-images. The Visual Computer,26(6):1135–1143.

REFERENCES 94

Kluever, K. and Zanibbi, R. (2009). Balancing usability and security in a videoCAPTCHA. In Proceedings of the 5th Symposium on Usable Privacy and Se-curity, pages 1–11. ACM.

Lanapsoft, Inc (2011). BotDetect. http://captcha.biz/.

Ledgeway, T. and Smith, A. T. (1994). Evidence for separate motion-detectingmechanisms for first- and second-order motion in human vision. Vision Re-search, 34(20):2727 – 2740.

Lerdorf, R., Tatroe, K., and MacIntyre, P. (2006). Programming PHP. O’ReillySeries. O’Reilly.

Lillibridge, Mark D., A. M. B. K. B. A. Z. (2001). Method for selectively re-stricting access to computer systems. http://www.freepatentsonline.com/6195698.html.

Lopresti, D. (2005). Leveraging the CAPTCHA problem. Human InteractiveProofs, pages 97–110.

Lupkowski, P. and Urbanski, M. (2008). Semcaptcha —user-friendly alternativefor ocr-based captcha systems. In Computer Science and Information Technol-ogy, 2008. IMCSIT 2008. International Multiconference on, pages 325 –329.

Mahfoud, S. (1995). Niching methods for genetic algorithms. Urbana, 51(95001).

Mitra, N., Chu, H., Lee, T., Wolf, L., Yeshurun, H., and Cohen-Or, D. (2009).Emerging images. ACM Transactions on Graphics (TOG), 28(5):1–8.

Morein, W., Stavrou, A., Cook, D., Keromytis, A., Misra, V., and Rubenstein, D.(2003). Using graphic turing tests to counter automated ddos attacks againstweb servers. In Proceedings of the 10th ACM conference on Computer andcommunications security, pages 8–19. ACM.

Mori, G. and Malik, J. (2003). Recognizing objects in adversarial clutter: Break-ing a visual CAPTCHA.

Naor, M. (1997). Verification of a human in the loop or Identification via theTuring Test. Unpublished manuscript.

OCR Research Team (2006). http://ocr-research.org.ua/teabag.html.

REFERENCES 95

Raj, A., Jain, A., Pahwa, T., and Jain, A. (2009). Analysis of tagging variantsof Sequenced Tagged Captcha (STC). In Science and Technology for Human-ity (TIC-STH), 2009 IEEE Toronto International Conference, pages 427–432.IEEE.

reCAPTCHA Team (2011). personal communication.

Regan, D. (1989). Orientation discrimination for objects defined by relative mo-tion and objects defined by luminance contrast. Vision Research, 29(10):1389– 1400.

Regan, D. and Hamstra, S. (1991). Shape discrimination for motion-defined andcontrast-defined form: squareness is special. Perception, 20:315–336.

Rui, Y. and Liu, Z. (2004). ARTiFACIAL: Automated reverse Turing test usingFACIAL features. Multimedia Systems, 9(6):493–502.

Schreier, H., Orteu, J.-J., Sutton, M. A., Michael A., M. A., Orteu, J.-J., andSchreier, H. W. (2009). Digital image correlation (dic). In Image Correlationfor Shape, Motion and Deformation Measurements, pages 1–37. Springer US.10.1007/978-0-387-78747-3_5.

Shapiro, L. and Stockman, G. (2001). Computer Vision. 2001. Prentice Hall.

Shirali-Shahreza, M. and Shirali-Shahreza, S. (2007). Question-BasedCAPTCHA. In iccima, pages 54–58. IEEE Computer Society.

Shirali-Shahreza, M. and Shirali-Shahreza, S. (2008). Motion CAPTCHA. InHuman System Interactions, 2008 Conference on, pages 1042–1044. IEEE.

Smeulders, A., Worring, M., Santini, S., Gupta, A., and Jain, R. (2000). Content-Based Image Retrieval at the End of the Early Years. IEEE Transactions onPattern Analysis and Machine Intelligence, 22(12):1349.

Sonka, M., Hlavac, V., and Boyle, R. (2008). Image processing, analysis, andmachine vision (third edition). International Thomson.

Stevanović, R. (2011). Quantum Random Bit Generator Service. http://

random.irb.hr.

REFERENCES 96

Takagi, H., Unemi, T., and Terano, T. (1998). Interactive evolutionary compu-tation. In Proc. Int. Conf. Soft Comput. Inf./Intell. Syst, pages 41–50.

Turing, A. (1950). Computing machinery and intelligence. Mind, 59:433–460.

Veret, A. (2009). JCaptcha. http://jcaptcha.sourceforge.net.

Von Ahn, L., Blum, M., Hopper, N., and Langford, J. (2003). CAPTCHA: Usinghard AI problems for security. Advances in Cryptology—EUROCRYPT 2003,pages 646–646.

Von Ahn, L., Blum, M., and Langford, J. (2004). Telling humans and computersapart automatically. Communications of the ACM, 47(2):56–60.

Von Ahn, L. and Dabbish, L. (2004). Labeling images with a computer game. InProceedings of the SIGCHI conference on Human factors in computing systems,pages 319–326. ACM.

Von Ahn, L., Maurer, B., McMillen, C., Abraham, D., and Blum, M. (2008).recaptcha: Human-based character recognition via web security measures. Sci-ence, 321(5895):1465.

Wang, L., Siegel, H., Roychowdhury, V., and Maciejewski, A. (1997). Taskmatching and scheduling in heterogeneous computing environments using agenetic-algorithm-based approach* 1,* 2. Journal of Parallel and DistributedComputing, 47(1):8–22.

Wilkins, J. (2009). Strong captcha guidelines v1. 2.

Ximenes, P., dos Santos, A., Fernandez, M., and Celestino, J. (2006). ACAPTCHA in the Text Domain. In On the Move to Meaningful Internet Sys-tems 2006: OTM 2006 Workshops, pages 605–615. Springer.

Yamamoto, T., Tygar, J., and Nishigaki, M. (2010). Captcha using strangenessin machine translation. Advanced Information Networking and Applications,International Conference on, 0:430–437.

Yan, J. (2009). Bot, Cyborg and Automated Turing Test. In Security Protocols,pages 190–197. Springer.

REFERENCES 97

Yan, J. and El Ahmad, A. (2007). Breaking visual captchas with naive patternrecognition algorithms. In acsac, pages 279–291. IEEE Computer Society.

Yan, J. and El Ahmad, A. (2009). CAPTCHA security: a case study. Security& Privacy, IEEE, 7(4):22–28.

Yan, J. and El Ahmad, A. S. (2008). Usability of captchas or usability issues incaptcha design. In Proceedings of the 4th symposium on Usable privacy andsecurity, SOUPS ’08, pages 44–52, New York, NY, USA. ACM.

Appendix A

Contents of the Attached CD

All software produced in the course of this project is available on the attachedcompact disk. Please note that for running the PHP scripts, one requires a work-ing Apache server with the required software installed. Please refer to AppendixB for more information.

Notable directories and files are explained below.

/readme.txt

Read-me file containing detailed instructions for installing and running thesoftware.

/sqldump.sql.gz

MySQL dump required to run the testing, preview and administration en-vironments. Also contains all test results.

/source/

Contains source code of the developed software

/source/conf.php

Main configuration file, i.a. containing the DSN (Database Source Name)

/source/lib/

Contains third party libraries used by the software.1

/doc/index.html

Contains API documentation for the software (in HTML format, generatedautomatically with PhpDocumentor)

1Please refer to Appendix D for the definite list

98

Appendix B

Software Requirements

The following software needs to be installed on a computer to run the code onthe attached CD. If one does not wish to bother with this, the software can alsobe found on-line on a preconfigured server. Please Appendix C for specifics.

Apache HTTP ServerMore information: http://httpd.apache.org

PHP Runtime (5.2.17+)More information: http://www.php.net

MySQL Client Libary for PHP (5.0+)Not required for PHP 5.3+. More information: http://dev.mysql.com/

99

Appendix C

Finding Materials On-line

Since the software can be difficult to run from the attached compact disk, every-thing on the CD can also be found on-line.

Public SVN repository with source code (hosted at SourceForge)https://svn.code.sf.net/p/kundi-msc/code

Server with working code set uphttp://captcha.ivokund.eu/Preview - Preview environmenthttp://captcha.ivokund.eu/Tests - Testing environmenthttp://captcha.ivokund.eu/Admin - Administration environment

On-line API documentationhttp://captcha.ivokund.eu/doc/

Downloadable source codehttp://captcha.ivokund.eu/source/

CD image file of the attached diskhttp://captcha.ivokund.eu/cd.iso

100

Appendix D

List of Used 3rd Party Libraries

The following 3rd party libraries were used and are included in the programmecode (CD and on-line resources).

Zend Framework for PHPhttp://framework.zend.com/

Copyright c©2006 - 2011 by Zend Technologies Ltd.Licenced under New BSD License

jQuery JavaScript Libraryhttp://jquery.com/

Copyright c©2011, John ResigDual licensed under the MIT or GPL Version 2 licenses.

jQuery Cookie pluginhttps://github.com/carhartl/jquery-cookie/

Copyright c©2010 Klaus HartlDual licensed under the MIT and GPL licenses

jQuery qTip2 pluginhttps://github.com/craga89/qtip2/

Copyright c©2009-2010 Craig Michael ThompsonDual licensed under MIT or GPLv2 licenses

101

Appendix E

Group Rotation Procedure

This is the most complex of all particle operations. Code listing taken from/Captcha/Effect/Variants/Rotation.php. Please see Section 4.3.3 with Equa-tion 4.16 for more information.

1 /∗∗∗ Apply group ro t a t i on to a p a r t i c l e

3 ∗∗ @param Captcha_Elements_Particle $oPar t i c l e

5 ∗/pr i va t e func t i on applyGroupRotation ( Captcha_Elements_Particle $ oPa r t i c l e )

7 {// conta ins the maximum and minimum coord ina te s o f p a r t i c l e s in t h i s l a y e r

9 $aLayerMaxMinCoords = $th i s−>aParams [ ’ pa r t i c l e_coo rd s ’ ] ;

11 // c a l c u l a t e the l o c a l centre p o s i t i o n s f o r l a y e r s$iLocalCentreX = ( $aLayerMaxMinCoords [ ’x_max ’ ] − $aLayerMaxMinCoords [ ’x_min ’ ] )

/ 2 + $aLayerMaxMinCoords [ ’x_min ’ ] ;13 $iLocalCentreY = ( $aLayerMaxMinCoords [ ’y_max ’ ] − $aLayerMaxMinCoords [ ’y_min ’ ] )

/ 2 + $aLayerMaxMinCoords [ ’y_min ’ ] ;

15 // make sure we won ’ t d i v i d e by 0 l a t e r$oPar t i c l e−>getX ( ) == $iLocalCentreX && $iLocalCentreX++;

17 $oPar t i c l e−>getY ( ) == $iLocalCentreY && $iLocalCentreY++;

19 // ad ju s t p o s i t i on to centre$iCentreX = $oPar t i c l e−>getX ( ) − $iLocalCentreX ;

21 $iCentreY = $oPar t i c l e−>getY ( ) − $iLocalCentreY ;

23 // we ’ l l add t h i s r o t a t i on to the l a y e r$dRotationToAdd = $th i s−>aRotationByFrames [ $ th i s−>iCurrentFrame ] ;

25// ge t current r o t a t i on

27 $dRotationCurrent = atan ( $iCentreY / $iCentreX ) ;$dRotationTarget = $dRotationCurrent + $dRotationToAdd ;

29// c a l c u l a t e the hypotenuse

102

APPENDIX E. GROUP ROTATION PROCEDURE 103

31 $dLengthFromZero = $iCentreY / sin ( $dRotationCurrent ) ;

33 // c a l c u l a t e new po s i t i o n s f o r the s t a r t coord ina te s$iNewX = cos ( $dRotationTarget ) ∗ $dLengthFromZero + $iLocalCentreX ;

35 $iNewY = sin ( $dRotationTarget ) ∗ $dLengthFromZero + $iLocalCentreY ;

37 $oPar t i c l e−>setCoords ($iNewX , $iNewY) ;}

Listing E.1: Example character data, showing 5 particles of the letter “A”

Appendix F

Database Schema (User Tests)

104

Appendix G

Parameters of Fittest Individuals

Symbol Island 1 Island 3 Island 5 Island 6 Island 7

a03 41.504 34.68 43.117 39.082 47.87a07 565.118 581.558 730.193 629.973 672.891a08 8.148 17.587 9.872 20 7.271a11 0.685 0.64 0.702 0.885 0.828a13 7.376 14.893 50.413 44.795 66.953a14 75.079 40.844 0.044 3.593 47.769a15 0.004 0.002 0.006 0.002 0.002a16 2.127 5.256 3.831 1.754 1a17 0.317 0.452 0.457 0.437 0.41a18 0.247 0.1 0.285 0.286 0.194a19 75.849 21.998 70.391 59.515 50.69a20 0.593 0.611 0.455 0.419 0.518a21 0.396 0.422 0.516 0.327 0.568a22 20.141 15.653 24.954 5 17.093a23 0.184 0.457 0.204 0.406 0.01a24 56.119 100 11.946 97.837 45.436a25 0.293 0.243 0.718 0.373 0.333a26 33.384 50.077 53.274 60.815 46.715

Table G.1: Parameter values for the fittest individuals on the 5 fittest islands.

105

non-standard captchas for the web: a motion...

Documents