itrustpage: a user-assisted anti-phishing tool · 2018-01-04 · itrustpage is a web browser...

iTrustPage: A User-Assisted Anti-Phishing Tool

Troy RondaDept. of Computer Science

University of Toronto

Stefan SaroiuDept. of Computer Science

University of Toronto

Alec WolmanMicrosoft Research

ABSTRACTDespite the many solutions proposed by industry and the researchcommunity to address phishing attacks, this problem continues tocause enormous damage. Because of our inability to deter phishingattacks, the research community needs to develop new approachesto anti-phishing solutions. Most of today’s anti-phishing technolo-gies focus on automatically detecting and preventing phishing at-tacks. While automation makes anti-phishing tools user-friendly,automation also makes them suffer from false positives, false nega-tives, and various practical hurdles. As a result, attackers often findsimple ways to escape automatic detection.This paper presents iTrustPage – an anti-phishing tool that does

not rely completely on automation to detect phishing. Instead,iTrustPage relies on user input and external repositories of infor-mation to prevent users from filling out phishing Web forms. WithiTrustPage, users help to decide whether or not a Web page is legit-imate. Because iTrustPage is user-assisted, iTrustPage avoids thefalse positives and the false negatives associated with automaticphishing detection. We implemented iTrustPage as a downloadableextension to FireFox. After being featured on the Mozilla websitefor FireFox extensions, iTrustPage was downloaded by more than5,000 users in a two week period. We present an analysis of ourtool’s effectiveness and ease of use based on our examination ofusage logs collected from the 2,050 users who used iTrustPage formore than two weeks. Based on these logs, we find that iTrustPagedisrupts users on fewer than 2% of the pages they visit, and thenumber of disruptions decreases over time.

Categories and Subject DescriptorsC.2.0 [Computer-Communication Networks]: General—Secu-rity and protection

General TermsSecurity, Experimentation, Measurement

Keywordsphishing, anti-phishing

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.EuroSys’08, April 1–4, 2008, Glasgow, Scotland, UK.Copyright 2008 ACM 978-1-60558-013-5/08/04 ...$5.00.

1. INTRODUCTIONToday’s anti-phishing tools have done little to stop the prolifer-

ation of phishing attacks. Relying on automatic ways to identifyphishing attacks makes these tools suffer from both false positivesand false negatives. For example, attackers can easily adapt to auto-matic spam and phishing email filters and find new ways to bypassthem. Other approaches to defend against phishing, such as auto-matic password managers, face significant practical hurdles: theymust allow users to temporarily deactivate them to handle back-ward compatibility issues or password resets. Many banks usevisual cues such as customized login pages to prevent phishing,yet inattentive users can still be phished. While automation makesthese anti-phishing tools more user-friendly, automation often in-troduces simple alternate ways for phishing attacks to continue tobe viable in practice.Despite the many automatic anti-phishing solutions, the phish-

ing threat has continued to gain momentum. The following alarm-ing trends indicate that our reliance on automatic tools to detectand prevent phishing attacks has had little effect. Phishing toolk-its have started to become available: RSA recently uncovered atoolkit which displays the current version of a targeted Web page,yet copies any data entered to the phisher [6]. This “UniversalMan-in-the-Middle Phishing Kit” sells for approximately $1,000,and it allows phishers to get started by just entering a few param-eters. Phishing messages continue to be sent in great volume:Symantec observed 166,248 unique phishing messages during thelast six months of 2006; on average, this equates to 904 uniquephishing messages per day [26]. Symantec’s anti-phishing filtersblocked an average of 8.48 million phishing messages per day overthis time period. The number of phishingWeb sites continues togrow: There were 27,221 new unique phishing Web sites reportedin January, 2007 [1]; this is nearly three times the number of newunique phishing Web sites reported in January, 2006. Some in thebanking industry claim that phishing is quickly becoming the sin-gle most important source of fraud they deal with [2]. Phishinge-mails are becoming sophisticated and hard to filter: Phishingmessages have started to include personalized information, specificto the targeted user [17]. We believe that the research communitymust investigate new approaches for anti-phishing solutions to re-verse the current trends.In this paper, we present iTrustPage – a novel tool that does not

completely rely on automation to detect phishing. Instead, iTrust-Page relies on user input to decide on the legitimacy of a Web form.iTrustPage also uses external information repositories on the Inter-net to help the user with decision-making. A variety of sources ofinformation on the Internet (e.g., a search engine such as Google)can be used to help establish the legitimacy of a particular Webpage and/or Web form.

iTrustPage is a Web browser extension that prevents users fromentering any information into suspicious Web forms. iTrustPageintercepts the user input and prompts the user for search terms thatdescribe the Web page the user intended to visit. With these searchterms, iTrustPage performs a Google search and validates the Webform only if it appears among the top search results1. If the suspi-cious Web form does not appear in the top search results, iTrust-Page presents visual previews of those pages that do appear in thetop search results, and asks the user whether any of them visuallymatch the current form. If a match is found, the current form islikely to be a phishing attack. This step exemplifies of one of thekey benefits of user-assistance. The task of deciding whether thevisual appearance of two Web pages is similar can be very hard forprograms to get right, yet is a relatively easy task for a user. Whenthe user indicates a visual match, iTrustPage redirects the user tothe form selected from the search results. In this way, iTrustPagenot only prevents users from filling out phishing forms, it also helpsthem locate the corresponding legitimate Web form.In some cases, the search terms might not lead to any search

results where the legitimate Web forms are visually similar to thecurrent suspicious form. We believe that this type of limitation isfundamental to all anti-phishing tools – they all encounter ambi-guity in certain cases when attempting to determine whether a siteis legitimate. Unfortunately, there is no easy way to deal with thiscase. One conservative approach is to always block users from fill-ing out such forms. This option might appeal to corporate systemadministrators. Another approach is to raise a warning before al-lowing the user to fill out the form. However, warnings have beenshown to be ineffective: most users ignore them when they do notunderstand the implications [5, 27, 8]. While iTrustPage supportsboth options, the default option in our current implementation is toraise a warning.iTrustPage uses two simple mechanisms to reduce how often

users need to be involved in determining the legitimacy of a Webform. First, iTrustPage uses a whitelist of 467 domain names se-lected from the top 500 most popular Web sites as reported byAlexa.com (some of these Web sites belong to the same domainname). Second, iTrustPage uses a cache recording all the previ-ously visited Web forms. In this way, users are never interruptedwhen filling out a form on aWeb page whose domain appears in thewhitelist or upon subsequent re-visits. Because these mechanismsintroduce automation in our tool, we deliberately kept them sim-ple and unsophisticated to minimize (and hopefully eliminate) thepossibility of introducing false negatives when detecting whether aform is phishing. As our evaluation will show, we found that boththe whitelist and the cache work surprisingly well in practice.We implemented iTrustPage as a downloadable extension for the

FireFox Web browser. iTrustPage is available for download fromthe Mozilla website for add-ons; once posted on this site, iTrust-Page was downloaded by over 5,000 people in less than two weeks.We present an analysis of our tool’s effectiveness based on reportedusability statistics collected from our deployed software. We showthat iTrustPage disrupts users on less than 2% of the pages theyvisit and the number of disruptions decreases over time. We alsofind that iTrustPage resorts to its user-assisted validation schemeon 32.6% of the pages that users type into; the remaining pages arevalidated by iTrustPage’s whitelist and cache. Our current imple-mentation collects these statistics and anonymizes them to protectour users identities; it also allows users to disable reporting entirely.

1Currently, iTrustPage uses the top 10 results returned by Google.

2. COMPARING AUTOMATION TOUSER-ASSISTANCE FORANTI-PHISHING TOOLS

In this section, we illustrate the problems associated with au-tomated anti-phishing tools with a case study: we investigate theeffectiveness of the SpamAssassin e-mail filter. SpamAssassin is avery popular open-source spam and phishing e-mail filter. In ourexperiments, we use four different corpora of e-mail (includingmessages that are legitimate, phishing, and spam) collected froma public repository of e-mail suitable to use for testing in anti-spamand anti-phishing e-mail filters [21]. While e-mail filters are justone of the many tools in our anti-phishing arsenal, we believe thatthe problems they exhibit are representative of a wider range of is-sues associated with automated anti-phishing techniques. We thendiscuss why user-assisted tools have some inherent advantages overfully-automated tools. We delay our in-depth discussion of the pre-viously proposed anti-phishing techniques and their benefits anddrawbacks until Section 5.

2.1 Automation Introduces False Negativesand False Positives

To illustrate the presence of false negatives in automated phish-ing e-mail filters, we performed the following experiment. Weran a very recent version of SpamAssassin (version 3.1.8) over arelatively old corpus of 414 phishing e-mails collected betweenNovember 2004 and June 2005 [21]. We use the default config-uration of SpamAssassin to identify phishing e-mails. Despite thiscorpus being more than two years old, SpamAssassin is still notvery effective in classifying many of them as phishing. We findthat 67 of the 414 phishing e-mail messages (16%) are labeled aslegitimate by SpamAssassin.We ran the same configuration of SpamAssassin over a corpus of

250 legitimate e-mails collected in 2004 from the same source [21].Although each of these e-mail messages is legitimate, these mes-sages are known to be hard to differentiate from spam and phish-ing e-mails because they contains unusual HTML markup, coloredtext, and phrases often found in illegitimate e-mail. Despite thesee-mails being collected more than three years ago, SpamAssassinstill labeled 8 of them (3.2%) as illegitimate.

2.2 Coping With False Positives and FalseNegatives Simultaneously

When anti-phishing tools suffer from false negatives, they maynot be effective in protecting their users. In such cases, they just de-crease the likelihood that their users will become victims of phish-ing; however, users must remain vigilant and find other means toprotect themselves from phishing attacks. On the other hand, falsepositives introduce serious usability concerns. For example, usersare reluctant to continue using anti-phishing e-mail filters that clas-sify even a small quantity of legitimate e-mail as phishing.Unfortunately, it is very hard for an automated tool to eliminate

both its false negatives and false positives. To reduce the rate offalse negatives, the only option is to make the tool more aggres-sive. Unfortunately, becoming more aggressive usually leads to anincrease in the rate of false positives. We illustrate this tension withthe following experiment.We ran SpamAssassin over two e-mail data sets: (1) a newer

corpus of phishing e-mails from the same source as the one exam-ined earlier [21] containing 1,423 e-mails dated between November2005 and August 2006; and (2) a corpus of 478 legitimate e-mailmessages sent by one of the authors over the past few months. Weran SpamAssassin with three different configurations: the default;

29.7%

21.1%

8.7%

0%

10%

20%

30%

40%

Default More Aggressive Most Aggressive

SpamAssassin Configurations

Fa

lse

Ne

ga

tiv

es

0.0% 0.4%

15.5%

0%

10%

20%

30%

40%

Default More Aggressive Most Aggressive

SpamAssassin Configurations

Fals

e P

osit

ives

Figure 1: The trade-off between false positives and false negatives for SpamAssassin. We ran SpamAssassin with three configurations: the defaultone; a more aggressive configuration; and the most aggressive configuration. We increased SpamAssassin’s aggressiveness by decreasing its thresholdfor labeling an e-mail as phishing, from 5 to 3 to 1, respectively. On the left, we present the rate of false negatives in a corpus of 1,423 phishing e-mailscollected between November 2005 and August 2006. On the right, we present the rate of false positives in a corpus of 478 legitimate e-mails sent by one ofthe authors over the past few months.

45.4%

51.2%

30.7%

45.6%

0%

20%

40%

60%

Older Dataset Newer Dataset

Fa

lse

Ne

ga

tiv

es

SpamAssassin as of 2004

SpamAssassin as of 2007

Figure 2: Phishing adapts to SpamAssassin. We ran SpamAssassinover two phishing data sets: an older data set collected between November2004 and June 2005, and a newer data set collected between June 2005and November 2005. We ran two versions of SpamAssassin – a versionreleased in 2004 (version 3.0.1) and a very recent version released in 2007(version 3.1.8). The rate of false negatives is higher in the newer phishingdata set for both versions of SpamAssassin.

a more aggressive configuration; and the most aggressive configu-ration. We increased SpamAssassin’s aggressiveness by decreasingits threshold for labeling an e-mail as phishing. Figure 1 presentsour results. Making SpamAssassin more aggressive has the ex-pected effects: it reduces the rate of false negatives (as seen on theleft) while simultaneously increasing the rate of false positives (asseen on the right).

2.3 Phishers Adapt to Automated PhishingFilters

Another problem with automation is that phishing attacks tend toevolve and adapt to the automated filters. We illustrate this problemwith the following experiment. We ran SpamAssassin over the twophishing data sets used in the experiments above [21]: an older dataset collected between November 2004 and June 2005, and a newerdata set collected between June 2005 and November 2005. We rantwo versions of SpamAssassin – an older version released in 2004(version 3.0.1) and a recent version released in 2007 (version 3.1.8).We present the rate of false negatives in Figure 2.For each of the two data sets, the most recent SpamAssassin has

fewer false negatives. As expected, SpamAssassin is trying to adapt

to incoming phishing e-mail – its recent version is better than theolder one. On the other hand, we also see that the rate of falsenegatives increases in the newer data sets for both versions of Spa-mAssassin. This illustrates that phishing attacks are adapting tobypass the SpamAssassin’s automated filters.

2.4 The Benefits of User-Assistance forAnti-Phishing Tools

We believe that tools (such as iTrustPage) that do not completelyrely on automatic techniques for phishing detection have three keyadvantages over automated tools. First, we believe that false pos-itives are rare and, if they do occur, they are less likely to irritateusers. When a user is involved in the validation process, a false pos-itive is a legitimate e-mail or Web page that the user helped to labelas phishing. If they made a mistake, we believe that users are lesslikely to complain and stop using the tool. Because false positivesare less of a concern, user-assisted anti-phishing tools can afford tomake their automated detection very aggressive. If an e-mail (or aWeb page) is deemed suspicious, such tools can then involve theusers in a manual validation process.Second, we believe it is more challenging for phishing attacks to

adapt to user-assisted validation than to automated validation. Au-tomatic phishing detection must examine human readable contentand classify it as legitimate or suspicious. This is a very hard prob-lem; as an example, there is debate in the machine learning com-munity whether machine learning algorithms can be made secureagainst adversaries [4]. On the other hand, it is easier for people tointerpret and understand human readable content. While still im-perfect, people are much better at understanding the meaning ofe-mails and Web pages than programs.Third, user-assisted tools naturally support content in many lan-

guages whereas many automated anti-phishing tools have difficultyhandling such content. For example, automated e-mail filters writ-ten to handle English e-mails cannot be directly ported to otherlanguages. Instead, they must incorporate a whole new set of auto-matic rules and heuristics that correspond to the characteristics ofphishing e-mails in those other languages.

3. THE DESIGN OF iTrustPageIn this section, we present the design and implementation of

iTrustPage, our tool that prevents users from filling out phishingWeb forms and then assists them in finding the corresponding le-gitimate Web form. When encountering a suspicious Web page,

iTrustPage does not simply warn the user – instead it offers themcorrective action. The design of iTrustPage is based on three obser-vations: (1) we can rely on users to assist with the process of de-ciding whether or not a Web page is legitimate, as there are certaintasks that are easy for people to do, yet are very difficult to auto-mate reliably; (2) we can assist users by locating legitimate pagesfor them; and (3) we can use external information repositories, suchas Internet search engine results, to assist with the process of de-ciding whether or not a given Web page is legitimate. User inputis needed for two tasks: (1) describing search terms for the Webform they are filling in to see whether or not it is a well-knownand established Web page; and (2) performing visual comparisonsbetween the Web form they are filling in and the top Web formsrelated to their supplied search terms. The external informationrepositories used by iTrustPage are simply Google’s search indexand the whitelist of known trustworthy web pages.The remainder of this section presents a step-by-step description

of how iTrustPage works, along with the intuition behind each stepas well as presenting possible alternatives.

3.1 Automatic ClassificationFor first-time visits to a particular Web form, iTrustPage uses

a built-in whitelist of popular legitimate sites. For subsequent re-visits, iTrustPage maintains a local cache of all previously vali-dated Web forms visited by the user, as well as those forms man-ually approved by the user. In this way, iTrustPage never disruptsusers when they visit a page on a popular, well-established domainor when they re-visit a page subsequently. Automatic validationmakes iTrustPage easier to use: the tool remains transparent whenthese two heuristics determine that a form is legitimate.

3.2 Interactive Page ValidationWhen the automatic classification step fails to validate a partic-

ular Web form, iTrustPage involves the user in the validation pro-cess. Whenever users attempt to fill out an unvalidated Web form2,iTrustPage presents an overlay on the browser window that asksthem to describe the form they intend to fill in, as if they are en-tering search terms for a search engine. iTrustPage uses the searchterms to issue a search query using Google. If the top 10 searchresults contain the form’s Web site (i.e., its domain name), theniTrustPage infers that the Web site that serves this form is legiti-mate, and therefore the user is allowed to proceed and fill out theform. In some cases, the user reaches a suspicious Web form bynavigating to it from the Google search page. For these forms, theuser has already assisted iTrustPage, albeit unknowingly, to vali-date this page. Thus, iTrustPage does not ask the user to performan additional Google search.The fact that the site appears in the top 10 search results means

that the Google crawler indexed the site, that many other sites linkto it, and that the site is most likely not short-lived. Since the userselected the search terms that led to the search results, this meansthat the form presumably matches the user’s intent. Once a formis deemed legitimate, iTrustPage remembers that decision and willnot intervene again. As future work, we could improve this mech-anism by including the query results from other search engines,looking at other information repositories, such as the Whois do-main registration database, and investigating whether “top 10” isthe best number of search results to check (we used this numberbecause it is the first page of results).If the Web site that hosts the form does not appear in the top

search results, iTrustPage then fetches the page contents of the top

2iTrustPage monitors keyboard and mouse input to detect when auser begins filling in a Web form.

three search results. iTrustPage presents the user with a visual pre-view of those pages and asks the user if any of the search resultpages are visually similar to the Web form they intended to access.If the user detects a visual similarity, then the original form is prob-ably a phishing form. Therefore, iTrustPage immediately redirectsthe user away from the original form to the legitimate form foundin the search results. Figure 3 illustrates iTrustPage’s search inter-face overlay when visiting a questionable Web form (in this casehttp://www.heinket.de, a current phishing page spoofingPayPal). In fact, this Web form was phishing PayPal; once the userentered the search term “paypal”, iTrustPage previews the legiti-mate Web page, http://www.paypal.com.

3.3 Revising Search TermsIn some cases, the user will not find the intended Web site among

the top 10 search results. When this happens, the user is asked torefine the search and the previous step is repeated. Nevertheless, itis possible (although not very common) that users may never findthe desired site among the top search results. In this case, iTrust-Page is unable to determine whether or not the form is legitimate.While we make every effort to ensure that it only occurs rarely,

the problem that iTrustPage faces, of being unable to determinewhether or not a particular Web form is legitimate, is common to alldetection tools, whether they try to detect phishing, spam e-mail, ormalware. Sometimes there is simply not enough information to re-liably make the correct decision. To address such situations, iTrust-Page does allow the user to bypass the search mechanism when theuser is unable to find a legitimate site using visual comparison.

3.4 Implementation DetailsIn this section, we present additional lower-level issues related

to our tool’s implementation. iTrustPage is implemented as an ex-tension to the FireFox Web browser. We have tested it on severalcommodity OSes, including Windows, Linux, and Mac OS X. Itssource code is approximately 5,200 lines of code and it is freelyavailable to download [3].iTrustPage is configured to not block a user’s interaction with

a Web form until they use the keyboard or bookmark the page.iTrustPage ignores common Firefox keyboard control characterssuch as the spacebar, the “Mac” key, and the control keys. iTrust-Page can optionally be configured to block user interaction on pageopen; to only check pages with input boxes; and to only check whenthe user clicks on a form element. iTrustPage does not currentlydeal with embedded objects (e.g., Flash and ActiveX) because itdoes not receive their keyboard events. This is an implementationissue rather than a fundamental flaw in our design, and we intendto address this in a future version of iTrustPage.

3.5 DeploymentOver the past year, we have released several versions of iTrust-

Page. In the early releases, we incorporated several automaticheuristics to validate Web forms. The first heuristic was based onGoogle’s page rank. Google provides a Web service that takesas input a URL and returns the URL’s PageRank. The PageRankis a number between 0 (low) and 10 (high). Most sites have aPageRank of 0; only a few very popular sites have a PageRankhigher than 8 (e.g., Google’s search page has a PageRank of 9).iTrustPage required a PageRank of at least 5 to automatically labela form as “validated”.In addition to checking the PageRank of the actual form URL,

iTrustPage also looked at the sequence of URLs accessed immedi-ately preceding the form. If any previous URLs were served by thesame site as the form, then the PageRank was calculated for each of

Figure 3: iTrustPage displaying its interface overlay. On the left, the user is visiting http://www.heinket.de, a well-known page phishing PayPal. Oncethe user enters their search terms (in this case “paypal”), iTrustPage previews the legitimate page, http://www.paypal.com (shown in the middle). On theright, iTrustPage informs the user about avoiding a suspicious page and being redirected to a legitimate one.

those previous URLs, and the maximum PageRank value was usedto make the decision. The intuition behind this process was thatoften the URL of a form at a popular site has a very low PageRank,yet the site homepage that contains a link to that form has a veryhigh PageRank.To improve iTrustPage’s performance, once a user visited a page

iTrustPage immediately prefetched the PageRank information,even if the user did not attempt to type anything on that page.In this way, iTrustPage could check whether the form had a highPageRank before the user started to fill it in. Without this prefetch,iTrustPage would have had to block the user from filling in theform briefly while retrieving its PageRank. We implemented thisprefetching step asynchronously to avoid interfering with the userexperience while loading the Web page.However, after we examined the logs of our early releases, we

decided to eliminate these automatic validation heuristics fromiTrustPage for three reasons. First, we found that the automaticheuristics were not effective; Web forms with high page ranks wereoften found in iTrustPage’s whitelist or in its cache. Thereforethese heuristics added little to iTrustPage’s overall effectiveness.Second, the automatic prefetch of the PageRank of every pageaccessed introduced serious privacy concerns. A small numberof users pointed out that using iTrustPage meant transmitting toGoogle the URL of every single page browsed. Third, we worriedabout a “Google bomb” attack – influencing the ranking of a pageby creating a large number of sites linking to it. By relying on anautomatic test, iTrustPage could be subject to false negatives andthereby fail to protect its users against phishing attacks.The heuristics based on PageRank would automatically validate

any visited page with a high PageRank. To be successful, attack-ers would only need to setup their phishing pages on Web pageswith high PageRank values. Instead, with the current version ofiTrustPage a “Google bomb” attack can only be successful whenthe phishing page appears in the top 10 search results returned byGoogle based on the user’s search terms. For example, to phishPayPal attackers must make their phishing pages be one of the top10 answers returned by Google for the search term PayPal. Whilenot impossible, we believe this attack will be very difficult to mountin practice. In summary, by removing these automatic heuristics wedid not significantly affect iTrustPage’s effectiveness, we improvediTrustPage’s users privacy, and we made “Google bomb” attacksmuch harder to mount in practice.

0

100

200

300

400

500

600

700

13-Jun-07 25-Jun-07 7-Jul-07 19-Jul-07 31-Jul-07

Nu

mb

er o

f D

aily F

resh

In

sta

lls

Figure 4: The number of daily installations of iTrustPage. WhileiTrustPage was released in early June, the FireFox website did not post alink to iTrustPage until June 27th. The number of fresh installs grew bytwo orders of magnitude the next day.

The most recent version of iTrustPage was released in early June2007 and appeared on the FireFox website for third-party exten-sions on June 27th, 2007. All the results in this paper are basedon logs received before August 9th 2007. During this period, wereceived logs from 5,184 users. Figure 4 shows the number of in-stallations of iTrustPage since July 1st, 2007.

3.6 Circumventing iTrustPageIn this section, we describe how phishers might try to circumvent

the detection algorithm implemented by iTrustPage. We anticipatethree types of attacks.One way to circumvent iTrustPage is to setup a phishing site in

a domain listed in iTrustPage’s whitelist or in its cache. For ex-ample, an attacker can break into a popular Web site and replaceone of their pages with a phishing form. While this attack is pos-sible, most popular sites are typically well monitored, and such anattack would likely not go undetected for long. Another way tocircumvent iTrustPage is using the “Google bomb” attack that wedescribed earlier.iTrustPage also relies on an implicit assumption: the user’s

browser has not been compromised. If the browser is compromisedthen an attacker can easily disable iTrustPage’s functionality. From

27,936Total # of Pages Typed Into

42,928 (almost 5 years)# of Hours Browsed in Total by All Users

796,038Total # of Pages Browsed

2,050# of Users with 2+ Weeks of Activity

Table 1: High-Level Statistics of Long-Lived Users. Our evaluation isbased on users who used iTrustPage for at least two weeks.

the beginning, we decided not to engineer against such attacks: ifthe browser is compromised, the user is subject to more harmfulattacks, such as malware, viruses, spyware, or Trojan horses.

4. EVALUATIONThe goal of this section is to evaluate how well iTrustPage works.

For this, we attempt to answer the following six questions:1. While most phishing attacks target passwords, divulging any

kind of information to a malicious third party can raise seriousprivacy concerns. In addition to passwords, phishing attacks havebeen known to target social security numbers, addresses, birthdates, and other types of personal information. Thus, users riskbecoming the victim of a phishing attack whenever they enterpersonal information into a Web page. How often do users visitpages where they can input information?2. When filling out suspicious Web pages, iTrustPage interrupts

users asking them to get involved in the validation process. If thesedisruptions occur too frequently, users might stop using iTrustPage.How disruptive is iTrustPage to the users?3. iTrustPage uses a whitelist and a cache to minimize the num-

ber of disruptions; if a Web page appears in the whitelist or thecache, iTrustPage will not stop users from entering information intothis Web page. How effective are iTrustPage’s whitelist and cache?4. When a user starts to enter information on a Web page that

does not appear in the whitelist or in the cache, iTrustPage deemsthe page suspicious. For suspicious Web pages, iTrustPage requiresuser assistance. Users must either validate this page by providingsearch terms to iTrustPage which then performs a Google search,or users choose to bypass iTrustPage’s validation process. Thereis one exception: if the way that the user arrived at the suspiciousWeb page in the first place was by performing a Google search, thenthe user has already provided assistance to iTrustPage and thereforeiTrustPage does not ask the user to validate. How often does iTrust-Page validate pages?5. When validating the form with iTrustPage, users enter search

terms describing the Web page they intended to type into. If theintended page is not found based on the search terms entered, userscan continue to refine their searches until either the intended pageis found or the users choose to bypass the validation process. Howmany searches do users perform until they validate their pages?6. Ultimately, iTrustPage’s effectiveness is determined by

whether it stops users from becoming victims of phishing attacks.Did iTrustPage prevent any users from becoming phishing victims?We answer these questions by examining the logs returned by the

users of iTrustPage. These users installed our tool by downloadingit either from the FireFox website for third-party extensions or di-rectly from our project Web site at the University of Toronto. Afterpresenting a high-level characterization of these logs, we answereach of the six questions above in detail.

4.1 High-Level Characterization ofiTrustPage’s Logs

Some of the installations of iTrustPage were short-lived – someusers ended up uninstalling our tool after only a few hours or a fewdays. We removed these short-lived installations from our evalua-tion because we want to capture the tool’s effectiveness in steady-state, over the course of several weeks of browsing activity. In theremainder of this evaluation, all the results we present are based onthe logs sent by users who used iTrustPage for at least two weeksafter installation. Table 1 presents some high-level statistics aboutthese long-lived installations.

4.2 How often do users visit Web pages whichaccept input?

Users enter information into Web pages by typing in HTMLforms or in embedded scripts that display forms. While not allforms and embedded scripts are used for entering text in a Webpage, measuring their presence on the Web can give us an upperbound on how often users visit Web pages which accept input. Iffew browsed pages include forms or embedded scripts, the risk ofbeing phished is low. To determine the prevalence of forms andscripts on the Web, we examined how widespread the use of formsand embedded scripts is in the set of pages browsed by our users.We determine whether pages contain a form or embedded script bysearching for the HTML “<form>” tag and the HTML “<script>”tag, respectively. In addition, we also record the presence of HTMLinput boxes of type “password”. Figure 5 presents our findings.On the left, Figure 5 shows the fraction of all pages that include

forms, password fields, and embedded scripts. While 7% of pageshave password input boxes, almost two-thirds of all pages browsedcontain forms. This suggests that users can enter information inmany of the Web pages they visit. While many Web forms do notask for personal information, forms have become highly prevalenton the Web. Embedded scripts are even more prevalent than forms,being found in over 80% of visited pages. These results suggestthat being able to enter information in Web pages has now becomethe common case when users browse the Web.On the right, Figure 5 presents the cumulative distribution of the

number of embedded scripts, forms, and passwords present in Webpages. For each distribution, we exclude the Web pages that donot contain any embedded scripts, forms, and passwords, respec-tively. Figure 5 shows that while most pages with password fieldshave only one such field (92.6%), 40% of all pages with embeddedscripts include at least 10 such scripts. These results suggest thatnot only are forms and embedded scripts prevalent on the Web, butit is also common to have multiple forms and scripts present on asingle Web page.

4.3 How disruptive is iTrustPage to theaverage user?

To validate suspicious Web forms, iTrustPage interrupts usersand asks them for assistance in validating these Web pages. If thesedisruptions are too frequent, users might stop using iTrustPage. Webelieve that a high degree of user transparency is a key requirementof any Web security tool. iTrustPage must minimize the number ofinterruptions to its users.Figure 6 illustrates how intrusive iTrustPage is. On the left, Fig-

ure 6 shows the fraction of Web pages that iTrustPage asked to bevalidated over time. On the right, Figure 6 presents the same datameasured in the number of interruptions per day rather than thefraction of the total number of pages browsed. Our results indicatethat iTrustPage is not excessively intrusive for its users. After oneweek of use, iTrustPage disrupts its users on less than 2% of all

63.3%

7.7%

80.3%

0%

20%

40%

60%

80%

100%

Forms Passwords Scripts

Fracti

on

of

Bro

wsed

Pag

es

0

20

40

60

80

100

1 10 100

Number of Elements per Page

Fracti

on

of

Pag

es

Scripts

Forms

Passwords

Figure 5: The distribution of forms, passwords, and scripts in Web pages. (a) Fraction of pages browsed by iTrustPage’s users with form HTML tags,password HTML tags, and script HTML tags; (b) CDFs of the number of forms, passwords, and script tags in the browsed Web pages containing suchtags. Note that in this graph we exclude the Web pages not containing any forms, passwords, or script tags, respectively.

0.0%

0.5%

1.0%

1.5%

2.0%

2.5%

0 5 10 15 20 25 30

# of days since iTrustPage's install

% o

f p

ag

es t

hat

iTru

stP

ag

e

asked

to

be v

ali

date

d

0

500

1000

1500

2000

2500

3000

0 5 10 15 20 25 30


# o

f p

ag

es t

hat

iTru

stP

ag

e

asked

to

be v

ali

date

d

Figure 6: How disruptive iTrustPage is. On the left, we present the ratio of the number of interruptions due to iTrustPage to the total number ofpages visited each day. This graph presents the average of these ratios over time, relative to the number of days elapsed since the installation day. WhileiTrustPage stops users about 2% of the time in the first couple of days, after one week, the rate decreases to 1 to 1.5% of the time. On the right, wepresent the number of interruptions due to iTrustPage. The median number of interruptions per user is 0 every day, except than on the first day after theinstallation.

pages browsed. In fact, we found that fewer than half of iTrust-Page’s users are disrupted daily (except during the day when usersinstall their copy of iTrustPage).

4.4 How effective are iTrustPage’s whitelistand cache?

iTrustPage uses a whitelist and a cache to minimize the num-ber of disruptions. The whitelist contains 467 domain names se-lected from the top 500 most popular Web sites as reported byAlexa.com (some of these Web sites belong to the same domainname). Also, whenever a newWeb page is validated or bypassed bya user, iTrustPage records this decision along with the page in theuser cache. In this way, users are not interrupted upon re-visiting aWeb page.If these two techniques are effective, many of the pages users

type into will appear either in the whitelist or in the cache. Figure 7illustrates how effective the whitelist and the cache are by present-ing their hit rates over time. On the left, we present the average hitrate of the caches of all users relative to the number of days sinceiTrustPage’s installation. While iTrustPage’s caches have an aver-age of 40% hit rate on their first day, the hit rate grows to 65% afterone week only. After only one week, two-thirds of the pages userstype into appear in their caches (i.e., they were re-visited).

On the right, Figure 7 shows the hit rate of the whitelist for allusers over time. The whitelist’s hit rate is analyzed independentlyof the cache’s hit rate. As expected, the hit rate of iTrustPage’swhitelist does not rise over time; instead, it remains flat at about55%. Like the cache, the whitelist is also very effective – morethan half the pages users type into appear in iTrustPage’s whitelist.

4.5 How often does iTrustPage validatepages?

When a user types into a page, iTrustPage first checks whetherthe page’s domain appears in the whitelist. If not, iTrustPage nextchecks whether the page appears in iTrustPage’s cache. If the pageappears neither in the whitelist nor in the cache, iTrustPage re-sorts on its user-assisted validation scheme. On the left, Figure 8presents the breakdown of validating pages with iTrustPage. Morethan half of the pages that users type into (55.7%) appear in iTrust-Page’s whitelist. Combining the whitelist and the cache, iTrustPagecan validate 67.4% of pages typed into. Intuitively, this correspondsto the hit rate of a single cache servicing all iTrustPage’s users; thishit rate is different than the average hit rate of users’ caches andwhitelists presented in Figure 7.If a Web page does not appear either in the whitelist or in the

cache, iTrustPage labels the page as suspicious. In this case, iTrust-

0%

20%

40%

60%

80%

100%

0 5 10 15 20 25 30


Hit

Rate

of

iTru

stP

ag

e's

Cach

e

0%

20%

40%

60%

80%

100%

0 5 10 15 20 25 30


Hit

Rate

of

iTru

stP

ag

e's

wh

iteli

st

Figure 7: The hit rates of iTrustPage’s cache and whitelist from a Google Search page. The hit rates of each mechanism were analyzed in isolationfrom each other.

0%

20%

40%

60%

80%

100%

Outcomes of iTrustPage's User-Assisted Validation

Page validated by

navigating to it from

Google Search

(61.6%)

Users chose to bypass

iTrustPage's validation

(33.6%)

Page validated

by searching and

found in top 10

answers

(4.6%)

Visually similar

page chosen instead

(0.2%)

0%

20%

40%

60%

80%

100%

Breakdown of Validating Pages Users Typed In

Appears in whitelist

(55.7%)

User-assisted

validation

(32.6%)

Appears in cache,

but not in whitelist

(11.7%)

Figure 8: Breakdown of validating the pages users type into. On the left, we present how often pages are validated by the whitelist, the cache, andfinally, by iTrustPage’s user-assisted technique. On the right, we focus on the user-assisted validation only. With user-assistance, iTrustPage is successfulat validating 66.4% of all suspicious pages. In the remaining cases, users choose to bypass the validation process.

Page validates the page if the users can search and find the page onGoogle. If the page is not found, the user is presented with a set ofalternate pages (i.e., the top three pages returned by Google) andthe user is asked to choose a visually similar page.On the right, Figure 8 presents the breakdown of how iTrust-

Page handles suspicious pages. Over 60% of suspicious pages arereached by navigating to them directly from a Google search; inthese cases, iTrustPage validates the page without asking the userto perform an additional Google search. In almost 5% of cases,users validate their pages by searching for them and finding theanswer in the top 10 results returned by Google. In a very smallnumber of cases (0.2%), users choose to navigate to an alternatepage instead. Finally, users bypass iTrustPage’s validation processone-third of the time (33.6%).While these results indicate that iTrustPage is successful at val-

idating many of the suspicious Web forms that users type into, wealso found that it is common for users to bypass iTrustPage’s val-idation despite our efforts to make the tool unintrusive. One thirdof the time users avoid validating their Web pages, thereby runningthe risk of becoming the victim of a phishing attack.

4.6 How many searches do users performuntil they validate their page?

To validate their forms with iTrustPage, users must enter searchterms describing the Web page they intended to type into. We ex-amine how many times users revise their search terms to validatetheir Web forms. Figure 9 illustrates the number of attempts users

81.1%

12.2%

6.8%

0%

20%

40%

60%

80%

100%

Number of Searches Performed Until Page Found

One Search

Two Searches

Three or More

Searches

Figure 9: Number of times users must refine their search to validatea page. Users can manually validate their page with only one search inover 81.1% of the cases.

needed to revise their searches, for those searches that were even-tually found in the top 10 search results. In 81.1% of the cases,the first search was successful, whereas 6.8% of the cases requiredat least three searches. These findings suggest that users can findtheir intended page with one search the vast majority of the time.Nevertheless, on occasion it is necessary to refine the search a fewtimes in order to finish the validation process for some Web forms.

4.7 Did iTrustPage prevent any users frombecoming phishing victims?

While measuring the iTrustPage’s non-intrusiveness and the ef-fectiveness of assisting users with validating their forms is impor-tant, ultimately iTrustPage is useful only if it stops users from be-coming victims of phishing attacks. Unfortunately, we cannot de-termine whether any browsed page is a phishing page without vi-olating the privacy of iTrustPage’s users. Determining whether apage is phishing requires knowing what the page displays.However, we can compute an upper bound on the number of

prevented phishing attacks by investigating the cases when userschose to navigate to a page visually similar to the one they typedinto. While we do not know whether all these cases correspondto prevented phishing attacks, in any phishing attack stopped byiTrustPage a user must choose to navigate to an alternate page.We found 291 cases in our logs when a user chose a visually

similar page. While 81 of these cases occurred within the firstthree days of installing iTrustPage (e.g., these might correspond tocases when users tested and explored iTrustPage’s functionality),92 cases occur after the user ran iTrustPage for at least two weeks.We found HTML forms present in 240 of these pages.

4.8 SummaryOur evaluation shows that iTrustPage is effective and easy-to-

use. Specifically, we found that:

• iTrustPage disrupts users on less than 2% of the pages theyvisit and the number of disruptions decreases over time.

• iTrustPage’s cache is effective – over two-thirds of the formsusers type into appear in iTrustPage’s cache after one weekof use.

• iTrustPage’s whitelist is effective – more than half of theforms users type into appear in iTrustPage’s whitelist.

• iTrustPage resorts to its user-assisted validation scheme on32.6% of the pages users type into.

• iTrustPage requires user-assistance to validate suspiciousWeb forms two-thirds of the time; one-third of the time userschoose to bypass iTrustPage’s validation.

• iTrustPage is easy to use: when searching to validate theirWeb forms, users can find their desired form in the top 10search results 81.1% of the time.

• We cannot determine whether iTrustPage prevented anyphishing attacks without violating the privacy of iTrust-Page’s users. Nevertheless, we found 291 cases when userschose to navigate to a Web page other than the one theyoriginally intended to visit. Of these cases, 92 occurred afterthe users were already familiar with running iTrustPage (i.e.,at least two weeks after their installation date).

These findings show that iTrustPage’s approach to relyingon user input and on external repositories of information showspromise. In the future, we intend to examine additional strategiesto incorporate these insights into other Internet security tools.

5. RELATED WORKIn this section, we start by classifying current anti-phishing tools

and techniques into four broad categories. After this categorization,we summarize the results of six previous studies on the behavior ofInternet users when facing phishing Web pages, when using their

Web passwords, and when using different anti-phishing tools. Webelieve the results of these studies results portray the seriousness ofthe “phishing threat”.

5.1 Spam Filters and BlacklistsOne approach relies on spam filters and blacklists to automat-

ically prevent users from visiting a phishing Web page. Alreadythere are many phishing-specific filters for popular email software(e.g., Exchange Server 2003 SP2 [19], Outlook [20], and SpamAs-sassin [25, 9]). Many Web browsers also incorporate blacklists toprevent users from visiting specific Web pages based on the do-main or the IP addresses they are served from. Microsoft’s newIE7 browser, Mozilla’s FireFox 2, and Opera from version 9.1 allinclude lists of known phishing pages. If such a page is visited, thebrowser will either warn the user or block the page outright [12].Very recently, a company has started to offer a special DNS servicethat filters out known phishing domains [16].Some advantages of this general approach are its simplicity,

transparency, and ease of deployment. Web browsers and spamfilters are already highly popular. Automatically deploying filtersand blacklists is easy to setup, and much of their functionalityremains transparent to most users.While effective, this class of solutions alone will not eliminate

phishing attacks. Spam filters are not perfect. Phishing pages mustbe quickly discovered and added to blacklists, especially since theaverage uptime of a phishing page was only 4.5 days in August2006 [1]. Also, the “freshness” of phishing pages can impact theeffectiveness of some anti-phishing tools [30]. Studies have shownthat many users ignore browser warnings when they do not under-stand their implications [27, 8, 5]. All these reasons lead us tobelieve that spam filters and blacklists will have only a marginaland temporary effect on the prevalence of phishing attacks.

5.2 New Web Authentication ToolsAnother approach is to invent new Web authentication schemes

that replace the current approach of users entering passwords di-rectly into forms. Different techniques have been proposed to re-place current authentication protocols between users andWeb sites.One such technique is out-of-band authentication, where users

are asked to login through a different channel, more secure than theWeb, such as a cell-phone [22] or a virtual machine [15]. Unfortu-nately, some of these techniques are subject to man-in-the-middleattacks [24]. Furthermore, out-of-band techniques can be logisti-cally difficult to deploy.Several research projects have proposed using password man-

agers to protect users’ credentials [14, 23, 29]. Almost all thesetools rely on a technique known as “password hashing” to ensurethat one user never re-uses a password on more than one Web site.The idea is simple: these techniques use some site-specific infor-mation (for example, its SSL certificate or its domain name) com-bined with the user’s password as input to a secure hash function,whose output is forwarded to the server. In this way, the passwordmanager ensures that even when phished, users do not divulge theirpasswords; instead, they divulge their passwords hashed with infor-mation specific to the phishing page. Phishers cannot reuse thesepasswords on the legitimate site.The password hashing approach is both elegant and very effec-

tive. Unfortunately, password managers have not been quicklyadopted by either users or Web sites. We believe several reasonsare responsible for the moderate success of these tools. First, sometools have deployment issues. For example, when resetting a pass-word, many Web sites today send an e-mail message containing anew, perhaps temporary, password. The user must enter this new

password to login, which requires disabling the password manager.The user then logs in with the temporary password, visits a formthat allows users to change their password, and then re-enables thepassword manager to generate the permanent password. To han-dle this issue, many password managers use a special sequence ofkeys (e.g., typing the character ’@’ twice) or a toolbar button toactivate or de-activate them [14, 23, 29]. De-activating a passwordmanager is dangerous because the user becomes exposed to phish-ing attacks. Also, a recent study [5] found that many participantsforget to activate (or re-activate) their password managers.The same study [5] revealed a more subtle issue with password

managers. The concept of password hashing is not easy to explainto many people, especially novice users. Some people’s mentalprocess of how online authentication works differs substantiallyfrom what password hashing does. As a result, users become frus-trated, and they misunderstand when the tool is protecting them andwhen it is not. Ultimately, this leads to users still being exposed tophishing attacks. We believe that this is an important lesson: tobe effective, a tool must be simple and intuitive, fitting most users’mental model of how Web browsing works.

5.3 New Web InterfacesAnother approach is to design Web interfaces that are less vul-

nerable to phishing. One such example is to require users to ac-cess important Web pages only through user-created labels [29]. Inthese cases, users have to go through an initial setup phase wherethey assign special labels to each of their important Web pages.From then on, as long as users visit their pages only by clicking onthe appropriate label, they cannot be phished.Another example allows users to create personalized visual clues

and associate them with important Web pages. Many popular Websites (e.g., Yahoo) have started to adopt this technique. Because aphishing page cannot know the correct visual clue, users can im-mediately detect when they are accessing a phishing page becausetheir personalized clue is missing or incorrect.The main disadvantage of both these approaches is that they

place the burden on the user to notice the absence of personalizedclues or to never forget to access the important page through theirpreset labels. These tools protect only the most diligent Web users,the ones who will always carefully check the authenticity of theWeb forms they are about to fill in with their sensitive information.Unfortunately, studies [7, 27] have shown that many Internet usersare not very careful when filling forms online.Another approach relies on the content of a Web page to de-

termine its legitimacy. CANTINA [31], an anti-phishing system,extracts keywords from a Web page, searches Google for thesekeywords, and determines whether the top Google search resultscontain the page’s domain. The authors also evaluate additionalheuristics combined with their keyword extraction technique. Thepure keyword extraction correctly labels 97% of their phishing dataset with 6% false positives; when combined with heuristics the sys-tem correctly labels 90% of their phishing dataset with 1% falsepositives. Unfortunately, this work doesn’t address how to notifyor block users from filling forms on suspicious Web pages. When-ever the false positive rate is not zero, it becomes problematic tosimply notify users or block them from using a Web form. We be-lieve this is a significant shortcoming in anti-phishing research andis an important reason for developing iTrustPage. Another prob-lem with CANTINA is that an attacker can create a Web page thatappears visually similar to an established page, while still mislead-ing the tool into extracting incorrect search terms. For example,the phisher could hide the correct keywords as an image to protectthem from being extracted. If the automated system extracts incor-

rect keywords, then the phishing web page may be automaticallyvalidated. Adding manual validation to an anti-phishing system ismore robust – attackers cannot pick the keywords that users give toiTrustPage’s manual validation task.A different approach is to use automatic tools to fill in forms.

Such tools remove the need for Internet users to type their creden-tials into online forms. These tools can then perform extra securitychecks to make sure that the form is legitimate. Web Wallet [28] isa tool that automatically fills in previously saved passwords. Whenthe user is phished, WebWallet detects that the current form has notbeen visited before. In this case, it presents the user with a list ofpreviously filled-in forms; the user must review this list and eithercontinue or choose one of the previous forms. If the user contin-ues, Web Wallet issues a warning if the page has not been verifiedby TrustWatch [13], a site serving a list of verified Web forms.On the surface, Web Wallet is similar to our tool: when filling

out a new password field, the user is presented a list of previouslyfilled-in forms (which are presumably legitimate). However, webelieve iTrustPage is based on different insights than Web Wallet.Web Wallet relies on three mechanisms to prevent phishing: (1) au-tomatic detection of password fields, (2) previously filled-in pass-word fields, and (3) relying on users to notice and understand thedescription of a Web form (even when that description might notbe available). We believe that systems with automatic detection ofpassword fields can be “fooled” by clever Web forms (e.g., usingActive-X controls or Flash scripts). Also, we think that a good se-curity principle is to reduce the amount of confidential informationstored and managed automatically by tools. Unfortunately, WebWallet has to maintain a list of previously filled-in passwords.Unlike the anti-phishing tools discussed above, iTrustPage is

centered around three different observations. (1) users can describethe Web forms they are about to fill in; (2) these descriptions canbe used to find more “established” forms on the Web; when sucha form exists, the current form is most likely phishing; and (3) un-like many other solutions, iTrustPage does not simply warn the userwhen encountering a suspicious Web page – it offers them correc-tive action. iTrustPage’s goal is to assist the user by taking themaway from the phishing Web page and to help them find the corre-sponding legitimate page.

5.4 Centralized ApproachesA different approach uses a centralized server that tracks when

users provide the same password to different sites [11]. The mainobservation behind the password-tracking approach is that whentwo different sites appear to have when users with the same pass-word, it is likely that one site is phishing the other. Many issueshave been raised regarding centralized approaches, including pro-tecting users’ privacy, filtering information introduced by phishersto poison the server’s data, and whether the detection of phishingpages can be done sufficiently early to rescue any potential victims.Another approach also uses the notion of visual similarity (i.e.,

page layouts and style) of Web pages [18]. Site owners submit theirlegitimate URLs and keywords to a central server. When the userreceives an e-mail message with similar keywords, the system com-pares the visual similarity of the linked page in the e-mail messagewith the registered legitimate pages. Although this is an interest-ing approach, there are three key problems. First, it relies on siteowners submitting their keywords and URLs, and phishers may at-tack this mechanism. Second, we believe it is possible for this typeof heuristic to be misled by clever Web forms, which is why ourtool relies on people to perform the visual comparisons. Third, thephisher may attempt to hide the relevant keywords in the phishinge-mail message.

5.5 User Studies Relevant to PhishingIn this section, we briefly summarize the results of six stud-

ies of how users behave when facing phishing pages, when usingtheir Web passwords, and when using different anti-phishing tools.Overall, these studies paint a pessimistic picture of our progressagainst the “phishing threat”: it is very easy to mislead people intodivulging their credentials with common phishing attacks.Florêncio et al. [10] monitored the Web password habits of over

half a million users over a period of three months. Based on thesemeasurements, the authors extrapolate users’ Web password behav-iors. We describe four of their findings relevant to the phishingproblem. First, during a three week period, they observed pass-words being typed into verified phishing sites 101 times. These arepasswords which were previously used on other Web sites. Second,users type their passwords often. The authors estimate that userstype passwords an average of eight times per day. Third, users havemany Web accounts. On average, each user has 25 accounts requir-ing passwords. In addition, an average user has 6.5 passwords.Each password is shared between 3.9 different sites. Finally, usersforget their passwords. The authors estimate that 4.3% of Yahoousers forgot their passwords during the three month study. Theyalso found that Yahoo users changed their password 15 times outof every 100 logins.Dhamija et al. [7] asked 22 participants to decide whether or not

20 Web pages are legitimate. They found that the participants mademistakes 40% of the time. Even worse, the best phishing Web pagefooled 90% of the participants. Warnings are ineffective: 68% ofthe participants “proceeded without hesitation when presented with[certificate] warnings”.Wu et al. [27] explored how well anti-phishing toolbars stop peo-

ple from using a fraudulent Web page. They found that active warn-ings (e.g., pop-up warnings) are more effective than passive secu-rity cues. Even active warnings failed to prevent some of the partic-ipants from falling victim to the phishing pages. Moreover, someparticipants thought the warning given by the toolbar was invalid.Downs et al. [8] performed a preliminary interview study includ-

ing 20 participants with no computer security experience. Theyfound that many users cannot distinguish legitimate e-mail fromphishing e-mail. Many participants miss cues in the address bar,and they do not interpret pop-up messages in meaningful ways.They also found that security tools need to recommend a courseof action instead of merely giving warnings.Jagatic et al. [17] performed an actual phishing attack against

581 students at Indiana University in April 2005. The experimentused publicly available information to personalize their phishing e-mails. 72% of targeted students gave away their username and pass-word when the e-mail message appeared to be from a friend. Whenthe e-mail is sent from a fictitious person, the successful phishingrate drops to 16%. In the first 12 hours of their experiment, 70% ofthe total phishing responses occurred. Their findings illustrate thatmore sophisticated phishing attacks, such as sending personalizedphishing e-mails, have very high rates of success.Chiasson et al. [5] explored the usability of two password man-

agers with 26 participants, as mentioned earlier. Their participantshad difficulty building a mental model of the software – they do nothave an understanding, even at a high-level, of what the softwareis doing. This led to frustration and misconception leading to dan-gerous security exposures. They also found that participants tendto dismiss alerts when the warning message is unclear.

5.6 Summary: Lessons LearnedIn this section, we summarize the lessons learned from the above

related work. In spite of all these anti-phishing efforts, the phishingthreat has been gaining momentum as we discussed in Section 1.We believe that these lessons help us better identify different strate-gies for dealing with phishing.

• To be effective, an anti-phishing tool must be intuitive andsimple-to-use. It can rely on users only to perform very sim-ple tasks they normally perform when browsing the Web.

• Relying on users to be diligent and check for signs of suspecte-mails or Web pages can be only marginally successful.

• More sophisticated phishing attacks, such as the ones send-ing personalized e-mails, have high rates of success. It is dif-ficult for spam filters to identify and eliminate personalizedphishing e-mails.

• Many phishing pages are short-lived. To be effective, tech-niques based on detecting these pages and blocking usersfrom visiting them must act quickly.

6. CONCLUSIONSThis paper presents iTrustPage, a tool for preventing users from

filling out Web phishing forms. iTrustPage relies on two key ob-servations: (1) user input can be used to disambiguate between le-gitimate and phishing sites, as long as the interaction with the useris simple and intuitive; and (2) Internet repositories of informationcan be used to assist the user with the decision making process.Our results show that user-assisted anti-phishing strategies show

promise while avoiding the problems associated with automatictools. We believe that iTrustPage is just one example of a differentapproach to Internet security – having the user assist the securitytool and helping it provide better protection. In the future, we in-tend to examine additional strategies to incorporate this insight intoother Internet security tools.

7. ACKNOWLEDGMENTSWe would like to thank John Aycock, Fabian Monrose, Ratul

Mahajan, Niels Provos, Anil Somayaji, Paul Van Oorschot and theanonymous reviewers for their helpful comments on earlier draftsof this paper.

8. REFERENCES[1] Anti-Phishing Working Group Website.

http://www.antiphishing.org/.[2] Personal Communication, 2006. Confidential Source,

Canadian Banking Sector. Toronto.[3] iTrustPage Tool, 2007. http:

//www.cs.toronto.edu/~ronda/itrustpage/.[4] M. Barreno, B. Nelson, R. Sears, A. D. Joseph, and J. D.

Tygar. Can Machine Learning Be Secure? In Proceedings ofthe ACM Symposium on Information, Computer, andCommunication Security (ASIACCS), Taipei, Taiwan, March2006.

[5] S. Chiasson and P. van Oorchot. A Usability Study andCritique of Two Password Managers. In Proceedings of theUSENIX Security Symposium, August, 2006.

[6] CNET News.com. New tool enables sophisticated phishingscams. http://news.com.com/New+tool+enables+sophisticated+phishing+scams/2100-1029_3-6149090.html.

[7] R. Dhamija, J. D. Tygar, and M. Hearst. Why PhishingWorks. In Proceedings of Conference on Human Factors inComputing Systems (CHI), April 2006.

[8] J. S. Downs, M. B. Holbrook, and L. F. Cranor. Decisionstrategies and susceptibility to phishing. In Proceedings ofthe Symposium on Usable Privacy and Security, July 2006.

[9] I. Fette, N. Sadeh, and A. Tomasic. Learning to DetectPhishing Emails. In Proceedings of the International WorldWide Web Conference (WWW), Banff, Alberta, Canada, May2007.

[10] D. Florêncio and C. Herley. A Large-Scale Study of WebPassword Habits. In Proceedings of the International WorldWide Web Conference (WWW), Banff, Alberta, Canada, May2007.

[11] D. Florêncio and C. Herley. Password Rescue: A NewApproach to Phishing Prevention. In Proceedings of USENIXWorkshop on Hot Topics in Security, July 2006.

[12] R. Franco. Better Website Identification and ExtendedValidation Certificates in IE7 and Other Browsers, 2005.http://blogs.msdn.com/ie/archive/2005/11/21/495507.aspx.

[13] GeoTrust. TrustWatch Search, 2006.http://www.trustwatch.com/.

[14] J. Halderman, B. Waters, and E. Felten. A convenient methodfor securely managing passwords. In Proceedings of theInternational Conference on World Wide Web, May, 2005.

[15] C. Jackson, D. Boneh, and J. C. Mitchell. Stronger PasswordAuthentication Using Virtual Machines. 2006.http://crypto.stanford.edu/SpyBlock/spyblock.pdf.

[16] K. Jackson. DNS Gets Anti-Phishing Hook, 2006.http://www.darkreading.com/document.asp?doc_id=99089&WT.svl=news1_1.

[17] T. Jagatic, N. Johnson, M. Jakobsson, and F. Menczer. SocialPhishing. Communications of the ACM. Vol. 50, No. 10.,October, 2007.

[18] W. Liu, X. Deng, G. Huang, and A. Fu. An AntiphishingStrategy Based on Visual Similarity Assessment. IEEEInternet Computing, Vol. 10, No.2. 58-65, March/April, 2005.

[19] Microsoft. Exchange Server, 2006. http://www.microsoft.com/exchange/default.mspx.

[20] Microsoft.com. Get anti-phishing and spam filters withOutlook SP2, 2005.http://www.microsoft.com/athome/security/email/outlook_sp2_filters.mspx.

[21] J. Nazario. Phishingcorpus: phishing2. http://monkey.org/~jose/phishing/phishing2.mbox.

[22] B. Parno, C. Kuo, and A. Perrig. Phoolproof PhishingPrevention. In Proceedings of Financial Cryptography andData Security (FC), 2006.

[23] B. Ross, C. Jackson, N. Miyake, D. Boneh, and J. Mitchell.Stronger Password Authentication Using BrowserExtensions. In Proceedings of the Usenix SecuritySymposium, April, 2005.

[24] B. Schneier. Two-Factor Authentication: Too Little, TooLate. Communications of the ACM. Vol. 48, No. 4., April,2005.

[25] SURBL. Surbl lists, 2006.http://www.surbl.org/lists.html.

[26] Symantec. Symantec Internet Security Threat Report: Trendsfor July - December 06. http://eval.symantec.com/mktginfo/enterprise/white_papers/ent-whitepaper_internet_security_threat_report_xi_03_2007.en-us.pdf.

[27] M. Wu, R. Miller, and S. Garfinkel. Do Security ToolbarsActually Prevent Phishing Attacks? In Proceedings ofConference on Human Factors in Computing Systems (CHI),April 2006.

[28] M. Wu, R. Miller, and G. Little. Web Wallet: PreventingPhishing Attacks by Revealing User Intentions. InProceedings of the Symposium on Usable Privacy andSecurity, July 2006.

[29] K. Yee and K. Sitaker. Passpet: convenient passwordmanagement and phishing protection. In Proceedings of theSymposium on Usable Privacy and Security, July 2006.

[30] Y. Zhang, S. Egelman, L. Cranor, and J. Hong. PhindingPhish: An Evaluation of Anti-Phishing Toolbars. In Networkand Distributed System Security Symposium (NDSS), SanDiego, California, USA, February 2007.

[31] Y. Zhang, J. Hong, and L. Cranor. CANTINA: AContent-Based Approach to Detecting Phishing Web Sites.In Proceedings of the International World Wide WebConference (WWW), Banff, Alberta, Canada, May 2007.

itrustpage: a user-assisted anti-phishing tool · 2018-01-04 · itrustpage is a web browser...

Documents