lucy simko, luke zettlemoyer, and tadayoshi kohno ... · corresponding author: lucy simko: paul...

Proceedings on Privacy Enhancing Technologies ; 2018 (1):127–144

Lucy Simko*, Luke Zettlemoyer, and Tadayoshi Kohno

Recognizing and Imitating Programmer Style:Adversaries in Program Authorship AttributionAbstract: Source code attribution classifiers have re-cently become powerful. We consider the possibility thatan adversary could craft code with the intention of caus-ing a misclassification, i.e., creating a forgery of anotherauthor’s programming style in order to hide the forger’sown identity or blame the other author. We find thatit is possible for a non-expert adversary to defeat sucha system. In order to inform the design of adversariallyresistant source code attribution classifiers, we conducttwo studies with C/C++ programmers to explore thepotential tactics and capabilities both of such adver-saries and, conversely, of human analysts doing sourcecode authorship attribution. Through the quantitativeand qualitative analysis of these studies, we (1) evaluatea state-of-the-art machine classifier against forgeries, (2)evaluate programmers as human analysts/forgery detec-tors, and (3) compile a set of modifications made to cre-ate forgeries. Based on our analyses, we then suggest fea-tures that future source code attribution systems mightincorporate in order to be adversarially resistant.

Keywords: Authorship Attribution, Source Code Attri-bution, Machine Learning, Adversarial Stylometry, Pri-vacy, Computer Security

DOI 10.1515/popets-2018-0007Received 2017-05-31; revised 2017-09-15; accepted 2017-09-16.

1 IntroductionWith recent publicity surrounding cyber attacks andcorresponding investigations into attribution of the at-tacks, the computer security research community hasrenewed its attention on authorship attribution of pro-

*Corresponding Author: Lucy Simko: Paul G. AllenSchool of Computer Science & Engineering, University ofWashington, E-mail: [email protected] Zettlemoyer: Paul G. Allen School of ComputerScience & Engineering, University of Washington, E-mail:[email protected] Kohno: Paul G. Allen School of ComputerScience & Engineering, University of Washington, E-mail:[email protected]

grams. This line of research seeks to answer the follow-ing question: given source code (or a binary) of unknownauthorship, as well as examples of source code (or bi-naries) from a set of authors, is it possible to identifythe original author of that program? Prior research onthis question has shown significant promise— it is pos-sible to attribute the authorship of programs with a highdegree of accuracy under realistic assumptions. For ex-ample, Caliskan-Islam et al. [5] propose a classifier thatachieves 96% accuracy over 250 programmers.

Computer security involves adversaries trying to de-feat security mechanisms, and the designers of thosesecurity mechanisms seeking to defend against adver-saries. We argue that the next phase in the evolution ofprogram authorship attribution is therefore consideringpotential adversaries—parties seeking to fool author-ship attribution systems. Specifically, because strongprogram authorship attribution systems now exist, al-beit not designed for adversarial contexts, we can exper-imentally explore how adversaries might seek to defeatthose systems and whether those adversaries might besuccessful. This paper explores both the methods thatan adversary might use to intentionally subvert an at-tribution system, as well as a conservative estimate ofthe adversary’s capabilities. By understanding the ad-versaries’ tactics and capabilities, as well as their effecton a (public) state-of-the-art attribution system, our re-sults can inform the design of adversarially resistant au-thorship attribution systems.Adversarial Goals. We consider two related goals foran adversary: to create a forgery of another author, andtomask the original author. A successfulmasking meansthat an adversary has taken a program by some originalauthor and modified it to conceal the true authorship.A successful forgery means that an adversary has fooledan attribution system into believing that some programwas written by a targeted, victim author who did notwrite the program. A forgery attack is a targeted attack,while a masking attack is untargeted. Security underthe latter goal (masking) implies security under the for-mer (forgery) because an adversary able to accomplishforgery will by definition be able to accomplish masking.

For example, consider an attribution system trainedon five authors (A-E) with program Pa written by au-

Recognizing and Imitating Programmer Style 128

thor A, and program Pa′ written by author A but mod-ified by an adversary. An adversary launching a forgeryattack with Pa′ would have the goal of Pa′ being labeledas written by a specific author, say, C. An adversarylaunching a masking attack with Pa′ has a goal of Pa′

being labeled as written by any of B, C, D, or E.Our Forgery Studies. To study the capabilities andmethods of an adversary with these goals, we conductedtwo studies with programmers; specifically, we studyforgery attempts, as a successful forgery implies a suc-cessful masking. In our first study, the forgery creationstudy, programmers play the adversary, modifying codeto fool a machine classifier. In our second study, theforgery detection study, we present a different set of pro-grammers, acting as human analysts, with a series of at-tribution tasks over a set of code samples that includeforgeries.

With the results of these studies, we then conductquantitative analyses to explore the potential capabili-ties of an adversary against both existing classifiers andhuman analysts; we also conduct qualitative analyses tounderstand the potential methods of both an adversarycreating attacks and a human analyst doing attribution.Finally, we analyze the modifications from the point ofview of the machine classifier, to better understand thesuccess of the forgeries and suggest features that future,robust machine classifiers might add.

Both these studies, along with our analysis of themachine classifier, give us a greater richness in our un-derstanding of how people might attempt to create forg-eries—as well as a baseline of their success—and pro-vide us with insights on how one might design adver-sarially robust program authorship attribution systems,including systems that one might design to work withhuman analysts.Contributions. We conduct the first public analysis ofthe robustness of a state-of-the-art source code author-ship attribution system to adversarial attempts at cre-ating forgeries. We also conduct the first, to our knowl-edge, public study of the efficacy of human analysts atidentifying forgeries. The following contributions arisefrom these experiments and our resulting analyses.– We experimentally evaluate the robustness of a pub-

lic state-of-the-art attribution system to adversarialattempts at creating forgeries. In doing so, we findthat both expert and non-expert programmers cansuccessfully fool the attribution system.

– We provide a set of modifications that forgers madeto create their forgeries. We also identify the strate-

gies that human analysts (in our study) use to makeattribution decisions.

– We experimentally evaluate programmers as ana-lysts for source code authorship attribution andforgery detection. We find that forgery detection isdifficult for humans as well as for machines, butthat humans can adapt their strategies after be-ing warned about adversarial input, without be-ing specifically trained on it. From this analysis, wedraw lessons for machines.

– We analyze strategies from both the forgers andthe analysts, and provide recommendations to thedesigners of future program authorship attributionsystems, in an effort to make them robust againstthe type of adversaries we consider.

– We investigate the effect of the forgeries on the ma-chine classifier and, through this analysis, suggestlessons for future, more robust source code attribu-tion classifiers

2 Related Work

Natural Language Processing: adversarial sty-lometry and authorship obfuscation. Authorshipattribution is a well-studied topic within the NaturalLanguage Processing (NLP) community. See [25] for asurvey. In addition to attributing authorship of writ-ten text, the NLP community has recently turned toauthorship obfuscation and forgery.

Kacmarcik and Gamon [15] explored the idea of pur-poseful anonymization in NLP, with a focus on quantify-ing the effort necessary to anonymize against attributionsystems. Later, Brennan et al. [4] introduced adversarialstylometry: the notion that authors of natural languagetext may intentionally obfuscate (mask) their identity,or imitate (forge) another’s identity. They asked a groupof writers to obfuscate their own writing and then toimitate a famous author, and found that classifiers thattypically had high accuracy did poorly on the obfus-cated and imitated samples.

Our experiments and goals are similar to those ofBrennan et al. [4], except our domain is source code in-stead of English. Just as code attribution research bor-rows ideas from NLP, our research on subverting sourcecode attribution classifiers is both analogous to Brennanet al. [4] and novel within the security community.

Afroz et al. [1] analyzed the data from [4] to identifylinguistic features that change when writers are beingdeceptive about their identities. This work introduced a


set of features with which they train a classifier to recog-nize when an author is hiding their identity. Concurrentwork by Juola et al. [14] found similar results.

More recently, Potthast, et al. [20] published a sur-vey of authorship obfuscation and imitation, as well aslarge scale evaluation of obfuscation techniques and de-tection algorithms.Source code attribution. Compared to natural lan-guages, authorship attribution for source code is a muchyounger field. Early works include Spafford and Wee-ber [23], Pellen [19], Kothari et al. [16], and Frantzeskouet al. [10]. Recently, Caliskan-Islam et al. [5] improvedthe state-of-the-art by using a combination of syntac-tic, lexical, and layout features and achieved higher ac-curacy over a much larger group of programmers thanprevious works. We use this classifier when testing thestrength of forgeries.

Hayes and Offutt [13] suggest a number of high-levelfeatures to distinguish programmers. They find that themost distinguishing features are concerning the occur-rence of operators, constructs, and lint warnings. Theyconsider high-level algorithmic and engineering featuressuch as testability and code coverage, but do not findthat these features distinguish amongst the five pro-grammers in their study. Our results suggest revisitingthese high-level features in the context of forgeries.

Additionally, Dauber et al. [8] develop an ensem-ble classifier to do highly accurate attribution on short(partial) code samples.Plagiarism detection for source code. Plagiarismdetection is a sub-problem of authorship attribution.MOSS (“Measure Of Software Similarity”) [22], a well-known publicly available tool, measures the similaritybetween two sets of source code and outputs a percentmatch. Forgery and plagiarism are related but also havea clear difference: in plagiarism, multiple instances ofthe plagiarized program are often available. In forgeryand masking, the adversary creates a new program, in-tending to cause the attribution system to misattribute.Classifying binaries. There is also work on attri-bution of binaries. Rosenblum et al. [21] apply ma-chine learning to style features extracted from binaries;Caliskan-Islam et al. [6] build on this work. Muir andWikström [17] find that changing compiler settings andlinking statically can be used to decrease attribution ac-curacy and obfuscate authorship on binary attributionclassifiers— the only other work, to our knowledge, fo-cused on authorship obfuscation for programs.

3 Threat Model and GoalsWe now elaborate on our goals and underlying threatmodel, beginning with several motivating scenarios.

3.1 Motivating Scenarios

Consider the following scenarios. In each case, the pro-grammer or adversary knows about the existence of pro-gram authorship attribution systems but not the details.The adversary is not necessarily malicious, but is alwaysacting adversarially to the attribution system.1. Forgery. Suppose an adversary wants to insert ma-

licious code into an open-source code repository.One avenue the adversary might take is to obtainthe credentials of an established contributor andcommit code that introduces a vulnerability. Theadversary’s goal is to commit code that appears tohave come from the legitimate programmer so as tonot raise the suspicion of the other contributors tothe repository, or trigger any automated alarms—i.e., the adversary seeks to create a forgery.

2. Masking. Suppose that a programmer wants towrite and release code for a new digital currencybut does not want anyone to know who wrote theprogram. The history of the Bitcoin digital currency,and its anonymous inventor Satoshi, inspire this sce-nario [3]. The programmer, desiring privacy, mayseek to intentionally modify their code to make itunidentifiable— i.e., to mask their code.

3. Masking. Consider a programmer developing soft-ware in a country where censorship is rampant. Theprogrammer wants to develop software that is ille-gal in that country, such as anti-censorship messag-ing apps, and may use Tor and other network toolsto obfuscate any network indicators. However, theprogrammer also needs to ensure that the softwaredoes not look like it was written by them— i.e., thisprogrammer seeks to mask their code, to protecttheir anonymity. This example is inspired by SaeedMalekpour [26], an Iranian web developer sentencedto death for writing photo sharing software used todistribute pornography.

4. Forgery. Suppose now that an adversary wants tomake it appear that a programmer developed anti-censorship software. The adversary, perhaps a workcolleague or someone with access to software thatthe intended victim had written, e.g., via Git, mod-


ifies their own code to look like the victim’s, and thevictim is blamed. This represents a forgery attack.

In any of these scenarios, the adversaries may eitherwrite their own code or modify code written by others,depending on their skill levels and access to code thatalready fulfills their purpose.

3.2 Threat Model

As captured in the preceding scenarios, we consider twokey (related) adversarial goals.1. Forgery. An adversary creates a successful forgery

if they create code such that the attribution systemattributes that code to a specific target author.

2. Masking. An adversary successfully masks the au-thorship of a program if the adversary modifies aprogram written by some original author such thatthe attribution system does not identify the originalauthor when given the modified program as input.

Actors. To fully understand the threat model, we defineseveral additional parties.– Adversary: the party seeking to create a forgery or

mask and thwart the attribution system.– Original Author: the person who first wrote the code

modified in an attack. The original author may ormay not be the adversary.

– Target Author: the victim of a forgery attack. Theadversary modifies code written by the original au-thor to make the attribution system identify the tar-get author. We envision adversaries who are not thetarget authors.

– Attribution System: one or more classifiers that de-termine the authorship of the code. Inspired byprior works, we typically use this term to refer toa machine classifier that does authorship attribu-tion. However, as we discuss, human analysts mayaugment these systems.

We next elaborate on our assumptions, drawn from ourgoal of developing a conservative estimate of an adver-sary’s capabilities.Training set. First, we assume that both the adver-sary and the attribution system have labeled samplesof code written by the original and target author (if aforgery attack). Both parties may have samples fromother authors, i.e., the classifier may be trained on mul-

tiple authors. We assume the training set contains noforgeries.Classifier. We assume that the only information avail-able to the attribution system are the programs in ques-tion: network traces, memory dumps of machines, etc.,could help detect authorship but are orthogonal to ourstudy of source code attribution.Adversary skill. We assume adversaries with a solid,but not necessarily expert, knowledge of the languageused by the authors (C/C++ in our case), and no spe-cific training on forgery, masking, or attribution. Wethink it advantageous to begin with the simple case (inthis case, experienced but not expert programmers): ifeven these adversaries can fool a state-of-the-art attri-bution system, then the attribution system will likely bevulnerable to more sophisticated adversaries, and a lesssophisticated attribution system will probably be morevulnerable to the same adversaries.Adversary’s knowledge of the classifier. We as-sume that the adversary lacks specific knowledge aboutthe classifier in question, e.g., classification algorithm,features, size, or contents of training set. This assump-tion gives the classifier the greatest power and likelihoodof success even in the face of an adversary.

3.3 Goals

Informed by these motivating scenarios and qualified bythe bounds of our threat model, we present our goals asa series of guiding questions:1. Deceptibility of classifiers under adversarial

input. What is a realistic conservative estimate forthe success of forgery attacks in the wild? Section 6addresses this question using our experiments withboth a state-of-the-art classifier and human ana-lysts.

2. Observed methods of forgery creation. Whatmodifications might adversaries make to code to cre-ate forgeries? Section 7.1 explores the forgery cre-ation process to inform the design of future attribu-tion systems with increased resiliency to forgery.

3. Observed methods of forgery detection andauthorship attribution. What features might hu-man analysts look for when making attribution deci-sions? Are these sufficient for attribution when forg-eries are not present? When forgeries are present?Could these features be potentially adopted by ma-chine classifiers? Section 7.2 explores the forgery de-tection insights gleaned from human analysts.


4. Effect of forgery on machine classifier fea-tures. How do the code modifications made by forg-ers affect the features that a machine classifier mightexamine? Section 8 addresses this by comparing themachine classifier’s view of original files and theircorresponding forgeries.

4 Classifier and DataBefore discussing our methodology, we give backgroundinformation on the classifier we use and our data set.State of the art classifier. To facilitate the explo-ration of our goals, we needed a classifier against whichto test forgeries. Industry attribution systems, to the ex-tent that they exist, are proprietary, with little publicknowledge of their capabilities or methods. We thereforefocused on a publicly available attribution system.

A clear choice for a state-of-the-art code stylome-try classifier was the classifier introduced by Caliskan-Islam et al. [5], since it achieved remarkable increasesin accuracy over prior work on source code attribution.This model performs attribution on C/C++ code byextracting lexical, layout, and syntactic features, whichare derived from an abstract syntax tree of the code.

We replicated Caliskan-Islam et al.’s setup with theopen source version of the classifier available at [7] andfollowed their directions to set up using the Code Sty-lometry Feature Set. As they did, we used Weka’s im-plementation of a Random Forest with 300 trees.Dataset. For consistency, we used the same dataset asCaliskan-Islam et al. [5], Google Code Jam (GCJ) [11],a programming competition. GCJ invites participantsto code to solve a series of problems. GCJ data is well-suited for code stylometry and authorship attributionwork because each program is guaranteed to be single-author and it provides a body of functionally equiva-lent programs as solved by different authors. However,the programs differ from much code in the wild becausethere are no style guidelines and the code is not meantto be maintained or used over long periods, unlike muchprofessionally developed software. We used GCJ despitethis difference because it is consistent with previouswork and provided our participants with functionallyequivalent programs from which to learn authors’ styles.

We used C code because the classifier is built to at-tribute C code. The dataset consisted of 8661 programsfrom 3940 authors (averaging 2.2 each), but we primar-ily used code from the five authors with the most files:214 total, or an average of 42.8 files each.

C5 C20 C50

Precision 100% 87.6% 82.3%Recall 100% 88.2% 84.5%

Table 1. Precision and recall for our classifiers, calculated using10-fold cross-validation.

Data Modification. The forgery creation study aimedto understand the methods and capabilities of poten-tial adversaries. A secondary goal was respecting par-ticipants’ time. Pilot experiments showed participantscreated forgeries by making many tedious typographicalmodifications and then losing interest and motivation.

To encourage participants to focus on other modifi-cations, we normalized typographical style to a large ex-tent by running all code though a code linter, astyle [2],a simple static analyzer and code beautifier. By doingthis, we support our goal of understanding sophisticatedmethods of forging and attribution rather than methodsthat would be obliterated by a linter.Our Classifiers. We created a suite of three trainingsets on which this classifier achieved high accuracy inorder to (1) give the classifier an advantage where pos-sible and realistic and (2) allow us to discuss generaltrends over training sets.

The three training sets differ by the number of au-thors in the sets: 5, 20, and 50. We name them accord-ingly as C5, C20, and C50. All classifiers include datafrom the five authors with the most files; additional au-thors are chosen randomly from those with at least fivefiles (to ensure sufficient training data). We chose thefive most prolific authors because we wanted to give theclassifier as much training data as possible in order togive it an advantage over the forgers. Table 1 gives theprecision and recall of each classifier.

Although we mirrored Caliskan-Islam et al.’smethodology until this point, here we differ. Caliskan-Islam et al. constructed training sets that had equalamounts of code from each author in the training setto investigate how the amount of training data affectsattribution success; for our goals, we wanted to havea classifier with as high accuracy as possible in orderto obtain a conservative estimate of adversarial success.We achieve this by not withholding training data fromany author and creating forgeries of the five most pro-lific authors. We did not control the number of trainingfiles per author, as downsampling would not improveclassification accuracy or fairness.

Additionally, we remove all programs that we gaveparticipants as a possible starting point for forgeries,


a small subset of all code by each author. We removethese files from the training sets because these originalfiles are similar to the forgeries created.

5 MethodologyThis section explains our design decisions for the studiesand presents a summary of participants’ demographics.

We refer to the author or authors that we askeda participant to focus on as ‘X’, ‘Y’, and ‘notX’. Recallthat there are five possible authors, A-E, so five possiblevalues for each X and Y. ‘notX’ takes on the value of theother four authors; for example, if ‘X’ is author B, then‘notX’ is the combination of A, C, D, and E. Addition-ally, participants in the forgery creation study compareddifferent author pairs: some compared authors A and Ewhile others compared B and C (and so on).

Running studies that explored all possible combina-tions of authors and participant skill with the number ofchoices of code to forge and attribute would have provenprohibitively large. Given the lack of prior work on codestyle forgery, our intentions were two-fold: 1) develop aconservative estimate of adversarial capabilities by mea-suring forgeries against a state-of-the-art machine clas-sifier, and, 2) qualitatively analyze the methods used toforge and attribute, as well as the effects they had onthe features extracted by the machine classifier.

We designed and conducted both studies with theethical considerations required when working with hu-man participants. Our institution’s human subjects in-stitutional review board approved both studies. Neitherstudy asked participants to incur any risk greater thannormal while programming or reading code.

5.1 Forgery Creation Study Design

In our forgery creation study, we conducted sessionswith 28 programmers, who produced 29 forgery at-tempts (one participant misunderstood the instructionsand created two forgeries). We excluded two forgeriesbecause they differed dramatically from the originalcode in functionality, which was explicitly against theinstructions. Thus, we have 27 total forgeries.

The tasks in this study were:1. Train. Given code from two authors, X and Y

(chosen from authors A-E), compare functionallyequivalent code by both authors to learn the au-thors’ styles. Participants were not told how to think

about style, but they did know that subsequenttasks would ask them to attribute and forge.

2. Attribute. Attribute 10 new samples of code toeither X or Y. We asked participants to: (1) give anattribution decision, (2) indicate their confidence ona scale of 1-10, (3) briefly state their reasoning.

3. Forge. Modify code written by one author to looklike it was written by the other author. We askedparticipants to imagine the best possible classifier:human, machine, or some combination thereof. Par-ticipants chose whether to forge X’s style or Y’s. Inthis task, we learn the level of success after partici-pants think they have made enough transformationsto fool a classifier— i.e., the capabilities of an ad-versary with no information about a classifier.

4. Forge with Oracle. Test the forgeries from theprevious step against the classifiers. Participantsthen continued to modify their forgeries until allversions of the classifier produced the target mis-classification, or until they chose to stop. The pur-pose of this task was to encourage participants tocontinue making more transformations if their firstforgeries did not fool all the classifiers.

5.2 Forgery Detection Study Design

Our forgery detection study used the forgeries created inour forgery creation study (Section 5.1) to explore boththe success of programmers as attribution analysts andthe methods used to successfully detect forgeries.

This study had 21 participants. We gave the par-ticipants no specific instructions concerning attributionor forgery detection, since we did not want to bias theirattribution methods with our preexisting knowledge ofthe forgeries. At first, participants were not aware thatthis study was about forgeries.

Participants completed the following tasks, in whichthey acted as an analyst looking to detect forgeries ofa certain author, X; X varied between authors A-D (wedid not receive enough forgeries of E):1. Train. Given code from the same five authors used

in the forgery creation study, with four of the au-thors combined to be ‘notX’, and the remaining au-thor as ‘X’, learn the style of author ‘X’. As in theforgery creation study, there were two examples ofeach problem, and there were no forgeries in thetraining set. Participants did know that there weremultiple authors in the ‘notX’ category. We trainedparticipants on one author versus all the others, adeparture from the training phase in the forgery cre-


ation study, because we thought it would be easierfor participants completing these tasks to focus ononly one author’s style— i.e., the one being forged.

2. Simple attribution. Given 12 new samples ofcode, answer the question “who wrote this code?”With possible answers ‘X’ and ‘notX’, randomguessing would be expected to achieve 50% accu-racy. Four of the 12 samples were written by X, andeight were written by one of the notX authors. Ofthe eight, four were forgeries of X, and four werenot. We count a ‘notX’ label on ‘forgery of X’ ascorrect, and an ‘X’ label incorrect, as the task wasto identify the person who wrote the code.

3. Attribution with knowledge of forgery. At-tribute the same set of 12 samples again, with theknowledge that some may be forgeries. Possible an-swers were ‘X’, ‘notX’, and ‘forgery of X’. In thistask, random guessing would be expected to achieve44%, as there are two correct labels for a forgery ofX: both ‘notX’ (the original author) and ‘forgeryof X’ (a forgery detection) are correct, since eitherwould be a win from a classifier.

4. Find the forgery. Given 3 or 4 pairs of function-ally equivalent programs side by side, one writtenby author X and the other a forgery of author X,decide which is the forgery.

5.3 Recruitment and Demographics

Recruitment. For both studies, we required partic-ipants to self-report a good working knowledge ofC/C++. As we elaborate below, participants in theforgery-creation study were a mix of undergraduateComputer Science students and current or former pro-fessional software developers. Participants in our forgerydetection study were primarily upper-level undergrad-uates and graduate students in Computer Science. Noone participated in both studies.Demographics. Of the 28 forgery creation partici-pants, 10 were students, 16 were professional computerscientists, and two were unemployed or had an occu-pation unrelated to computer science. Of the 21 forgerydetection participants, 14 were students, and three wereprofessional computer scientists.

We also asked participants to report experience as aprofessional C/C++ programmer. Although the forgerycreation participants were, in general, more experiencedwith programming, we can compare the relative skilllevels of the two groups at attribution specifically by

comparing their performance on the two attributiontasks: the attribution task in the forgery creation study(Step 2 in Section 5.1) was nearly identical to the sim-ple attribution task in the forgery detection study (Step2 in Section 5.2), if we count attribution decisions ononly non-forgeries. Despite their demographic differ-ences, both groups of participants attained high accu-racy on these tasks, suggesting that they were similarlyskilled at thinking about programming style.

6 Quantitative ResultsWe now turn to our results regarding the capabilitiesof both the participants in the forgery study (the forg-ers) and the participants in the forgery detection study(the analysts). Broadly, we find that the forgeries cre-ated by our participants, who had no specific training inforgery creation, are largely successful against machineclassifiers and human analysts. However, when analystsare told to look for forgeries, forgery success decreasesrapidly, suggesting that future, robust attribution sys-tems should learn from the strategies that the humananalysts developed. In this section, we refer to successas the success of the forgery attacks, not the success ofthe classifier.

6.1 Deceptibility of machine classifier

We find that the current state-of-the-art machine clas-sifier [5]— though it achieves high accuracy on largenumbers of programmers— is not robust to forgery at-tempts. The ability of adversaries to defeat this classi-fier is perhaps unsurprising given that the classifier wasnot designed to be robust against adversaries. However,we find it informative to study whether it actually failswith adversarial input in practice and (later) evaluateboth how people created those forgeries and how theforgeries affected the machine classifier features in or-der to inform the design of future, adversarially-robustclassifiers.

Attack type Initial attack success Final attack successForgery 61.1% 70.0%Masking 73.3% 80.0%

Table 2. The percent of attempts that were a successful attackon both the forgers’ initial and final attempts.


Attack type C5 C20 C50 AverageForgery 66.6% 70.0% 73.3% 70.0%Masking 76.6% 76.6% 86.6% 80.0%

Table 3. Breakdown of attack success of final attempts againstthe machine classifiers.

Initial forgery attempts. Table 2 shows the per-cent of forgery attempts that yielded successful forg-eries or masks against the machine classifiers. On theforgers’ first try, 61.1% of forgery attempts were suc-cessful forgery attacks, and 73.3% of forgery attemptsresulted in successful masks. (Recall that forgery is asubset of masking, so successful mask attempts includeall successful forgery attempts).Final forgery attempts. Table 3 shows that 70.0% offinal forgery attempts were successful, and 80.0% of finalattempts were successful as masking attacks. That is,for 80.0% of final forgery attempts, the classifier did notidentify the original author, meaning that the averageaccuracy of the classifier on the forgeries was 20.0%.Training data variation: number of authors. Thepercentage of successful attacks increases with the num-ber of authors in the classifier’s training; in our data,the percent of successful forgery attacks increases from66.6% on C5 to 73.3% on C50. Similarly, masking attacksincrease from 76.6% to 86.6% on the same classifiers. Weexpect that as the number of authors in the classifier’straining set rises, the ease of masking would continueto rise, while the ease of forging would depend on thestrength of the classifier’s model of the target author.Training data variation: presence of unmodifiedforgeries. Recall, from Section 4, that the training setsdid not contain the original files from which the forg-eries were created. We also test the forgeries againstthree classifiers trained on the same authors, but withthe addition of the original forgery files, so the classi-fiers have access to all authors’ entire body of work.Intuitively, forgeries have a lower level of success whenattacking these versions of the classifier: forgery suc-cess is 65.6% overall, and masking is 73.3% (down from70.0% and 80.0%, respectively). This implies that theclassifier has a better chance of recognizing the originalauthor if the actual original file is in its training set.

6.2 Success against human analysts

The results of our forgery creation study, detailed inSection 6.1, reveal that an adversary can deceive the

Task Simple attri-bution

Attribution withknowledge offorgery

Find theforgery

Forgeryattacksuccess

56.6 % 23.7% 53.1%

Table 4. The percent of forgeries that were a successful attack ineach phase of the forgery detection study.

state-of-the-art machine classifier. Our forgery detectionstudy finds that while our participants do not spot forg-eries when given no information at all, they can developsuccessful forgery detection strategies without examplesof forgeries or instructions about forgery creation. Asabove, success refers to the forgery attack success, notthe success of the participants.

Recall, from Section 5.2, that in the forgery de-tection study, participants had no knowledge of theforgeries in the simple attribution task, and later knewthat there might be forgeries present in the next task,the attribution-with-knowledge-of-forgery task. The finaltask, find-the-forgery, presented participants with twofunctionally equivalent programs, and instructed themto determine which one was the forgery. Table 4 sum-marizes the rate of success of forgery attacks in each ofthe attribution tasks.

We find that attack success decreases dramaticallyin the second task, after participants are told about thepossibility of forgeries. We also find that humans andmachines are not always fooled by the same forgeries.Simple attribution task results. In the simple at-tribution task, the forgeries fooled the humans 56.6%of the time, meaning that the participants’ accuracy onforgery was statistically equivalent to random guessing.This indicates that in this task, when participants hadno reason to expect that some of the programs were forg-eries, they were often fooled by the forgeries (as was themachine classifier). The attribution accuracy on non-forgeries attained by the participants, 92.7%, is high.Attribution-with-knowledge-of-forgery task re-sults. In the attribution-with-knowledge-of-forgery task(Step 3 in Section 5.2), forgery attack success decreasesdrastically, to 23.7%. That is, participants were able tocorrectly detect 76.3% of forgeries. This dramatic dropin forgery success shows that when analysts are madeaware of the possibility of forgery, they are more likelyto accurately detect or attribute forgeries. However, al-though many forgeries were correctly detected, a num-ber of code samples attributed as forgeries were not ac-


tually forgeries: where previously the analysts correctlyattributed non-forgeries with 92.7% accuracy, they onlyachieved 76.2% accuracy on non-forgeries in this task.This can be attributed to increased suspicion about in-consistencies in every program, as participants noted.Find-the-forgery task results. The third and finaltask of the study presented participants with two func-tionally equivalent programs, one written by X, and theother a forgery attempt of X solving the same prob-lem. Success on this task was 53.1% overall, statisti-cally equivalent to random guessing (50%). However,the results were not random: some forgeries consistentlyfooled the humans while others tended to be detected.Furthermore, although some forgeries in this task fooledthe participants, they were not strictly the same forg-eries that fooled the machine classifiers, or that fooledthe participants in the previous task.

6.3 Lessons

Our results enable us to draw the following lessons re-garding the capabilities of forgers and analysts:1. The current state-of-the-art attribution classifier [5]

is not robust to adversarial stylometry attacks, evenwhen those attacks are created by non-experts. Thissuggests that other source code attribution classi-fiers may be vulnerable to the same type of attacks.

2. Training sets matter: classifiers trained on fewer au-thors are more robust, and classifiers that have theoriginal, unmodified files in their training sets aremore robust. This suggests having a variety of train-ing sets on which to test new code.

3. Augmenting attribution systems with human ana-lysts, though expensive, may increase the successof the overall system’s forgery detection and attri-bution capabilities. Attribution analysts are subjectto the same attacks that machine classifiers are, butthey may be sensitive to different attack strategies.

4. Telling the analysts about the potential adversarialinput was enough to significantly increase forgerydetection, though the increased suspicion causednon-forgeries to be marked as forgeries. Attributionsystems employing human analysts should considerhow to train analysts to increase the rate of forgerydetection without decreasing attribution accuracy.

5. Some forgeries fooled all the machine classifiers andmany or all of the humans. Others fooled all themachine classifiers and none of the humans, andothers still fooled many or all of the humans, butnone of the machine classifiers. Forgeries that fool

all the classifiers, human and machine, are the bestforgeries. However, the small number of forgeriesthe fool either humans or machines but not bothsuggests two things: (1) that as the quality of theforgery declines, it may only fool some of the clas-sifiers, but we cannot say which ones; and (2) someforgeries are easier for different classifiers or peopleto recognize.

7 Qualitative ResultsTo understand how to improve forgery detection, wefirst analyze forgeries to understand the changes forgersmade. Second, we analyze the participants’ notes fromthe attribution phases in both studies to understandhow they thought about author style, attribution, andforgery detection.

7.1 Forgery Creation

Implementation involves three types of decisions: ty-pographic, control structure, and information structure[12, 18]. We further divide modifications into local, orimplementation-level changes, and higher-level algorith-mic changes. These categories exist on a continuum, butwe define them roughly as follows: if one were imple-menting a program from pseudocode, a local modifica-tion would not be visible in the pseudocode, whereas analgorithmic change would be a departure from the pseu-docode. For example, imagine pseudocode for a nestedloop, along with the accompanying source code. In thesource code, modifying the type of loop— i.e., for towhile—would be a local control flow modification, sinceit does not affect the pseudocode, but control flow key-words are modified. Abstracting the inner loop to ahelper function would be an algorithmic control flowmodification, since it would affect the underlying pseu-docode.

Type of change Total modifications Forgers who madethis change

Var. names 168 25Lines of code copied 113 23Libraries imported 53 22Indent scheme 1024 19Var. decl. location 85 19

Table 5. Common modifications made by forgers


We find that modifications made to create successfulforgeries are overwhelmingly local control flow, local in-formation structure changes, and typographical changes,and that the vast majority of participants did not makealgorithmic modifications at all.

The current state-of-the-art classifier [5] achieveshigh accuracy on non-adversarial input. However, ourparticipants—without knowing what features the clas-sification algorithm was using—modified code in such away that they were able to fool the classifiers using onlythese local modifications, suggesting that local code fea-tures are not sufficient for an attribution system thatmay receive adversarial input. Section 8 explores thehow these local code modifications affect the machineclassifier features.

Next, we detail the types of changes made to forg-eries, citing examples from the studies. First, we presentan overview of the most common types of modifications;then we explore specific changes using the examplesshown in Figures 1, 2, 3, and 4. Appendix A containsthe entire list of modifications.

7.1.1 Most Common Changes

Table 5 shows the most popular modifications, rankedby the number of participants who made at least onesuch modification. Changing variable names, variabledeclaration location, and libraries imported are local in-formation structure modifications (Section 7.1.3), andmodifying the indentation scheme is a typographicalchange (Section 7.1.2).

Copying entire lines of code written by the targetauthor in the training set may be either a control flowor an information structure change: often, participantscopied the logic of the main loop (control flow mod-ification) and variable declarations for variables usedglobally or at a high level in the main function (in-formation structure modification), suggesting that par-ticipants considered file- or function-level variables andloop logic and structure an essential marker of style.

7.1.2 Typographical Changes

Recall from Section 4 that the code given to partici-pants had been normalized typographically to a largeextent. Many participants still made some typograph-ical changes, though some typographical modificationsmay not have been intentional if participants were work-ing in an IDE that enforced a certain style. There is no

[Original code]

for(i=0;i<a1-1;i++) printf("%d",r1[i]);printf("%d\n",r1[i]);for(i=0;i<a2-1;i++) printf("%d",r2[i]);printf("%d\n",r2[i]);

[Modified code (forgery)]

for (i = 0; i < cmx - 1; i++){

printf("%d ",curmx[i]);}printf("%d\n", curmx[i]);

for (i = 0; i < cmy - 1; i++){

printf("%d ", curmy[i]);}printf("%d\n", curmy[i]);

Fig. 1. Original and modified code, showing typographical andinformation structure modifications

way to determine which modifications were not deliber-ate, so we include all modifications in our analysis.

The most common typographical modificationswere the usage or location of brackets and the inden-tation scheme, while less common modifications werecomments and spacing between operators. Figure 1 il-lustrates some of these modifications. In this original-forgery pair, the forger added brackets, newlines, andspaces between operators to transform the code tolook more like the target author’s code. These modifica-tions make the code look significantly different at firstglance, but in reality these modifications could be easilyautomated with a code linter.

7.1.3 Information Structure Changes

Information structure changes cover variables, syntax,and data flow. Local modifications include changing thevariable name or declaration location, ternary opera-tor usage, and macro addition and removal. Algorith-mic changes, which no participants made, would includemodifications to the variable type, data structure usage,and static and dynamic memory usage.

Nearly all participants modified variable names andthe location of variable declarations, typically eitherfrom or to a global variable, or inside a for-loop in-stantiation. Participants also frequently added or re-moved macros, libraries included (more often added),and swapped equivalent API calls, typically for I/O.Less common modifications were syntactic, such asternary operator usage, the increment/decrement op-erator, and array indexing.


[Original code]

43 int main()44 {45 int i,j,k;46 int cc,ca;47 cin >> ca;48 for(cc=1;cc<=ca;cc++)49 {50 cin >> D >> I >> M >> N;51 for(i=0; i<N; i++)52 cin >> original[i];

[Modified code (forgery)]

1 #define REP(i,a,b) for(i=a;i<b;i++)2 #define rep(i,n) REP(i,0,n)

...41 int main()42 {43 int i,j,k;44 int size, count = 0;45 scanf("%d", &size);46 while (size--)47 {48 scanf("%d%d%d%d", &D, &I, &M, &N);49 rep (i,N) scanf("%d", original+i);

Fig. 2. Information structure and control flow modifications

The transformation in Figure 2 illustrates changesto variable names, API calls (specifically, this oc-curred with I/O functions), and the addition of macrosused by the target author. Figures 1 and 2 show modi-fied variable names.

7.1.4 Control Flow Changes

Control flow modifications relate to the path of execu-tion in a program. Local modifications include modify-ing the loop type, loop logic (i.e., increment to decre-ment), putting multiple assignments per line, break-ing up a complex if-statement, and adding or remov-ing control flow keywords. Algorithmic modifications in-clude refactoring functions and any major addition orremoval of control structures like loops and conditionals.

The majority of the control flow changes to forgeriesare implemented as local modifications. Most commonare changes to loop type or logic, breaking up an if-statement, and the addition or removal of control flowkeywords (with minimal modifications to the surround-ing algorithm). We do find some algorithmic changes,though they are much less common than the local mod-ifications: some participants added or removed inlinedversions of memset or common math functions. One par-ticipant undertook a major refactoring of a helper func-tion; we highlight this unusual but impressive series ofmodifications after discussing local modifications.

8 int judge(int m) {9 int j,k;10 for(j=0;j<m;j++) for(k=0;k<m;k++) if (b[j][k]!=-1) {11 if (b[k][j]==-1) b[k][j]=b[j][k]; else if

(b[j][k]!=b[k][j])return 0;

12 if (b[m-1-j][m-1-k]==-1) b[m-1-j][m-1-k]=b[j][k]; elseif (b[m-1-j][m-1-k]!=b[j][k]) return 0;

13 if (b[m-1-k][m-1-j]==-1) b[m-1-k][m-1-j]=b[j][k]; elseif (b[m-1-k][m-1-j]!=b[j][k]) return 0;

14 }15 return 1;16 }1718 int cal(int m) {...21 memset(b,0xff,sizeof(b));

Fig. 3. Original code: algorithmic changes. The forger later refac-tored the function, judge

Control flow: local changes. Figure 2 illustrates lo-cal control flow modifications. Most prevalent is the ad-dition or removal of control flow keywords, like break,continue, assert. This may be because it is often trivialto add or remove them by locally modifying the logic.

The modified code snippet in Figure 2 begins byadding macros for for-loops that the imitated authorfrequently used. Then, the loop type of the inner for-loop (original line 51, modified line 48) is changed touse the macro. The modified version also removes anewline so that both the loop initialization and bodyare on the same line, without brackets. The loop typeof the main loop (original line 48, modified line 46)is changed from a for-loop to a while-loop. The logicof the main loop is also modified: in the modifiedversion, the counter decreases; in the original version,the counter increases. The input functions at originallines 47/50 and modified lines 45/48 have been changedto use scanf instead of cin. Finally, some entire lines(44-46) in the modified version have been copied fromprograms written by the target author.Control flow: algorithmic changes. Control flowchanges that modify the underlying algorithm reflecta deeper understanding of both the original and tar-get programmers’ thought processes and goals. Only afew participants made these types of changes— for ex-ample, refactoring a function, replacing a goto with anon-trivial series of conditionals, or adding or removinginlined version of common math functions. Figures 3and 4 show an example of inlining: the participant re-moves a call to memset (original line 21) and replacesit with an inlined version (modified line 41), a doubleloop through the array where each element is set to 256,as in the original version. This type of modification oc-curs in a few other forgeries and also signifies a more


2 #define REP(i,a,b) for(i=a;i<b;i++)3 #define rep(i,n) REP(i,0,n)...10 int copy_maybe(int x, int y, int c, int d) {11 if (b[x][y]==-1) {12 b[x][y] = b[c][d];13 }14 else15 {16 if (b[x][y] != b[c][d]) return 0;17 }18 return 1;19 }2021 int judge(int m) {22 int j,k,id,ie,r;23 rep(j,m) rep(k,m) if (b[j][k]!=-1) {24 r = copy_maybe(k,j,j,k);25 if (r == 0) return 0;26 id = m - 1 - j;27 ie = m - 1 - k;28 r = copy_maybe(id,ie,j,k);29 if (r == 0) return 0;30 r = copy_maybe(ie,id,j,k);31 if (r == 0) return 0;32 }33 return 1;34 }3536 int calculate(int m) {...41 rep(o,256) rep(p,256) b[o][p] = MAX;

Fig. 4. Algorithmic changes to a forgery: the forger refactoredsome logic into a new function, copy_maybe

significant change to the path of execution, though itonly requires only a local understanding of the code.

A more unusual, higher-level algorithmic modifica-tion in Figures 3 and 4 is the addition of a helper func-tion, showing deep analysis of the algorithm. To createthis forgery, the participant made significant changes tothe function judge, primarily by adding temp vari-ables and abstracting away much of the logic in thecomplex if-statements in the original code (lines 11-13)to a new helper function, called copy_maybe. Thesechanges would require a forger to understand the func-tion as a whole. This is the only forgery that shows aforger delving slightly deeper in the design choices madeby both the original and target authors. The participantwho created this forgery wrote that “it required under-standing the code more deeply than just at a stylisticlevel and trying to rewrite it from scratch.”

7.2 Features Used for Attribution

Through analysis of our study results, we find that ana-lysts focus primarily on local implementation details in-stead of algorithmic decisions when making attributiondecisions, just as the forgers focused on these when cre-ating forgeries. That is, participants overwhelmingly usethe same set of features when trying to detect forgeries,

but augment this set with higher-level reasoning aboutstylistic inconsistencies and the forgeability of features.

These can be roughly divided into inconsistenciesregarding the presence of a feature and inconsistenciesregarding use of a feature. For example, imagine an au-thor typically uses the variable name aa in their mainloops, as seen in the training set. In a forgery, a presenceinconsistency would be the lack of a variable named aa.A usage inconsistency would be a variable named aaused in a helper function and not in the main loop.Noticing a usage inconsistency requires the analyst tohave a greater understanding of the author’s style.Presence of features. The inconsistent inclusion orexclusion of a feature indicative of the target author ledparticipants to label code a forgery. Although notingthese inconsistencies sometimes resulted in a correct la-bel, it was not a reliable technique, as we discuss below.

A common reason for a correct forgery label wasinconsistencies in spacing and bracket use. The successof these low-level typographic inconsistencies (in com-bination with the other features) may be attributed, tosome extent, to the fact that style in the training setwas largely normalized, but we made no normalizationsto the forgeries because they reflect the forgers’ idea ofwhat will fool a classifier. So, if the forger did not at-tempt to imitate typographic style at all, the participantmay have discerned typographical differences.

Sometimes, however, typographical inconsistencieslead to an incorrect label: participants wrote, aboutnon-forgeries, that “everything matches except there isa for loop missing its braces” ; and “the discrepan-cies, particularly the multiple for-loop statements onthe same line, make me think it’s a forgery.”

Other participants justified the forgery label withanomalies in the code, or features that match or mighthave been copied from the target author’s training set:“The use of unsigned long is something B never didin the training data however the overall structure of thecode looks like it could be from person B. Maybe this per-son didn’t understand the difference between unsignedlong and long and blindly copied over from code thatused unsigned long’s?”

Conversely, participants labeled as non-forgeriescode that contained features consistently present in thetraining set. One participant wrote that the code “uses_ and __, asserts, has no inconsistencies with D’sstyle”, and assigned a ‘X’ label to a forgery of X. In boththis example and the previous one, participants whobased attribution decisions on the inconsistent presenceof features assigned incorrect labels.


Forgeability of Features. Some participants reasonedabout the fact that typographical and local featureswould be easy to forge but still falsely decided the codewas not a forgery: “For me this looked like D becauseof the asserts and curly braces, but it would be prettyeasy to add those things in an attempted forgery....”Similarly, another participant noted that the additionof simple macros would be trivial to add but assigned anon-forgery label because of the lack of unused variables.This shows that participants thought about the processof forgery creation—that is, that the addition of localand typographical features is simple—but sometimesstill came to the wrong decision.Usage of features. Some participants assigned correctforgery labels because they noticed features not used asthe target author had. Higher-level features that par-ticipants used to detect forgeries include differences inmodularity of the code, use of helper functions, and linesthat appear in the training set, all inconsistencies in theusage of features.

A participant wrote “the use of underscores as vari-able names suggests D, but normally D would only dothat if the function is very short and delegates all thework to another helper function. Here, underscores areused in a very long loop, and somewhere in the mid-dle an underscore variable is used. This seems unlikeD’s style. Also, assert.h is included but there are noasserts used, which seems like a sloppy imitation ofD.”

This participant noted three simple features—vari-able names, function length, and libraries—and com-bined information about higher-level indicators (liketypical variable names) with variable usage and infor-mation flow to discover inconsistencies in the usage ofcertain elements.

7.3 Lessons

Stepping back, we are now positioned to draw some keyimplications and lessons regarding the potential tacticsand methods of forgers and analysts:1. When creating forgeries, forgers may copy code from

other programs written by the target of the forgery.This code, while visible in the source, might not ac-tually be executed when the program is run. There-fore, an adversarially resilient attribution systemmight use a software analysis tool to determinewhich portions of code are reachable or unreachableand then treat these regions of code differently.

2. The vast majority of modifications made to createforgeries required only a local understanding of thecode, and were made by forgers who had no knowl-edge of the features that the classifier was using forattribution. This suggests that forgeries in the wild,to the extent that they exist, might contain the sametypes of local modifications.

3. When told about forgeries, but without beingshown examples, some participants developed re-liable methods of forgery detection. These arosefrom the participants first considering how forgeriesmight be created and then developing an under-standing of the target author’s style at a higher levelthan the modifications that the forger made. Thissuggests that attribution system designers shouldaim to develop a high level understanding of au-thorship style.

4. Taken together, the previous two lessons sug-gest that adversarially resistant attribution systemsmight incorporate algorithmic features which re-quire a deeper understanding of the target author’sengineering process and programmatic goals.

8 DiscussionTo extend the results in Sections 6 and 7, we analyze thefeatures modified by the forgers, and then summarizeour recommendations for a more robust classifier.

8.1 Machine Classifier Feature Analysis

To better understand the high rate of forgery successagainst the machine classifier, we explore the machineclassifier’s features. Each forgery has an associated orig-inal file that the forger modified to create the forgery;we compare the features and analyze how the modifi-cations made by the forgers affected the features thatthe machine classifier extracted. In this section, we arespecifically discussing machine classifier features, not‘features’ used by the forgers or analysts.Overview of classifier features. Caliskan-Islam etal’s [5] classifier extracts features from an abstract syn-tax tree (AST) representation of the code. Leaf nodesof the AST are unigrams of code: variable names, APIcalls, strings, and literals (as well as C keywords). ASTnodes are snippets of the syntactic structure of code,such as ‘AdditiveExpression’ or ‘ForInit’. The classifier


Fig. 5. This is a proportional view of the average number of fea-tures changed between the original and forgery. The large outercircles represent the original (left) and forgery (right) files. Thesizes of the circles is proportional to the number of features thatthe machine classifier extracts. The overlap in the middle repre-sents features that the machine classifier extracted from both theoriginal and the forgery. The non-overlapped section of the leftcircle is 49.4% of its area, and the non-overlapped section of theright circle is 40.2% of its area. The small circle in the overlappedsection represents the small proportion of features that the ma-chine classifier extracted from both files and whose values did notchange in either file: it is 43.0% of the overlapped portion.

traverses the AST to extract AST unigrams and bi-grams. For an in-depth explanation, see [5].

On average, for classifier C5, 55.0% of features in-clude a variable name, API call, macro, or literal. Bymaking small, local changes to only variable names,macros, literals, or API calls, forgers had access to overhalf of the features. Recall from Section 7.1 that swap-ping variable names was the most common modificationthat forgers made, and they also commonly modifiedmacros and API calls: these modifications had the po-tential to alter half the features extracted from the file.Overview of changes to features. Figure 5 shows anoverview of the proportion of features shared betweenthe original and the forgery. On average, 49.4% of fea-tures extracted by the machine classifier from originalfile are absent in the corresponding forgery, meaningthat the forgers’ modifications completely removed anyinstances of nearly half of the features. Similarly, 40.2%of the features in a forgery do not exist in the origi-nal file. The type of features that exist in only one fileare proportional to the overall feature makeup, meaningthat forgers affected both a large percentage and vari-ety of features: although forgers primarily made mod-ifications to originals that did not require a high-levelunderstanding of the authors’ decisions, they were stillable to affect a vast majority and variety of features,in line with the high rate of forgery success against theclassifier.

The small circle contained in the overlapping middleportion of Figure 5 shows the proportion of featureswhose values remained the same throughout the change:43.0% of the features that exist in both files. Combinedwith the features that were added to the forgery, thismeans that of the features in the forgery, only 23.5%have the same value as they did in the original. So, bymaking primarily local changes, forgers created forgeriesthat contained 76.5% features with different values.

Next, we take a deeper look at the features whosevalues change most dramatically. Adding or removingmany instances of a feature changes its value more. Forexample, a forger who changes all while-loops to for-loops will increase the value of certain features (such asthe AST node ‘ForInit’) more than a forger who onlychanges one while loop.Features that exist in only one of the files. Weboth examine features removed from original files andfeatures added to forgeries. Table 6 shows the mostcommonly added and removed features for authors A-D(there were not enough forgeries of author E). The ma-jority of these features are variable names or macros,reflecting the modifications that forgers made to vari-able names and macros.

There are also AST unigrams such as ‘ShiftExpres-sion’, ‘UnaryOp’, ‘ArrayIndexing’, and ‘Assignment-Expr’, which are typically close to the leaf nodes andthus more susceptible to local code changes. Moreover,these AST nodes indicate low level programming de-cisions, such as the use of a bitshift operator, or thechoice to use array indexing rather than pointer ad-dition to access array elements. There are only twoAST bigrams that appear in these highly-changed fea-tures. ‘ExpressionStatement AssignmentExpression’ in-dicates assigning value to a variable, perhaps becausethe forger broke up multiline statements. ‘ForInit Identi-fierDeclStatement’ indicates that the forger added vari-able declarations immediately after for-loops, anothercommon modification. The AST bigrams sometimescapture programmatic elements across more than oneline of programming, but even when they do—as with‘ForInit IdentifierDeclStatement’— they do not capturethe higher level observations that participants madeabout the use of features rather than the presence offeatures.Features that exist in both files. We now examinethe features that exist in both files but have a high cu-mulative change in value between original and forgeryfiles. Table 7 sorts these features by type and author. Incontrast to features that only exist in one file, these con-


Feature Description Feature Authors

Var. name, macro, &literal unigrams

ca (avg depth), cc, w, ca, pos, color, plate, best, N, N, k, t, T, cc, rr, best, j, r, D, T(avg depth), t, c, N (avg depth), rep, size, count, tar, ll, mode, in, rep (avg depth),tm, size (avg depth), r, res, lim, __, _, n, s, R, y, 2, tt, cmy, curmy, tn, MAXN, res

A, B, C, D

Var. name bigrams t T, i n BAPI calls unigrams cin, cin (avg depth), scanf_s, scanf, assert A, B, C, D

AST unigrams ShiftExpression, MultiplicativeExpression, cc (avg depth), ShiftExpression (avg depth),AndExpression, UnaryOp, AssignmentExpr, ArrayIndexing A, C, D

AST—AST bigrams ExpressionStatement AssignmentExpr, ForInit IdentifierDeclStatement A, B

Table 6. These are features that most often occur in either the original by an author, or the forgery of that author, but not both.

Feature Description Feature AuthorsVar. name, macro, &literal unigrams i, j, 0, 1, 1 all, C, D

AST unigrams

Argument, AssignmentExpr, ExpressionStatement, Condition, AdditiveExpression, Re-lationalExpression, CallExpression, IncDecOp, Callee, ArgumentList, ArrayIndexing,CallExpression, Callee, ArgumentList, IncDecOp, ArrayIndexing, CallExpression, Iden-tifierDecl, Callee, ArrayIndexing, IdentifierDecl, IncDecOp, ForInit

all, A, B, C, D

C keyword unigrams int all

Table 7. These features often occur in both a forgery and its corresponding original, with the highest cumulative change in value.

tain no leaf nodes other than common variable namesand literals used by most programmers. The majorityof the features are AST unigrams.

Two of the AST nodes are typically relatively highin the AST: ‘ExpressionStatement’ and ‘Condition’. Achange in the value of ‘ExpressionStatement’, a rela-tively generic AST node, could indicate that the forgerhas expanded or contracted lines of code, or simplychanged the length of the program, in part by adding orremoving lines of code that are expression statements.Forgeries were commonly different lengths than the cor-responding original files. A change in the value of ‘Con-dition’ may reflect that some forgers trivially added orremoved if-statements.

One of the most popular AST unigrams, ‘Argument’occurs for each argument in a function call, so its valuewill change if the forger uses equivalent API calls thathave a different number of arguments, another commonmodification we observed. The rest of the AST nodenames are self explanatory and reflect code modifica-tions that occur on one line only.

Combined with the features that occur in only onefile, we observe that the features highly affected by forg-ers’ changes are variable names and AST unigrams, typ-ically nodes close to the leaf nodes. As shown in Fig-ure 5, these are 76.5% of the features in the forgery.We now briefly turn our attention to the 23.5% of fea-

tures that typically do not change in value between theoriginal and the forgery.

There are only two features that occur in alloriginal-forgery pairs and never change value. They are:AST—AST bigram ‘FunctionDefinition Compound-Statement’, and ‘Max depth of AST leaf’. There is a longtail of features that stay constant in the one file in whichthey appear, primarily AST—AST and leaf—AST bi-grams. Along with the high rate of feature modification,this long tail indicates that the machine classifier fea-tures are not unforgeable, and that a future, more robustmachine classifier should extract features that are moredifficult for forgers to modify.

8.2 Lessons

Based on our analysis of how the forgers affected the ma-chine classifier’s features, we draw the following lessons.1. Over half of the machine classifier features in-

cluded variable names, macros, literals, or API calls.Because these are typically simple to change byhand—and even to automate—we suggest that fu-ture classifiers should consider fewer of these fea-tures, or that these features could be contextual-ized with their usage in the program, as successfulhuman analysts did in Section 7.2.


2. AST bigrams can capture some information abouthigher level programming decisions. Although theseare not unforgeable, they may be a valuable part ofa more robust classifier.

3. Combined with our finding from Section 7.3 thatthinking about the forgeability of code propertieshelped analysts correctly label forgeries, we reiter-ate that the unchanged machine classifier featuresare not unforgeable and that the same commonstrategies that changed other features could modifythem. This suggests that future, more robust classi-fiers may benefit from extracting new features thatare not affected by the local modification strategiesthat the forgers employed, for example by includingmore complex AST features or by including forg-eries (correctly labeled) in the training set.

9 ConclusionWe conduct two studies with C/C++ programmers inorder to better understand the potential capabilities andmethods of (1) adversaries creating forgery attacks, and(2) machine classifiers and human analysts trying to at-tribute source code which may have been modified byan adversary.

Through our forgery creation and detection studies,we find that programmers not specifically trained in at-tribution or forgery can, in many cases, fool a state-of-the-art source code attribution system. This work there-fore provides a conservative estimate of the success ofmore skilled or trained adversaries.

We find that successful forgeries primarily containmodifications that required the forger to have only a lo-cal understanding of the code, and that human analystswho successfully detected forgeries understood style at ahigher level than the modifications, suggesting that fu-ture adversarially robust attribution systems should ad-ditionally include features that capture high-level style.

This work additionally paves the way for follow-onresearch, such as research in which participants modifytheir own code to look like a target’s style, or researchin which forgeries are included in the training set.

This work, an exploration of the capabilities andmethods of adversaries seeking to forge another pro-grammer’s style, is a starting point for the design offuture adversarially resistant attribution systems. Thedesigners of such systems should consider the lessonsand implications detailed here, and build systems re-

silient to even more sophisticated forgery and maskingattacks.

10 AcknowledgementsWe are very grateful to Camille Cobb, Karl Koscher,Kiron Lebeck, Anna Kornfeld Simpson, Paul Vines,Karl Weintraub, and Eric Zeng for helping us pilot thesestudies and for their feedback on our paper drafts, andto Sandy Kaplan, Franziska Roesner, Brian Rogers, andAlison Simko for their feedback on drafts of this paper.We also are extremely thankful for Yejin Choi’s earlydiscussions on this project, and her feedback on our fi-nal paper.

We would also like to thank Aylin Caliskan andRachel Greenstadt for their personal help setting uptheir classifier, and for some early conversations aboutresearch directions. Finally, a sincere thank you to ouranonymous reviewers for their constructive feedback.

This work was supported in part by an NSF Grad-uate Research Fellowship and the Short-Dooley Profes-sorship.

References[1] Afroz, Sadia, Michael Brennan, and Rachel Greenstadt. "De-

tecting hoaxes, frauds, and deception in writing style on-line." IEEE Symposium on Security and Privacy. IEEE, 2012.

[2] astyle.sourceforge.net[3] S, L. "Who Is Satoshi Nakamoto?" The Economist. The

Economist, 02 Nov. 2015. Web. 28 Feb. 2017.[4] Brennan, Michael, Sadia Afroz, and Rachel Greenstadt. "Ad-

versarial stylometry: Circumventing authorship recognitionto preserve privacy and anonymity." ACM Transactions onInformation and System Security (TISSEC) 15.3 (2012): 12.

[5] Caliskan-Islam, Aylin, Richard Harang, Andrew Liu, ArvindNarayanan, Clare Voss, Fabian Yamaguchi, and RachelGreenstadt. "De-anonymizing programmers via code stylom-etry." 24th USENIX Security Symposium (USENIX Security15). 2015.

[6] Caliskan-Islam, Aylin, Fabian Yamaguchi, Edwin Dauber,Richard Harang, Konrad Rieck, Rachel Greenstadt, andArvind Narayanan. "When coding style survives compilation:De-anonymizing programmers from executable binaries."arXiv preprint arXiv:1512.08546 (2015).

[7] https://github.com/calaylin/CodeStylometry[8] Dauber, Edwin, Aylin Caliskan-Islam, Richard Harang, and

Rachel Greenstadt. "Git blame who?: Stylistic authorshipattribution of small, incomplete source code fragments."arXiv preprint arXiv:1701.05681 (2017).

[9] DeMillo, Richard A., Richard J. Lipton, and Frederick G.Sayward. "Hints on test data selection: Help for the practic-


ing programmer." Computer 11.4 (1978): 34-41.[10] Frantzeskou, Georgia, Stephen MacDonell, Efstathios Sta-

matatos, and Stefanos Gritzalis. "Examining the significanceof high-level programming features in source code authorclassification." Journal of Systems and Software 81.3 (2008):447-460.

[11] https://code.google.com/codejam/[12] Gray, Andrew, Stephen MacDonell, and Philip Sallis. "Soft-

ware forensics: Extending authorship analysis techniques tocomputer programs." (1997).

[13] Hayes, Jane Huffman, and Jeff Offutt. "Recognizing authors:an examination of the consistent programmer hypothesis."Software Testing, Verification and Reliability 20.4 (2010):329-356.

[14] Juola, Patrick. "Detecting stylistic deception." Proceedingsof the Workshop on Computational Approaches to Decep-tion Detection. Association for Computational Linguistics,2012.

[15] Kacmarcik, Gary, and Michael Gamon. "Obfuscating docu-ment stylometry to preserve author anonymity." Associationfor Computational Linguistics, 2006.

[16] Kothari, Jay, Maxim Shevertalov, Edward Stehle, and SpirosMancoridis. "A probabilistic approach to source code author-ship identification." Information Technology, 2007. ITNG’07.Fourth International Conference on. IEEE, 2007.

[17] Muir, Macaully, and Johan Wikström. "Anti-analysis tech-niques to weaken author classification accuracy in compiledexecutables." (2016).

[18] Oman, Paul W., and Curtis R. Cook. "A taxonomy for pro-gramming style." Proceedings of the 1990 ACM annual con-ference on Cooperation. ACM, 1990.

[19] Pellin, Brian N. "Using classification techniques to deter-mine source code authorship." White Paper: Department ofComputer Science, University of Wisconsin (2000).

[20] Potthast, Martin, Matthias Hagen, and Benno Stein. "Au-thor obfuscation: attacking the state of the art in authorshipverification." Working Notes Papers of the CLEF (2016).

[21] Rosenblum, Nathan, Xiaojin Zhu, and Barton P. Miller."Who wrote this code? identifying the authors of programbinaries." European Symposium on Research in ComputerSecurity. Springer Berlin Heidelberg, 2011.

[22] Schleimer, Saul, Daniel S. Wilkerson, and Alex Aiken. "Win-nowing: local algorithms for document fingerprinting." Pro-ceedings of the 2003 ACM SIGMOD international confer-ence on Management of data. ACM, 2003.

[23] Spafford, Eugene H., and Stephen A. Weeber. "Softwareforensics: Can we track code to its authors?" Computers &Security 12.6 (1993): 585-595.

[24] Laskov, Pavel and Nedim Srndic. "Practical evasion of alearning-based classifier: A case study." 2014 IEEE Sympo-sium on Security and Privacy. IEEE, 2014.

[25] Stamatatos, Efstathios. "A survey of modern authorshipattribution methods." Journal of the American Society forinformation Science and Technology 60.3 (2009): 538-556.

[26] "8 Years Later: Saeed Malekpour Is Still In AnIranian Prison Simply For Writing Open SourceSoftware." Techdirt. Accessed February 20, 2017.https://www.techdirt.com/articles/20161005/10584235719/8-years-later-saeed-malekpour-is-still-iranian-prison-simply-writing-open-source-software.shtml.

A All ModificationsIn Section 7.1 we discuss the set of modifications thatparticipants made in our forgery creation study. In thisappendix (Table 8), we provide the complete list of thedifferent classes of modifications. This table splits mod-ifications into three categories: control flow, informationstructure, and typographical. See Section 7.1 for addi-tional discussions.


Type of change Total number of modi-fications

Number of participantswho made this type ofchange

Avg modifications perparticipant who madeat least one

Control Flow modificationsControl flow keywords 32 14 2.3Loop type 48 12 4Conditional statements 21 10 2.1Loop logic 12 11 1.1Functions added/re-moved 11 10 1.1

Multiple assignmentsper line 10 5 2

API calls - usage vs in-lining 4 4 1

Information Structure modificationsVariable names 168 25 6.7Libraries included 53 22 2.4Variable declaration lo-cations 85 19 4.5

Macros 36 17 2.1API calls - which one isused 67 17 3.9

Usage/placement ofunary inc/dec operator 10 8 1.3

Array indexing vspointer addition 6 4 1.5

Ternary operator usage 5 3 1.7Typographical modifications

Indent scheme 1024 19 53.9Brackets - location 69 17 4.1Brackets - use 78 13 6Spaces between opera-tors 628 12 52.3

Commented-out code 17 11 1.5Comments (not code) 13 4 3.25

OtherEntire lines of codecopied 113 23 4.9

Table 8. All modifications made to forgeries

lucy simko*, luke zettlemoyer, and tadayoshi kohno ... · *corresponding author: lucy simko: paul...

Documents

lucy simko, luke zettlemoyer, and tadayoshi kohno ... · corresponding author: lucy simko: paul...