online counterfeit enterprise · the blackhat seo techniques, such as keyword stuffing4 and...

29
Page | 1 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved. Online Counterfeit Enterprise Pioneering Criminal Online Sociometry Author: Frank Angiolelli Contributions by: Eric Feinberg 12/15/2013

Upload: others

Post on 03-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 1 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Online Counterfeit Enterprise Pioneering Criminal Online Sociometry

Author: Frank Angiolelli Contributions by: Eric Feinberg

12/15/2013

Page 2: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 2 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Contents

Abstract .................................................................................................................................................. 3

Prolific Counterfeit Enterprises .............................................................................................................. 3

Chinese Actors ........................................................................................................................................ 3

Russian Actors ...................................................................................................................................... 11

Free OSP Content ................................................................................................................................. 14

Free Email Providers ............................................................................................................................. 15

Finding Victims ..................................................................................................................................... 15

Socially Engineered Content ................................................................................................................. 16

Attribution Through Forensics and Sociometrics ................................................................................. 17

Offensive Social Engineering Points to Source ..................................................................................... 19

How Much Money Are They Making? .................................................................................................. 20

Current Weaknesses in Takedowns: .................................................................................................... 22

Resiliency Insulates the Counterfeit Enterprise ................................................................................... 22

The Response to Takedowns ................................................................................................................ 23

Identity Theft & Financial Fraud ........................................................................................................... 24

Cost to Effect Ratio ............................................................................................................................... 24

Addressing Criminal Counterfeit Enterprise ......................................................................................... 25

Takeaways ............................................................................................................................................ 25

Appendix A: Prior Work in This Area .................................................................................................... 26

Appendix B: Methodology & Theory .................................................................................................... 27

Page 3: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 3 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Abstract

This paper presents the results of our studies of online Counterfeit Enterprise (CE) and efforts to

perform Online Criminal Sociometry to define the Counterfeit Entperise (CE). Conducted using the HIIT

System from July 2013 through December 2013, the analysis shows the presence of a few small groups

engaged in criminal activity who are responsible for the vast majority of counterfeit e-commerce

apparel, accessories and pharmaceuticals websites.

Prolific Counterfeit Enterprises

Counterfeit websites are not a new problem, but they are gaining in speed and intensity. Based on our

research, three primary groups are responsible for the vast majority of counterfeit websites in the high

fashion, shoes, sports apparel, watches (HSSW) and pharmaceuticals space.

These groups are broken down into Chinese HSSW, Chinese Pharmaceuticals and Russian Affiliate based

Pharmaceuticals. The most prolific of these are Chinese actors which are operating a very sophisticated

criminal Counterfeit Enterprise that is resilient, wide spread, indiscriminate of brand and effective.

The Chinese operation itself appears to be broken into two separate highly siloed “units”, HSSW and

pharmaceuticals. Both units employ very similar MOs which include linkfarms, compromised websites,

compromised hosting accounts and creating counterfeit and trademark infringing websites en mass.

These methods appear to differ in frequency, construction, distribution and complexity from other

Pharmaceutical counterfeit operations and small scale HSSW brand counterfeiters which are more

brand specific.

The Russian based operations have entirely different MOs which are discernable.

Chinese Actors

Chinese actors have a sophisticated network of operations that is rather large is size and scope. They

include a sophisticated distribution network, paid sponsored advertisements and Blackhat search engine

optimization.

Chinese Distribution Network The current (link and website) distribution network of CEs is a hierarchical botnet providing obfuscation

and frustrating detection efforts. The term botnet is not used lightly, as the CE controls the websites,

content and the links programmatically and can be changed at will, with speed, including intra-day

changes.

Page 4: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 4 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Figure 1: Visualization of CE Distribution Hierarchy

Bottom Tier – The Link Farm

At the bottom of the distribution network stands three distinct types of websites that we have identified

so far. The majority is comprised of content created through Markovian1 generators, as discussed in the

previous work of Thomas Lavergne, Tanguy Urvoy and Fancois Yvon2 and demonstrated by Jason Bury3.

The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with

the Markovian generators to boost the ranking of the second tier sites in search engines. Additionally,

these concepts have been discussed in previous works by Bharat and Henzinger6 as well as Wu and

Davison7.

The first type is a botnet8 of tens of thousands of compromised websites, as well as compromised web

hosting accounts making up a large percentage of the bottom tier. Those compromised sites are being

used to host specific campaigns where multiple simultaneous campaigns can be hosted on the same

compromised websites. The content at this level is highly specific and represents replicated (and

detectable) methods of compromise and upload.

1 https://en.wikipedia.org/wiki/Markov_chain 2 http://www.uni-weimar.de/medien/webis/research/events/pan-08/pan08-papers-final/lavergne08-

detecting-fake-content-with-relative-entropy-scoring.pdf 3 http://www.soliantconsulting.com/blog/2013/02/draft-title-generator-using-markov-chains 4 https://support.google.com/webmasters/answer/66358?hl=en 5 http://www.webopedia.com/TERM/L/link_farming.html 6 K. Bharat and M. R. Henzinger. Improved algorithms for topic distillation in a hyperlinked environment. In

Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 104{111, Melbourne, AU,Aug. 1998.

7 http://www.cse.lehigh.edu/~brian/pubs/2005/www/link-farm-spam.pdf 8 http://www.microsoft.com/security/resources/botnet-whatis.aspx

Page 5: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 5 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

In an interview with one webmaster whose websites were

compromised, he discussed how the CE had gained access to

his email, reset his GoDaddy account password and used FTP

scripts to mass upload content to hundreds of sites under his

control. The webmaster email credentials were compromised

and used by the same enterprise uncovered in this paper. Our

unproven expectation is that the credentials were harvested

through malware or social engineering.

Figure 2: Bottom Tier Compromised Website Hosting Unauthorized Blog Indexed By Google

The second type of the bottom tier is hundreds of blog websites created by the CE providing backlinks to

the second tier using the text generators [See Figure 4: Markovian Generated Text]. The CE uses these

blogs to dump content and links at will, providing better search engine rankings due to the link farm.

“The webmasters email

credentials were

compromised and used by

the same enterprise

uncovered in this paper.”

Page 6: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 6 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Figure 3: Markovian Generated Text

Figure 4: The Links on the Page Show Numerous Brands

Thirdly, at the bottom tier, are linkfarms built through existing forms with no or weak authentication

containing backlinks [See Figure 6: Comment/Forum Spam With Backlinks to Second Tier]. These are, in

some cases, millions of links strong for an individual search term.

Page 7: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 7 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Figure 5: Comment/Forum Spam With Backlinks to Second Tier

By parsing through this data, we have observed definitively that the HSSW CE operations are

interconnected and brand indiscriminate. These websites provide Blackhat SEO and provide resiliency to

the CE. The property of being brand indiscriminate applies to the Pharmaceutical CE or division as well.

Forum Spam Sources

The forum spam has numerous sources and email addresses with patterns. For example, very active

hosts include:

• .com.cn dynamic IP addresses

• Pegtech IP Address Blocks

• Wholesale Internet IP Addresses

• Chinese ISP Dynamic IPs

The speed at which these posts are occurring have been observed reaching 200-300 per hour per

website, placing the volume of this activity into the billions and possibly trillions per year. The sources

are various throughout the world, but patterns do emerge whereby an individual ip address is

associated with multiple email addresses, individual email addresses are associated with multiple brands

being spammed and individual email addresses are associated with multiple IPs.

Page 8: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 8 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Redirection & Cloak Methods:

• 302 Redirect

• Standard HTML (i.e. Base HREF Methods)

• Obfuscated Javascript (Visual Presentation

& on Mouse Click)

• JS files called by HTML5 asynchronous

processing (Visual Presentation and on

Mouse Click)

Figure 6: Example of Forum Spam

Figure 7: Forum Spam is Cross Sourced Frustrating Blocking Efforts

Second Tier – Redirection & Cloaking

In the second tier of the distribution network are

cloaked9 and compromised websites. While the

compromised sites are currently used as a “Second

Tier” they can be converted by the CE to host

Bottom Tier Content. While standard 302

redirectors are being employed, the cloaking often

occurs through replicated obfuscated javascript or

links to js files using the HTTP5 asynchronous

method. The javascript itself is not difficult to de-

obfuscate, however when the links are

asynchronous javascript files linked in the HTML, the

detection method can be quite challenging, though

not impossible. The content of these cloaking

techniques is changed regularly and the sites hosting them are changed regularly.

A second tier website may be used for both cloaking and 302 redirection over time, indicating the level

of control the owner has over these websites.

9 http://webdesign.about.com/od/seo/i/aa092704.htm

Page 9: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 9 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

These websites are currently common in search engines and counterfeit paid advertisements because

they are difficult to detect. The search engine and the advertiser believe they are presenting content

which may be legitimate, however once a user actually arrives at the site, they are redirected or their

browser overlays another website, which is a counterfeit. The second tier websites funnel users to the

top tier, however the destination, or “payload” changes over time.

Figure 8: Visualization of Distribution Network

In the case where the website is overlaid on top of the original content, if the user clicks on a link, they

are redirected.

These websites serve multiple purposes:

• Frustrating detection methods – The detection of the actual counterfeit website through

programmatic means is difficult due to rotating obfuscation methods.

• Providing a common link of distributed command and control – The top tier can change at will

while the second tier maintains a direction and control over which top tier websites are presented to

the users.

• Resiliency – Because the second tier is generally not addressed, seizing a specific domain will not

have an impact on the CE ability to operate. In response, the CE simply replicates the code onto a new

domain and hosting account and drops that domain into the second tier, which is constantly linked to

through the botnet and linkfarms. As a result, the new website is placed at the top of the search engine

very quickly. Usually, this happens well before any individual website is “seized”.

Page 10: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 10 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Figure 9: Example of a Second Tier Compromised Website

Top Tier – The Counterfeit Website

At the top tier are counterfeit websites that rotate on a seemingly irregular basis ranging from days

to weeks. New counterfeit sites or existing sites are entered into the distribution network at

varying frequencies based on the CE objectives at the time, which appear to be based on social

engineering in addition to a standard counterfeit content. Once entered into the network, the

links in the second tier are either changed out or the second tier redirection is updated to point

to the new website.

Page 11: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 11 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Figure 10: Bottom and Second Tier Websites (A-E) For Cyclical References As Well as Outputting to the Payload Website (F)

Additionally, the paths from the bottom tier to the top tier vary (as demonstrated in the figure above).

The links in the second tier of the botnet change to point to different websites whose objective is to

funnel the user to the same top tier website using a different path. We have reason to believe this is

based on a sophisticated system of command and control.

Russian Actors Russian actors operate in a different MO. Their operation is smaller in scale than the Chinese actors,

much less obfuscated and unsiloed. For example, the Russian operators will compromise websites,

however that compromise is not used to create an active botnet of compromised sites with Markovian

text, but instead only straight passive and static backlinks.

The Russian actors also have an unsiloed property where their operation can be tied through

Sociometrics to other activities including:

- PayDay Loan Scams

- Pornographic Websites

- Law Firm Referral Services

- Illegal Movie Downloads

The Russian actors are more centered in the Pharmaceutical space where they apply an “Affiliate

Based” methodology for distribution. Additionally, they are much less active in the Paid

Sponsored Advertisement space than the Chinese actors.

Page 12: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 12 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Figure 11: Affiliate Based Distribution of Websites

Page 13: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 13 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Figure 12: Notice Replication in the Phone Numbers

One operation is running TDS redirectors know to be used in some malware deployments and

clickfraud – in.cgi. This activity was traced back to tds.moncarlo.net, registered to

[email protected], giving reference to Markov chain Monte Carlo as a randomized

redirection system. That user also owns a number of pharmaceutical sites and when you

attempt to connect to the site without a proper URL, you are presented with only the following.

Page 14: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 14 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Free OSP Content Free OSPs inside the United States are being used

to deliver content to victims as well. The OSPs,

including Facebook, Tumblr, Pinterest, Twitter,

Blogspot, Webs.com and others. These sites are

being used to deliver Second and Top Tier content

in addition to advertising email addresses for sales

Examples of Counterfeit Pharmaceuticals on Free

Website Providers - online-mexico-pharmacy-prednisone.webs.com

- drugsatchemistcoltedpharma.webs.com

- nolvadex-web-pharmacy45.webs.com

- pharmacy-2616195.webs.com

- nolvadex-canada-pharmacy.webs.com

- pharmacy-that-sell-synthroid.webs.com

- pharmacyzyvoxlinezolidinintern60.tumblr.com

Domains Directly Owned by This

“Name”

247-health-online.com

canada-express-shop.com

moncarlo.net

my-health-24.com

Compromised Websites

Page 15: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 15 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

contacts. Among out sites, the top “free website” providers in descending order are:

1. Tubmlr.com

2. Webs.com

3. Blogspot.com

4. Wordpress.com

5. Tripod.com

6. Weebly.com

Free Email Providers Of the 2,706 unique free email addresses we have identified, the distribution is weighted heavily for

using @gmail.com or @hotmail.com. Their reuse on other websites varies from single use to dozens of

sites.

Finding Victims Delivering this content to victims is achieved through a few different methods.

• Search Engines

Page 16: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 16 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

• Sponsored Advertisements10

• Spam

• Social Networks & OSPs (i.e. Facebook,

Twitter, Tumblr, Pinterest, etc…)

Socially Engineered Content

In addition to the normal counterfeit content that is created weekly, CEs are creating websites socially

engineered to meet the seasons and holidays throughout the year, and can be country specific. In

summer, fashion sites, sunglasses, Major League Baseball and other content specific to summer time is

created and distributed. As the year progresses and we approach football season, National Football

League jerseys come into prevalence.

The content development is specific to holidays as well. This includes “Black Friday”, “Cyber Monday”

and even country specific holidays like “Boxing Day” (UK).

10 http://fortknoxnetworks.blogspot.com/2013/08/cybercriminals-using-facebook-paid.html

Page 17: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 17 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Figure 13: Examples of Socially Engineered, Holiday Specific Counterfeit Site

The expectation is that the user will enter these search terms during

the holiday seasons, or be more prone to search for Sunglasses

during the summer time.

After the development period, the code is deployed to websites

which are pushed through the botnet for SEO results relevant to the

socially engineered term and potentially advertised and spammed.

This process occurs for numerous brands and trademarks and can

easily result in hundreds or even thousands of unique users on the

first day.

In one case, a website was created, deployed and then reached 12

Million in Alexa ranking within 10 days. The ranking was increasing quickly enough to show the

distribution network was achieving success.

Attribution Through Forensics and Sociometrics

We employ a number of methods to uncover entire operations, or create entity disambiguation. The

Criminal Online Sociogram below represents an operation of 125 websites we have definitively tied

together valued at about $100M/year in sales. The MO and forensics ties this operation to the larger

Chinese HSSW operation likely above several billion dollars in counterfeit sales.

Page 18: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 18 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Figure 14: Criminal Sociogram of 125 Counterfeit Websites

Counterfeit Pharmaceuticals o Group 1 - Chinese actors –Their MO and distribution model is similar enough to identify their

methods are the same as the HSSW counterfeiters, however their semantics and programming are

different enough to believe this is a different unit or subset of the same operation. This group employs

the same Markovian generator techniques, compromised websites, compromised social media accounts

and other blackhat techniques similar to the HSSW Counterfeiters

o Group 2 - Russian/Ukrainian Actors – The Russian and Ukrainian actors tend to rely more on the

“affiliate” model for replication and distribution. Their methods, distribution and resiliency model is

different from the Chinese actors MO and semantic commonalities.

High Fashion/Sports Apparel/Sunglasses/Watches (HSSW) o Group 1 - Chinese Actors – Our data and intelligence shows that this is leviathan in scale. A well

organized operation, this group includes hackers, spammers, advertisers, developers, “sales” support

(term used loosely), order fulfillment and the financial administration needed to maintain numerous

bank, e-commerce and credit card processing accounts. For example, this group seems to effectively

manage hundreds, if not thousands, of email addresses, and “customer support chat” accounts.

Where we have been able to make contact with specific people in this organization, we have seen

mostly Fuzhou and Changsha China as their locations, however on the ground intelligence has

additionally notified us of physical recruitment operations in the GongDong Province. This group is

highly replicated in their MO, distribution, semantics and they are responsible for activity that is orders

of magnitude higher than any other group on the web for HSSW. Their activities include

Page 19: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 19 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

▪ HSSW Content

▪ Social Media & Digital Media Advertisements

▪ Website Compromise

▪ Hosting Account Compromise

▪ Forum and Social Media Spam

▪ Email Spam

▪ Counterfeit Social Media Groups

▪ Fake Social Media Accounts

▪ Twitter and Facebook Postings in Groups

Group 2 - Other unidentified groups operate at much smaller volumes with great difference to the MOs

and resiliency models. For example, one website being operated by a small time counterfeiter was

seized in a Legal Action. That counterfeiter took to his Facebook account to announce that he moved his

content to another web page. This individual is interested in customer loyalty, which the large scale

Chinese HSSW operation has no concerns about.

Various Groups o Outside of the HSSW groups and pharmaceuticals, additional activities appear to be engaged in

various activities on a small scale worldwide. For example, a number of counterfeit Drivers license,

passport and social security as well as diploma counterfeiters appear to be operating. We have

dedicated only a few cycles to this content but have none-the-less identify some individuals utilizing

replicated content for high resiliency. There are a few fake ID operations that appear to center around

British Columbia, Canada and South East Asia, though more resources or time would prove useful.

Offensive Social Engineering Points to Source

We performed controlled requests for takedowns on a few targeted sites from the distribution botnet

using a method where the domain owner would see the domain we were sending the email from. We

picked three different domains distributed inside the HSSW botnet.

On three separate occasions, as the takedown notices were sent out, a computer in China with the same

User agent string connected to the website. This computer was running Windows XP with Internet

Explorer 6 and using a free version of Chinese enterprise chat software, indicating a possible usage

relationship with qq.com.

The first connection was made from Changsha, China followed twice from Fuzhou, China a few days

later. Interestingly enough, when we contacted the prospective sellers of this merchandise and perform

forensics on the email headers, the users were located in Fuzhou, China as well.

Page 20: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 20 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

How Much Money Are They Making?

We estimate $250 to $1600 per day per site, based on our limited intelligence in this area at this time.

Based on this information, the number of sites we have identified amount to between $1.3 Billion and

$8 Billion annually, likely somewhere in the middle. Coming from the premise that we have not

identified all of the sites they operate, the Chinese actors involved in these operations are likely the top

grossing CE on the web, however more information is required to refine that determination.

Page 21: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 21 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

The graph below was created through studying the log files obtained for the cybersquatted domain loulsvuitton.com. This was hosted under an

account which was made public and contained four different brand counterfeiting websites. The log files for loulsvuitton.com were intact and

clearly demonstrated the website was copied from louisvuitt0n.com, which we had identified in July in a Facebook Advertisement.

On April 22nd, the site louisvuitt0n.com was in testing for e-commerce code for a few days. Once published and advertised, the site grossed

$1,476 in sales on the first day it was advertised.

Meanwhile, on the same hosting account were multiple brand-counterfeiting websites, one of which was an NFL Jersey counterfeit site whose

code and metadata definitively tied it to more than 100 sites. Based on the estimated sales volume, this specific HSSW is worth >$100M/year.

The site louisvuitt0n.com was seized sometime in December 2013, and by that point had grossed about $376,000 and the sister site,

loulsvuitton.com was still online.

Page 22: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 22 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Current Weaknesses in Takedowns: This is a counterfeit enterprise that is exploiting the weaknesses in protocol, legislative and regulatory

framework. Our studies of this operation reveal some relevant information on response.

1. 3% of counterfeit sites are being seized. We discover 97 counterfeit websites for every 3 that

we find are seized. While there are some minor variations, we have evidence that the vast majority of

sites go undetected.

2. They operate over a year before they are found. Across our data set, the average length of time

a counterfeit domain has been operating when addressed or identified is 553 days, or 1.5 years (this is a

moving number). During that time frame, using the logs which we have obtained, we estimate each

domain is billing between $138,000 and $553,000.

3. The math does not add up to impacting the overall operation. Seizing 3% of sites which have

been operating for 1.5 years is not enough to impact the CE. Even when trademark owners are

aggressively pursuing these websites and approach a 50% seizure rate through our observations, that

impact is 50% of only one brand, which leads to a much smaller impact on the overall organization. For

example, if the group is counterfeiting 10 brands, and one brand’s enforcement approaches 50%

seizure, the impact to the organization is 5%. They are undeterred.

4. Alexa rankings are a poor metric. Some in the industry are addressing only websites which rank

high in Alexa rankings, believing they will be addressing the lowest hanging fruit. The issue with this

approach is that the enterprise can collect tens of thousands of dollars from a counterfeit website prior

to it achieving a rank on Alexa and hundreds of thousands of dollars before the ranking warrants any

attention. Because the site is part of an interwoven CE comprising thousands of websites, the money

continues to flow.

Resiliency Insulates the Counterfeit Enterprise Since we are proving this activity is rooted in a limited number of organizations, the strategy of

resiliency shows itself as effective. This protects their profits through diversification, demonstrated in

the graph below. If the overall enterprise deploys a large scale and distributed model, the enterprise

itself can funnel money at high volume despite individual domain seizures. Additionally, successful

website designs are replicated before they are seized and then placed into the distribution network.

Socially engineered domains, containing specific phrases like “blackfriday” or “cybermonday” large

volumes of traffic before they appear to be discovered.

Page 23: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 23 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Figure 15: Demonstration of Resiliency in the Profit Model

While it is true that seizure of bank accounts and funding may be occurring, the billing processors and

accounts are distributed across a number of providers and accounts, again achieving resiliency.

The Response to Takedowns

In controlled experiments, we have affected takedowns of

cherry picked domains owned by the Chinese Pharmaceutical

and HSSW. The domains were not seized in a legal action, only

suspended. After a takedown is completed, the website

becomes re-hosted or re-registered in an average of 3 days,

sometimes with the same hoster or registrar.

In one case, we completed a takedown of a website, which was

hosted at a specific web hoster in the United States. In response to our takedown, the CE created

another account at the same hoster and brought the content back online.

“In response to our takedown,

the CE (Counterfeit

Enterprise) simply created

another account at the same

hoster and re-published the

same website.”

Page 24: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 24 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

The CE also moved their content to numerous other hosters, mostly reacting by moving their websites

outside the United States. Some reacted by moving their hosting to dedicated server environments,

others moved to Maldova or Switzerland. We theorize that if we scale up, the Sociometry of their

reactions would further expose operations.

Identity Theft & Financial Fraud In our studies of the operations of these websites, large numbers contain “account creation” and

“payment” mechanisms that are not utilizing standard HTTPs encryption. In addition to the plain text

transmission of PII across the large scale internet framework, the information is being collected by a

Criminal Enterprise based in China or Russia.

In events where data has been collected, it is not subject to any standards of storage and PII protections

generally accepted by the internet at large. In cases where the log files or databases are inadvertently

left in public facing positions, the data is complete and readable by anyone with little recourse or

consequences to the CE.

Additionally, payment methods like Moneygram and Western Union increase the likelyhood of simple

financial fraud.

Cost to Effect Ratio The cost for the counterfeit enterprise to create and publish a

website is nominal. For the purposes of example, we will say it

costs $25 to put up a counterfeit website and $500 to seize it.

The ratio of interdiction to crime is 20/1, meaning it costs $20

to interdict every $1 of criminal CE investment. Unless we bend this cost curve down significantly, the CE

wins every day. Their expansion in recent years seems to support that the ROI for them is quite

effective.

During 2013, Google received 235 Million DCMA requests11 12, meanwhile the Counterfeit Enterprise

created > 400 Billion Links in their linkfarm for Chinese HSSW alone. This would give credence to the

feeling expressed by the RIAA “We are using a bucket to deal with an ocean”13. If every single one of

those resources were tasks to interdicting the HSSW botnet, it would result in addressing >1% of the

problem.

11 http://torrentfreak.com/google-discarded-21000000-takedown-requests-in-2013-

131227/?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+Torrentfreak+(Torrentfreak)

12 http://www.google.com/transparencyreport/removals/copyright/?hl=en 13 http://www.riaa.com/blog.php?content_selector=riaa-news-blog&blog_selector=One-Year-

&news_month_filter=5&news_year_filter=2013

“We are using a bucket to deal

with an ocean” -Brad Buckles,

RIAA

Page 25: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 25 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

In the current DCMA & OCILLA landscape, most registrars, hosters and OSPs opt for a reporting system

that requires filling out catchpas and other methods to force reporting on a one-at-a-time basis.

Additionally, some organizations require specific information for each website, additionally increasing

the costs to respond.

While some organizations are cooperative in addressing bulk complains, the counterfeiters gravitate to

organizations that do not.

If, according to DCMA, the hosters and registrars are not required to monitor their own networks for the

publishing of counterfeit and trademark infringing materials and their processes slow down bulk

reporting by those employed in detection and identification, our system currently supports a cost to

effect ratio that supports counterfeiting.

Addressing Criminal Counterfeit Enterprise The methods of forensics and Sociometrics can be used to expose the enterprise as a whole, as

demonstrated in this paper. This requires specific technology and skill sets employed in unison with law

enforcement to interdict the CE itself. Intellectual Property owners are ill equipped to address this

problem because of the malicious nature and volume of activity, including compromising websites,

compromising credentials and violating Trademark and Intellectual Property laws en mass.

The motivations are clear. The criminal operation negatively impacts:

• Intellectual property owners

• Government tax revenues

• Online advertisers

• Auction websites

• Consumers

• Resale marketplaces

• Search engines

• Banks and Financial Organizations

• Payment Processors

• Social networks

• Jobs in America, further impacting the economy, tax revenues and business.

• The economy as a whole

Most importantly it is negatively impacting consumer confidence in the internet.

Takeaways This is a criminal problem being run by highly sophisticated enterprises. Those enterprises can be

exposed and that information can be used to further diminish their influence and revenues. We are

Page 26: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 26 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

interested in a cooperative framework to address this problem using a system that can vet and correlate

their activity.

Our research shows that the CEs are highly resistant to the effects of a single or group website

takedown and that cloaking is effective enough to block identifying enough websites. More

sophisticated intelligence gathering which employs successful disambiguation methods can expose the

organization as a whole.

The CE itself can reap hundreds of thousands of dollars from a website before it is ranked on Alexa.

Websites suspended, can be put back online in 24 to 72 hours and seized websites can be recreated and

promoted in days.

The concept that we have applied to this, identifying the enterprise operation as a whole, could

potentially have a greater impact on online criminal operations by identifying whole portions of their

Sociometrically mapped networks and with the cooperation of the industry may further secure the

internet from nefarious parties.

The existence of counterfeit factories, order systems and sales depends on the counterfeiter placing

their online content in front of an audience. Currently, they are very successful at achieving this goal.

Appendix A: Prior Work in This Area • Foundations of Sociometry (1941): J. L. Moreno

http://www.psicologia1.uniroma1.it/repository/387/Moreno_1941.pdf

• An Introduction to Sociogram Construction: Hollander

http://asgpp.org/pdf/carl%20hollander%20sociogram.pdf

• Sociometric Applications in Criminology: Brag and Rounds

http://www.educ.ttu.edu/uploadedFiles/personnel-folder/lee-duemer/epsy-

6304/documents/Sociometric%20application%20in%20criminology%20and%20other%20settings.pdf

• Research Methods of Criminology and Criminal Justics (Criminal Sociometry) : Dantzker and

Hunter

http://books.google.com/books?id=pvXzFpKGQUgC&pg=PA63&lpg=PA63&dq=sociometry+crimi

nal&source=bl&ots=qWEtHYySZt&sig=uR9VsqvJZgEzGpfxL2TV6CUFc4Q&hl=en&sa=X&ei=9ZS8UuPHCc-

GrAeUqIGADw&ved=0CCsQ6AEwAA

• Detecting Fake Content with Relative Entropy Scoring: Lavergne, Urvoy and Yvon

http://www.uni-weimar.de/medien/webis/research/events/pan-08/pan08-papers-

final/lavergne08-detecting-fake-content-with-relative-entropy-scoring.pdf

• K. Bharat and M. R. Henzinger. Improved algorithms for topic distillation in a hyperlinked

environment. In Proceedings of the 21st International ACM SIGIR Conference on Research and

Development in Information Retrieval, pages 104{111, Melbourne, AU,Aug. 1998.

• Identifying Link Farm Spam: Wu and Davidson

http://www.cse.lehigh.edu/~brian/pubs/2005/www/link-farm-spam.pdf

• Entity Identification in the Semantic Web: - Morris, Velegrakis and Bouquet

Page 27: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 27 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

http://disi.unitn.it/~velgias/docs/MorrisVB08.pdf

• Finding User Semantics on the Web Using Word Co-occurrence: Mori, Matsuo and Ishizuka

http://www.win.tue.nl/persweb/Camera-ready/5-Mori-short.pdf

• Entity Disambiguation fo Knowledge Base Population: Dredze, McNamee, Rao, Gerber and Finan

http://www.cs.jhu.edu/~delip/entity_linking_coling10.pdf

• Entropy Compression and Information Content: Fossum

http://www.isi.edu/~vfossum/entropy.pdf

• Shanon Entropy and Komogorov Complexity: Grunwalk and Vitanyi

http://homepages.cwi.nl/~paulv/papers/info.pdf

• Defining Habits: Dickens and the Psychology of Repetition

http://www.case.edu/artsci/engl/Library/Vrettos--Defining%20Habits.pdf

Appendix B: Methodology & Theory

In an effort to better study Counterfeit Enterprise (referred to as CE) operations worldwide, we required

a system that collected, analyzed and scored websites as well performing correlation of data. This

information could be used to map associations. The design resulted in the HIIT System, which is the

brainchild of this research project, and was based on the premise that weaknesses must be present in

CE which will uncover relationships between seemingly unrelated sites and methods. Some of the

weaknesses are:

Modus operandi (MO), defined as “A method of operating or functioning."14 The premise is that

complexity has limitations when limited resources are applied to creating a widespread system and MOs

are detectable, even if well obfuscated. This takes the form of repetition in technology, in text

communication (AKA ngrams15) as well as distribution methods. Our methods primarily center on

applied sciences in data mining, data correlation, ontological semantics, observational behavior and

forensics in semantics and programming which are deobfuscated into relationships.

Prior work in these areas is well documented in the semantic science areas including the following

examples [See Appendix A for a Complete list]:

• Mor, Matsuo and Ishizuka, Finding User Semantics on the Web16

• Morris, Velegrakis, Bouquet – Entity Identification on the Semantic Web17

• Saferstein, R. 2004. Criminalistics: An introduction to forensic science.18

14 http://www.merriam-webster.com/dictionary/modus%20operandi 15 https://en.wikipedia.org/wiki/N-gram 16 http://www.win.tue.nl/persweb/Camera-ready/5-Mori-short.pdf 17 http://disi.unitn.it/~velgias/docs/MorrisVB08.pdf 18 http://www.amazon.com/Criminalistics-Introduction-Forensic-Science-Edition/dp/0135045207

Page 28: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 28 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Criminal Sociometry. We build on the concept of Sociograms & Sociometry introduced by Jacob L.

Moreno19 in the 1930s. This is defined as “the inquiry into the evolution and organization of groups and

the position of individuals within them” and further discussed by Carl Hollender20 as “a map

of…interpersonal lines of communication”. The application here is dual-homed and more based on the

Criminology Sociometry presented by Berg and Rounds21 described as “assessing group relational

structures such as hierarchies [and] friendship networks” when the subject of that study is set of online

activities.

Those nodes can be correlated using a series of data points including HTML structure, software, text

content (otherwise known as ngrams), registrar information, email information and a number of other

data points. This concept was discussed in “Research Methods for Criminology and Criminal Justice”

(Dantzkey and Hunter22) though the application here is based on a behavioral relationship of an online

enterprise.

The Law of Large Numbers. In probability theory, the law of large numbers (LLN) is a theorem that

describes the result of performing the same experiment a large number of times.23 The concept is that

discrepancies or variations are less pronounced when you conduct an experiment thousands of times, or

in this case, observe behavior across thousands of sites. The application of this theory is that if we track

very specific pieces of information that can identify commonalities among actors, over time the statistics

will flatten out and we can identify a relationship (sociometry) in the enterprise as well as a behavioral

patterns in the Sociometrical relationship.

To demonstrate, when one flips a coin 10 times, it is possible that the result is 5 heads and 5 tails,

however this is not predictable and variations can be extreme. On the other hand, if you flip a coin

10,000 times, the variations are much less pronounced and one discovers that you will flip heads 50% of

the time, overall.

19 http://www.psicologia1.uniroma1.it/repository/387/Moreno_1941.pdf 20 http://asgpp.org/pdf/carl%20hollander%20sociogram.pdf 21 http://www.educ.ttu.edu/uploadedFiles/personnel-folder/lee-duemer/epsy-

6304/documents/Sociometric%20application%20in%20criminology%20and%20other%20settings.pdf 22

http://books.google.com/books?id=pvXzFpKGQUgC&pg=PA63&lpg=PA63&dq=sociometry+criminal&source=bl&ots=qWEtHYySZt&sig=uR9VsqvJZgEzGpfxL2TV6CUFc4Q&hl=en&sa=X&ei=9ZS8UuPHCc-GrAeUqIGADw&ved=0CCsQ6AEwAA

23 http://en.wikipedia.org/wiki/Law_of_large_numbers

Page 29: Online Counterfeit Enterprise · The Blackhat SEO techniques, such as keyword stuffing4 and linkfarming5, are used in conjunction with the Markovian generators to boost the ranking

Page | 29 © 2013 Fort Knox Networks, Frank Angiolelli. All Rights Reserved.

Figure 16: Graph of Coin Flips Demonstrates How Anomalies are Smoothed Over A Large Data Set

24

By identifying the MO, using advanced correlation, exploiting the limited number of ngrams and tracking

that information out to build a data set large enough (>10,000) and flatten out inconsistencies (Law of

Large Numbers), we expose the enterprise.

24 http://upload.wikimedia.org/wikipedia/commons/thumb/f/f9/Largenumbers.svg/400px-

Largenumbers.svg.png