web.cs.wpi.eduweb.cs.wpi.edu/~jb/cs3043/studentwork2015/twitter po…  · web viewkurt bugbee,...

13
Kurt Bugbee, Daniel Seaman Project 3 CS 3043 Twitter Posts vs. Stock Prices 1. Introduction and Hypothesis What is Twitter? Twitter is a free social networking microblogging service that allows registered members to broadcast short posts called tweets. Twitter members can broadcast tweets and follow other users' tweets by using multiple platforms and devices. Tweets and replies to tweets can be sent by cell phone text message, desktop client or by posting at the Twitter.com website. [1] The default settings for Twitter are public. Unlike Facebook or LinkedIn, where members need to approve social connections, anyone can follow anyone on public Twitter. To weave tweets into a conversation thread or connect them to a general topic, members can add hashtags to a keyword in their post. The hashtag, which acts like a meta tag, is expressed as #keyword. The public aspect of Twitter enables big data collection. [1] What is a stock market? A stock market, or equity market, is the aggregation of buyers and sellers (a loose network of economic transactions, not a physical facility or discrete entity) of stocks (also called shares); these may

Upload: dinhdiep

Post on 06-Feb-2018

216 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: web.cs.wpi.eduweb.cs.wpi.edu/~jb/CS3043/StudentWork2015/Twitter Po…  · Web viewKurt Bugbee, Daniel Seaman. Project 3. CS 3043. Twitter Posts vs. Stock Prices. Introduction and

Kurt Bugbee, Daniel Seaman

Project 3

CS 3043

Twitter Posts vs. Stock Prices

1. Introduction and HypothesisWhat is Twitter?

Twitter is a free social networking microblogging service that allows registered members

to broadcast short posts called tweets. Twitter members can broadcast tweets and follow other

users' tweets by using multiple platforms and devices. Tweets and replies to tweets can be sent

by cell phone text message, desktop client or by posting at the Twitter.com website. [1]

The default settings for Twitter are public. Unlike Facebook or LinkedIn, where members

need to approve social connections, anyone can follow anyone on public Twitter. To weave

tweets into a conversation thread or connect them to a general topic, members can add

hashtags to a keyword in their post. The hashtag, which acts like a meta tag, is expressed as

#keyword. The public aspect of Twitter enables big data collection. [1]

What is a stock market?

A stock market, or equity market, is the aggregation of buyers and sellers (a loose

network of economic transactions, not a physical facility or discrete entity) of stocks (also called

shares); these may include securities listed on a stock exchange as well as those only traded

privately. [2]

When you buy stock, you become a shareholder, which means you now own a "part" of

the company. If the company's profits go up, you "share" in those profits. If the company's

profits fall, so does the price of your stock. If you sold your stock on a day when the price of that

stock falls below the price you paid for it, you would lose money. [2]

How can Twitter and a stock market be related?

Over half of the adults in the U.S. have savings invested in the stock market, and that

number is rising rapidly with the new flood of micro-trading applications like Acorn and no-

commision platforms like Robinhood. As these number rise, it is beneficial to predict the future

Page 2: web.cs.wpi.eduweb.cs.wpi.edu/~jb/CS3043/StudentWork2015/Twitter Po…  · Web viewKurt Bugbee, Daniel Seaman. Project 3. CS 3043. Twitter Posts vs. Stock Prices. Introduction and

movements of stocks using certain indicators. Our project analyzes the effect social media

(Twitter) posts have on stock prices. Many investors use popular numeric indicators and

historical trend data to predict stock market movements. In this new age of instant media, it is

worth examining the relationship between a stock’s price and the amount the stock’s ticker

shows up in social media and search results. We hypothesize that the number of Twitter posts about a company or the company’s stock ticker can be an indicator of a large price movement in the near future.

2. MethodologyTo complete this project we used three primary data sources:

● Wall Street Journal database of NASDAQ Biggest Percentage Gainers

● Yahoo! Finance stock price information

● Twitter Social Analytics by Topsy

The WSJ NASDAQ database of the Biggest Percentage Gainers lists the top 100 stocks

in order of percentage gain per trading day. Yahoo! Finance has a massive amount of stock

information, but for the purposes of our project we used Yahoo! Finance’s history of closing

prices. Topsy is a website that holds a database of all Twitter posts since 2006. Their Social

Analytics tool allows for search by keywords or phrases. Non-premium access to the website

afforded data from the data from the last 30 days.

Most companies listed in the NASDAQ are not very well-known to the greater public, so

we started by moving through the last thirty days of the Biggest Percentage Gainers and

selected stocks that had enough social recognition by three metrics:

1. Familiar to us, as two individuals with trading experience.

2. Familiar to our peers, so as to hold some relevance when presenting our findings.

3. Public recognition, since many corporations can have powerful market presence but no

social media presence.

After identifying five stocks to use, we used Yahoo! Finance to analyze the closing prices

for the last 30 days, noting particularly high-value gains or drops. In most cases, these gains or

drops only occurred on the specific day for which the stock in question was recognized as being

one of the biggest percentage gainers, but in some cases more than one spike or drop in prices

was worth keeping record of.

We then searched the trend data provided by Topsy for the last thirty days. We searched

for both the company name and the company’s stock ticker. The data we received was in the

form of number of tweets per day for the last thirty days that contained the search phrase. We

Page 3: web.cs.wpi.eduweb.cs.wpi.edu/~jb/CS3043/StudentWork2015/Twitter Po…  · Web viewKurt Bugbee, Daniel Seaman. Project 3. CS 3043. Twitter Posts vs. Stock Prices. Introduction and

noted any spikes in Twitter appearances, and then compared those to the large spikes or drops

in the stock prices.

3. AssumptionsA number of assumptions had to be made to complete this project. The first of which is

that the Twitter posts we identified were related to the company or the company’s stock. For this

reason, we avoided companies like Apple: Apple is a popular noun, and many “Apple” hits could

be entirely unrelated to the company. This is also the reason we chose to include a search for

the stock ticker as well as the company name. The company could be mentioned for other

reasons, but a mention of the stock ticker in a tweet is very likely related to the company in

reference to the stock market.

Despite this, some discrepancies may remain that we will mention here. The stock ticker

inclusion allows us to filter “false results” from company name searches if needed, but that

would be a very subjective process on our part and we chose not to do that. For some of our

companies, like KaloBios for example, it likely doesn’t matter. KaloBios is not a word that would

pop up often in social media without it being in direct reference to the company, and as a

biopharmaceutical company any social media mention is likely due to news that will also affect

stock prices. However, Chipotle is mentioned quite often in posts like “CRAZY day today at the

airport, just now about to get a flight home.... I need chipotle” - @thetidedrew. Therefore, without

more time to determine how to filter results, we had to assume that sudden spikes in Twitter

mentions were due to company news and not, for example, a sudden influx of people wanting

Chipotle for dinner.

It is also important to note how we identified correlation. It would be a false assumption

to say that an increase in Twitter posts about a company causes the stock prices to rise or fall.

We hypothesized that the number of Twitter references could be used as a volatility indicator,

and that increases in relevant tweets can correlate to large movements in price.Therefore, we

identified positive correlation if increases in Twitter posts happened on the same day, or

immediately previous day in relation to a price change. In other words, with no other news outlet

or financial information, someone could look at only often a company and its stock ticker

showed up on Twitter and accurately anticipate a large price movement on the next trading day.

4. AnalysisWe collected data on the following five companies:

Page 4: web.cs.wpi.eduweb.cs.wpi.edu/~jb/CS3043/StudentWork2015/Twitter Po…  · Web viewKurt Bugbee, Daniel Seaman. Project 3. CS 3043. Twitter Posts vs. Stock Prices. Introduction and

● Chipotle Mexican Grill (CMG): A fast-food chain offering Mexican fare, including

design-your-own burritos, tacos & bowls.

● FInisar (FNSR): A manufacturer of optical communication components and subsystems.

● KaloBios Pharmaceuticals (KBIO): A biopharmaceutical company dedicated to

improving the lives of patients with innovative therapies.

● Keurig Green Mountain (GMCR): A publicly traded specialty coffee and coffeemaker

company.

● Vera Bradley (VRA): An American design company best known for its patterned bags.

As we review the results from each company, we will refer to them by their stock ticker.

Beginning with CMG, we collected the following data:

We can see from the Topsy data that from November 13th, 2015 through the 18th, the

number of mentions of the keyword “chipotle” hovered between five and ten thousand per day.

On the 19th, that number increased to just under fifteen thousand and then again to twenty

Page 5: web.cs.wpi.eduweb.cs.wpi.edu/~jb/CS3043/StudentWork2015/Twitter Po…  · Web viewKurt Bugbee, Daniel Seaman. Project 3. CS 3043. Twitter Posts vs. Stock Prices. Introduction and

thousand on the 20th. CMG’s stock price closed at $611.51 on the 19th, and closed at $536.19

on the 20th, a drop of 12.3%.

We see again that on December 7th, tweets with the keyword “chipotle” jumped to over

twenty-five thousand from the previous day’s value of about ten thousand. The stock ticker

itself, which is normally mentioned under one thousand times per day, also jumped to just under

three thousand Twitter hits on the 7th. CMG closed at $551.75 on the 7th, and then closed at

$542.24 on the 8th, a drop of 1.7% and the lowest closing price in December as of the writing of

this paper.

We gathered the following data for FNSR:

As a manufacturing company, FNSR is not mentioned often in social media in any

capacity. Both the company name “finisar” and the ticker rarely breach fifty mentions on Twitter

in a day. However, on December 10th 2015, “finisar” jumped to just under two hundred

mentions while the ticker itself increased to over two hundred. On that day, FNSR closed at

Page 6: web.cs.wpi.eduweb.cs.wpi.edu/~jb/CS3043/StudentWork2015/Twitter Po…  · Web viewKurt Bugbee, Daniel Seaman. Project 3. CS 3043. Twitter Posts vs. Stock Prices. Introduction and

$11.64 with a trading volume of 3.51 million shares. On December 11th, FNSR increased to a

trading volume of 13.08 million shares and closed at $14.23, an increase of 22.3%

We collected the following data on KBIO:

Like most pharmaceutical companies, KBIO is very volatile and experiences massive

fluctuations in stock prices. On November 17th and 18th of 2015, “kalobios” Twitter mentions

increased to over five hundred from a normal value of near zero, and hits for the stock ticker

jumped from around two hundred to over thirty-five hundred. On December 18th KBIO closed at

$2.07 a share. On the 20th KBIO closed at $18.25, a massive 781.6% increase in two days.

Twitter mentions for KBIO spiked to near identical values again on November 22nd,

when trading reopened on the 23rd, KBIO closed at $39.50, another 116.4% increase over its

previous bump. KBIO then fell for a couple days, closing at $26.63 on the 25th. Following

another good Twitter day on Thanksgiving (the 26th), KBIO increased 30.8% to close at $34.83

Page 7: web.cs.wpi.eduweb.cs.wpi.edu/~jb/CS3043/StudentWork2015/Twitter Po…  · Web viewKurt Bugbee, Daniel Seaman. Project 3. CS 3043. Twitter Posts vs. Stock Prices. Introduction and

on Black Friday, and has since leveled off while also returning to a more discreet presence on

Twitter.

The following data was collected for GMCR:

“Keurig”, being the name of a very popular brand of coffee machine that is used in

countless offices and homes around the U.S. alone, experiences occasional spikes in Twitter

mentions that, as we explained in the Assumptions section, may not be related to actual

company news.

However, on December 6th, 2015 the keyword “keurig” jumped from under two thousand

mentions to over ten thousand, and “green mountain” hits as will as mentions of the stock ticker

jumped from the normal values of near zero to around four thousand and around two thousand,

respectively. That was a Sunday, and GMCR had closed at $51.70 the preceding Friday. On

December 7th, GMCR closed at $88.89 a share, an increase of 71.9%.

The data for our final company, VRA, is as follows:

Page 8: web.cs.wpi.eduweb.cs.wpi.edu/~jb/CS3043/StudentWork2015/Twitter Po…  · Web viewKurt Bugbee, Daniel Seaman. Project 3. CS 3043. Twitter Posts vs. Stock Prices. Introduction and

VRA was a unique case amongst the companies we analyzed. Because Vera Bradley is

a very popular brand, mentions of the brand name itself are largely independent of company

financial movements, since most of the interest in the company is directed towards its new

lineups rather than its share holdings.

When looking at the Twitter mentions of the stock ticker though, we see that it goes from

its normal hit value of zero to around 350 on December 8th, 2015. VRA closed at $11.69 that

day, and then increased 46.1% to close at $17.08 per share on December 9th.

5. ConclusionsAfter collecting our data, we returned to our hypothesis statement: the number of

Twitter posts about a company or the company’s stock ticker can be an indicator of a large price movement in the near future.

After analyzing our results, we can conclude that the number of relevant Twitter posts, in

the way that we measured them, can in fact be used as a volatility indicator. For the five

Page 9: web.cs.wpi.eduweb.cs.wpi.edu/~jb/CS3043/StudentWork2015/Twitter Po…  · Web viewKurt Bugbee, Daniel Seaman. Project 3. CS 3043. Twitter Posts vs. Stock Prices. Introduction and

companies we researched, every increase from the normal number of Twitter mentions was

followed by a significant price move.

Due to the time constraints on this project, certain changes that could have made the

results more valuable weren’t implemented. Better filtering of Twitter results could help to focus

the indicator, ignoring unrelated references to the companies in question. Expanding the list of

tested companies would further authenticate the results, and with more time we would have had

a broader range of both percentage gainers and losers. Finally, Twitter has open APIs for

collecting data; Topsy limited us to thirty days of trend data without purchasing a $12,000

license, but with time we could have written our own data collection algorithm using the API in

conjunction with MySQL or MongoDB.

However, even without these additions, our results are still quite interesting. Without

watching the new and without following a company’s financial statements, anyone with a

computer could open Topsy for free and search the company’s name and stock ticker. if they

observed a sudden jump in Twitter mentions on that particular day, they could, with some

accuracy, predict a large stock price move on the following trading day, using just that one

indicator. This shows why hedge funds and investment banks employ code that does just that.

After all, a company’s value and that of its stock is determined by social view of its value, and

what better way to gauge social views than by analyzing data from social media?

Page 10: web.cs.wpi.eduweb.cs.wpi.edu/~jb/CS3043/StudentWork2015/Twitter Po…  · Web viewKurt Bugbee, Daniel Seaman. Project 3. CS 3043. Twitter Posts vs. Stock Prices. Introduction and

Sources:

[1]http://whatis.techtarget.com/definition/Twitter

[2]http://www.themint.org/kids/what-is-the-stock-market.html

[3]http://topsy.com/analytics

[4]http://finance.yahoo.com/;_ylt=An9q5flkhZowpBJnwp.Wmjd.FJF4#morequotes

[5]http://www.wsj.com/mdc/public/page/2_3021-gainnnm-gainer-20151211.html?

mod=mdc_pastcalendar