35.facebook post analytics using python and advance open ...ijrsr.com/january2013/20.pdf ·...

Volume 2 Issue 1 January 2013 International Manuscript ID : ISSN23194618-V2I1M20-012013 FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS Amit Sharma Assistant Professor Apeejay Institute of Management Technical Campus (APJIMTC) Jalandhar, Punjab, India Abstract This paper portrays the utilization of tools for data gathering andextraction that permits researchers to fare datain standard document groups from various areas of theFacebook long range informal communication benefit. Kinship networks,gatherings, and pages can subsequently be breaking down quantitatively andsubjectively with respect to demographical, post-demographical, and social qualities. The papergives a review over expository headings opened upby the data made accessible, talks about stage particularparts of data extraction through the official ApplicationProgramming Interface, and quickly connects with the troublesomemoral contemplations connected to this sort of research. Keywords - Facebook Post Analytics, Python and Advance Open Source Tools, Network Security INTRODUCTION In October 2012, Facebook declared that it had come tothe typical number of one billion month to month dynamic clients.[4] This seemingly makes it one of the biggest

Upload: others

Post on 17-Aug-2020




0 download


Page 1: 35.Facebook Post Analytics using Python and Advance Open ...ijrsr.com/January2013/20.pdf · FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS ... and practical structure

Volume 2 Issue 1 January 2013

International Manuscript ID : ISSN23194618-V2I1M20-012013



Amit Sharma

Assistant Professor

Apeejay Institute of Management Technical Campus (APJIMTC)

Jalandhar, Punjab, India


This paper portrays the utilization of tools for data gathering andextraction that permits

researchers to fare datain standard document groups from various areas of theFacebook long

range informal communication benefit. Kinship networks,gatherings, and pages can

subsequently be breaking down quantitatively andsubjectively with respect to demographical,

post-demographical, and social qualities. The papergives a review over expository headings

opened upby the data made accessible, talks about stage particularparts of data extraction

through the official ApplicationProgramming Interface, and quickly connects with the

troublesomemoral contemplations connected to this sort of research.

Keywords - Facebook Post Analytics, Python and Advance Open Source Tools, Network



In October 2012, Facebook declared that it had come tothe typical number of one billion

month to month dynamic clients.[4] This seemingly makes it one of the biggest

Page 2: 35.Facebook Post Analytics using Python and Advance Open ...ijrsr.com/January2013/20.pdf · FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS ... and practical structure

Volume 2 Issue 1 January 2013

International Manuscript ID : ISSN23194618-V2I1M20-012013

mediaassociations ever, challenged as it wereby Google's gathering of administrations as far

as day by dayoverall group of onlooker’s size and engagement.Customaryorganizations

predominate these monstrous Internet organizations whenit goes to the measure of their

workforce – Facebookutilized an insignificant 4500 individuals toward the end of 2012 –

however thesheer number of "people who utilize Facebook to remainassociated with loved

ones, to find what's goingon the planet, and to share and express what is important tothem"

[4] is basically gigantic.

It is no big surprise, then, thatresearchers from many regions of the human and socialsciences

have moved rapidly to concentrate on the stage: a latersurvey article [1] recognized 412

companion inspected researchpapers that take after exact methodologies, not counting

thevarious productions employing applied or potentially basicapproaches. While customary

exact techniques, for example,interviews, analyses, and perceptions are broadly utilized,a

growing number of studies depend on what the creators call“data crawling", i.e. "gleaning

information about clients fromtheir profiles without their dynamic investment" [1].

Thispaper discusses tools intended toencourage this last approach.Research techniques using

programming to catch, create, orrepurpose computerized data so as to investigate diverseparts

of the Internet have been utilized for well over adecade.

Datasets can be abused to examine complex socialfurthermore, social wonders and advanced

strategies [2] have anumber of advantages contrasted with customary ones:advantages

concerning cost, speed, comprehensiveness, detail,et cetera, additionally identified with the

rich contextualizationmanaged by the nearby relationship amongst data and theproperties of

the media (advancements, stages, apparatuses,sites, and so forth.) they are associated with;

data crawlingfundamentally draws in these media through the specifics oftheir specialized

Page 3: 35.Facebook Post Analytics using Python and Advance Open ...ijrsr.com/January2013/20.pdf · FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS ... and practical structure

Volume 2 Issue 1 January 2013

International Manuscript ID : ISSN23194618-V2I1M20-012013

and practical structure and thereforeproduces data that can give nitty gritty perspectives of

theframeworks and the utilization hones they have.

The investigation of socialnetworking administrations (SNS) like Facebook, be that as it

may,introduces various difficulties and contemplations thatmakes the academic investigation

of these administrations, theirclients, and the different types of substance they

holdsignificantly not quite the same as the investigation of the open Web.Thispaper examines

a portion of the conceivable outcomes and troubleswith the data crawling approach connected

to Facebook andintroduces a device that permits researchers to create datarecords in standard

arrangements for various areas of the

Facebook person to person communication benefit without having tofall back on manual

collecting or custom programming. We willto start with introduce a portion of the ways to

deal with data extraction onSNS, keeping in mind the end goal to arrange the proposed

instrument. We will thenintroduce theapplication and give variousshort cases for the kind of

investigation it makes conceivable.

Before concluding, we will examine two further perspectives thatare especially relevant to

the current matter: research by means ofApplication Programming Interfaces (API) and the

questionof security and research morals. While this paper containsspecialized depictions, it is

composed from a media considerspoint of view and therefore concentrates on angles

generally relevantto media researchers.


The investigation of Internet stages by means of data extraction has seenquick development

in the course of the most recent two decades and the laterfervor around the idea of big data

appears to haveadded extra force to endeavors going into thisbearing. [9] For researchers

Page 4: 35.Facebook Post Analytics using Python and Advance Open ...ijrsr.com/January2013/20.pdf · FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS ... and practical structure

Volume 2 Issue 1 January 2013

International Manuscript ID : ISSN23194618-V2I1M20-012013

from the humanities andsociologies, the likelihood to dissect the expressionsalso, behavioral

follows from now and again huge numbersof individuals or gatherings using these stages can

givesignificant insights into the varieties of meaning and practicethat rise and manifest

themselves online. Other thanonly shedding light on a "virtual" space, as far as anyone

knowsisolate from "genuine living", the Internet can be considered as“a wellspring of data

about society and culture" [2] on the loose.

The guarantee of producing observational data, i.e. data thatrecords what individuals do

rather than what they say theydo, without having to manually convention

conduct,expressions, and interactions is especially enticing toresearchers. SNS when all is

said in done, and the gigantic Facebookstage specifically, can be compared, on a certain

level, toobservational gadgets or even to trial outlines: the"caught" data are firmly identified

with carefullydeveloped specialized and visual structures – functionalities,interfaces, data

structures, et cetera – that capacity as"sentence structures of activity" [1], enabling and

directing exercisesin distinct routes by providing and circumscribingpotential outcomes for

activity and expression.

Regardless of the possibility that the outline ofthis expansive scale social test is indicated

neither by norfor social researchers and humanists, the delineated andparameter spaces gave

by SNS present a controllededge of reference to gathered data. No big surprise thatCameron

Marlow, one of the research researchers working atFacebook considers the administration to

be "the world's mostintense instrument for studying human culture" [5].Inrequest to better see

how such data can be gathered, ashort diagram of existing methodologies is indispensable.

Page 5: 35.Facebook Post Analytics using Python and Advance Open ...ijrsr.com/January2013/20.pdf · FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS ... and practical structure

Volume 2 Issue 1 January 2013

International Manuscript ID : ISSN23194618-V2I1M20-012013

Fig. 1 - Facebook Data Analytics


The as of now specified audit paper [1] distinguishes fiveclassifications of observational

Facebook research: elucidatinginvestigation of clients, inspirations for using Facebook,

characterpresentation, the part of Facebook in social interactions, andsecurity and information

revelation. It is not hard to seehow approaches gathering data from or through thestage can be

helpful for each of these zones ofinvestigation. The question, then, is the thing that data can

Page 6: 35.Facebook Post Analytics using Python and Advance Open ...ijrsr.com/January2013/20.pdf · FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS ... and practical structure

Volume 2 Issue 1 January 2013

International Manuscript ID : ISSN23194618-V2I1M20-012013

reallybe gotten to and how this is to be done, considering that thespecific method picked has

important repercussions forthe extent of what can be sensibly procured.

One can to a great extent distinguish two general introductions when itcomes to collecting

advanced data from SNS throughprogramming based apparatuses: to begin with, researchers

can selectparticipants, through Facebook itself or all things considered,also, gather data by

asking them to round out surveys,frequently by means of alleged Facebook applications [1]

While thisstrategy certainly varies from conventional methods for recruitingparticipants as far

as coordination’s and sampling methods,it is not generally not the same as online surveying


Second, data can be recovered in different waysfrom the pools of information that the

Facebook stageas of now gathers as a major aspect of its general operation. This last-

mentionedapproach, which is the concentration of this paper, is energized by datagotten from

both sides of the distinction Schafermakesamongst "verifiable and express cooperation" [4],

referringto the distinction amongst information and substanceintentionally gave by clients,

e.g. by filling out theirprofiles, and the data gathered and delivered by loggingclients'

activities in here and there minute detail.

Page 7: 35.Facebook Post Analytics using Python and Advance Open ...ijrsr.com/January2013/20.pdf · FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS ... and practical structure

Volume 2 Issue 1 January 2013

International Manuscript ID : ISSN23194618-V2I1M20-012013

Fig. 2 - Facebook Post Analytics Sample Illustration

While Facebookindividuals share content, compose messages, and minister theirprofiles, they

likewise click, watch, read, explore, et cetera,thereby providing extra data points that are put

away anddissected. Since these exercises spin around componentsthat have social criticalness

– liking a page of a politicalgathering is more than "clicking" – these data are not

justbehavioral, however take into consideration more profound probing into culture.

Page 8: 35.Facebook Post Analytics using Python and Advance Open ...ijrsr.com/January2013/20.pdf · FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS ... and practical structure

Volume 2 Issue 1 January 2013

International Manuscript ID : ISSN23194618-V2I1M20-012013

Forresearch researchers, there are three routes by which to gainaccess to these data, with

significant contrasts betweenapproaches as far as specialized prerequisites andinstitutional

positioning:Guide database access to the organization's servers is heldto in-house researchers

or participation between a SNS and aresearch institution. [3] Certain organizations likewise

make data“gifts", for instance Twitter deciding to exchange itsfinish chronicle to the Library

of Congress, though with asignificant deferral. The data made open in these waysare by and

large extensive and very much organized, however regularlyanonymized or collected.

Partnering with a stageproprietor is certainly the main (legitimate) approach to gain access to

allgathered data, from a certain point of view.

Access through authorized APIs makes utilization of the machineinterfaces gave by many

Web 2.0 administrations to outsiderdesigners with the goal of stimulating

applicationadvancement and integration with other administrations so as togive extra

usefulness and utility to clients.Theseinterfaces likewise give very much organized data,

however areby and large constrained as far as which data, how much data,furthermore, how

regularly data can be recovered. Conditions can changesignificantly between administrations:

rather than Twitter, forillustration, Facebook is very prohibitive regarding what datacan be

gotten to.


For many individuals (myself among them), the Python dialect is anything but difficult to

begin to look all starry eyed at.Since its first appearance in 1991, Python has gotten to be a

standout amongst the most famous element,programming dialects, alongside Perl, Ruby, and

others.Python and Ruby haveturned out to be particularly prominent as of late to build sites

using their variousweb structures, similar to Rails (Ruby) and Django (Python). Such dialects

Page 9: 35.Facebook Post Analytics using Python and Advance Open ...ijrsr.com/January2013/20.pdf · FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS ... and practical structure

Volume 2 Issue 1 January 2013

International Manu

are regularlycalled scripting dialects as they can be utilized to compose down to business

little projects,on the other hand scripts

conveys an implication that theycan’t

Among interpreted dialects

Fig. 3 - Sample Python Data Extraction Result

Python is distinguished by its vast and dynamic logical computing group

Python for logical computing in both industry applications and

Volume 2 Issue 1 January 2013

International Manuscript ID : ISSN23194618-V2I1M20-012013

scripting dialects as they can be utilized to compose down to business

the other hand scripts. I don't care for the expression "scripting dialect" as

theycan’t be utilized for building mission-basic programming

Sample Python Data Extraction Result

Python is distinguished by its vast and dynamic logical computing group.Adopt

Python for logical computing in both industry applications and scholarlyresearch

scripting dialects as they can be utilized to compose down to business

I don't care for the expression "scripting dialect" as it

basic programming.

Adopt- ton of

scholarlyresearch has

Page 10: 35.Facebook Post Analytics using Python and Advance Open ...ijrsr.com/January2013/20.pdf · FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS ... and practical structure

Volume 2 Issue 1 January 2013

International Manuscript ID : ISSN23194618-V2I1M20-012013

increased significantly since the mid-2000s.For data investigation and interactive, exploratory

computing and data representation, Pythonwill inevitably draw examinations with the many

other domain-particular open sourcewhat’s more, business programming dialects and

apparatuses in wide utilize, for example, R, MATLAB,SAS, Stata, and others. Lately,

Python's enhanced library bolster (essentiallypandas) has made it a solid option for data

manipulation undertakings. Combined withPython’s quality when all is said in one reason

programming, it is a superb decision as a singledialect for building data-driven applications.


UI crawling should be possible manually, yet generallyutilizes purported bots or arachnids

that read the HTMLreports used to give graphical interfaces to clients,either straightforwardly

at the HTTP convention level or by means of programrobotization from the rendered DOM. 3

[8] These strategiescan evade the impediments of APIs, however regularly at thecost of

specialized and legitimate uncertainties if a stagesupplier’s authorization is not expressly

granted. For the situationof Facebook, bot discovery instruments are set up andsuspicious

movement can rapidly prompt to record suspension.

On the off chance that performed on a substantial scale, these methodologiesrequire either

custom programming or extensivemeasures of manual work. The concentration points

andprerequisites for research and teaching do, be that as it may, bearsigns of likeness and

Facebook itself is composedaround a set number of functionalities or "spaces".Onecan

therefore contend that universally useful apparatuses might beimagined that give utility to an

assortment of researchactivities and interests.

A few such data extractorstargeting Facebook have been produced in the course of the most

recent years,invariably using endorsed APIs for data gathering.Theseinstruments by and large

Page 11: 35.Facebook Post Analytics using Python and Advance Open ...ijrsr.com/January2013/20.pdf · FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS ... and practical structure

Volume 2 Issue 1 January 2013

International Manuscript ID : ISSN23194618-V2I1M20-012013

fare data in like manner organizations and theyconcentrate on particular areas of the stage –

incompletely bydecision, incompletely because of restrictions forced by the stageitself. Their

objectives are additionally comparable: to bring down the specializedwhat’s more, calculated

prerequisites for observational research by means of dataexamination with a specific end goal

to further the capacity of researchers tothink about a medium that joins over a billion clients

in a frameworkthat is basically imagined as a walled cultivate.

In whattails, I depict theapplication 4, an instrument outlinedto research researchers in

extracting data from Facebook.


This paper has depicted the application instruments and python, for various subsections of

theFacebook stage. With an attention on inquiries relevant tomedia researchers, specifically,

23have contextualized theapplication in a more extensive arrangement of research

concerns.WithFacebook now counting more than one billion dynamic clients, it isbecoming

critical to create and harden researchways to deal with an administration, to a great extent

developed as a walledplant, that is a piece of an ongoing privatization ofcorrespondence, both

as far as financial aspects andavailability.

While there are important cutoff points to what canbe managed without having to go into an

organization with theorganization, the application instruments demonstrate that certain

partsof Facebook are amendable to experimental investigation, all things considered.These

instruments are continuously grown further, extracomponents will be included what's to

come. Providing more in-profundity data on worldly parts of client engagement

withsubstance will certainly be one of the following strides.

Page 12: 35.Facebook Post Analytics using Python and Advance Open ...ijrsr.com/January2013/20.pdf · FACEBOOK POST ANALYTICS USING PYTHON AND ADVANCE OPEN SOURCE TOOLS ... and practical structure

Volume 2 Issue 1 January 2013

International Manuscript ID : ISSN23194618-V2I1M20-012013


[1] Agre, P.E. Surveillance and Capture: Two Modelsof Privacy. The Information Society

10, 2 (1994),101-127.

[2] Anscombe, F.J. Graphs in Statistical Analysis. TheAmerican Statistician 27, 1 (1973), 17-


[3] Emerson, R.M. Social Exchange Theory. AnnualReview of Sociology 2, (1976), 335-


[4] Facebook Key Facts.http://newsroom.fb.com/Key-Facts.

[5] Freeman, L.C. Centrality in Social Networks.Conceptual Clarification. Social Networks

1, 3(1979), 215-239.