Social Network Analysis WithPython
@ PyCon APAC 2014David Chiu
About Me
Co-founder of Ex-Trend Micro Engineer
NumerInfo
ywchiu.com
Social Network
http://libeltyseo.com/wp-content/uploads/2013/03/social-networking.png
Human Nature
http://cdn.macado.com/assets/2010/03/peeping-tom.gif
What do we want to know?Who knows whom, and which people are common to their socialnetworks?How frequently are particular people communicating with oneanother?Which social network connections generate the most value for aparticular niche?How does geography affect your social connections in an onlineworld?Who are the most influential/popular people in a social network?What are people chatting about (and is it valuable)?What are people interested in based upon the human language thatthey use in a digital world?
Explore Facebook
OAuth2 FlowOpen standard for authorization. OAuth provides a method for clients to
access server resources on behalf of a resource owner
Connect to Facebook
https://developers.facebook.com/
Get Access Token
https://developers.facebook.com/tools/explorer/
User Permission
Permission ListUser Data Permissions:
user_hometownuser_locationuser_interestsuser_likesuser_relationships
Friends Data Permissions: friends_hometownfriends_locationfriends_interestsfriends_likesfriends_relationships
Extended Permissions: read_friendlists
Copy Token
Social Network Analysis WithPython
Let's Hack
Get Information From Facebook
Test On API Explorer
Required Packagesrequests
Sending HTTP Request to Retrieve Data From Facebook
jsonFor Parsing JSON Format
Facebook Connectimport requestsimport json
access_token="<access_token>"url = "https://graph.facebook.com/me?access_token=%s"
response = requests.get(url%(access_token))fb_data = json.loads(response.text)print fb_data
Question:
Who Likes My Post The Most?
Get Likes Count of Postsaccess_token = '<access_token>'url="https://graph.facebook.com/me/posts?access_token=%s"response = requests.get(url%(access_token))fb_data = json.loads(response.text)count_dic = {}for post in fb_data['data']: if 'likes' in post: for rec in post['likes']['data']: if rec['name'] in count_dic: count_dic[rec['name']] += 1 else: count_dic[rec['name']] = 1
Simple Ha!
Ask Harder Question!
Question:
What's People Talking About
Take Cross-Strait AgreementAs Example
keyword_dic = {}posts_url = 'https://graph.facebook.com/%s/posts?access_token=%s'post_response = rs.get(posts_url%(userid, access_token))post_json = json.loads(post_response.text)for post in post_json['data']: if 'message' in post: m = re.search('服貿', post['message'].encode('utf-8')) if m: if userid not in keyword_dic: keyword_dic[userid] = 1 else: keyword_dic[userid] += 1
Text Mining
NLTK!
Sorry! My Facebook FriendsSpeak In Mandarin
Jieba!
Using Jieba For WordTokenization
import jiebadata = post_json['data']dic = {}for rec in post_json['data']: if 'message' in rec: seg_list = jieba.cut(rec['message']) for seg in seg_list: if seg in dic: dic[seg] = dic[seg] + 1 else: dic[seg] = 1
Question:
How to Identify Social Groups?
Required Packagesnetworkx
Analyze Social Network
communityCommunity Detection Using Louvain Method
Social NetworkMan As , Connection As Node Edge
Build Friendship Matriximport networkx as nxmutual_friends = {}
for friend in friends_obj['data']: mutual_url = "https://graph.facebook.com/%s/mutualfriends?access_token=%s" res = requests.get( mutual_url % (friend['id'], access_token) ) response_data = json.loads(res.text)['data'] mutual_friends[friend['name']] = [ data['name'] for data in response_data ]
nxg = nx.Graph()[ nxg.add_edge('me', mf) for mf in mutual_friends ][ nxg.add_edge(f1, f2)for f1 in mutual_friendsfor f2 in mutual_friends[f1] ]
Draw Network Plotnx.draw(nxg)
Calculate Network Property
betweenness_centrality(nxg)degree_centrality(nxg)closeness_centrality(nxg)
Community Detectionimport communitydef find_partition(graph): g = graph partition = community.best_partition(g) return partition
new_G = find_partition(nxg)
Draw Social NetworkCommunities
import matplotlib.pyplot as pltsize = float(len(set(new_G.values())))pos = nx.spring_layout(nxg)count = 0.for com in set(new_G.values()) : count = count + 1. list_nodes = [nodes for nodes in new_G.keys() if new_G[nodes] == com] nx.draw_networkx_nodes(nxg, pos, list_nodes, node_size = 20, node_color = str(count / size))nx.draw_networkx_edges(nxg,pos, alpha=0.5)plt.show()
Community Partitioned Plot
GephiGephi, an open source graph visualization and manipulation software
One More Thing
To build your own data service
jsnetworkxA JavaScript port of the NetworkX graph library.
juimee.com
THANK YOU