Download - Stack_Overflow-Network_Graph
Stack Overflow
Network Graph
Team: Heineken
Motivation
What we want?
- We want overall quality of a question
- We want to see how much the similar
questions related to the question
Solution
Visualize the Question Network by A Graph
- Inspired by http://www.visuwords.com/
Dataset
Dataset
How Can We Translate
the Questions Into
Network Graph?
Three Properties of Graph
1. Size of Node
2. Distance
Between Nodes
3. Directed Edges
1.Quality of A Question
2.Similarity Between
Questions (Content Based)
1.Link Between Questions
(Item based)
Quality
SimilarityAssociation
Question
Concept
Data Exploration
Data Exploration
Data Exploration
Data
CleansingSed Pig Latin
Bash - Sed
Pig
Data Analysis
Cosine
Similarity
Aggregation
Integration
Hive
● Library: “tm"
● Data Format: Corpus
● Standardization:
○ Cleansing
○ SMART IR system
○ Porter Stemmer Approach
● Term Frequency matrix
○ Normalization: Cornell SMART system
● Generate Vector Space Model Matrix
● Dot Product of the Matrix
Finding Similarities Among the Questions
in RStudio With R Language
factorial number of similarities
Visualization
Discovery - First 300 Questions TfIdf
Discovery - First 300 Questions Bool
Discovery - Links Rank Validation
Initial
OpenOrd
Discovery - Links Rank Validation
Filter
Discovery - Links Rank Validation
Edge View
OpenOrd
Discovery - Links Rank Validation
GEXF
Deployment
Sigma.js
Graphs from Gephi can be exported but they are static
But what we want is for the user to interact with the graph
Sigma.js is a JavaScript library that renders graphs on web pages
Export graph from Gephi and parse it through Sigma.js
Sigma.js
Init function
Sigma.js
parse.Gexf function
Creating Nodes
Sigma.js
Creating Edges
Final Product
Code Available on Git Repo
Cheers!