rforcecom: an r package which provides a connection to force.com and salesforce.com

Post on 11-Aug-2014

481 Views

Category:

Data & Analytics

14 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

RForcecom: An R package which

provides a connection to Force.com

and Salesforce.com

Takekatsu Hiramura

2014-07-02 The R User Conference 2014 @ UCLA

1

Agenda

2

1. About me

2. Brief introduction of Force.com and Salesforce.com

3. Overview of RForcecom

4. Features of RForcecom

5. Example of the analysis using RForcecom

-Visualizing consumers’ voice-

Agenda

3

1. About me

2. Brief introduction of Force.com and Salesforce.com

3. Overview of RForcecom

4. Features of RForcecom

5. Example of the analysis using RForcecom

-Visualizing consumers’ voice-

Agenda

5

1. About me

2. Brief introduction of Force.com and Salesforce.com

3. Overview of RForcecom

4. Features of RForcecom

5. Example of the analysis using RForcecom

-Visualizing consumers’ voice-

Specific features of CRM

About Salesforce.com/Force.com

6

» “Salesforce.com “ is one of the most famous SaaS (Software-as-a-Service)

based CRM (Customer Relationship Management) service.

» “Force.com” is a application platform of Salesforce.com, and it specifically

called PaaS (Platform-as-a-Service).

Campaign

Management

Contract

Management

Customer

Management

Product

Management

and etc.

Sales

Forecasting

Case

Management

Overview of the Application/Service Architecture

Application Platform

Service

Custom Object Apex /

VisualForce

Web API

and etc.

Agenda

7

1. About me

2. Brief introduction of Force.com and Salesforce.com

3. Overview of RForcecom

4. Features of RForcecom

5. Example of the analysis using RForcecom

-Visualizing consumers’ voice-

The RForcecom package

I developed an R package “RForcecom” which provides a connection to

Salesforce.com and Force.com via REST API.

8

Statistical Analysis

Machine Learning

Data Manipulation

Visualization

Customer Relationship

Management

Dashboard

Collaboration Platform

(Chatter,Schedule,ToDo etc.)

R Salesforce.com

Delete

Insert

Update/Upsert

Data extract

SOQL query

Search

The CRAN page of the RForcecom

9

http://cran.r-project.org/web/packages/RForcecom/

Source code is available on GitHub

10

https://github.com/hiratake55/RForcecom

Agenda

11

1. About me

2. Brief introduction of Force.com and Salesforce.com

3. Overview of RForcecom

4. Features of RForcecom

5. Example of the analysis using RForcecom

-Visualizing consumers’ voice-

Features of RForcecom

Execute a SOSL rforcecom.search()

Create a record rforcecom.create()

Retrieve record rforcecom.retrieve()

Update a record rforcecom.update()

Upsert a record rforcecom.upsert()

Delete a record rforcecom.delete()

Retrieve a server timestamp rforcecom.getServerTimestamp()

Execute a SOQL rforcecom.query()

Sign in to the Force.com rforcecom.login()

Feature Function name #

Retrieve object descriptions rforcecom.getObjectDescription()

Retrieve a list of objects rforcecom.getObjectList

8

2

3

4

5

6

11

7

1

10

9

12

Sign in to the Force.com

13

> library(RForcecom)

> username <- yourname@yourcompany.com

> password <- "YourPasswordSECURITY_TOKEN”

> instanceURL <- https://xxx.salesforce.com/

> apiVersion <- "26.0“

> session <- rforcecom.login(username, password, instanceURL, apiVersion)

Retrieve records

14

> objectName <- "Case"

> fields <- c("CaseNumber", "Subject", "Status")

> rforcecom.retrieve(session, objectName, fields, order=c("CaseNumber"),limit=12)

Salesforce.com R

Execute a SOQL

> soqlQuery <- "SELECT Id, Name, Industry FROM Account order by CreatedDate"

> rforcecom.query(session, soqlQuery)

15

Salesforce.com R

Agenda

16

1. About me

2. Brief introduction of Force.com and Salesforce.com

3. Overview of RForcecom

4. Features of RForcecom

5. Example of the analysis using RForcecom

-Visualizing consumers’ voice-

RForcecom demo : Visualizing consumers’ voice

» Assume you are a manager at a company and want to know the

consumers’ voice from CRM. Consumers’ voices are stored in

Salesforce.com which registered by their call center staff.

17

Call Center Salesforce.com R Managers

Data collection/

Operation

Data

Management Data Analysis Reporting

Customer

Mgmt.

Case Mgmt.

REST/SOAP

API

API Client

(RForcecom)

NLP

(TreeTagger)

Visualization

(Wordcloud) …

Sample Dataset:

Delta’s Twitter Social Customer Support Account

» It is difficult to use actual dataset, so I crawled Delta Airline’s Twitter

account (@DeltaAssist) and stored tweets to Salesforce.com instead

of actual dataset.

18

https://twitter.com/DeltaAssist/with_replies

Step 1: Retrieving a dataset from Salesforce.com

» Tweets sent to @DeltaAssist are stored in Salesforce.com.

19

Step 1: Retrieving a dataset from Salesforce.com

» Tweets sent to @DeltaAssist are stored in Salesforce.com.

20

Step 1: Retrieving a dataset from Salesforce.com

» Load required libraries and sign into Salesforce.com.

21

> library(RForcecom)

> username <- yourname@yourcompany.com

> password <- "YourPasswordSECURITY_TOKEN”

> instanceURL <- https://xxx.salesforce.com/

> apiVersion <- "26.0“

> session <- rforcecom.login(username, password, instanceURL, apiVersion)

Step 1: Retrieving a dataset from Salesforce.com

» To retrieve dataset with parameters of objectname and field names.

22

> CustomerVoice <-rforcecom.retrieve(session,"CustomerVoice__c",c("TweetDate__c","Tweet__c"))

> head(CustomerVoice$Tweet__c,10)

Step 2: Extracting high-frequency keywords

23

> library(koRpus)

> temp.file.name<-tempfile()

> write.table(CustomerVoice$Tweet__c,temp.file.name,col.names=F,row.names=F)

> tagged<-treetag(temp.file.name, lang="en",

treetagger="manual",TT.options=list(path="C:/Apps/TreeTagger", preset="en",encoding="UTF-8"))

> tagged.DF<-tagged@TT.res

> head(tagged.DF,10)

» Tag the word class for each words using “koRpus“ package and TreeTagger.

Step 2: Extracting high-frequency keywords

24

> term<-tagged.DF[tagged.DF$wclass=="noun",]$token

> term<-tolower(term)

> head(term,20)

» Filter “noun” from tagged list

Step 2: Extracting high-frequency keywords

25

> term.unique<-unique(term)

> term.freq <- unlist(lapply(term.unique,function(x){length(term[term==x])}))

> termfreq<-data.frame(term=term.unique, freq=term.freq)

> termfreq<-termfreq[order(termfreq$freq,decreasing=T),]

> head(termfreq,10)

» Count frequencies of the terms.

Step 3: Visualizing the words as a word cloud

26

> library(wordcloud)

> termfreq.top<-head(termfreq, n=100)

> pal <- brewer.pal(8, "Dark2")

> windowsFonts(SegoeUI = "Segoe UI")

> wordcloud(termfreq.top$term, termfreq.top$freq, random.color=T, colors=pal, family="SegoeUI")

» Visualize the terms using wordcloud package. “Flight” is the most frequent.

Step 4: Visualize the Buzz-word of the day

27

> CustomerVoice.sun <- rforcecom.query(session, "select Tweet__c, TweetDate__c from CustomerVoice__c

where TweetDate__c >= 2014-05-25T00:00:00-04:00 and TweetDate__c < 2014-05-26T00:00:00-04:00")

> CustomerVoice.mon <- rforcecom.query(session, "select Tweet__c, TweetDate__c from CustomerVoice__c

where TweetDate__c >= 2014-05-26T00:00:00-04:00 and TweetDate__c < 2014-05-27T00:00:00-04:00")

> CustomerVoice.tue <- rforcecom.query(session, "select Tweet__c, TweetDate__c from CustomerVoice__c where TweetDate__c >= 2014-05-27T00:00:00-04:00 and TweetDate__c < 2014-05-28T00:00:00-04:00")

> CustomerVoice.wed <- rforcecom.query(session, "select Tweet__c, TweetDate__c from CustomerVoice__c where TweetDate__c >= 2014-05-28T00:00:00-04:00 and TweetDate__c < 2014-05-29T00:00:00-04:00")

> CustomerVoice.thu <- rforcecom.query(session, "select Tweet__c, TweetDate__c from CustomerVoice__c

where TweetDate__c >= 2014-05-29T00:00:00-04:00 and TweetDate__c < 2014-05-30T00:00:00-04:00")

> CustomerVoice.fri <- rforcecom.query(session, "select Tweet__c, TweetDate__c from CustomerVoice__c

where TweetDate__c >= 2014-05-30T00:00:00-04:00 and TweetDate__c < 2014-05-31T00:00:00-04:00")

> CustomerVoice.sat <- rforcecom.query(session, "select Tweet__c, TweetDate__c from CustomerVoice__c

where TweetDate__c >= 2014-05-31T00:00:00-04:00 and TweetDate__c < 2014-06-01T00:00:00-04:00")

> CustomerVoice.all <- rbind(CustomerVoice.sun, CustomerVoice.mon, CustomerVoice.tue, CustomerVoice.wed, CustomerVoice.thu, CustomerVoice.fri, CustomerVoice.sat)

» Retreive daily datasets by SOQL.

Step 4: Visualize the Buzz-word of the day

28

# Tag, Extract noun, Calculate TF

make.treetag<-function(CustomerVoice){

temp.file.name<-tempfile()

write.table(CustomerVoice$Tweet__c,temp.file.name,col.names=F,row.names=F)

tagged<-treetag(temp.file.name, lang="en", treetagger="manual",TT.options=list(path="C:/Apps/TreeTagger",

preset="en",encoding="UTF-8"))

tagged.DF<-tagged@TT.res

# Extract noun, To lower

term<-tagged.DF[tagged.DF$wclass=="noun",]$token

term<-tolower(term)

# Count frequency of term

term.unique<-unique(term)

term.freq <- unlist(lapply(term.unique,function(x){length(term[term==x])}))

termfreq.DF<-data.frame(term=term.unique, freq=term.freq, stringsAsFactors=F)

termfreq.DF<-termfreq.DF[order(termfreq.DF$freq,decreasing=T),]

return(termfreq.DF)

}

# Apply to each dataset

termfreq.sun<-make.treetag(CustomerVoice.sun)

termfreq.mon<-make.treetag(CustomerVoice.mon)

termfreq.tue<-make.treetag(CustomerVoice.tue)

termfreq.wed<-make.treetag(CustomerVoice.wed)

termfreq.thu<-make.treetag(CustomerVoice.thu)

termfreq.fri<-make.treetag(CustomerVoice.fri)

termfreq.sat<-make.treetag(CustomerVoice.sat)

termfreq.all<-make.treetag(CustomerVoice.all)

» Tag, extract noun and calculate the Term Frequency (TF).

*TF (Term Frequency):

number of occurrence of term i in document j

29

Step 4: Visualize the Buzz-word of the day

# Calculate IDF

IDF.documents <- sapply(termfreq.all$term,function(x){

sum(

nrow(termfreq.sun[termfreq.sun$term==x,])>0,

nrow(termfreq.mon[termfreq.mon$term==x,])>0,

nrow(termfreq.tue[termfreq.tue$term==x,])>0,

nrow(termfreq.wed[termfreq.wed$term==x,])>0,

nrow(termfreq.thu[termfreq.thu$term==x,])>0,

nrow(termfreq.fri[termfreq.fri$term==x,])>0,

nrow(termfreq.sat[termfreq.sat$term==x,])>0

)

})

IDF<-data.frame(term=termfreq.all$term,IDF=log(7/IDF.documents))

» Calculate the Inverse Document Frequency (IDF).

*IDF (Inverse Document Frequency):

IDF measures “Term specificity”.

df: number of documents containing term i.

N: Total Number of documents

Reference: http://dovgalecs.com/blog/matlab-simple-tf-idf/

IDF=

Step 4: Visualize the Buzz-word of the day

30

# Calculate TF and returns TF-IDF

calc.tfidf <- function(termfreq){

# TF

termfreq$TF <- termfreq$freq/sum(termfreq$freq)

# TF-IDF

tfidf.val <- lapply(termfreq$term,function(x){

tf_i <- termfreq[termfreq$term==x,]$TF

idf_i <- IDF[IDF$term==x,]$IDF

return(tf_i * idf_i)

})

tdidf <- data.frame(term=termfreq$term, TFIDF=unlist(tfidf.val))

return(tdidf)

}

tfidf.sun <- calc.tfidf(termfreq.sun)

tfidf.mon <- calc.tfidf(termfreq.mon)

tfidf.tue <- calc.tfidf(termfreq.tue)

tfidf.wed <- calc.tfidf(termfreq.wed)

tfidf.thu <- calc.tfidf(termfreq.thu)

tfidf.fri <- calc.tfidf(termfreq.fri)

tfidf.sat <- calc.tfidf(termfreq.sat)

» Calculate the TF-IDF of each dataset.

*TF-IDF

(Term Frequency–Inverse Document Frequency):

TF-IDF measures how important a word is in a

document.

TF-IDF = TF × IDF

Step 4: Visualize the Buzz-word of the day

31

# Wordcloud

draw.wordcloud <- function(tfidf,title=""){ png.filename <- paste("wordcloud-", title,".png", sep="") png(png.filename,width=7,height=7,units="in", res=600)

tfidf <- tfidf[order(tfidf$TFIDF, decreasing=T),] tfidf.head <- head(tfidf, n=100) # Extract to 100 terms

par(oma = c(0, 1, 2, 1)) # Set margin pal <- brewer.pal(8, "Dark2") wordcloud(tfidf.head$term,tfidf.head$TFIDF,random.color=T,colors=pal,main=title) # Plot Wordcloud

par(oma = c(0, 0, 0, 0)) # Unset Margin title(title) # Add Title

dev.off() # Close File } # Plot Wordcloud for each dataset

draw.wordcloud(tfidf.sun,title="2014-05-25(Sun)") draw.wordcloud(tfidf.mon,title="2014-05-26(Mon)")

draw.wordcloud(tfidf.tue,title="2014-05-27(Tue)") draw.wordcloud(tfidf.wed,title="2014-05-28(Wed)") draw.wordcloud(tfidf.thu,title="2014-05-29(Thu)")

draw.wordcloud(tfidf.fri,title="2014-05-30(Fri)") draw.wordcloud(tfidf.sat,title="2014-05-31(Sat)")

» Output word clouds as PNG format.

Step 4: Visualize the Buzz-word of the day

» These wordclouds are describing the trends of the day.

32

Step 4: Visualize the Buzz-word of the day

» These wordclouds are describing the trends of the day.

wordcloud of a Sunday

There are specific location such

as “Vancouver”, “Boston” and

“Phoenix”.

It seems that this day has more questions about a route or a

booking than other days or

troubles that happened in

specific airport.

33

Step 4: Visualize the Buzz-word of the day

» These wordclouds are describing the trends of the day.

wordcloud of a Thursday

The words “seatback”, “pain”

and ”captains” appeared in the

word cloud.

It seems that there are troubles with the fleet, cabin or in-flight

service somewhere.

34

Step 4: Visualize the Buzz-word of the day

» These wordclouds are describing the trends of the day.

Wordcloud of a Friday

There are words : “award” and

“platinum”.

It seems that this day has more

questions about frequent flyer program than a normal day.

35

Conclusion

36

» I told a brief introduction of SaaS-Based CRM “Salesforce.com” and its

application platform “Force.com”.

» An R package RForcecom has various features for exchanging data

with Salesforce.com/Force.com.

» I made a sample use case of RForcecom using Twitter dataset and

visualized customers’ voice.

» The framework might be applied to for conducting a sentiment

(negative/positive) analysis and for analyzing customer feedback for

specific product or service to improve customer satisfaction.

37

Thank you

» Any Questions?

Takekatsu Hiramura http://thira.plavox.info/

thira@plavox.info http://rforcecom.plavox.info/

top related