data & methodology...data & methodology for analyzing the above objectives and hypothesis...
TRANSCRIPT
METHODOLOGY
Data & Methodology
For Analyzing the above objectives and hypothesis Data mining Tools and Techniques i.e R programming, R Rattle, WEKA and SPSS 20 software's were using Data validating summarizing visualizing Modeling testing : We are the implemented below Data
mining machine learning algorithms techniques 1.Data Mining Tools 2.Data Validation Imputation 3.Classifying models 4.Clustering models 5.Models Comparison 6.Best fitted Model
Reachout Analytics Client Sample Report
I look for the popular brands when purchasing online If I am satisfied with the product purchased I intend to buy the other products from the same brand I can shop online by comparing the products and can access it hours all the time I feel we can save the time on online rather than going to buy in retail stores I feel we can have better offers in online purchasing I feel the products that are bought from online satisfy my requirements I read the review and the product information of the product before purchasing it I feel secure when purchasing online I prefer to buy the products which are approved or certified by the quality experts I feel it is easy to book the product in advance and order it when the stock is available I feel the price is an important factor while purchasing online I feel the promotions influence my buying decisions I feel the product advertising will influence me to some extent to buy the product I’m willing to pay more for high value products in certain category I still buy the same product/brand if there is increase in price
I think the convenient sizes and the packages offered may significantly impact my purchase decision I look for discounts gift coupons while purchasing the product I choose to buy product endorsed by my favorite celebrities The opinions and recommendation of your family members would affect your purchase decision The opinions of your friends and colleagues would affect your purchase decision I think buying an expensive product indicates high standard of living I think international brands understands my requirements well I buy products online when it is recommended by my friends and colleagues I buy products online when it is recommended by the doctor I like to check out the things by trailing before I buy the product I prefer to buy online with websites that have up to date contents Attractive website design encourages to spend me more time to search for products I buy personal care products whenever I go out for shopping I buy personal care products every one month I preferred to buy through user friendly web portals
E commerce influence variables gathered from secondary sources : Research journals articles and publication and E commerce consultants the list of variables are below
2.Data Validation Imputation
Reachout Analytics Client Sample Report
Dimension Reduction
Dimension Reduction
Load
ed
Factors
4. Customer Behavior,
1. Customer Perception,
2. Customer Attitude
3. Customer Branding
5. Packaging
(Principle Component Method)
6. Customer Self Belief
7. Offers and Discounts
Data Validation
E Commerce Influencing Variables
30 variables
Reachout Analytics Client Sample Report
Data Reliability:
The Cronbach’s Alpha which is a reliability statistic is obtained minimum should be 70% or else increase sample size . This is found to be 93% which means the data is reliable. The table for this is shown below.
Sample Validity:
A sample of 727 is adequate for the study is confirmed by the KMO statistic . Minimum should be 50% , if less then 50% should increase sample size. A good result of 96% is showing the sample validity.
Reliability Statistics
Cronbach's Alpha N of Items .939 30
KMO and Bartlett's Test
Kaiser-Meyer-Olkin Measure of Sampling Adequacy 0.964
Bartlett's Test of Sphericity
Approx. Chi-Square 62659.45
df 435
Sig. 0
Reachout Analytics Client Sample Report
Validation Imputation of PCA Factors
Reachout Analytics Client Sample Report
7. Offers and Discounts 4. Customer Behavior 5. Packaging
6. Customer Self Belief
Default Outlier are 2.16% with 95% confidential significant ; at 99% level there are not outliers . Reachout Analytics Client Sample Report
Classification Regression Model
1.1.Ordinal Regression
1.2.Multinomial Regression
1.3.Binary Regression Model
1.4 Naive Bayes Model
1.5 Decision Tree
1.6 KNN Model
1.7 SVM Model
3. Classifying models
Reachout Analytics Client Sample Report
Model is build with Demographic Variables VS(verses) 7 Factor variables, which are Customer Perception, Customer Attitude, Customer Branding, Customer Behavior, Packaging Customer Self Belief, Offers and Discounts and 7 behavior variables CBB_9,CBB_10,CBB_11,CBB_12, CBB_13,CBB_14 variables that are affecting the E-commerce. A detailed study is done on each model and conclusions are made about the factors that are influencing consumer buying behavior. All the models are evaluated by the confusion matrix and the respective model diagnostics for each variable which are shown below.
1. Marital Status : Model Validation summary report [Binary Logistic Model]
Confusion Matrix Model Diagnostics Marital Status
Unmarried Married ROC RMSE Classification %(Count)
Miss classification %(Count)
Unmarried 146 154 0.677
46%
67.67 32.32
Married 81 346 0.677 (492) -235
Binary , Ordinal and Multinomial Model Summary
Reachout Analytics Client Sample Report
Gender
Confusion Matrix Model Diagnostics
Female
Male
ROC
RMSE
Classification
Miss classification
Female 159 195 0.633 48% 63.14% 36.86%
Male 73 300 0.633 (459) (268)
2.Customer Gender : Model Validation Summary report [Binary Model ]
Region
Confusion Matrix Model Diagnostics
Town Urban Village Rural Metro ROC RMSE
Classification
%(Count)
Miss
Town 137 6 4 0 2 0.976
18% 90.10%
(655)
9.90
%
(72)
Urban 6 206 6 1 11 0.955
Village 3 3 130 0 4 0.965
Rural 0 1 0 0 0 0.437
Metro 0 25 0 0 182 0.967
3.Customer Region : Model Validation summary report [Multinomial Regression]
Reachout Analytics Client Sample Report
Income Group
Confusion Matrix Model Diagnostics
Ab
ove
60
,00
0
60
,00
0 t
o
80
,00
0
30
,00
0 t
o
40
,00
0
40
,00
0 t
o
60
,00
0
ROC
RMSE
Classification
%(Count)
Miss
classification
%(Count)
60,000 205 23 6 8 0.976
27% 83.63%
(608)
16.36%
(119)
60,000 to 80,000 23 195 5 6 0.955
30,000 to 40,000 1 3 113 13 0.965
40,000 to 60,000 0 26 5 95 0.437
4. Customer Income Group :Model Validation summary report [Ordinal regression ]
Occupation
Confusion matrix Model Diagnostics
Self
Emp
loye
e
Emp
loye
d
Ho
me
Mak
er
Pro
fess
ion
al
ROC
RMSE
Classification
%(Count)
Miss
classification
%(Count)
Self Employee 28 68 1 55 0.272
48% 66.85% (486) 33.14%(241) Employed 18 197 10 29 0.712
Home Maker 0 10 104 3 0.842
Professional 8 24 15 157 0.701
5. Customer Occupation : Model Validation summary report [Multinomial regression ]
Reachout Analytics Client Sample Report
Education
Confusion matrix Model Diagnostics
Gra
du
atio
n
Inte
rme
dia
te/1
0+
2
Pro
fess
ion
al
De
gre
e
Po
st-G
rad
uat
ion
ROC
RMSE
Classification
%(Count)
Miss
classification
%(Count)
Graduation 133 20 10 17 0.884
26% 84.59% (615) 15.40%(112) Intermediate/10+2 13 132 6 7 0.927
Professional Degree 2 0 181 7 0.992
Post-Graduation 26 4 0 169 0.965
6. Education :Model Validation summary report [Ordinal Regression ]
Age
Group
Confusion Matrix Model Diagnostics
36
-45
46
-55
26
-35
Be
low
25
Ab
ove
56
ROC RMSE
Classification
%(Count)
Miss classification
%(Count)
36-45 152 19 35 1 28 0.772
32%
60.52% (440)
39.48%
(287)
7.Age Group: Model Validation summary report [Ordinal Regression ]
Reachout Analytics Client Sample Report
Model Fitting Formula : Gender
4.3407+0.2414* Offers and Discounts +0.3309* Customer Self Belief-0.1517* Packaging -0.3154* Customer Behaviour+0.1796* Customer Branding-0.3178 *Customer Attitude+0.1457*Customer Perception-0.6757*CBB_14 -0.2964*CBB_13 +0.133*CBB_12-0.0137*CBB_11+0.0389*CBB_10-0.4398*CBB_9
e
Model Fitting Formula : Marital Status
2.8687+ 0.227 * Offers and Discounts +0.4861* Customer Self Belief-0.1331* Packaging -0.2495* Customer Behaviour+0.2038* Customer Branding-0.4819 *Customer Attitude+0.6287*Customer Perception-0.535*CBB_14 -0.1468*CBB_13 +0.2218*CBB_12+0.1082*CBB_11-0.0836*CBB_10-0.5071*CBB_9
e
Reachout Analytics Client Sample Report
Model Fitting Formula : Age Group [36-45]
7.1193+ 0.4248 * Offers and Discounts -0.1704* Customer Self Belief+0.4935* Packaging -0.1931* Customer Behaviour-1.5526* Customer Branding+1.1133 *Customer Attitude-2.6848*Customer Perception+0.5719*CBB_14 -0.621*CBB_13 +0.0359*CBB_12+0.0245*CBB_11+0.157*CBB_10-1.8955*CBB_9 e
Model Fitting Formula : Age Group [46-55]
6.9429 – 0.1778 * Offers and Discounts -0.2134* Customer Self Belief+0.2055* Packaging -0.296* Customer Behaviour-1.3673* Customer Branding+0.7912 *Customer Attitude-1.2464*Customer Perception+0.6122*CBB_14 -0.5026*CBB_13 -0.2009*CBB_12-0.1683*CBB_11+0.0932*CBB_10-1.4382*CBB_9 e
Reachout Analytics Client Sample Report
8.9252 + 0.3884 * Offers and Discounts -0.0226* Customer Self Belief+0.1511* Packaging -0.0189* Customer Behaviour-1.0493* Customer Branding+0.8993 *Customer Attitude-2.5456*Customer Perception+1.0159*CBB_14 -0.7425*CBB_13 -0.0238*CBB_12-0.3246*CBB_11+0.2426*CBB_10-2.985*CBB_9
Model Fitting Formula : Age Group [26-35]
e
Model Fitting Formula : Age Group [Below 25]
-57.1636 + 0.1955 * Offers and Discounts +0.1775* Customer Self Belief-0.9933* Packaging +0.2122* Customer Behaviour-2.0521* Customer Branding+1.9345 *Customer Attitude-1.8265*Customer Perception+0.1878*CBB_14 -1.8165*CBB_13 +11.5065*CBB_12+0.1293*CBB_11+0.1547*CBB_10-0.2915*CBB_9 e
Reachout Analytics Client Sample Report
Model Fitting Formula : Education [Graduation]
3.7728 + 0.4904 * Offers and Discounts -0.3615* Customer Self Belief+0.6794* Packaging +0.108* Customer Behaviour-1.3465* Customer Branding+0.3239 *Customer Attitude+0.1711*Customer Perception-0.076*CBB_14 -0.3695*CBB_13 +0.2737*CBB_12+0.0713*CBB_11-0.1435*CBB_10-1.6017*CBB_9
e Model Fitting Formula : Education [Intermediate/ 10+2]
10.3017 + 0.4006 * Offers and Discounts -0.6479* Customer Self Belief+0.5467* Packaging +0.1217* Customer Behaviour-1.6277* Customer Branding+0.3364 *Customer Attitude+0.9111*Customer Perception+0.4626*CBB_14 -0.5513*CBB_13 +0.156*CBB_12-0.404*CBB_11-0.1571*CBB_10-4.006*CBB_9
e Reachout Analytics Client Sample Report
Model Fitting Formula : Education [Professional Degree]
-432.5593 – 8.1589 * Offers and Discounts +13.6019* Customer Self Belief-8.4274* Packaging -0.1061* Customer Behaviour-4.5206* Customer Branding-2.2585 *Customer Attitude-17.2684*Customer Perception+0.0567*CBB_14 -.1061*CBB_13 -0.1708*CBB_12+0.2005*CBB_11-0.2711*CBB_10+103.7066*CBB_9
e
Reachout Analytics Client Sample Report
Model Fitting Formula : Employment Status [Occupation] -Self Employee
6.5139 – 0.4707 * Offers and Discounts +0.4177* Customer Self Belief-0.1122* Packaging -0.5012* Customer Behaviour-0.1658* Customer Branding-0.3998 *Customer Attitude+0.2741*Customer Perception-0.109*CBB_14 +0.0188*CBB_13 -0.8888*CBB_12-0.2882*CBB_11+0.0388*CBB_10-1.2381*CBB_9
e
Model Fitting Formula : Employment Status [Occupation] -Employed
11.9245 – 0.062 * Offers and Discounts +0.6931* Customer Self Belief-0.2403* Packaging -0.7339* Customer Behaviour-0.907* Customer Branding-0.7062 *Customer Attitude+0.3251*Customer Perception-1.0708*CBB_14 +0.0438*CBB_13 +0.1745*CBB_12-0.0651*CBB_11-0.0585*CBB_10-1.7213*CBB_9
e
Reachout Analytics Client Sample Report
Model Fitting Formula : Employment Status [Occupation] –Home Maker
18.2195 - 0.0781 * Offers and Discounts +0.7273* Customer Self Belief-0.2368* Packaging -0.6571* Customer Behaviour-1.4237* Customer Branding-0.7015 *Customer Attitude-0.0353*Customer Perception-0.6185*CBB_14 +0.4073*CBB_13 -0.803*CBB_12-0.5537*CBB_11+0.0062*CBB_10-4.8174*CBB_9
e
Reachout Analytics Client Sample Report
Model Fitting Formula : Income Group [Above Rs.60,000]
-5.9494 +0.6643 * Offers and Discounts +1.259 * Customer Self Belief-0.6403* Packaging -0.1426* Customer Behaviour+1.8045* Customer Branding-0.0252 *Customer Attitude+1.2725*Customer Perception-1.3824*CBB_14 +0.0523*CBB_13 -0.1824*CBB_12-0.0497*CBB_11-0.1176*CBB_10+3.7996*CBB_9
e Model Fitting Formula : Income Group [Between Rs.60,000 to Rs.80,000]
1.0753+0.2151 * Offers and Discounts +0.6689 * Customer Self Belief-0.586* Packaging -0.2852* Customer Behaviour+1.4364* Customer Branding-0.0087 *Customer Attitude+0.9552*Customer Perception-0.9507*CBB_14 +0.1349*CBB_13 -0.334*CBB_12-0.1056*CBB_11+0.1011*CBB_10+1.7476*CBB_9
e
Reachout Analytics Client Sample Report
Model Fitting Formula : Income Group [Between Rs.30,000 to Rs.40,000]
4.9492-0.4401 * Offers and Discounts -0.5695* Customer Self Belief+0.0038* Packaging -0.1781* Customer Behaviour-0.0711* Customer Branding-0.0633 *Customer Attitude-0.7938*Customer Perception-1.1732*CBB_14 -0.5434*CBB_13 +0.1875*CBB_12-0.5506*CBB_11+0.062*CBB_10-4.1027*CBB_9
e
Reachout Analytics Client Sample Report
Model Fitting Formula : Region [Town]
8.1358+0.6813 * Offers and Discounts -0.1903* Customer Self Belief+1.0127* Packaging -0.7039* Customer Behaviour-4.7333* Customer Branding+2.0802 *Customer Attitude-2.6166*Customer Perception+1.6688*CBB_14 -0.4427*CBB_13 +0.5599*CBB_12-0.1691*CBB_11+0.1847*CBB_10-4.4126*CBB_9
e Model Fitting Formula : Region [Urban]
7.571-0.0429 * Offers and Discounts -0.0619* Customer Self Belief+0.2488* Packaging -0.6561* Customer Behaviour-2.5204* Customer Branding+1.036 *Customer Attitude-1.5069*Customer Perception+1.0613*CBB_14 -0.35*CBB_13 +0.1758*CBB_12-0.4029*CBB_11+0.1373*CBB_10-2.3338*CBB_9
e Reachout Analytics Client Sample Report
Model Fitting Formula : Region [Village]
12.468+0.8203 * Offers and Discounts -0.338* Customer Self Belief+0.8843* Packaging -0.6863* Customer Behaviour-4.9745* Customer Branding+2.3208 *Customer Attitude-2.4159*Customer Perception+2.7935*CBB_14 -0.3705*CBB_13 +0.6338*CBB_12-0.6742*CBB_11+0.4286*CBB_10-8.4662*CBB_9
e Model Fitting Formula : Region [Rural]
-201.1738+0.3586 * Offers and Discounts +0.5564* Customer Self Belief+2.2098* Packaging -0.4867* Customer Behaviour-2.6323* Customer Branding+1.7746 *Customer Attitude+2.894*Customer Perception+28.0273*CBB_14 -29.8167*CBB_13 +14.2044*CBB_12+1.2923*CBB_11+6.7061*CBB_10-5.4107*CBB_9
e Reachout Analytics Client Sample Report
Naive Bayes Model
Decision Tree Model
Naïve Bayes model developed for all the demographic variables Vs Factor and Behavior variables, and the result summary is attached below in excel file. File name: Analysis of Naive Bayes-Summary Results
Decision Tree model also built and the result documentation is in final stage. Will finalize the best fit model once I complete remaining Clustering model then Models Comparison among the models
Reachout Analytics Client Sample Report
1. IIM Bangalore 3st international conference on Business Analytics and intelligence
17th -19th Dec 2015 “A Study of Customer buying behavior & E commerce: A
Data mining Approach by B. Naveena Devi, K.Venkata rao ,Y. Rama Devi C. Rajeswara
Rao,http://dcal.iimb.ernet.in/baiconf2015/pdf/Presentation%20Schedule_Multiple%20Tr
acks.pdf
Publication /Proceeding/Articles
Reachout Analytics Client Sample Report