data analysis spotlights # 1

36
Data Analysis Spotlights - I M.Elkharashy

Upload: muhammad-elkharashy

Post on 07-May-2015

71 views

Category:

Data & Analytics


0 download

DESCRIPTION

An introduction about variables types, data analysis types, random sampling/assignment, and numerical/categorical variables.

TRANSCRIPT

Page 1: Data analysis spotlights # 1

Data AnalysisSpotlights - I

M.Elkharashy

Page 2: Data analysis spotlights # 1

Introduction

● Variables Types● Analysis Types● Random Samlping Vs Assignment● Exploring Numerical Variables

o Mesures of Centero Robust Statistics

● Exploring Categorical Variables

Page 3: Data analysis spotlights # 1

Variables Types

Quantitive/Numberical● Continuous ● Discreate

Qualitative/Categorical

● Regural● Ordinal

Notes:It can be represented by numbers, but without any arthimatic operations.Ordinal: variables that have ordered levels.

Page 4: Data analysis spotlights # 1

Variables Types

Dependent/Associated● +ve association ● -ve association

Independant

Page 5: Data analysis spotlights # 1
Page 6: Data analysis spotlights # 1
Page 7: Data analysis spotlights # 1

Analysis Types

Observational Study● Merely “Observe”● Retrospective/Prespective

use data (from past/ throughout the study)

● Can only establish an association between the explooanatory and response variables.

Experiment

● Randomly assign subjects to various treatments.

● Can establish causal connections between the explanatory and response variables.

Page 8: Data analysis spotlights # 1
Page 9: Data analysis spotlights # 1

Random Sampling

Random Assignment

Page 10: Data analysis spotlights # 1

ideal experiment

most experiments

most opeservational studies

bad observational studies

Page 11: Data analysis spotlights # 1

Exploring Numerical Variables

Page 12: Data analysis spotlights # 1

Dot PlotUseful when individual values are of

interest

Page 13: Data analysis spotlights # 1

Histogram

Page 14: Data analysis spotlights # 1
Page 15: Data analysis spotlights # 1

Box Plot

Useful for highlighting outliers, median, and the interquartile range.

Page 16: Data analysis spotlights # 1
Page 17: Data analysis spotlights # 1

Intensity MapUseful for highlighting the spacial distribution

Page 18: Data analysis spotlights # 1

Mesures of Center

Page 19: Data analysis spotlights # 1

Mesures of Center

Sample Statistics/Point Estimates● mean: arithmetic average ● median: midpoint of the distn. (50th

percentile)● mode: most frequent observation

Page 20: Data analysis spotlights # 1

Mesures of Center

Page 21: Data analysis spotlights # 1

Skewness Vs Mesures of Center

Page 22: Data analysis spotlights # 1

Robust Statistics

Page 23: Data analysis spotlights # 1

Robust Statistics

As a mesure on which extreme observation have little effect

Page 24: Data analysis spotlights # 1

Robust Statistics

As a mesure on which extreme observation have little effect

Page 25: Data analysis spotlights # 1

Exploring Categorical Variables

Page 26: Data analysis spotlights # 1

Exploring Categorical Variables

● Exploring single categorical variable.● Exploring the relationship between two

categorical variables● Exploring the relationship between a

numberical variable and categorical variable

Page 27: Data analysis spotlights # 1

1- Single Categorical VariableFrequency Table & Bar Plot

Page 28: Data analysis spotlights # 1

1- Single Categorical VariablePie Chart

Page 29: Data analysis spotlights # 1

Contingency Table

2- Relationship between 2 categorical variables

Page 30: Data analysis spotlights # 1

Segmented Bar Plot

2- Relationship between 2 categorical variables

Page 31: Data analysis spotlights # 1

Relative Freq. Segmented Bar Plot

2- Relationship between 2 categorical variables

Page 32: Data analysis spotlights # 1

Mosaicplot

2- Relationship between 2 categorical variables

Page 33: Data analysis spotlights # 1

Side-by-Side Box Plots

3- Rel. bet. numberical & categorical variables

Page 34: Data analysis spotlights # 1

● Disjoint events Vs independent process● Conditional probability & probability

trees● Normal distribution ● Binomial distribution● Centeral Limit Theorem (CLT)● Confidence Interval● Introductoin to Inference

In Next Session

Page 35: Data analysis spotlights # 1

● Another introduction to inference● Hypothesis testing(for mean)

o p-valueo test statistics (Z & T)o ANOVA

● Hypothesis testing(for proportion)o Chi-Square (GOF & Independence Test)

● Frequentist Vs Bayesian Inference● Linear Regression

In 3rd Session

Page 36: Data analysis spotlights # 1

Questions?