unlocking the secrets hidden in your data data analysis
TRANSCRIPT
UNLOCKING THE SECRETS HIDDEN IN YOUR DATA
Data Analysis
Why Do Data Analysis ?
Avoids incorrect assumptions
Does the data makes sense?
Which one is better?
time elevation0 400000
10 39900020 39300030 384000
y = 0.3333x3 - 35x2 + 216.67x + 400000
R2 = 1382000
384000
386000
388000
390000
392000
394000
396000
398000
400000
402000
0 5 10 15 20 25 30 35
y = -20x2 + 60x + 400100
R2 = 0.9988
382000
384000
386000
388000
390000
392000
394000
396000
398000
400000
402000
0 5 10 15 20 25 30 35
Why Do Data Analysis ?
Are your assumptions correct?
Did you collect enough data?
If this is a model of a following body which is better?
Be careful what's better mathematically is not always better scientifically
y = -20x2 + 60x + 400100R2 = 0.9988
y = 0.3333x3 - 35x2 + 216.67x + 400000R2 = 1
250000
270000
290000
310000
330000
350000
370000
390000
410000
430000
0 10 20 30 40 50 60 70 80 90
Ways to Analyze Data
Plotting Data Ways to visually
understand data
Statistics Makes is easier to
compare data Mean, Median,
Mode Makes it clear if you
have NOISY data Range,
Variance, Standard Deviation
0
5
10
15
20
25
30
0 10 20 30 40 50 60
Mean Pink
Pink
Mean Blue
Blue
Ways to Analyze Data
Derivatives (Slopes) Tell if changes in
parameters affect data Parameter 2 has a
greater effect than Parameter 1
Get more information from data 0
0.5
1
1.5
2
2.5
3
3.5
4
0.00 2.00 4.00 6.00 8.00 10.00 12.00
Base Case
Parameter 1
Parameter 2
Slope = 0.08
Slope = 0.16
Slope = 0.39GreatDerivative
Plotting Data – Extracting from Netlogo
Two ways 1st Way: Write code to
extract the data you want – see File Output Example in the Code Examples
Open file in setup procedure
Create a write-to-file procedure
Plotting Data – Extracting from Netlogo
2nd way: Extract data from Netlogo graphs Have Netlogo generate graph on
Interface page (example on later slide) Create a setup-plot procedure and a
do-plot procedure Call the setup-plot procedure in setup
procedure Call do-plot procedure in go procedure
Plotting Data – Extracting from Netlogo
Run model until sufficient data obtained
(PC) Right Click on Graph/(Mac) Select Export Choose location and File name -
select save Excel File is created – Next Slide
Contains all the information in the plot and input parameters used.
Contains excess information about the plot (color, pen down, mode, interval…)
LET’S DO IT – Open Rabbits Grass Weeds
Plotting Data – Extracting from Netlogo
This is what You need
Plotting Data – Different Types of PlotsAll plots from http://www.statcan.ca
Pie Charts – music preference
Pets purchased at pet store
Bar Charts – preferred snacks
Plotting Data – Different Types of PlotsAll plots from http://www.statcan.ca
Line Graphs – cell phone use http://www.statcan.ca
Scatter Plotshttp://en.wikipedia.org/wiki/Scatterplot
Plotting Data – Activity in Excel
Open File Car Data Insert ChartSelect type of chart
XY Scatter Select Data RangeHighlight data to
be plotted
LET’S DO IT
Plotting Data – Activity in Excel
Label each data series Label Graph and Axis Select where you want
graph to be (on that page -worksheet –or on another worksheet in same file)
2
6
10
14
18
22
0 10 20 30 40 50 60
Noisy
Noisier
Mean (both)
Noisy + 2SD
Noisy - 2SD
Noisier + 2SD
Noisier - 2SD
Statistics
Statistics help you Summarize data Describe data Analyze data
2
6
10
14
18
22
0 10 20 30 40 50 60
Noisy
Noisier
Hard to describe the difference Between the two data sets
Now it is easy to summarize, describe and analyze the data….The blue and the pink data have the AVERAGE value (mean) but the bluedata is “NOISIER” (greater standarddeviation). Therefore…
Statistics – How to Calculate in Excel
+,-,*,/ used for addition, subtraction, multiplication and division.
Each cell has a label based on the column and row.
Use cells to perform calculations instead of numbers. Example : =(A4+B4)/C4
Perform calculations on an entire column - copy and paste the equation .Warning : this changes the cell number for each line.
Fix a specific cell - use the $ symbol, example (A4+B4)/$C$1
Excel has many built in statistical functions
Makes life easy!
E1
Statistics – Measurements of Central Tendency
Mean (Average), Median, and Mode
Definitions Mean (Average) – Sum divided by the number of data points Median – Middle data point when arranged from highest to
lowest Mode – Most frequent value
Use data set to calculate Mean (Average) Median, Mode, Max and Min
Select Cell where you want the value of the function to appear Select Insert then Function Select Statistical Select function wanted (AVERAGE, MEDIAN, or MODE) then
hit OK Select Range of data you want to analyze by clicking on range
symbol and highlighting range. Hit enter or OK
LET’S DO IT : StarlogoTNG : Fish and Plankton Netlogo : Rabbits and Grass
Statistics – Measurements of Data SpreadRange, Variance and Standard Deviation
Definitions Range = maximum - minimum
Variance = measures noise of the data around the mean value.
Standard Deviation (S) is the square root of the variance. Most commonly used measure of spread (same units as the data). Another reason to use S:
~68% of the data are in the interval Mean – S to Mean + S
~95% of the data are in the interval Mean – 2 S to Mean + 2 S
~99% of the data are in the interval Mean – 3 S to Mean + 3 S
EXCEL does it for you!!!
Rabbit Population
0
50
100
150
200
250
300
0 500 1000 1500 2000
Ticks
Num
ber o
f Rab
vits
Rabbits Mean Mean - 2 S Mean + 2 S
LET’S DO IT : StarlogoTNG : Fish and Plankton Netlogo : Rabbits and Grass
Derivatives
What are Derivatives? A simple calculation using data Instantaneous rate of change
= SLOPEWhy use Derivatives?
Get more information from data More Ways to comparison data Car moving down a road
Data = the distance traveled Velocity = the 1st derivative
of distance Acceleration = 2nd derivative
of distance
= the 1st derivative of velocity
0
5
10
15
20
25
30
35
40
0 2 4 6 8 10 12
Dis
tanc
e
0
1
2
3
4
5
6
7
8
0 2 4 6 8 10 12
Vel
ocit
y
-4
-3
-2
-1
0
1
2
0 2 4 6 8 10 12
Time
Acc
eler
atio
n
Slope of distance
Slope of velocity
How to Calculate a Derivative
Mathematically: x = position t = time
In Excel
12
12
tt
xx
t
x
t
x
2323 AABB
You Don’tHaveTo UseThis
Use this in Excel
LET’S DO IT : StarlogoTNG : Fish and Plankton Netlogo : Rabbits and Grass