final stat project

Upload: parth-patel

Post on 06-Jul-2018

249 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/17/2019 Final Stat Project

    1/16

    November 21

    2011[By: Parth Patel Period 2 and

    Rachel Seitsinger Period 5]

  • 8/17/2019 Final Stat Project

    2/16

    Table o !ontents

    "# "ntrod$ction

    a# %$estion

    b# Problem

    c# Prediction

    ""# &ata table

    """# 'ra(h yo$r data

    a# N$mber o contacts gra(h and analysis

    i# )((ro(riate statistical meas$res or n$mber o contacts

    b# N$mber o *aceboo+ riends gra(h and analysis

    i# )((ro(riate Statistical ,eas$res or n$mber o *aceboo+riends

    c# -.(loring the )ssociation

    i# Scatter Plot

    ii# /east S $ares Regression /ine

    iii# Resid$al Plot

    iv# !orrelation !oe cient

    v# n$s$al 3bservations

    d# &isc$ss

    "# The association o o$r variables

    ii# The strength o the linear association

    iii# The $se $lness o this linear model or (rediction

    # )lligator &ata

    g# !hoosing an e.(lanatory val$e

    h# )ns4ering the original $estion

  • 8/17/2019 Final Stat Project

    3/16

    "# "ntrod$ction

    a#%$estion: The $estion that 4e 4ill beans4ering is:&oes the n$mber o *aceboo+ riendsdetermine the n$mber o riends a (ersonhas in real li e

    b#Problem: The $estion 4e 4o$ld li+e toans4er is i the n$mber o *aceboo+ riends a(erson has is an acc$rate (redictor o

    4hether or not a (erson has a lot o riends inreal li e# To meas$re the n$mber o riends a(erson has in real li e6 4e decided to ta+e then$mber o (hone n$mbers one has in theircell (hone6 4ith the ho(e that thismeas$rement gives $s a good idea o ho4

    many (eo(le they tal+ to# This interests $sbeca$se *aceboo+ is s$ch a h$ge (art oteenager7s everyday lives and 4e 4o$ld li+eto determine i this social net4or+ing sitehold any tr$th 4hen it states on a (ro8le ho4many 9 riends someone has#

    c# Prediction: ;e (redict that the more*aceboo+ riends someone has6 the more

    riends they 4ill have in real li e6 there orethe more contacts they 4ill have in their

  • 8/17/2019 Final Stat Project

    4/16

    (hone# The scatter (lot sho$ld have a(ositive6 linear association#

    ""# &ata Table

    #Name # of contacts in cell

    phone# Of FacebookFriends

    1 )nh Tr$ong 100 2ho$

  • 8/17/2019 Final Stat Project

    5/16

    @0 =enny *oster C

  • 8/17/2019 Final Stat Project

    6/16

    i# )((ro(riate Statistical ,eas$re or n$mber o contacts

    Measure

    # ofcontacts

    ,inim$m 15%1

  • 8/17/2019 Final Stat Project

    7/16

    b#N$mber o *aceboo+ *riends

    1

    2

    3

    4

    5

    6

    7

    Collection 1 His

    Collection 1 B

    The sha(e o the histogram or n$mber o *aceboo+ riends isbimodal and s+e4ed to the lo4er n$mbers# This sho4s that most(eo(le had a n$mber o riends aro$nd 500 or 006 and e4er(eo(le had riends belo4 C00# =$dging by the bo. (lot6 the middle50K o (eo(le had riend n$mbers rom abo$t C00 riends toabo$t

  • 8/17/2019 Final Stat Project

    8/16

    i# )((ro(riate Statistical ,eas$res or n$mbero *aceboo+ riends

    Measure# of FacebookFriends

    ,inim$m ?0%1 C00,edian 5@?#5%@

  • 8/17/2019 Final Stat Project

    9/16

    c# -.(loring the )ssociation

    i# Scatter Plot

    050

    100150

    200

    250

    300

    350

    400

    450

    500Collection 1

    The scatter (lot bet4een n$mber o *aceboo+ riends L.M and n$mber o contacts in one7s cell (hone LyM has a (ositive6 moderately 4ea+6 linearassociation#

    ii# /east S $ares Regression /ine

  • 8/17/2019 Final Stat Project

    10/16

    number_o _contacts ! 0"166number_o _ b_ rien#s $ 70 % r 2 ! 0"21

    0

    100

    200

    300

    400

    500

    number_of_fb_friends

    Collection 1

    N$mber o !ontacts

  • 8/17/2019 Final Stat Project

    11/16

    The correlation coe cient is 0#CC

  • 8/17/2019 Final Stat Project

    12/16

    iii# se $lness o this /inear ,odel orPrediction

    3$r r 2 val$e6 4hich is 0#200C tells $s that 20#0CK o the variationin n$mber o contacts can be e.(lained by the linear model orn$mber o *aceboo+ riends6 and n$mber o (hone contracts# Thismeans that the other

  • 8/17/2019 Final Stat Project

    13/16

    This is the data be ore the trans ormation#

    The resid$al (lot is clearly c$rved ma+ing the linear modelina((ro(riate or this set o data#

    To re e.(ress the data6 4e 8rst $sed the nat$ral log o thede(endent variable L;eight in lbsM# The data came o$tstraightened li+e this:

  • 8/17/2019 Final Stat Project

    14/16

  • 8/17/2019 Final Stat Project

    15/16

    e /nL4eightM e C#

  • 8/17/2019 Final Stat Project

    16/16

    meant that only a small 20#0CK o variation in n$mber ocontacts can be e.(lained by the linear model or n$mbero *aceboo+ riends# So 4hile there 4as a clear linear

    association6 the 4ea+ness o the association means thatthe n$mber o *aceboo+ riends can only 4ea+lydetermine the n$mber o real riends# Io4ever6 some

    a4s co$ld be that not everyone ($ts riends into theircell (hones6 or may have had their contacts deletedrecently6 4hich 4o$ld change the data#