sl lecture01

Upload: rajat

Post on 25-Feb-2018

238 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/25/2019 Sl Lecture01

    1/31

    Lecture 1: Programming with R

    Lecture 1: Programming with R

    Renuka Sane

    July 29, 2015

    http://find/
  • 7/25/2019 Sl Lecture01

    2/31

    Lecture 1: Programming with R

    The kinds of questions economists ask?

    Has the mid-day meal scheme improved school attendance?

    Does increase in police presence lead to a reduction in crime rate?Did the ban in commissions to mutual funds lead to a reduction in fundflows?

    Do minimum wages cause unemployment?

    Do FII inflows increase stock price volatility?

    http://find/
  • 7/25/2019 Sl Lecture01

    3/31

    Lecture 1: Programming with R

    Components of writing such papers

    HypothesisEconometric model

    Data!

    http://find/
  • 7/25/2019 Sl Lecture01

    4/31

    Lecture 1: Programming with R

    Statistical packages

    Users supply data

    Run pre-defined routines. Example regress Y XThe problems:

    It is non-trivial to get Y and X in the same data-setOur work is actually 90% data handling & graphs and 10% estimation.What if the routine you want is not part of the package?

    1

    http://find/http://goback/
  • 7/25/2019 Sl Lecture01

    5/31

    Lecture 1: Programming with R

    Data handing example: MG-NREGA

    L t 1 P i ith R

    http://find/
  • 7/25/2019 Sl Lecture01

    6/31

    Lecture 1: Programming with R

    Data handing example: Trading data on NIFTY

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    7/31

    Lecture 1: Programming with R

    Data handling example: Many files to be put in one

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    8/31

    Lecture 1: Programming with R

    The elements for a computational toolkit

    Price

    Freedom

    Computer science

    Network effects

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    9/31

    Lecture 1: Programming with R

    The three main alternatives

    System Price Freedom CS Network

    SAS Very high Zero Bad SmallStata High Zero Better than SAS High among

    but not great economistsR Free Free Great Slowly growing

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    10/31

    Lecture 1: Programming with R

    Part I

    R

    Lecture 1: Programming with R

    http://goforward/http://find/http://goback/
  • 7/25/2019 Sl Lecture01

    11/31

    g g

    The origins of S

    The predecessor of R wasS. This was done at Bell Labs, and is a child of theUnix philosophy.

    1970s Initial implementation (Fortran, mostly for internal use)

    1980s Unix version, wider distribution in academia. New-S.

    1990s Statistical modeling language. Licensing (S-PLUS). Addition offormal object-oriented programming.

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    12/31

    ACM Software System Award

    John Chambers, awarded the Software Systems Award in 1998.For the S system, which has forever altered how people analyze, visualise,and manipulate data

    http://awards.acm.org/software_system/year.cfm

    The hall of fame includes:

    1983 Unix1986 TeX1995 World-Wide Web

    2002 Java

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    13/31

    From there we have R

    R is the free S.

    Started as a teaching tool by Robert Gentleman and Ross Ihaka at theUniversity of Auckland, around 1993.

    Released as Free software around 1995

    Version 1.0 released in 2000.

    R is now the dominant statistics software in the world.

    R is a GNU project.

    R is available as Free Software under the terms of the Free SoftwareFoundations GNU General Public License in source code form.

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    14/31

    From there we have R

    R is the free S.

    Started as a teaching tool by Robert Gentleman and Ross Ihaka at theUniversity of Auckland, around 1993.

    Released as Free software around 1995

    Version 1.0 released in 2000.

    R is now the dominant statistics software in the world.

    R is a GNU project.

    R is available as Free Software under the terms of the Free SoftwareFoundations GNU General Public License in source code form.

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    15/31

    A digression on free software

    Free as in free speech, and not free beer

    The freedom to run the program, for any purpose

    The freedom to study how the program works, and adapt it to your needs.

    The freedom to redistribute copies so that you can help your neighbour

    The freedom to improve the program, and release your improvements tothe public, so that the whole community benefits

    Free software can be commercial software

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    16/31

    Why open-source?

    Do not need a license.Do not need a department.

    Reproducible research.

    Lecture 1: Programming with R

    http://find/http://goback/
  • 7/25/2019 Sl Lecture01

    17/31

    R is a programming language

    Designed for interactive use

    With a focus on data analysisBasic data structures are vectorsLarge collection of statistical functionsAdvanced statistical graphics capabilities

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    18/31

    R in the real world

    .. It is becoming their lingua franca partly because data mining has entered a

    golden age, whether being used to set ad prices, find new drugs more quickly orfine-tune financial models. Companies as diverse as Google, Pfizer, Merck,

    Bank of America, the InterContinental Hotels Group and Shell use it.

    Source: New York Times, 7 January, 2009

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    19/31

    Where do you get R?

    http://www.r-project.org/

    http://www.rstudio.com/

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    20/31

    Installing R

    R can be installed on Windows, Mac or Linux.

    Homework: Visit the R website and follow the installation directions. You

    will want to install the base system.There are several additional user contributed add-on packages.

    To install a package, be connected to the internet and type >install.packages("plm") You will be asked to select the mirror sitenearest to you. After that everything is automatic.

    Load the package before using it. >

    library(plm)

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    21/31

    Preferred operating system

    Ubuntu on Linux: http://www.ubuntu.com/download

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    22/31

    Part II

    Programming for projects

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    23/31

    Reusability of code

    Dont design a chainsaw for only an oak treeThere might be a million species of trees

    But they are all trees

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    24/31

    Automation: writing and coding

    Frequent changes to the data-set

    Frequent changes to the statistics we want reported

    Embed coding in the writing of your paper

    Tools: LaTeX, R, knitr.

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    25/31

    Writing code is like writing a proof

    Each step follows the otherYou cannot jump!

    Many days later you may not remember how you got there after all

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    26/31

    Part III

    This course

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    27/31

    Goals

    Learn R as a statistics toolbox with a strong emphasis on programminglanguage aspects

    Learn automation using R, LaTeX and knitr

    Lecture 1: Programming with R

    http://find/http://goback/
  • 7/25/2019 Sl Lecture01

    28/31

    Broad syllabus

    Data manipulation in R: objects; tables and cross-classifications; array andmatrix operations

    Writing functions

    Graphics: Basic plots; lattice; ggplots

    Univariate statistics: generating random data; univariate statistics;bootstrap and permutation methods

    High performance computing: Introduction to high performance

    computing

    Lecture 1: Programming with R

    http://find/http://goback/
  • 7/25/2019 Sl Lecture01

    29/31

    Grading

    Examination Marks

    Class tests 20Midterm exam 40Final exam 40

    Lecture 1: Programming with R

    http://find/
  • 7/25/2019 Sl Lecture01

    30/31

    Resources

    The R Manuals: www.r-project.org

    Introductory Statistics with R, by Peter DalgaardModern Applied Statistics with S, by W. N. Venables and B. D. Ripley

    Tutorials on the web: http://www.r-bloggers.com/google-developers-r-programming-video-lectures/

    Lecture 1: Programming with R

    http://find/http://goback/
  • 7/25/2019 Sl Lecture01

    31/31

    Consultation

    Tuesdays: 15:30-17:00, Room 204Email: [email protected]

    http://find/