r as supporting tool for analytics and simulation
TRANSCRIPT
![Page 1: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/1.jpg)
R as Supporting Tool for Analytics and
SimulationAlvaro Gil
Simulation & Optimization Consultanthttp://agiltools.com
June 2016
![Page 2: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/2.jpg)
Agenda Introduction
What is R? Why use it?
What to Install
Example of Companies Using R
Some Facts About R
Interesting R Applications
R: The Generic Scripting Language
R + AnyLogic
Interfacing Programming Languages from R
R and IoT
Useful Links
![Page 3: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/3.jpg)
Introduction Pre and post processing of information is a necessary step for modeling and simulation
Information processing is part of the Analytics field.◦ Analytics is a discipline which combines: Descriptive, Predictive and Prescriptive
techniques on all type of data (INFORMS).
Applying Analytics requires special skills as well as knowledge of specialized software (SPSS, SAS, R, Python, JMP, Stata, etc.).
Several specialists are promoting the use of R as the standard language for data analysis (reasons to come in the following slides)
This presentation is an overview of R and what we expect to achieve with it.
![Page 4: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/4.jpg)
What is R? Why use it? R is a high level matrix programming language for statistical and data analysis.
It runs on multiple platforms including Windows, MacOS and Linux.
R is an interpreted language, meaning that user gets an immediate response of the tools without the need of program compilation.
R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, …) and graphical techniques, and is highly extensible.
Free and open source
R’s main selling point is the massive amount of libraries allowing you to perform almost any statistical procedure in a single command
◦ There are more than 8000 available packages on CRAN, all independently tested, and generally peer reviewed.
R is great for performing analysis on a dataset, and presenting findings in a static set of graphics
R is very useful to perform distributed automatized data analysis process
![Page 5: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/5.jpg)
R Video
![Page 6: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/6.jpg)
What to Install? R Language
◦ R CRAN or◦ Microsoft Open R (MRO)
An IDE◦ R-Studio◦ Red-R◦ Rattle◦ EMACS + Emacs Speaks Statistics (ESS)◦ Eclipse (StatET)◦ Visual Studio
A full set of packages
![Page 7: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/7.jpg)
Microsoft R MRO: Microsoft R Open (personal version link) MRS: Microsoft R Server (professional version link) Enhanced distribution of R from Microsoft Corporation. It includes the R languages plus additional capabilities for improved performance, reproducibility and platform support.
◦ The installation of many packages include all base and recommended R packages plus a set of specialized packages released by Microsoft Corporation to further enhance your Microsoft R Open experience
◦ Multi-threaded math libraries (Math Kernel Library MKL)◦ A high-performance default CRAN repository that provide a consistent and static set of packages to all Microsoft R
Open users.◦ The checkpoint package that make it easy to share R code and replicate results using specific R package versions.◦ Platforms: Windows, Mac OS X, and Linux◦ MRS also includes specialized packages for big data.
Visit https://mran.microsoft.com/open/ for more info
![Page 9: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/9.jpg)
R Packages More than 8,000 available packages
source('http://agiltools.com/R/rp.R')
![Page 10: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/10.jpg)
Examples of Companies Using Rhttp://www.revolutionanalytics.com/companies-using-r
http://www.r-bloggers.com/airbnb-uses-r-to-scale-data-science/
http://data-informed.com/companies-use-r-compete-data-driven-world/
http://www.r-bloggers.com/companies-using-open-source-r-in-2013/
Sou
rces
![Page 11: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/11.jpg)
Some Facts About R R is the highest paid IT skill (Dice.com survey, January 2014)
R most-used data science language after SQL (O'Reilly survey, January 2014)
R is used as Analytics tool by 75% of professionals (Rexer survey, October 2015)
R is #13 of all programming languages (RedMonk language rankings, June 2015)
R growing faster than any other data science language (KDNuggets survey, August 2014)
R is the #1 Google Search for Advanced Analytics software (Google Trends April 2016)
R has more than 2 million users worldwide (Oracle estimate, February 2012)
![Page 12: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/12.jpg)
Interesting R Applications Complete Libraries Specialized by Topic e.g.:
◦ Econometrics◦ Finances (e.g. actuar, fPortfolio, financial, etc.)◦ Machine Learning (e.g. nnet, neuralnet, RSNNS, deepnet, darch, h2o, etc.)◦ Optimization (e.g. Rquadprog, optmix, etc.)◦ Simulation (e.g. simmer)◦ Social Sciences◦ Spatial (e.g. maps)◦ See more at https://cran.r-project.org/web/views/
Markdown (R-Studio)
Shiny (R-Studio)
Big Data (e.g. bigmemory, ff, RevoScaleR)
![Page 13: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/13.jpg)
Interesting R Applications: Markdown
Markdown is a text-to-HTML conversion tool for reporting.
It allows users to share and/or present their work.
External examples:◦ 1 (pdf): https://github.com/yihui/knitr/releases/download/doc/knitr-minimal.pdf ◦ 2 (html): https://rawgit.com/yihui/knitr-examples/master/003-minimal.html◦ 3 (knitr + googleVis): https://cran.r-project.org/web/packages/googleVis/vignettes/Using_googleVis_with_knitr.html ◦ 4 (with Shiny): https://cpsievert.shinyapps.io/animintRmarkdown/ ◦ 5 (combined with JavaScript):
http://www.nytimes.com/interactive/2014/01/23/business/case-shiller-slider.html?_r=0
![Page 14: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/14.jpg)
Interesting R Applications: Shiny Web application for R. Interactive visualization tool based on JavaScript libraries like d3, Leaflet and Google Charts.
This reporting tool runs in all type of devices Can be connected to R to perform any kind of data analysis in real time (data mining, optimization, etc.)
See some examples at: Shiny User Showcase Shiny + javascript (https://frissdemo.shinyapps.io/FrissDashboard/)
Shiny can be embedded in individual servers to add security and increase performance.
Shiny is available at Predix through cf-buildpack-r (check link)
![Page 15: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/15.jpg)
Interesting R Applications: Big Data
Specialized libraries to manipulate big data◦ bigmemory+ biganalytics (article)◦ ff+ffbase (article)
R has proven to be very effective to manipulate millions of rows in short time (e.g. less than 30 seconds to perform a linear regression of a sample of 10M).
Machine learning algorithms with millions of rows can run in seconds with the right libraries and configuration
MRS implements RevoScaleR to manipulate big data and handle parallelism
![Page 16: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/16.jpg)
R: The Generic Scripting Language Given the popularity and versatility of R, many companies are adapting its services to be compatibles with R
Oracle, Microsoft, GE among others
Since 2016 SQL Server has the ability to run R scripts directly in database using SQL Server R Services. This means the R code will run directly on the server, as opposed to first extracting the data to a local R session.
In the words of Joseph Sirosh, corporate VP at Microsoft Data Group, “[Microsoft R Server enables] enterprise customers to standardize advanced analytics on one core tool, regardless of whether they are using Hadoop (Hortonworks, Cloudera and MapR), Linux (Red Hat and SUSE) or Teradata. [We are committed to] building R and Revolution’s technology into our broader database, big data and business intelligence offerings and to bring these benefits to customers and students – on-premises, in the Azure cloud and to new platforms.”
Forbes January 2016 https://t.co/AJicDBqv47
![Page 17: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/17.jpg)
R: The Generic Scripting Language
R and Azure
Microsoft is adapting services like Azure to include R as the scripting language for data analysis
![Page 18: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/18.jpg)
Calling R from AnyLogic AnyLogic can work with R by using the Java library Rcaller.
Rcaller is a software library which is developed to simplify calling R from Java (see link)
It successfully simplifies and wraps type conversations and makes variables in each languages accessible between platforms
multiple R processes can be created and handled by multiple RCaller instances in Java
![Page 19: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/19.jpg)
ExampleWatch demo video
More info: http://agiltools.com/blogsp/anylogic_r_qchart/
![Page 20: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/20.jpg)
Interfacing Programming Languages from R
The R environment can interface with other programming languages, such as Fortran, C and Java.
Examples of interfaces with C and Java can be found in:
C: http://adv-r.had.co.nz/C-interface.html
Java: http://rforge.net/rJava/
![Page 21: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/21.jpg)
R and IoT R can be executed inside Internet of Things (IoT) platforms like Bluemix, Amazon Web Services, Azure and Predix
Libraries like cf-buildpack-r allows users to execute Rscripts in cloudfoundry based plaforms and even embed Shiny applications.
In Microsoft platforms Rscripts are already embedded in Azure
![Page 22: R as supporting tool for analytics and simulation](https://reader036.vdocuments.site/reader036/viewer/2022062522/587adbab1a28ab542b8b54c5/html5/thumbnails/22.jpg)
Useful Links R Project
CRAN
R Packages
Books and Tutorials
http://www.statmethods.net/
R Bloggers
R Journal
R Graphical Manuals