using oracle r advanced analytics for hadoop (oraah) · • hadoop/big data cluster administrators...

37
Hello and welcome to this online, self-paced course titled Administering and Managing the Oracle Big Data Appliance (BDA). This course contains several lessons. This lesson is titled Using Oracle R Advanced Analytics for Hadoop (ORAAH). My name is Lauran Serhal. I am a curriculum developer at Oracle and I have helped educate customers on Oracle products since 1995. I'll be guiding you through this course, which consists of lectures, demos, and review sessions. The goal of this lesson is to describe Oracle R Advanced Analytics for Hadoop (ORAAH) and identify the benefits of using simple R functions. Using Oracle R Advanced Analytics for Hadoop - 1

Upload: others

Post on 20-May-2020

17 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Hello and welcome to this online, self-paced course titled Administering and Managing the

Oracle Big Data Appliance (BDA). This course contains several lessons. This lesson is titled

Using Oracle R Advanced Analytics for Hadoop (ORAAH). My name is Lauran Serhal. I

am a curriculum developer at Oracle and I have helped educate customers on Oracle

products since 1995. I'll be guiding you through this course, which consists of lectures,

demos, and review sessions.

The goal of this lesson is to describe Oracle R Advanced Analytics for Hadoop (ORAAH) and

identify the benefits of using simple R functions.

Using Oracle R Advanced Analytics for Hadoop - 1

Page 2: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Introduction

Before we begin, take a look at some of the features of this course player. If you’ve viewed a

similar self-paced course in the past, feel free to skip this slide.

Menu

This is the Menu tab. It’s set up to automatically progress through the course in a linear

fashion, but you can also review the material in any order. Just click a slide title in the outline

to display its contents.

Notes

Click the Notes tab to view the audio transcript for each slide.

Search

Use the Search field to find specific information in the course.

Player Controls

Use these controls to pause, play, or move to the previous or next slide. Use the interactive

progress bar to fast forward or rewind the current slide. Some interactive slides in this course

may contain additional navigation and controls. The view for certain slides may change so

that you can see additional details.

Resources (Optional)

Click the Resources button to access any attachments associated with this course.

Glossary (Optional)

Click the Glossary button to view key terms and their definitions.

Using Oracle R Advanced Analytics for Hadoop - 2

Page 3: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

So, you know the title of the course, but you may be asking yourself, “Is this the right course

for me?” Click the bars to learn about the course objectives, target audience, and

prerequisites.

Using Oracle R Advanced Analytics for Hadoop - 3

Page 4: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

What can you expect to get out of this course? Here are the core learning objectives.

After completing this course, you should be able to do the following:

1. Define the Hadoop ecosystem and its components including Hadoop’s Distributed File

System (HDFS), MapReduce, Spark, YARN, and some other related projects.

2. Complete the BDA Site Checklists.

3. Run the Oracle BDA Configuration Utility.

4. Install the Oracle BDA Mammoth software on the Oracle BDA.

5. Learn about how to secure data on the Oracle BDA.

6. Work with the Oracle Big Data Connectors.

Using Oracle R Advanced Analytics for Hadoop - 4

Page 5: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Who is this course for? Here is the intended audience.

• Application Developers

• Database Administrators

• Hadoop/Big Data Cluster Administrators

• Hadoop Programmers

Using Oracle R Advanced Analytics for Hadoop - 5

Page 6: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Before taking this course, you should have some exposure to Big Data, and optionally some

basic database knowledge.

Using Oracle R Advanced Analytics for Hadoop - 6

Page 7: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

In this course, we'll talk about the following lessons:

In the Introduction to the Hadoop Ecosystem, you define the Hadoop ecosystem and describe the Hadoop core components and some of the related projects. You will also learn about the components of HDFS and review MapReduce, Spark, and YARN.

In the Introduction to the Oracle BDA lesson, you identify the Oracle Big Data Appliance (BDA) and its hardware and software components.

In the Oracle BDA Pre-Installation Steps lesson, you learn how to download and complete the BDA Site Checklists. You also learn how to download and run the Oracle BDA Configuration Utility and then review the generated configuration files.

In the Working With Mammoth lesson, you learn how to download the Oracle BDA Mammoth Software Deployment Bundle from My Oracle Support. You also learn how to install a CDH or NoSQL cluster based on your specifications. You then learn how to install the Oracle BDA Mammoth Software Deployment Bundle using the Mammoth utility.

In the Securing the Oracle BDA lesson, you learn how to secure data on the Oracle Big Data Appliance.

In the Working With the Oracle Big Data Connectors lessons, you learn how to use Oracle SQL Connector for Hadoop Distributed File System, Oracle Loader for Hadoop, Oracle Data Integrator, Oracle XQuery for Hadoop, and Oracle R Advanced Analytics for Hadoop.

Using Oracle R Advanced Analytics for Hadoop - 7

Page 8: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Now that you’ve learned about the other Oracle Big Data Connectors, let’s take a look at

Oracle R Advanced Analytics for Hadoop (ORAAH) Oracle Big Data Connector.

After completing this lesson, you should be able to:

• Describe Oracle Advanced Analytics, Oracle Data Mining, and Oracle R Enterprise at a

high level

• Describe Oracle R Advanced Analytics for Hadoop (ORAAH) and identify the benefits of

using simple R functions

Let's get started.

Using Oracle R Advanced Analytics for Hadoop - 8

Page 9: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Oracle SQL Connector for Hadoop Distributed File System (previously Oracle Direct Connector

for HDFS): Enables an Oracle external table to access data stored in Hadoop Distributed File System

(HDFS) files or a table in Apache Hive. The data can remain in HDFS or the Hive table, or it can be

loaded into an Oracle database.

Oracle Loader for Hadoop: Provides an efficient and high-performance loader for fast movement of

data from a Hadoop cluster into a table in an Oracle database. Oracle Loader for Hadoop pre-partitions

the data if necessary and transforms it into a database-ready format. It optionally sorts records by

primary key or user-defined columns before loading the data or creating output files.

Oracle Data Integrator: Extracts, loads, and transforms data from sources such as files and databases

into Hadoop and from Hadoop into Oracle or third-party databases. Oracle Data Integrator provides a

graphical user interface to utilize the native Hadoop tools and transformation engines such as Hive,

HBase, Sqoop, Oracle Loader for Hadoop, and Oracle SQL Connector for HDFS.

Oracle XQuery for Hadoop (OXH): Runs transformations expressed in the XQuery language by

translating them into a series of MapReduce jobs, which are executed in parallel on the Hadoop cluster.

Oracle R Advanced Analytics for Hadoop (ORAAH): Provides a general computation framework, in

which you can use the R language to write your custom logic as mappers or reducers.

OSCH, OLH, ODI, and OXH are covered in two other lessons. ORAAH is covered in this lesson.

Using Oracle R Advanced Analytics for Hadoop - 9

Page 10: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

• Is a collection of R packages that enable Big Data analytics from an R environment.

• It enables a Data Scientist /Analyst to:

- Work on data from multiple platforms from the R environment

- Benefit from the R ecosystem while leveraging the advantages of massively

distributed Hadoop computational infrastructure

• Leverages the cluster compute infrastructure for parallel distribution computation while

shielding the R user from Hadoop’s complexity through a small number of easy to use

APIs.

Note: For a complete description of the ORAAH features, refer to the latest ORAAH 2.5.0

Release Notes at:

http://download.oracle.com/otn/other/bigdata/ORAAH-2.5.0-ReleaseNotes.pdf

Using Oracle R Advanced Analytics for Hadoop - 10

Page 11: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Before we get started, to use Oracle R Advanced Analytics for Hadoop, you should be familiar

with the following:

• MapReduce and Spark programming concepts

• R programming

• Statistical methods

Using Oracle R Advanced Analytics for Hadoop - 11

Page 12: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Oracle Advanced Analytics (OAA) is an Option to Oracle Database Enterprise Edition, which

extends the database into a comprehensive advanced analytics platform for big data.

• The Oracle Advanced Analytics Option is a comprehensive advanced analytics platform

comprising Oracle Data Mining and Oracle R Enterprise.

• Oracle Data Mining delivers in-Database predictive analytics, data mining, and text

mining.

• Oracle R Enterprise enables R programmers to leverage the power and scalability of

Oracle Database, while also delivering R’s world-class statistical analytics, advanced

numerical computations, and interactive graphics.

With these two technologies, OAA brings powerful computations to the database, resulting in

dramatic improvements in information discovery, scalability, security, and savings.

Data analysts, data scientists, statistical programmers, application developers, and DBAs can

develop and automate sophisticated analytical methodologies inside the database and gain

competitive advantage by leveraging the OAA Option.

Using Oracle R Advanced Analytics for Hadoop - 12

Page 13: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

ORAAH is a set of packages that provides an interface between the local R environment,

Oracle Database, and Cloudera Distribution for Hadoop. From the ORAAH interface, you can

use simple R to do the following:

• Execute Machine Learning algorithms directly on the Hadoop cluster using either

MapReduce or Spark

• Sample data in HDFS

• Copy data between Oracle Database and HDFS

• Schedule R programs to execute as MapReduce and Spark jobs

• Return the results to Oracle Database, HDFS, or your client

Notes:

• ORAAH is one of the Oracle Big Data Connectors. It is a separate product from ORE.

• ORAAH is available only as part of the Oracle Big Data Connectors.

Using Oracle R Advanced Analytics for Hadoop - 13

Page 14: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

ORAAH includes a collection of R packages that provides:

• Interfaces to work with the:

- Apache Hive tables

- Apache Hadoop compute infrastructure

- Local R environment

- Oracle Database tables

• Predictive analytic techniques

- ORAAH implements the techniques in either Java or R as distributed, parallel

MapReduce jobs, thereby leveraging all nodes of your Hadoop cluster

Using Oracle R Advanced Analytics for Hadoop - 14

Page 15: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Using simple R functions, you can perform the following tasks:

• Access and transform HDFS data using a Hive-enabled transparency layer

• Use the R language for writing mappers and reducers

• Copy data between R memory, the local file system, HDFS, Hive, and Oracle Database

instances

• Manipulate Hive data transparently from R

• Execute R programs as Hadoop MapReduce jobs and return the results to any of those

locations

• Submit MapReduce jobs from R for both non-cluster (local) execution and Hadoop

cluster execution

We will provide some examples of the various ORAAH functions in this lesson.

Using Oracle R Advanced Analytics for Hadoop - 15

Page 16: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

This architecture diagram depicts ORAAH as an interface between the local R environment,

Oracle Database, and Cloudera CDH.

This architecture enables:

• Expanded user population that can build models on Hadoop

• Accelerated rate at which business problems are tackled

• Analytics that scale with data volumes, variables, and techniques

• Transparent access to the Hadoop Cluster

• The ability to:

- Manipulate data in HDFS, database, and the file system—all from R

- Write and execute MapReduce jobs with R

- Leverage CRAN R packages to work on HDFS—resident data

- Run specialized algorithms from R (Logistic Regression and Neural Networks)

using a Spark in-memory Context from data in HDFS, for up to 200x in added

performance compared to the same algorithms in MapReduce.

- Move from lab to production without requiring knowledge of Hadoop internals,

Hadoop CLI, or IT infrastructure

Using Oracle R Advanced Analytics for Hadoop - 16

Page 17: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

ORAAH provides access from a local R client to Apache Hadoop using functions with these

prefixes:

• hadoop: Identifies functions that provide an interface to Hadoop MapReduce

• hdfs: Identifies functions that provide an interface to HDFS

• orch: Identifies a variety of functions; orch is a general prefix for ORCH functions

• ore: Identifies functions that provide an interface to a Hive data store

• spark: Identifies a variety of functions that provide an interface between the local R

instance and a Spark cluster

On the next few pages, we will look at some of the functions from the hadoop, hdfs, orch,

and ore categories.

Using Oracle R Advanced Analytics for Hadoop - 17

Page 18: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

This table describes functions that you use when creating and running MapReduce programs.

Using Oracle R Advanced Analytics for Hadoop - 18

Page 19: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

ORAAH contains a variety of functions that enable an R user to interact with an HDFS

system. You can explore files in HDFS, including the following activities: change directories,

list files in a directory, show the current directory, create new directories, move files, copy

files, determine the size of files, and sample files.

This table describes some of the functions that execute HDFS commands from within the R

environment.

Using Oracle R Advanced Analytics for Hadoop - 19

Page 20: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

In addition, ORAAH contains a variety of functions that enable an R user to interact with an

HDFS system. You can interact with HDFS content in the ORAAH environment, including the

following: discovery (or creation) of metadata and easy access to in-memory R objects,

database objects, and local files.

This table describes the functions for copying data between platforms, including R data

frames, HDFS files, local files, and tables in an Oracle database.

Using Oracle R Advanced Analytics for Hadoop - 20

Page 21: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

ORAAH includes a few commands to connect to a Spark Distributed in-memory cluster

(usually through YARN), and create a Spark context that can be used for machine learning algorithms, in particular Multi-Layer Neural Networks (orch.neural) and Logistic

Regression (orch.glm2).

For an example of usage on a cluster with 1Bi records and the detailed steps, refer to the

following link: https://blogs.oracle.com/R/entry/oracle_r_advanced_analytics_for

Using Oracle R Advanced Analytics for Hadoop - 21

Page 22: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

This table describes the functions for establishing a connection to Oracle Database.

Using Oracle R Advanced Analytics for Hadoop - 22

Page 23: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

ORAAH also includes functions that expose statistical algorithms through the orch API.

These predictive algorithms are available for execution on a Hadoop cluster.

This table describes the native analytic functions.

Using Oracle R Advanced Analytics for Hadoop - 23

Page 24: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

You can connect to Hive and analyze and transform Hive table objects using R functions that have an ore prefix, such as ore.connect. If you are also using Oracle R Enterprise, then

you will recognize these functions. The ore functions in Oracle R Enterprise create and

manage objects in an Oracle database, and the ore functions in Oracle R Advanced Analytics

for Hadoop create and manage objects in a Hive database.

Using Oracle R Advanced Analytics for Hadoop - 24

Page 25: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Without Oracle R Advanced Analytics for Hadoop, you will need Java skills to write

MapReduce programs in order to access the Hadoop data.

For example, you need the mapper and reducer programs shown on this page to perform a

simple word-count task on text data that is stored in Hadoop.

Using Oracle R Advanced Analytics for Hadoop - 25

Page 26: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

With ORAAH, the same word-count task is performed with a simple R script. ORAAH

functions greatly simplify access to Hadoop data by leveraging the familiar R interface.

In this example, the R script uses several common ORAAH functions to perform the task, including hdfs.put(), hadoop.exec(), and orch.keyval().

The R script:

• Loads the R data into HDFS and creates a function named wordcount by using the

input data

• Specifies and invokes the MapReduce job with the hadoop.exec() function

• Splits words and outputs each word in the mapper step

• Sums the count of each word in the reducer step

• Specifies the job configuration and returns the MapReduce output as the result

With ORAAH, mapper and reducer functions can be written in R, greatly simplifying access to

Hadoop data.

Using Oracle R Advanced Analytics for Hadoop - 26

Page 27: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Using ORAAH with HIVE tables is transparent. You write R as if you are writing the function to

work on an R dataframe, and ORAAH translates the commands into the proper HQL (HIVE

query language) commands. The commands are executed in the entire cluster, returning to

the end user only the results, so it can operate on a very large scale data.

This example shows how to use the newly defined variable in any type of computation and

how to check the content of this new variable. The example also shows how to review the first

six records of the new variable.

Using Oracle R Advanced Analytics for Hadoop - 27

Page 28: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

In this example, ORAAH is used to launch open-source R algorithms and functions including

plots in MapReduce. It returns an ORAAH packed object that can be unpacked to show the

graphical contents to the user after the execution.

Using Oracle R Advanced Analytics for Hadoop - 28

Page 29: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

This example illustrates how to use ORAAH with a MapReduce Linear Regression algorithm.

It is as simple as attaching the file on HDFS (first line of code) and then invoking the orch.lm() model with the formula (exactly like in R) and data settings. The Mappers and

Reducers settings are optional.

The result is an R Model object that can be queried with summary() for example, to give the

results exactly as expected if we were using open-source R lm().

Using Oracle R Advanced Analytics for Hadoop - 29

Page 30: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

The new Spark-based algorithms can be executed in Spark by simply adding one line with the settings to spark.connect(), and then invoking the correct algorithm that supports Spark.

Currently, the available algorithms are either Logistic Regression (orch.glm2) or Neural

Networks (orch.neural). They will require a formula and an attached HDFS file as inputs,

and will run from 100x to 200x faster than their MapReduce counterparts, depending on data

volumes.

You can find examples of performance on those the two algorithms at:

https://blogs.oracle.com/R/entry/oracle_r_advanced_analytics_for

Using Oracle R Advanced Analytics for Hadoop - 30

Page 31: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

In this lesson, you should have learned how to:

• Describe Oracle Advanced Analytics, Oracle Data Mining, and Oracle R Enterprise at a

high level

• Describe Oracle R Advanced Analytics for Hadoop (ORAAH) and identify the benefits of

using simple R functions

Using Oracle R Advanced Analytics for Hadoop - 31

Page 32: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

In this course, we discussed the following lessons:

• Introduction to the Hadoop Ecosystem

• Introduction to the Oracle BDA

• Oracle BDA Pre-Installation Steps

• Working With Mammoth

• Securing the Oracle BDA

• Working With the Oracle Big Data Connectors:

- Oracle SQL Connector for Hadoop Distributed File System (HDFS)

- Oracle Loader for Hadoop (OLH)

- Oracle Data Integrator (ODI)

- Oracle XQuery for Hadoop (OXH)

- Oracle R Advanced Analytics for Hadoop (ORAAH)

Using Oracle R Advanced Analytics for Hadoop - 32

Page 33: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

The Oracle Learning Library offers other self-paced courses about the Oracle Big Data

Appliance and other related topics. Visit the Oracle Learning Library to learn about the

courses.

Using Oracle R Advanced Analytics for Hadoop - 33

Page 34: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Oracle University offers In-Class courses about Oracle Big Data and other related topics.

Visit Oracle University to learn about the following courses:

• Oracle Big Data Fundamentals

• XML Fundamentals

• Oracle Database 12c: Use XML DB

• Oracle NoSQL Database for Developers

• Oracle NoSQL Database for Administrators

• Oracle R Enterprise Essentials

Using Oracle R Advanced Analytics for Hadoop - 34

Page 35: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

The Oracle Learning Library offers many free demonstrations and tutorials.

And, of course, the Oracle Big Data Appliance documentation and online help embedded

within the product are also valuable resources.

Using Oracle R Advanced Analytics for Hadoop - 35

Page 36: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Using Oracle R Advanced Analytics for Hadoop - 36

Page 37: Using Oracle R Advanced Analytics for Hadoop (ORAAH) · • Hadoop/Big Data Cluster Administrators • Hadoop Programmers Using Oracle R Advanced Analytics for Hadoop - 5 . Before

Using Oracle R Advanced Analytics for Hadoop - 37