how to configure eclipse for developing with python and spark on hadoop

3

Click here to load reader

Upload: prossblad

Post on 12-Apr-2017

262 views

Category:

Software


0 download

TRANSCRIPT

Page 1: How to configure Eclipse for developing with Python and Spark on Hadoop

How to configure Eclipse for

developing with Python and

Spark on Hadoop

https://enahwe.wordpress.com/2015/11/25/how-to-configure-eclipse-for-developing-with-python-and-spark-on-hadoop

Page 2: How to configure Eclipse for developing with Python and Spark on Hadoop

How to configure Eclipse for developing

with Python and Spark on Hadoop

Python is one of the most famous programming language used by Data Scientists who develop

programs in order to process Feature Engineering and Machine Learning algorithms.

However Spark (DataFrame and Machine Learning) enables Data Scientists who want to develop

in Python of raising their program's performances by using a Spark cluster.

But what about if Data Scientists want their projects in Python to be more industrial ?

There are many benefits for them to develop with an IDE like Eclipse in addition of developing in

web mode on notebook servers like Jupyter and Zeppelin.

This roadmap describes how to configure Eclipse V4.3 IDE with the PyDev V4.x+ plugin in order to

develop with Python V2.6 or higher, Spark V1.5 or Spark V1.6, and on Hadoop YARN.

https://enahwe.wordpress.com/2015/11/25/how-to-configure-eclipse-for-developing-with-python-and-spark-on-hadoop

Page 3: How to configure Eclipse for developing with Python and Spark on Hadoop

How to configure Eclipse for developing

with Python and Spark on Hadoop

In this roadmap you will learn how to successfully lead the following topics:

• How to execute the basic Spark example code “Word Counts”

• How to read a CSV file directly as a Spark DataFrame for processing SQL

• How to execute your Python-Spark application on a cluster with Hadoop YARN

• How to deploy your Python-Spark application in a production environment

https://enahwe.wordpress.com/2015/11/25/how-to-configure-eclipse-for-developing-with-python-and-spark-on-hadoop