sqoop in actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • sqoop is a...

56
Sqoop In Action LecturerAlex Wang QQ532500648 QQ Communication Group286081824

Upload: others

Post on 02-Jun-2020

18 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Sqoop In ActionLecturer:Alex WangQQ:532500648QQ Communication Group:286081824

Page 2: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Aganda

• Setup the sqoop environment• Import data • Incremental import• Free-Form Query Import• Export data• Sqoop and Hive

Page 3: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Apache sqoop link page

• http://sqoop.apache.org/• http://sqoop.apache.org/docs/1.4.6/index.html

Page 4: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Introduction

• Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes. You can use Sqoop to import data from a relational database management system (RDBMS) such as MySQL or Oracle or a mainframe into the Hadoop Distributed File System (HDFS), transform the data in Hadoop MapReduce, and then export the data back into an RDBMS.

• Sqoop automates most of this process, relying on the database to describe the schema for the data to be imported. Sqoop uses MapReduce to import and export the data, which provides parallel operation as well as fault tolerance.

Page 5: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Apache Sqoop-1 Architecture

Page 6: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Apache Sqoop-2 Architecture

Page 7: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Prerequisites

• The following prerequisite knowledge is required for this product:

• Basic computer technology and terminology• Familiarity with command-line interfaces such as bash• Relational database management systems• Basic familiarity with the purpose and operation of Hadoop

Page 8: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Setup sqoop environment

• Download the sqoop tar and uncompress.

• Config the environmentsexport SQOOP_HOME=/usr/local/sqoop-1.4.3.bin__hadoop-0.20

export PATH=$SQOOP_HOME/bin:$PATH

Page 9: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Download the database connectors

Page 10: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Introduce the sqoop command

Page 11: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Prepare for the mysql

• Install the mysql-server•Create a database(sqoop) for test•Create two tables

Page 12: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Import--Transferring an Entire Table

• sqoop import \• --connect jdbc:mysql://master:3306/sqoop \• --username username \• --password password \• --table cities

Page 13: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Import--Specifying a Target Directory

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --target-dir /etl/input/cities

Page 14: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Import--use --warehousedir

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --warehouse-dir /etl/input/

Page 15: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Import--Importing Only a Subset of Data

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --target-dir /alex/input/subset/cities \• --where "country = 'USA'"

Page 16: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Protecting Your Password

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --table cities \• -P

Page 17: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Protecting Your Password

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --table cities \• --password-file my-sqoop-password

• echo "my-secret-password" > sqoop.password• hadoop dfs -put sqoop.password

/user/$USER/sqoop.password• hadoop dfs -chown 400 /user/$USER/sqoop.password

Page 18: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Import --Using a File Format Other Than CSV

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --as-sequencefile

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --as-avrodatafile

Page 19: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Import--Compressing Imported Data

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --table cities \• --compress

• --compression-codec org.apache.hadoop.io.compress.BZip2Codec

Page 20: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Import--Speeding Up Transfers

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --table cities \• --direct

Page 21: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Import--Overriding Type Mapping

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --table cities \• --map-column-java id=Long

Page 22: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

import--Controlling Parallelism

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --num-mappers 10

Page 23: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Import--Encoding NULL Values

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --null-string '\\N' \• --null-non-string '\\N'

Page 24: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Import--Importing All Your Tables

• sqoop import-all-tables \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop

• sqoop import-all-tables \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --exclude-tables cities,countries

Page 25: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Incremental Import

• So far we’ve covered use cases where you had to transfer an entire table’s contents from the database into Hadoop as a one-time operation. What if you need to keep the imported data on Hadoop in sync with the source table on the relational database side? While you could obtain a fresh copy every day by reimporting all data, that would not be optimal. The amount of time needed to import the data would increase in proportion to the amount of additional data appended to the table daily. This would put an unnecessary performance burden on your database. Why reimport data that has already been imported? For transferring deltas of data, Sqoop offers the ability to do incremental imports.

Page 26: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Importing Only New Data

• Incremental import in append mode will allow you to transfer only the newly created rows. This saves a considerable amount of resources compared with doing a full import every time you need the data to be in sync. One downside is the need to know the value of the last imported row so that next time Sqoop can start off where it ended. Sqoop, when running in incremental mode, always prints out the value of the last mported row. This allows you to easily pick up where you left off.

Page 27: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Importing Only New Data

• sqoop import \• --connect jdbc:mysql://master:3306/sqoop \• --username root \• --password root \• --table cities \• --target-dir /alex/input/append \• --incremental append \• --check-column id \• --last-value 1

Page 28: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Incrementally Importing Mutable Data

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table visits \• --incremental lastmodified \• --check-column last_update_date \• --last-value "2013-05-22 01:01:01"

Page 29: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Preserving the Last Imported Value

• sqoop job \• --create visits \• -- \• import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table visits \• --incremental append \• --check-column id \• --last-value 0• sqoop job --exec visits

Page 30: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Sqoop Job

• The Sqoop metastore is a powerful part of Sqoop that allows you to retain your job definitions and to easily run them anytime. Each saved job has a logical name that is used for referencing. You can list all retained jobs using the --list parameter:

• sqoop job --list• You can remove the old job definitions that are no longer needed with the --delete• parameter, for example:• sqoop job --delete visits• And finally, you can also view content of the saved job definitions using the --show• parameter, for example:• sqoop job --show visits• Output of the --show command will be in the form of properties. Unfortunately, Sqoop• currently can’t rebuild the command line that you used to create the saved job.

Page 31: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Storing Passwords in the Metastore

• <configuration>• ...• <property>• <name>sqoop.metastore.client.record.password</name>• <value>true</value>• </property>• </configuration>

Page 32: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Overriding the Arguments to a Saved Job

• sqoop job --exec visits -- --verbose

Page 33: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Sharing the Metastore Between Sqoop Clients

• sqoop job \• --create cities \• --meta-connect jdbc:hsqldb:hsql://master:16000/sqoop \• -- \• import \• --connect jdbc:mysql://master:3306/sqoop \• --username root \• --password root \• --table cities \• --target-dir /alex/input/append \• --incremental append \• --check-column id \• --last-value 1

Page 34: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

sqoop-site.xml

• <configuration>• ...• <property>• <name>sqoop.metastore.client.autoconnect.url</name>• <value>jdbc:hsqldb:hsql://your-metastore:16000/sqoop</value>• </property>• </configuration>

Page 35: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Free-Form Query Import

• The previous chapters covered the use cases where you had an input table on the source

• database system and you needed to transfer the table as a whole or one part at a time

• into the Hadoop ecosystem. This chapter, on the other hand, will focus on more advanced

• use cases where you need to import data from more than one table or where you

• need to customize the transferred data by calling various database functions.

Page 36: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Importing Data from Two Tables

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --query 'SELECT normcities.id, \• countries.country, \• normcities.city \• FROM normcities \• JOIN countries USING(country_id) \• WHERE $CONDITIONS' \• --split-by id \• --target-dir cities

Page 37: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Using Custom Boundary Queries

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --query 'SELECT normcities.id, \• countries.country, \• normcities.city \• FROM normcities \• JOIN countries USING(country_id) \• WHERE $CONDITIONS' \• --split-by id \• --target-dir cities \• --boundary-query "select min(id), max(id) from normcities"

Page 38: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Renaming Sqoop Job Instances

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --query 'SELECT normcities.id, \• countries.country, \• normcities.city \• FROM normcities \• JOIN countries USING(country_id) \• WHERE $CONDITIONS' \• --split-by id \• --target-dir cities \• --mapreduce-job-name normcities

Page 39: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Importing Queries with Duplicated Columns

• --query "SELECT \• cities.city AS first_city \• normcities.city AS second_city \• FROM cities \• LEFT JOIN normcities USING(id)"

Page 40: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Export data to database

• The previous three chapters had one thing in common: they described various use cases of transferring data from a database server to the Hadoop ecosystem. What if you have the opposite scenario and need to transfer generated, processed, or backed-up data from Hadoop to your database? Sqoop also provides facilities for this use case, and the following recipes in this chapter will help you understand how to take advantage of this feature.

Page 41: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Transferring Data from Hadoop

• sqoop export \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --export-dir cities

Page 42: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Inserting Data in Batches

• sqoop export \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --export-dir cities \• --batch

Page 43: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Inserting Data in Batches

• sqoop export \• -Dsqoop.export.records.per.statement=10 \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --export-dir cities

• sqoop export \• -Dsqoop.export.statements.per.transaction=10 \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --export-dir cities

Page 44: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Exporting with All-or-Nothing Semantics

• sqoop export \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --staging-table staging_cities

Page 45: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Updating an Existing Data Set

• sqoop export \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --update-key id

Page 46: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Updating or Inserting at the Same Time

• sqoop export \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --update-key id \• --update-mode allowinsert

Page 47: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Using Stored Procedures

• sqoop export \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --call populate_cities

Page 48: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Exporting into a Subset of Columns

• sqoop export \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --columns country,city

Page 49: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Encoding the NULL Value Differently

• sqoop export \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --input-null-string '\\N' \• --input-null-non-string '\\N'

Page 50: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Use Sqoop import data to Hive

• Sqoop to import your data directly into Hive.

Page 51: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Importing Data Directly into Hive

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --hive-import

Page 52: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Using Partitioned Hive Tables

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --hive-import \• --hive-partition-key day \• --hive-partition-value "2013-05-22"

Page 53: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Replacing Special Delimiters During Hive Import

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --hive-import \• --hive-drop-import-delims

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --hive-import \• --hive-delims-replacement "SPECIAL"

Page 54: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Using the Correct NULL String in Hive

• sqoop import \• --connect jdbc:mysql://mysql.example.com/sqoop \• --username sqoop \• --password sqoop \• --table cities \• --hive-import \• --null-string '\\N' \• --null-non-string '\\N'

Page 55: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes

Sqoop summary

• Sqoop dependency on the JDBC• Sqoop will influence the source database performance.

Page 56: Sqoop In Actionstatic.roncoo.com/lecturer/da2ea00e057547a18320914bef7dc9e4.pdf · • Sqoop is a tool designed to transfer data between Hadoop and relational databases or mainframes