hadoop integration with sap hana
DESCRIPTION
This guide was generated in Jan-Feb'2014 timeframe. Using the feature of SAP HANA Smart Data Access(SDA), it is possible to access remote data, without having to replicate the data to the SAP HANA database beforehand. The following are supported as sources(till 2013): - Teradata database, - SAP Sybase ASE, - SAP Sybase IQ, - Intel Distribution for Apache Hadoop, - SAP HANA. SAP HANA handles the data like local tables on the database. Automatic data type conversion makes it possible to map data types from databases connected via SAP HANA Smart Data Access to SAP HANA data types. This guide will explain the step-by-step approach SAP HANA SDA for Hadoop data - which also include the following : - Hadoop Installation - Data Load in Hadoop system - Activities on Unstructured Data in Hadoop system - ODBC Driver installation & configuration on HANA Server for Hadoop system data access - Smart Data Access in SAP HANA (through SAP HANA Studio), using HADOOP as a remote data source Setup used for this guide : 1) Hadoop : HDP 1.3 for Windows(Hortonworks Data Platform) - Standalone - on Dell Laptop, OS Win7 64bit with 8GB RAM 2) SAP HANA Sever : running on VM – 24GB Standalone HANA 1.0 SPS 7 – SLES 11 SP1TRANSCRIPT
![Page 1: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/1.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 1
SAP HANA Smart Data Access using Hadoop/Hive =================================================================================================
By
Debajit Banerjee
Table of Contents
Introduction about SAP HANA Smart Data Access………………………………………………………………. Page 02
I.HDP 1.3 for Windows Installation Pre-requisite……………………………………………………………….. Page 03
II.HDP 1.3 for Windows (Hortonworks Data Platform) Standalone Installation………………….. Page 13
III.Validation of HDP 1.3 for Windows - Standalone Installation…………………………………………. Page 16
IV.Data Load in Hadoop System : eBook Upload…………………………………………………………………. Page 26
V.Unstructured Data Transformation into Table/View in Hadoop System…………………………… Page 35
VI.ODBC Driver Installation & Configuration on SAP HANA Server………………………………………. Page 40
VII.Smart Data Access (Hadoop Data) in SAP HANA…………………………………………………………….. Page 47
![Page 2: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/2.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 2
SAP HANA Smart Data Access
Using the feature of SAP HANA Smart Data Access, it is possible to access remote data, without having to replicate the
data to the SAP HANA database beforehand. The following are supported as sources(till 2013):
Teradata database,
SAP Sybase ASE,
SAP Sybase IQ,
Intel Distribution for Apache Hadoop,
SAP HANA.
SAP HANA handles the data like local tables on the database. Automatic data type conversion makes it possible to map
data types from databases connected via SAP HANA Smart Data Access to SAP HANA data types.
Steps/Procedure :
Hadoop Installation
Data Load in Hadoop system
Activities on Unstructured Data in Hadoop system
ODBC Driver installation & configuration on HANA Server for Hadoop system data access
Smart Data Access in SAP HANA (through SAP HANA Studio), using HADOOP as a remote data source
Assumption – SAP HANA System is already up & running.
Scenario / Lab Setup Details :
1) Hadoop Installation Pre-requisite : HDP 1.3 for Windows(Hortonworks Data Platform) - Standalone
2) Hadoop Installation : HDP 1.3 for Windows(Hortonworks Data Platform) - Standalone – on Dell Laptop, OS Win7
64bit – 8GB)
3) SAP HANA Sever Installation(Lab Server running on VM – 24GB Standalone HANA 1.0 SPS 70) – SLES 11 SP1
4) Validation of Hadoop Installation
5) Data Load in Hadoop system : eBook Upload
6) Unstructured Data transformation into table/views, so that HANA Server can understand Hadoop data.
7) ODBC Driver installation & configuration on HANA Server
8) Smart Data Access in SAP HANA (through SAP HANA Studio), using Hadoop as a remote data source
![Page 3: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/3.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 3
I. HDP 1.3 for Windows Installation Pre-requisite
- On HANA Server -Simba : Apache Hive ODBC Driver – Linux 64bit
- On Hadoop System - Microsoft Visual C++ 2010 Redistributable Package (64bit)
- On Hadoop System - Microsoft .NET Framework 4.0
- On Hadoop System - JAVA JDK 1.6/1.7 and PATH, JAVA_HOME environment variables setup
- On Hadoop System - Python 2.7 and PATH environment variable setup
In Linux
In Windows
![Page 4: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/4.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 4
MS Visual C++ 2010
![Page 5: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/5.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 5
MS .NET Framework 4
![Page 6: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/6.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 6
Cancelling it as it gives the option of Repair !!
![Page 7: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/7.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 7
Oracle JDK
![Page 8: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/8.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 8
![Page 9: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/9.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 9
i. Open the Control Panel -> System pane and click on Advanced system
settings.
ii. Click on the Advanced tab.
iii. Click the Environment Variables button.
iv. Under System variables, click New.
v. Enter the Variable Name as JAVA_HOME.
vi. Enter the Variable Value, as the installation path for the Java Development Kit.
For example, if your JDK is installed at C:\Java\jdk1.6.0_31, then you must
provide this path to the Variable Value.
vii. Click OK. viii. Click OK to close the Environment Variables dialog box.
![Page 10: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/10.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 10
Python
![Page 11: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/11.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 11
![Page 12: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/12.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 12
Like Oracle JDK above, C:\Python27 also to be set in PATH variable.
![Page 13: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/13.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 13
II. HDP 1.3 for Windows (Hortonworks Data Platform) Standalone Installation
Now accordingly update the C:\hdp-1.3.0.0-GA\clusterproperties.txt as per following:
![Page 14: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/14.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 14
In Command Window(Admin Privilege):
msiexec /i "C:\hdp-1.3.0.0-GA\hdp-1.3.0.0.winpkg.msi" /lv "C:\DEBAJIT\HD\hdp13\hdp.log" HDP_LAYOUT="C:\hdp-
1.3.0.0-GA\clusterproperties.txt" HDP_DIR="C:\hdp\hadoop" DESTROY_DATA="Yes"
![Page 15: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/15.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 15
There are 3 shortcuts created in desktop area.
![Page 16: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/16.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 16
III. Validation of HDP 1.3 for Windows - Standalone Installation
Now we have to start Hadoop.
![Page 17: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/17.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 17
Services were not started due to 0 bytes in .xml files(master & regionserver)
Also rest/thrift/thrift2.xml are also of zero bytes.
![Page 18: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/18.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 18
1) Navigate to the hbase install directory: C:\hdp\hadoop\hbase-0.94.6.1.3.0.0-0380\bin 2) Open the hbase.cmd in a text editor 3) Look for the line that says: set PATH=%PATH%;%HADOOP_HOME%\bin 4) Delete it or comment it out with a @rem
Now Open a command prompt and navigate to hbase install: C:\hdp\hadoop\hbase-0.94.6.1.3.0.0-0380\bin Rebuild the .xml files: hbase.cmd --service master start > master.xml hbase.cmd --service regionserver start > regionserver.xml hbase.cmd --service rest > rest.xml hbase.cmd --service thrift > thrift.xml hbase.cmd --service thrift2 > thrift2.xml
![Page 19: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/19.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 19
Now all the above .xml files having contents.
Stop & Start Hadoop – now it is PERFECT. No more failed services.
![Page 20: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/20.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 20
Hadoop Smoketest
![Page 21: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/21.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 21
![Page 22: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/22.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 22
![Page 23: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/23.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 23
![Page 24: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/24.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 24
![Page 25: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/25.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 25
![Page 26: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/26.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 26
IV. Data Load in Hadoop System : eBook Upload
![Page 27: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/27.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 27
Now to check whether Hadoop can read the same or not…
It can…perfect !!
![Page 28: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/28.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 28
![Page 29: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/29.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 29
After refresh
![Page 30: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/30.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 30
![Page 31: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/31.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 31
From the Namenode server, click on “Browse the filesystem”
![Page 32: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/32.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 32
Click on “user”
![Page 33: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/33.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 33
Click on .txt file…one can see the book
If one can click on .out file, then one can see the part file
![Page 34: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/34.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 34
![Page 35: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/35.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 35
V. Unstructured Data Transformation into Table/View in Hadoop System
Now we have to convert those files to be readable table format for HANA. For that we will use HIVE.
Created a table called “debajit_wc” for wordcount part file. But right now, it is empty.
Now loading Data.
![Page 36: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/36.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 36
![Page 37: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/37.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 37
Configuration change required in hive-site.xml file.
![Page 38: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/38.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 38
Just changed from http to thrift – servermode.
And then restart Hadoop.
![Page 39: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/39.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 39
Now we can test whether SAP HANA can connect to Hadoop….
Download the license file from email and deployed. Problem solved.
![Page 40: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/40.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 40
VI. ODBC Driver Installation & Configuration on SAP HANA Server
Renaming done at WinSCP level….
![Page 41: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/41.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 41
![Page 42: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/42.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 42
Stopping HANA System
![Page 43: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/43.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 43
SIMBA Driver
Changed items are as follows:
![Page 44: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/44.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 44
UNIXODBC
We have to upgrade it because of compatibility issue with Simba.
![Page 45: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/45.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 45
ODBC.INI - DSN purpose
![Page 46: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/46.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 46
Now added odbc information into customer.sh
So, now the connection is working between HANA Server and Hadoop system from OS level.
![Page 47: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/47.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 47
VII. Smart Data Access (Hadoop Data) in SAP HANA
SAP HANA Studio
![Page 48: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/48.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 48
So, now the connection is working between HANA Server and Hadoop system from SAP HANA Studio.
Creating a schema in HP7
![Page 49: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/49.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 49
![Page 50: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/50.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 50
![Page 51: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/51.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 51
One can do Query and Connection Monitoring when click on “Smart Data Access” under “Provisioning”.
![Page 52: Hadoop integration with SAP HANA](https://reader031.vdocuments.site/reader031/viewer/2022013115/5595a15e1a28ab14448b4730/html5/thumbnails/52.jpg)
SAP HANA Smart Data Access using Hadoop/Hive
Prepared by Debajit Banerjee Page 52
That’s all.
**** END OF DOCUMENT ****