big data con hadoop y ssis 2016
TRANSCRIPT
![Page 1: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/1.jpg)
##SQLSatMadrid
Big Data con Hadoop en SQL Server SSIS 2016Ángel M. Rayo
![Page 2: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/2.jpg)
##SQLSatMadrid
¿Quién soy?
Ángel M. Rayo twitter.com/oyara Technology Lead Expert en Netmind Más de 9.000 horas de experiencia formativa Microsoft Certified Trainer desde 2005
MCDBA SQL 2000 – MCSA SQL 2014
![Page 3: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/3.jpg)
##SQLSatMadrid
Agenda
Hadoop HDInsight SQL Server SSIS 2016 Referencias
![Page 4: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/4.jpg)
##SQLSatMadrid
HADOOPBig Data con Hadoop en SQL Server SSIS 2016
![Page 5: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/5.jpg)
##SQLSatMadrid
Hadoop
Procesado distribuido
Grandes conjuntos de
datos
Clústeres de ordenadores
Modelos de programación
sencillos
Apache™ Hadoop®
![Page 6: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/6.jpg)
##SQLSatMadrid
Hadoop
• 2003 – Google File System
• 2004 – MapReduce
• 2006 – Hadoop 0.1.0
• 2011 – Hadoop 1.0
• 2015 – Hadoop 1.7
• 25 de agosto de 2016 – Hadoop 2.7.3
![Page 7: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/7.jpg)
##SQLSatMadrid
Hadoop – Componentes
Hadoop Common
Hadoop Distributed File System
(HDFS™)
Hadoop YARN Hadoop MapReduce
Apache™ Hadoop®
![Page 8: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/8.jpg)
##SQLSatMadrid
Hadoop – Otros componentes
![Page 9: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/9.jpg)
##SQLSatMadrid
Hadoop
![Page 10: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/10.jpg)
##SQLSatMadrid
HDINSIGHTBig Data con Hadoop en SQL Server SSIS 2016
![Page 11: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/11.jpg)
##SQLSatMadrid
HDInsight
• Servicio Hadoop alojado en Microsoft Azure• Gestión de clústeres
• Framework diseñado para:
• Gestión
• Análisis
• Reporting
• Utiliza la distribución Hortonworks Data Platform (HDP)
![Page 12: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/12.jpg)
##SQLSatMadrid
HDInsight
![Page 13: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/13.jpg)
##SQLSatMadrid
HDInsight – Uso
• Hadoop as a Service (HaaS)
• Crear soluciones y servicios Big Data
• Administrar y monitorizar clústeres Hadoop
• Analizar y generar estadísticas de:
• Disponibilidad
• Utilización
![Page 14: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/14.jpg)
##SQLSatMadrid
HDInsight – Creación
![Page 15: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/15.jpg)
##SQLSatMadrid
SQL SERVER SSIS 2016Big Data con Hadoop en SQL Server SSIS 2016
![Page 16: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/16.jpg)
##SQLSatMadrid
SQL Server
• ¿Hace falta decir qué es? ;-)
• Sistema gestor de bases de datos relacionales Microsoft
• 1989 – SQL Server 1.0
• 1 de junio de 2016 – SQL Server 2016 (14.0)
![Page 17: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/17.jpg)
##SQLSatMadrid
SQL Server 2016 – Servicios y herramientas
Service Broker
Replication Services
Analysis Services
Reporting Services
Notification Services
Visual Studio
Integration Services
SQL Server Management
Studio
Full Text Search Service
Business Intelligence Dev Studio
SQLCMD
![Page 18: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/18.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS
• Plataforma de integración datos y aplicaciones de flujos
de trabajo
• Herramienta Data Warehouse rápida y flexible
• ETL
• Extraction
• Transformation
• Loading
![Page 19: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/19.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS
Data Transformation Services (DTS)
SQL Server Integration
Services (SSIS)
Disponible desde SQL Server 6.5 a SQL Server 2000
Disponible desde SQL Server 2005.NET como base de ejecución
Control FlowData FlowEvent HandlersPackage Explorer
![Page 20: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/20.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Situación desde SQL Server 2016• Se incluye Hadoop como fuente de datos
• SSIS 2016 se integra con soluciones Big Data
Situación hasta SQL Server 2014• Podemos utilizar Hadoop mediante conexión ODBC
• Acceso a recursos con scripts PowerShell
![Page 21: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/21.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Gestor de conexiones• WebHCat – API REST
Apache™ Hive
• WebHDFS – API
REST HDFS
![Page 22: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/22.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Tareas de flujo de control (Control Flow)
Hadoop File System Task
Hadoop Hive Task Hadoop Pig Task
![Page 23: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/23.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Hadoop File System Task
Funcionamiento• Obtiene, copia o mueve ficheros
• Utiliza la API REST del clúster
• El acceso es directo al almacén HDFS
![Page 24: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/24.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Hadoop File System Task
![Page 25: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/25.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Hadoop Hive Task
Funcionamiento• Envía consultas HiveQL
• Utiliza la API REST WebHCat (aka Templeton)
• Funcionamiento mediante sistema de colas
![Page 26: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/26.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Hadoop Hive Task
![Page 27: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/27.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Hadoop Pig Task
Funcionamiento• Envía scripts Pig
• Pig Latin + API REST de WebHCat (aka Templeton)
• Funcionamiento mediante sistema de colas
![Page 28: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/28.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Hadoop Pig Task
![Page 29: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/29.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Tareas de flujo de datos (Control Flow)
HDFS File Source HDFS File Destination
![Page 30: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/30.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
HDFS File Source
![Page 31: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/31.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
HDFS File Destination
![Page 32: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/32.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Otros componentes• Azure Feature Pack for Integration Services (SSIS)
• Componentes:
• Gestores de conexiones Azure
• Tareas
• Componentes de flujo de datos (Data Flow)
• Azure Blob Enumerator• https://www.microsoft.com/en-us/download/details.aspx?id=49492
![Page 33: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/33.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Gestores de conexiones Azure
Azure Storage Connection
Manager
Azure Subscription Connection
Manager
![Page 34: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/34.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Tareas
Azure HDInsight Hive
Task
Azure HDInsight Pig
Task
Azure HDInsight
Create Cluster Task
Azure HDInsight
Delete Cluster Task
Azure Blob Upload Task
Azure Blob Download
Task
![Page 35: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/35.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Componentes Data Flow
Azure Blob Source
Azure Blob Destionation
![Page 36: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/36.jpg)
##SQLSatMadrid
SQL Server 2016 SSIS + Hadoop
Azure Blob Enumerator
![Page 37: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/37.jpg)
##SQLSatMadrid
Referencias
Hadoop• http://hadoop.apache.org/
HDInsight• https://azure.microsoft.com/es-es/services/hdinsight/
SQL Server 2016• http://www.microsoft.com/es-es/server-cloud/products/sql-server/default.aspx
![Page 38: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/38.jpg)
##SQLSatMadrid
Gracias
¡GRACIAS!
@oyara
@netmindIT
![Page 39: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/39.jpg)
##SQLSatMadrid
BIG Thanks to SQLSatMadrid Sponsors
![Page 40: Big data con Hadoop y SSIS 2016](https://reader034.vdocuments.site/reader034/viewer/2022052606/58808bb21a28ab35718b6ab1/html5/thumbnails/40.jpg)
##SQLSatMadrid
4 Sponsor Sessions at 11:40
Don’t miss them, they might be getting distributing some awesome prizes!
HPE SolidQ KABEL TSD Consulting
Also BIG Raffle prizes at the end of the event provided by:Plainconcepts, SolidQ, Kabel, TSD Consulting, Pyramid Analytics & sqlpass.es