apache nifi: ingesting enterprise data at scale

Post on 11-Apr-2017

369 Views

Category:

Engineering

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1 ©HortonworksInc.2011– 2017.AllRightsReserved

TimothySpann2017FutureofData– PrincetonMeetupHostedbyTRACIntermodal

Apache NiFi: Ingesting Enterprise Data @ Scale

DATAWORKSSUMMIT/HADOOPSUMMITJUNE13–15,2017SanJoseMcHenryConventionCenter

REGISTERNOWANDSAVE$1,000

REGISTERNOW!>

dataworkssummit.com

3 ©HortonworksInc.2011– 2017.AllRightsReserved

Agenda

• Apache NiFi RDBMS, EDI, JSON, CSV, Sensors

• EDI• https://community.hortonworks.com/content/kbentry/59975/in

gesting-edi-into-hdfs-using-hdf-20.html• https://github.com/tspannhw/EnterpriseNIFI

4 ©HortonworksInc.2011– 2017.AllRightsReserved

5 ©HortonworksInc.2011– 2017.AllRightsReserved

6 ©HortonworksInc.2011– 2017.AllRightsReserved

FlowManagement Flowmanagement+StreamProcessing

D A T A I N M O T I O N D A T A A T R E S T

IoTDataSources AWSAzure

GoogleCloudHadoop

NiFiKafka

Storm

Others…NiFi

NiFi NiFi

MiNiFi

MiNiFi

MiNiFi

MiNiFi

MiNiFi

MiNiFi

MiNiFi

NiFi

HDF2.1– DatainMotionPlatform

EnterpriseServices

Ambari Ranger Otherservices

7 ©HortonworksInc.2011– 2017.AllRightsReserved

Actionable Insights Architecture

IngestionSimpleEventProcessing

EngineComplexEventProcessing

Destination

DataBus

BuildPredictiveModel

FromHistoricalData

DeployPredictiveModel

ForReal-timeInsights

PerishableInsights

HistoricalInsights

8 ©HortonworksInc.2011– 2017.AllRightsReserved

ActionableIntelligenceTransformsIndustrial,Transportation&Utilities

AssetData

CustomerSurveys

Weather&Environmental

ServiceFleetGPSData

SmartMeterStreams

CommodityPrices

REVENUEPROTECTION

SINGLEVIEWOFCUSTOMER

PREDICTIVEEQUIPMENTMAINTENANCE

CONSERVATIONVOLTAGEREDUCTION

COMMODITYTRADING

SocialMedia

GISData

SCADA OutageHistories

CISRecords

EDW

9 ©HortonworksInc.2011– 2017.AllRightsReserved

What is Apache NiFi?

• Created to address the challenges of global enterprise dataflow• Key features:

– VisualCommandandControl

– DataLineage(Provenance)

– DataPrioritization

– DataBuffering/Back-Pressure

– ControlLatencyvs.Throughput

– SecureControlPlane/DataPlane

– ScaleOutClustering

– Extensibility

10 ©HortonworksInc.2011– 2017.AllRightsReserved

Apache NiFi

What is Apache NiFi used for?• Reliable and secure transfer of data between systems• Delivery of data from sources to analytic platforms• Enrichment and preparation of data:

– Conversionbetweenformats– Extraction/Parsing– Routingdecisions

What is Apache NiFi NOT used for?• Distributed Computation• Complex Event Processing• Complex Rolling Window Operations

11 ©HortonworksInc.2011– 2017.AllRightsReserved

NiFi Terminology

FlowFile• Unitofdatamovingthroughthesystem• Content+Attributes(key/valuepairs)

Processor• Performsthework,canaccessFlowFiles

Connection• Linksbetweenprocessors• Queuesthatcanbedynamicallyprioritized

12 ©HortonworksInc.2011– 2017.AllRightsReserved

Contact:

TimothySpann@PaaSDeVwww.meetup.com/futureofdata-princeton

community.hortonworks.com/users/9304/tspann.html

13 ©HortonworksInc.2011– 2017.AllRightsReserved

HortonworksCommunityConnection

Read access for everyone, join to participate and be recognized

• FullQ&APlatform(likeStackOverflow)

• KnowledgeBaseArticles

• CodeSamplesandRepositories

14 ©HortonworksInc.2011– 2017.AllRightsReserved

CommunityEngagement

Participate now at: community.hortonworks.com©HortonworksInc.2011– 2015.AllRightsReserved

4,000+RegisteredUsers

10,000+Answers

15,000+TechnicalAssets

One Website!

top related