design a dataflow in 7 minutes with apache nifi/hdf
TRANSCRIPT
1 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Create a live dataflow in minutesHow would that change your business?
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Add processor for data intake. Time: 1 minute1 Drag and drop processor from top menu
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Choose the specific processor2 Choose one of the processors – currently 170+ available
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Example: Pick Twitter Processor
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Configure the processor. Time: 2 minutes3
4
Select processor and choose option to Configure
Adjust parameters as required
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Another processor for data output. Time: 1 minute5
6 Filter for and select a “Put” processor
Drag and drop processor from top menu
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Configure second processor. Time: 1 minute7 Configure 2nd processor
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Connect processors, configure connection. 2 minutes
Configure Connection8
Note: Sample Flow is different from previous example of PutHDFS. This dataflow is PutFile. Same concepts apply.
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Click Start to Begin Processing. Time total: 7 minutes
9 Click start “play” to begin processing (will run continuously until you select stop)
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
See Processors Update with Real Time Changes10 As data flows, GUI interface updates in real time.
11 If destination is stopped or unable to receive, queue builds
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Dynamically adjust and tune data flow as needed
12 Dynamically configure/ start/ stop/ tune/ reroute change/ pause dataflows as needed.
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Powerful Tools to Quickly Replicate, Group, Repurpose, Tune and Test in Real-Time
13
14 Create a new template
Group multiple processes together to create a process group
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Provenance MeansReal-Time Traceability of:
Data FlowData ContentData Context
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Watch Real Time Flow of Data: Data Provenance
Select Data Provenance15
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Trace Lineage of a Particular Piece of Data
Icon for Data Lineage16
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Every Change to Data is Tracked in Real-Time: processing, views
Every event is traceable
17
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Real-Time Updates of Dataflow: Traceable Context & Content
Know immediately both context and content18
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Easily access and trace changes to dataflow
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Audit trail of Hortonworks DataFlow User Actions
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Questions?
Hortonworks Community Connection:Data Ingestion and Streaminghttps://community.hortonworks.com/