london apache kafka meetup (jan 2017)
TRANSCRIPT
![Page 1: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/1.jpg)
Delivering Fast Data Systems with Kafka
LANDOOPwww.landoop.com
Antonios Chalkiopoulos 18/1/2017
![Page 2: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/2.jpg)
@chalkiopoulos
Open Source contributor
Big Data projects in Media, Betting, Retail and Investment Banks in London Books Author, Programming MapReduce with Scalding Founder of Landoop
![Page 3: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/3.jpg)
DevOps Big Data Scala
Automation Distributed Systems Monitoring
Hadoop Fast Data / Streams Kafka
![Page 4: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/4.jpg)
KAFKA CONNECT
a bit of context
![Page 5: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/5.jpg)
KAFKA CONNECT
“a common framework for allowing stream data flow
between kafka and other systems”
![Page 6: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/6.jpg)
Data is produced from a source and consumed to a sink.
Data SourceKa
fka
Con
nect
Kafk
a C
onne
ct
KAFKA Data SinkData SourceKa
fka
Con
nect
Kafk
a C
onne
ct
KAFKA Data Sink
Stream processing
![Page 7: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/7.jpg)
Data SourceKa
fka
Con
nect
Kafk
a C
onne
ct
KAFKA Data Sink
Stream processing
E T L
![Page 8: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/8.jpg)
Developers don’t care about:
Move data to/from sink/source Support delivery semantics Offset Management Serialization / de-serialization Partitioning / Scalability Fault tolerance / fail-over Schema Registry integration
Developers care about:
Domain specific transformations
![Page 9: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/9.jpg)
CONNECTORS
Kafka Connect’s framework allows developers to create connectors that copy data to/from other systems just by writing configuration files and
submitting them to Connect with no code necessary
![Page 10: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/10.jpg)
Connector configurations are key-value mappings
name connector’s unique name
connector.class connector’s java class
tasks.max maximum tasks to create
topics list of topics (to source or sink data)
![Page 11: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/11.jpg)
Introducing a query language for the connectors
name connector’s unique name
connector.class connector’s java class
tasks.max maximum tasks to create
topics list of topics (to source or sink data)
query KCQL query specifies fields/actions for the target system
![Page 12: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/12.jpg)
KCQLKafka Connect Query Language
is a SQL like syntax allowing streamlined configuration of Kafka Sink Connectors and then some more..
Example:
Project fields, rename or ignore them and further customise in plain text
INSERT INTO transactions SELECT field1 AS column1, field2 AS column2, field3 FROM TransactionTopic; INSERT INTO audits SELECT * FROM AuditsTopic; INSERT INTO logs SELECT * FROM LogsTopic AUTOEVOLVE; INSERT INTO invoices SELECT * FROM InvoiceTopic PK invoiceID;
![Page 13: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/13.jpg)
So while integrating Kafka with in-memory data grid, key-value, document stores, NoSQL, search etc systems..
INSERT INTO $TARGET SELECT *|columns(i.e col1,col2 | col1 AS column1,col2) FROM $TOPIC_NAME [ IGNORE columns ] [ AUTOCREATE ] [ PK columns ] [ AUTOEVOLVE ] [ BATCH = N ] [ CAPITALIZE ] [ INITIALIZE ] [ PARTITIONBY cola[,colb] ] [ DISTRIBUTEBY cola[,colb] ] [ CLUSTERBY cola[,colb] ] [ TIMESTAMP cola|sys_current ] [ STOREAS $YOUR_TYPE([key=value, .....]) ] [ WITHFORMAT TEXT|AVRO|JSON|BINARY|OBJECT|MAP ]
KCQLHow does it look like?
![Page 14: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/14.jpg)
Topic to target mapping Field selection Auto creation Auto evolution Error policies Multiple KCQLs / topic
- Field extraction - Access to Key & Metadata
Why KCQL ?
![Page 15: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/15.jpg)
KCQLAdvanced Features Examples
![Page 16: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/16.jpg)
KCQL |
{ "sensor_id": "01" , "temperature": 52.7943, "ts": 1484648810 } { “sensor_id": "02" , "temperature": 28.8597, "ts": 1484648810 }
Example Kafka topic with IoT data
INSERT INTO sensor_ringbuffer SELECT sensor_id, temperature, ts FROM coap_sensor_topic WITHFORMAT JSON STOREAS RING_BUFFER
INSERT INTO sensor_reliabletopic SELECT sensor_id, temperature, ts FROM coap_sensor_topic WITHFORMAT AVRO STOREAS RELIABLE_TOPIC
![Page 17: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/17.jpg)
INSERT INTO FXSortedSet SELECT symbol, price FROM yahooFX-topic STOREAS SortedSet(score=ts)
SELECT price FROM yahooFX-topic PK symbol STOREAS SortedSet(score=ts)
KCQL |
{ "symbol": "USDGBP" , "price": 0.7943, "ts": 1484648810 } { "symbol": "EURGBP" , "price": 0.8597, "ts": 1484648810 }
Example Kafka topic with FX data
B:1 A:2 D:3 C:20
Sorted Set -> { value : score }
![Page 18: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/18.jpg)
Stream reactor connectors support KCQL
kafka-connect-blockchain kafka-connect-bloomberg kafka-connect-cassandra kafka-connect-coap kafka-connect-druid kafka-connect-elastic kafka-connect-ftp kafka-connect-hazelcast kafka-connect-hbase
kafka-connect-influxdb kafka-connect-jms kafka-connect-kudu kafka-connect-mongodb kafka-connect-mqtt kafka-connect-redis kafka-connect-rethink kafka-connect-voltdb kafka-connect-yahoo
Source: https://github.com/datamountaineer/stream-reactor Integration Tests: http://coyote.landoop.com/connect/
![Page 19: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/19.jpg)
DEMOKafka Connect InfluxDB
We ‘ll need: • Zookeeper • Kafka Broker • Schema Registry • Kafka Connect Distributed • Kafka REST Proxy
We ‘ll also use: • StreamReactor connectors • Landoop Fast Data Web Tools
docker run --rm -it \ -p 2181:2181 -p 3030:3030 -p 8081:8081 \ -p 8082:8082 -p 8083:8083 -p 9092:9092 \ -e ADV_HOST=192.168.99.100 \ landoop/fast-data-dev
case class DeviceMeasurements( deviceId: Int,
temperature: Int,moreData: String,timestamp: Long)
We’ll generate some Avro messages
![Page 20: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/20.jpg)
DEMOKafka Development Environment @ Fast-data-dev docker image
https://hub.docker.com/r/landoop/fast-data-dev/
![Page 21: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/21.jpg)
DEMOIntegration testing with Coyote for connectors & infrastructure
https://github.com/Landoop/coyote
![Page 22: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/22.jpg)
Schema Registry UIhttps://github.com/Landoop/schema-registry-ui
![Page 23: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/23.jpg)
Kafka Topics UIhttps://github.com/Landoop/kafka-topics-ui
![Page 24: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/24.jpg)
Kafka Connect UIhttps://github.com/Landoop/kafka-connect-ui
![Page 25: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/25.jpg)
Connectors Performance
![Page 26: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/26.jpg)
Monitoring & Alerting
via JMX
![Page 27: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/27.jpg)
Deployment apps
Containers mesos -kubernetes
Hadoop integration
* state-less apps = container-friendly schema registry, kafka connect
How do I IT?
Available features: Kafka ecosystem StreamReactor Connectors Landoop web tools Monitoring & Alerting Security features
![Page 28: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/28.jpg)
![Page 29: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/29.jpg)
![Page 30: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/30.jpg)
Wrap up
- KCQL
- Connectors
- Kafka Web Tools
- Automation & Integrations
![Page 31: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/31.jpg)
Coming up
- Kafka backendenhanced UIs | Timetravel
![Page 32: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/32.jpg)
$ locate
https://github.com/Landoop
https://hub.docker.com/r/landoop/
https://github.com/datamountaineer/stream-reactor
http://www.landoop.com
![Page 33: London Apache Kafka Meetup (Jan 2017)](https://reader033.vdocuments.site/reader033/viewer/2022042706/58870f351a28abf2228b5cc9/html5/thumbnails/33.jpg)
Thank you ;)