spring batch performance tuning

50
© 2014 SpringOne 2GX. All rights reserved. Do not distribute without permission. Spring Batch Performance Tuning By Chris Schaefer & Gunnar Hillert

Upload: spring-io

Post on 14-Jun-2015

1.272 views

Category:

Software


3 download

DESCRIPTION

Speakers: Gunnar Hillert, Chris Schaefer Data / Integration Track In this presentation we will examine various scalability options in order to improve the robustness and performance of your Spring Batch applications. We start out with a single threaded Spring Batch application that we will refactor so we can demonstrate how to run it using: * Concurrent Steps * Remote Chunking * AsyncItemProcessor and AsyncItemWriter * Remote Partitioning Additionally, we will show how you can deploy Spring Batch applications to Spring XD which provides high availability and failover capabilities. Spring XD also allows you to integrate Spring Batch applications with other Big Data processing needs.

TRANSCRIPT

Page 1: Spring Batch Performance Tuning

© 2014 SpringOne 2GX. All rights reserved. Do not distribute without permission.

Spring Batch Performance TuningBy Chris Schaefer & Gunnar Hillert

Page 2: Spring Batch Performance Tuning

Agenda

• Spring Batch • Spring Integration • Spring Batch Integration • Scaling Spring Batch • Spring XD

2

Page 3: Spring Batch Performance Tuning

Spring Batch

3

http://projects.spring.io/spring-batch/

Page 4: Spring Batch Performance Tuning

Batch processing ... is defined as the processing of data without interaction or interruption.

4

“Michael T. Minella, Pro Spring Batch

Page 5: Spring Batch Performance Tuning

Batch Jobs

• Generally long-running • Non-interactive

• Often include logic for handling errors and restartability options • Process large volumes of data

• More than what may fit in memory or a single transaction

5

Page 6: Spring Batch Performance Tuning

Batch and offline processing

• Close of business processing • Order processing, Business reporting, Account reconciliation,

Payroll • Import / export handling

• a.k.a. ETL jobs (Extract-Transform-Load) • Data warehouse synchronization

• Large-scale output jobs • Loyalty program emails, Bank statements

• Hadoop job orchestration

6

Page 7: Spring Batch Performance Tuning

Features

• Transaction management • Chunk based processing • Schema and Java Config support

• Annotations for callback type scenarios such as Listeners • Start/Restart/Skip capabilities • Based on the Spring framework • JSR 352: Batch Applications for the Java Platform

7

Page 8: Spring Batch Performance Tuning

Concepts

• Job • Step • Chunk • Item

8

Repeat | Retry | Skip | Restart

Page 9: Spring Batch Performance Tuning

Chunk-Oriented Processing

• Read data, optionally process and write out the “chunk” within a transaction boundary.

9

Page 10: Spring Batch Performance Tuning

JobLauncher

10

Page 11: Spring Batch Performance Tuning

ItemReaders and ItemWriters

• Flat File • XML (StAX) • Multi-File Input • Database • JDBC, JPA/Hibernate, Stored Procedures, Spring Data • JMS • AMQP • Email • Implement your own...

11

Page 12: Spring Batch Performance Tuning

Simple File Load Job

12

Page 13: Spring Batch Performance Tuning

Job Repository

13

Page 14: Spring Batch Performance Tuning

Spring Integration

14

http://projects.spring.io/spring-integration/

Page 15: Spring Batch Performance Tuning

Integration Styles

• File Transfer • Shared Database • Remoting • Messaging

15

Page 16: Spring Batch Performance Tuning

Integration Styles

• Business to Business Integration (B2B) • Inter Application Integration (EAI) • Intra Application Integration

16

JVM JVM

EAI

External Business Partner

B2B

Core Messaging

Page 17: Spring Batch Performance Tuning

Common Patterns

17

Retrieve Parse Transform Transmit

Page 18: Spring Batch Performance Tuning

Enterprise Integration Patterns

• By Gregor Hohpe & Bobby Woolf • Published 2003 • Collection of well-known patterns • Icon library provided

18

http://www.eaipatterns.com/eaipatterns.html

Page 19: Spring Batch Performance Tuning

Spring Integration provides an extension of the Spring programming model to support the well-known enterprise integration patterns.

19

“ Spring Integration Website

Page 20: Spring Batch Performance Tuning

Adapters

20

AMQP/RabbitMQ AWS File/Resource FTP/FTPS/SFTP GemFire HTTP (REST) JDBC JMS JMX JPA

MongoDB POP3/IMAP/SMTP Print Redis RMI RSS/Atom SMB Splunk Spring ApplicationEvents

Stored Procedures TCP/UDP Twitter Web Services XMPP XPath XQuery !Custom Adapters

Page 21: Spring Batch Performance Tuning

Samples

• https://github.com/spring-projects/spring-integration-samples • Contains 50 Samples and Applications • Several Categories:

• Basic • Intermediate • Advanced • Applications

21

Page 22: Spring Batch Performance Tuning

Spring Batch Integration

22

Page 23: Spring Batch Performance Tuning

Launching batch jobs through messages

• Event-Driven execution of the JobLauncher • Spring Integration retrieves the data (e.g. file system, FTP, ...) • Easy to support separate input sources simultaneously

23

D C

FTP

Inbound Channel Adapter

JobLauncher

Transformer

FileJobLaunchRequest

Page 24: Spring Batch Performance Tuning

JobLaunchRequest

24

public class FileMessageToJobRequest {! private Job job;! private String fileParameterName;! ...! @Transformer! public JobLaunchRequest toRequest(Message<File> message) {! JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();! jobParametersBuilder.addString(fileParameterName,! message.getPayload().getAbsolutePath());! return new JobLaunchRequest(job, jobParametersBuilder.toJobParameters());! }!}!

Page 25: Spring Batch Performance Tuning

JobLaunchRequest

25

<batch-int:job-launching-gateway request-channel="requestChannel"! reply-channel="replyChannel"! job-launcher="jobLauncher"/>!

Page 26: Spring Batch Performance Tuning

Get feedback with informational messages

!

• Spring Batch provides support for listeners: • StepExecutionListener • ChunkListener • JobExecutionListener

26

Page 27: Spring Batch Performance Tuning

Get feedback with informational messages

27

<batch:job id="importPayments"> ... <batch:listeners> <batch:listener ref="notificationExecutionsListener"/> </batch:listeners> </batch:job> !<int:gateway id="notificationExecutionsListener" service-interface="o.s.batch.core.JobExecutionListener" default-request-channel="jobExecutions"/>

Page 28: Spring Batch Performance Tuning

Launching and information messages demo in next section

28

Page 29: Spring Batch Performance Tuning

Scaling Spring Batch

29

Page 30: Spring Batch Performance Tuning

Scaling and externalizing batch process execution

• Utilization of Spring Integration for multi process communication • Distribute complex processing

• Single process o Multi-threaded steps o Parallel steps o Local partitioning

• Multi process o Remote chunking o Remote partitioning

• Asynchronous Item processing support • AsyncItemProcessor • AsyncItemWriter

30

Page 31: Spring Batch Performance Tuning

Single Thread

31

Reader

GatewayOutput

Input

Processor Writer

ResultItem

Item Result

Page 32: Spring Batch Performance Tuning

Single Thread - Demo

32

Page 33: Spring Batch Performance Tuning

Multi-threaded

33

Reader

GatewayOutput

Input

Processor Writer

ResultItem

Item Result

• Simply add a TaskExecutor to your Tasklet configuration

Page 34: Spring Batch Performance Tuning

Multi-Threaded - Demo

34

Page 35: Spring Batch Performance Tuning

Asynchronous Processors• AsyncItemProcessor

• Dispatches ItemProcessor logic on new thread, returning a Future to the AsyncItemWriter

• AsyncItemWriter • Writes the processed items after processing is complete

35

Page 36: Spring Batch Performance Tuning

Asynchronous Processors - Demo

36

Page 37: Spring Batch Performance Tuning

Remote Chunking

37

Step 2a

ItemReader

ItemProcessor

ItemWriter

Step 1

ItemReader

ItemProcessor

ItemWriter

Step 2

ItemReader

ItemWriter

Step 3

ItemReader

ItemProcessor

ItemWriter

Step 2b

ItemReader

ItemProcessor

ItemWriter

Step 2c

ItemReader

ItemProcessor

ItemWriter

Page 38: Spring Batch Performance Tuning

Remote Chunking - Demo

38

Page 39: Spring Batch Performance Tuning

Remote Partitioning

39

Slave 1

ItemReader

ItemProcessor

ItemWriter

Step 1

ItemReader

ItemProcessor

ItemWriter

Master Step 3

ItemReader

ItemProcessor

ItemWriter

Slave 2

ItemReader

ItemProcessor

ItemWriter

Slave 3

ItemReader

ItemProcessor

ItemWriter

Partitioner

Page 40: Spring Batch Performance Tuning

Remote Partitioning - Demo

40

Page 41: Spring Batch Performance Tuning

Demo - Launching via messages & informational messages

41

Does not provide scaling but demonstrates how launch job via messages and send information messages to integration points

Page 42: Spring Batch Performance Tuning

Spring XD

42

http://projects.spring.io/spring-xd/

Page 43: Spring Batch Performance Tuning

Tackling Big Data Complexity

!

• Data Ingestion • Real-time Analytics • Workflow Orchestration • Data Export

43

Page 44: Spring Batch Performance Tuning

Tackling Big Data Complexity cont.

!

• Built on existing Spring assets • Spring Integration • Spring Batch • Spring Data • Spring Boot • Spring for Apache Hadoop • Spring Shell

• Redis, GemFire, Hadoop

44

Page 45: Spring Batch Performance Tuning

Data Ingestion Streams

• DSL based on Unix pipes and filters syntax

!

• Modules are parameterizable

!

• Simple logic can be added via expressions or scripts

45

http | file

twittersearch --query=spring | file --dir=/spring

http | filter --expression=payload=='Spring' | hdfs

Page 46: Spring Batch Performance Tuning

Hadoop workflow managed by Spring Batch

• Reuse Batch infrastructure and features to manage Hadoop workflows

• Job state management, launching, monitoring, restart/retry policies, etc.

• Step can be any Hadoop job type or HDFS script • Can mix and match with other Batch readers/

writers, e.g. JDBC for import/export use-cases

46

Page 47: Spring Batch Performance Tuning

Manage Batch Jobs with Spring XD

47

Page 48: Spring Batch Performance Tuning

48

Spring XD - Demo

Page 49: Spring Batch Performance Tuning

Books

49

Page 50: Spring Batch Performance Tuning

Learn More. Stay Connected.

!

!

!

Demo code and slides:

https://github.com/SpringOne2GX-2014/spring-batch-performance-tuning

50

THANK YOU!