Transcript
Page 1: Spring Batch Performance Tuning

© 2014 SpringOne 2GX. All rights reserved. Do not distribute without permission.

Spring Batch Performance TuningBy Chris Schaefer & Gunnar Hillert

Page 2: Spring Batch Performance Tuning

Agenda

• Spring Batch • Spring Integration • Spring Batch Integration • Scaling Spring Batch • Spring XD

2

Page 3: Spring Batch Performance Tuning

Spring Batch

3

http://projects.spring.io/spring-batch/

Page 4: Spring Batch Performance Tuning

Batch processing ... is defined as the processing of data without interaction or interruption.

4

“Michael T. Minella, Pro Spring Batch

Page 5: Spring Batch Performance Tuning

Batch Jobs

• Generally long-running • Non-interactive

• Often include logic for handling errors and restartability options • Process large volumes of data

• More than what may fit in memory or a single transaction

5

Page 6: Spring Batch Performance Tuning

Batch and offline processing

• Close of business processing • Order processing, Business reporting, Account reconciliation,

Payroll • Import / export handling

• a.k.a. ETL jobs (Extract-Transform-Load) • Data warehouse synchronization

• Large-scale output jobs • Loyalty program emails, Bank statements

• Hadoop job orchestration

6

Page 7: Spring Batch Performance Tuning

Features

• Transaction management • Chunk based processing • Schema and Java Config support

• Annotations for callback type scenarios such as Listeners • Start/Restart/Skip capabilities • Based on the Spring framework • JSR 352: Batch Applications for the Java Platform

7

Page 8: Spring Batch Performance Tuning

Concepts

• Job • Step • Chunk • Item

8

Repeat | Retry | Skip | Restart

Page 9: Spring Batch Performance Tuning

Chunk-Oriented Processing

• Read data, optionally process and write out the “chunk” within a transaction boundary.

9

Page 10: Spring Batch Performance Tuning

JobLauncher

10

Page 11: Spring Batch Performance Tuning

ItemReaders and ItemWriters

• Flat File • XML (StAX) • Multi-File Input • Database • JDBC, JPA/Hibernate, Stored Procedures, Spring Data • JMS • AMQP • Email • Implement your own...

11

Page 12: Spring Batch Performance Tuning

Simple File Load Job

12

Page 13: Spring Batch Performance Tuning

Job Repository

13

Page 14: Spring Batch Performance Tuning

Spring Integration

14

http://projects.spring.io/spring-integration/

Page 15: Spring Batch Performance Tuning

Integration Styles

• File Transfer • Shared Database • Remoting • Messaging

15

Page 16: Spring Batch Performance Tuning

Integration Styles

• Business to Business Integration (B2B) • Inter Application Integration (EAI) • Intra Application Integration

16

JVM JVM

EAI

External Business Partner

B2B

Core Messaging

Page 17: Spring Batch Performance Tuning

Common Patterns

17

Retrieve Parse Transform Transmit

Page 18: Spring Batch Performance Tuning

Enterprise Integration Patterns

• By Gregor Hohpe & Bobby Woolf • Published 2003 • Collection of well-known patterns • Icon library provided

18

http://www.eaipatterns.com/eaipatterns.html

Page 19: Spring Batch Performance Tuning

Spring Integration provides an extension of the Spring programming model to support the well-known enterprise integration patterns.

19

“ Spring Integration Website

Page 20: Spring Batch Performance Tuning

Adapters

20

AMQP/RabbitMQ AWS File/Resource FTP/FTPS/SFTP GemFire HTTP (REST) JDBC JMS JMX JPA

MongoDB POP3/IMAP/SMTP Print Redis RMI RSS/Atom SMB Splunk Spring ApplicationEvents

Stored Procedures TCP/UDP Twitter Web Services XMPP XPath XQuery !Custom Adapters

Page 21: Spring Batch Performance Tuning

Samples

• https://github.com/spring-projects/spring-integration-samples • Contains 50 Samples and Applications • Several Categories:

• Basic • Intermediate • Advanced • Applications

21

Page 22: Spring Batch Performance Tuning

Spring Batch Integration

22

Page 23: Spring Batch Performance Tuning

Launching batch jobs through messages

• Event-Driven execution of the JobLauncher • Spring Integration retrieves the data (e.g. file system, FTP, ...) • Easy to support separate input sources simultaneously

23

D C

FTP

Inbound Channel Adapter

JobLauncher

Transformer

FileJobLaunchRequest

Page 24: Spring Batch Performance Tuning

JobLaunchRequest

24

public class FileMessageToJobRequest {! private Job job;! private String fileParameterName;! ...! @Transformer! public JobLaunchRequest toRequest(Message<File> message) {! JobParametersBuilder jobParametersBuilder = new JobParametersBuilder();! jobParametersBuilder.addString(fileParameterName,! message.getPayload().getAbsolutePath());! return new JobLaunchRequest(job, jobParametersBuilder.toJobParameters());! }!}!

Page 25: Spring Batch Performance Tuning

JobLaunchRequest

25

<batch-int:job-launching-gateway request-channel="requestChannel"! reply-channel="replyChannel"! job-launcher="jobLauncher"/>!

Page 26: Spring Batch Performance Tuning

Get feedback with informational messages

!

• Spring Batch provides support for listeners: • StepExecutionListener • ChunkListener • JobExecutionListener

26

Page 27: Spring Batch Performance Tuning

Get feedback with informational messages

27

<batch:job id="importPayments"> ... <batch:listeners> <batch:listener ref="notificationExecutionsListener"/> </batch:listeners> </batch:job> !<int:gateway id="notificationExecutionsListener" service-interface="o.s.batch.core.JobExecutionListener" default-request-channel="jobExecutions"/>

Page 28: Spring Batch Performance Tuning

Launching and information messages demo in next section

28

Page 29: Spring Batch Performance Tuning

Scaling Spring Batch

29

Page 30: Spring Batch Performance Tuning

Scaling and externalizing batch process execution

• Utilization of Spring Integration for multi process communication • Distribute complex processing

• Single process o Multi-threaded steps o Parallel steps o Local partitioning

• Multi process o Remote chunking o Remote partitioning

• Asynchronous Item processing support • AsyncItemProcessor • AsyncItemWriter

30

Page 31: Spring Batch Performance Tuning

Single Thread

31

Reader

GatewayOutput

Input

Processor Writer

ResultItem

Item Result

Page 32: Spring Batch Performance Tuning

Single Thread - Demo

32

Page 33: Spring Batch Performance Tuning

Multi-threaded

33

Reader

GatewayOutput

Input

Processor Writer

ResultItem

Item Result

• Simply add a TaskExecutor to your Tasklet configuration

Page 34: Spring Batch Performance Tuning

Multi-Threaded - Demo

34

Page 35: Spring Batch Performance Tuning

Asynchronous Processors• AsyncItemProcessor

• Dispatches ItemProcessor logic on new thread, returning a Future to the AsyncItemWriter

• AsyncItemWriter • Writes the processed items after processing is complete

35

Page 36: Spring Batch Performance Tuning

Asynchronous Processors - Demo

36

Page 37: Spring Batch Performance Tuning

Remote Chunking

37

Step 2a

ItemReader

ItemProcessor

ItemWriter

Step 1

ItemReader

ItemProcessor

ItemWriter

Step 2

ItemReader

ItemWriter

Step 3

ItemReader

ItemProcessor

ItemWriter

Step 2b

ItemReader

ItemProcessor

ItemWriter

Step 2c

ItemReader

ItemProcessor

ItemWriter

Page 38: Spring Batch Performance Tuning

Remote Chunking - Demo

38

Page 39: Spring Batch Performance Tuning

Remote Partitioning

39

Slave 1

ItemReader

ItemProcessor

ItemWriter

Step 1

ItemReader

ItemProcessor

ItemWriter

Master Step 3

ItemReader

ItemProcessor

ItemWriter

Slave 2

ItemReader

ItemProcessor

ItemWriter

Slave 3

ItemReader

ItemProcessor

ItemWriter

Partitioner

Page 40: Spring Batch Performance Tuning

Remote Partitioning - Demo

40

Page 41: Spring Batch Performance Tuning

Demo - Launching via messages & informational messages

41

Does not provide scaling but demonstrates how launch job via messages and send information messages to integration points

Page 42: Spring Batch Performance Tuning

Spring XD

42

http://projects.spring.io/spring-xd/

Page 43: Spring Batch Performance Tuning

Tackling Big Data Complexity

!

• Data Ingestion • Real-time Analytics • Workflow Orchestration • Data Export

43

Page 44: Spring Batch Performance Tuning

Tackling Big Data Complexity cont.

!

• Built on existing Spring assets • Spring Integration • Spring Batch • Spring Data • Spring Boot • Spring for Apache Hadoop • Spring Shell

• Redis, GemFire, Hadoop

44

Page 45: Spring Batch Performance Tuning

Data Ingestion Streams

• DSL based on Unix pipes and filters syntax

!

• Modules are parameterizable

!

• Simple logic can be added via expressions or scripts

45

http | file

twittersearch --query=spring | file --dir=/spring

http | filter --expression=payload=='Spring' | hdfs

Page 46: Spring Batch Performance Tuning

Hadoop workflow managed by Spring Batch

• Reuse Batch infrastructure and features to manage Hadoop workflows

• Job state management, launching, monitoring, restart/retry policies, etc.

• Step can be any Hadoop job type or HDFS script • Can mix and match with other Batch readers/

writers, e.g. JDBC for import/export use-cases

46

Page 47: Spring Batch Performance Tuning

Manage Batch Jobs with Spring XD

47

Page 48: Spring Batch Performance Tuning

48

Spring XD - Demo

Page 49: Spring Batch Performance Tuning

Books

49

Page 50: Spring Batch Performance Tuning

Learn More. Stay Connected.

!

!

!

Demo code and slides:

https://github.com/SpringOne2GX-2014/spring-batch-performance-tuning

50

THANK YOU!


Top Related