speed up your build pipeline for faster feedback
TRANSCRIPT
Speed Up Your Build Pipeline for Faster FeedbackAshish Parkhi
IDeaS a SAS Company
@AshishParkhi
http://ashishparkhi.com
3
Impact on life.
Image source – http://ak3.picdn.net/shutterstock/videos/5132438/preview/stock-footage-mixed-ethnicity-group-of-medical-professionals-
working-late-at-night-are-looking-at-a-computer.jpg
http://the247analyst.files.wordpress.com/2011/10/dealing-with-pressure.jpg
http://www.dimitri.co.uk/business/business-images/worker-alone-dark-office.jpg
http://cdn.sheknows.com/articles/2012/10/crying-little-girl.jpg
5
Focus on the
Bottlenecks
Divide and
Conquer
Key Principles to Speed Up
Your Build Pipeline
Fail Fast
6
Focus on the
Bottlenecks
Divide and
Conquer
Key Principles to Speed Up
Your Build Pipeline
Fail Fast
8
Disk IO – ExampleFocus on Bottleneck
Database operations.
Image Source - https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcTPdVawndjUZbU2PDn-oKgjBPqmgDqr3PPZatZh9kxEgNi71AND
http://www.dba-oracle.com/images/large_disk_hot_files.gif
9
Disk IO – AlternativeFocus on Bottleneck
• Avoid file operations – e.g. duplicating workspace
Image Source - http://3.bp.blogspot.com/-bqTjSN7pSpg/UbqyjVojEFI/AAAAAAAADBw/PWe0kiuRHJ4/s200/no+duplicate+content.jpg
10
Disk IO – AlternativeFocus on Bottleneck
• Avoid file operations – e.g. Jar creation.
Image Source - http://i1.wp.com/blog.quoteroller.com/wp-content/uploads/2013/04/Dont-start-from-scratch.png?resize=800%2C264
11
Disk IO – AlternativeFocus on Bottleneck
• Robocopy/rsync.
Image source - http://www.asustor.com/images/admv2/022_Rsync%20Backup-The-ultimate-tool-for-remote-backup.png
12
Disk IO – AlternativeFocus on Bottleneck
• Test on smaller but apt data set.
Image source - http://4.bp.blogspot.com/_4hvqisoH9CE/TSZIs7eiSAI/AAAAAAAAA7E/vanj6bGD8XQ/s1600/big-vs-small-left.jpg
13
Disk IO - Alternative - SSDFocus on Bottleneck
• HDD (Toshiba MQ01ACF050 500GB SATA III) vs SSD
(Samsung PM851 512GB mSata)
15
Disk IO - Alternative – In Memory DBFocus on Bottleneck
Memory (Heap) Engine
– had some limitations over myisam engine.
16
Disk IO - Alternative – In Memory DBFocus on Bottleneck
– was not supporting many MySQL queries so was
discarded.
17
Disk IO - Alternative – In Memory DBFocus on Bottleneck
database
– looked promising as it could support many MySQL
queries but still required couple of modifications to
our code.
18
Disk IO - Alternative – In Memory DBFocus on Bottleneck
– looked most promising as it is wire compatible with
MySQL, which means without code changes I could
just point to memsql and be done with it.
CPU – Profiling - InsightsFocus on Bottleneck
Image source - https://encrypted-tbn1.gstatic.com/images?q=tbn:ANd9GcQde6NeSrbuv40CIhKtFa1OuIQXf7F7esMJKp1Ie7zmH2t29l6Z
Scanning resource bundle files from jars.
25
CPU – Profiling - InsightsFocus on Bottleneck
Image source - http://2.bp.blogspot.com/-uKMyLlB3F7o/Tqn_6yqdElI/AAAAAAAAB94/_1FMbHJFQBQ/s1600/weight-lift-cartoon.jpg
Loading Spring Application Context.
26
CPU – Profiling - InsightsFocus on Bottleneck
Image source - http://dilbert.com/dyn/str_strip/000000000/00000000/0000000/000000/00000/7000/000/7029/7029.strip.gif
Avoiding unnecessary activities during build e.g. sending
out email.
27
CPU – Profiling - InsightsFocus on Bottleneck
java.util.Calendar is horribly slow.
Total processing time took 20.72 minutes out of which Date
Arithmetic took 18.15 minutes which is about 87.6% of the
total processing time!
28
CPU – Profiling - InsightsFocus on Bottleneck
java.util.Calendar is horribly slow. We switched to joda
date library and deprecated java.util.Date API.
Now Date Arithmetic takes 1.30 minutes; that’s a massive
saving of 93.77%
29
Focus on the
Bottlenecks
Divide and
Conquer
Key Principles to Speed Up
Your Build Pipeline
Fail Fast
31
CPU - Running Tests ConcurrentlyDivide and Conquer
• Distribute tasks across multiple slaves.
Image source - https://wiki.jenkins-ci.org/download/attachments/2916393/logo.png?version=1&modificationDate=1302753947000
33
CPU - Running Tests ConcurrentlyDivide and Conquer
Image source - http://sharpreflections.com/wp-content/uploads/2012/06/multi_core_cpu.png
• Using @RunWith(ConcurrentJunitRunner.class).
– Curtesy - Mathieu Carbou
http://java.dzone.com/articles/concurrent-junit-tests
– Maven Surefire plugin has built in mechanism.
34
Focus on the
Bottlenecks
Divide and
Conquer
Key Principles to Speed Up
Your Build Pipeline
Fail Fast
35
Restructure The Build PipelineFail Fast
Image Source - http://javapapers.com/wp-content/uploads/2012/11/failfast.jpg
• We want our builds to give us fast feedback. Hence it is very important to
prioritize our build tasks based on what is most likely to fail first.
• Push unnecessary stuff to a separate build – Things like JavaDocs can be
done nightly.
• Separate out fast and slow running tests.
36
Incremental Build vs. Clean BuildFail Fast
• Local dev builds are incremental, instead of clean
builds, as it helps with faster feedback and fail fast.
37
Summary
• Focus on bottlenecks
– Avoid Disk IO - File operations, file based database operations.
– Use smaller datasets, robocopy, rsync
– Use in-memory databases, Ram Drives, SSDs.
– Perform CPU profiling, scan logs, to uncover the unknown.
– Verify build tool settings.
• Divide and Conquer
– Create smaller jobs that can run in parallel.
– Distribute jobs across multiple slaves.
– Write tests that can run in isolation and use ConcurrentJunitRunner to run them
in parallel.
• Fail Fast
– Restructure the build pipeline to uncover failures soon.
– Incremental Builds
38
Build Time Vs No Of Builds
Removed Workspace Duplication
Ant Junit Task – Fork Once
Ram Disk
Caching Resource
Caching Spring Context
Avoided Email
Joda DateTime
Deprecated Date API
Concurrent Junit
Runner
39
Impact on life
Image source - http://t3.gstatic.com/images?q=tbn:ANd9GcTCvK8pY5qcp7Gl3ZBjxN1mc1HVHdiy1sQhByKeGgUk_5eJuUk7cA
https://encrypted-tbn3.gstatic.com/images?q=tbn:ANd9GcQpoUXqhEpdGl1cLzn4gQsng_GyxUmOKWxYUH6GfrjN_FRUYPxw-Q
40
Resources
• Jenkins – http://jenkins-ci.org/
• CI – http://en.wikipedia.org/wiki/Continuous_integration
• Mklink – http://technet.microsoft.com/en-us/library/cc753194.aspx
• http://ant.apache.org/manual/Tasks/junit.html
• http://java.dzone.com/articles/javalangoutofmemory-permgen
• SSD – http://en.wikipedia.org/wiki/Solid-state_drive
• Hybrid disk – http://en.wikipedia.org/wiki/Hybrid_drive
• HSQL – http://hsqldb.org/
• H2 – http://www.h2database.com/html/main.html
• Memsql – http://www.memsql.com/
• MySQL is bazillion times faster than MemSQL
• Tmpfs – http://en.wikipedia.org/wiki/Tmpfs
• http://blog.laptopmag.com/faster-than-an-ssd-how-to-turn-extra-memory-into-a-ram-disk
• RAM Disk Software Benchmarked
• http://jvmmonitor.org/
• http://searchvmware.techtarget.com/tip/VMware-snapshot-size-and-other-causes-for-slow-
snapshots
• http://blogs.agilefaqs.com/2014/10/03/key-principles-for-reducing-continuous-integration-build-
time/
• http://googletesting.blogspot.com/2011/06/testing-at-speed-and-scale-of-google.html
• http://www.infoq.com/presentations/Development-at-Google
• http://crystalmark.info/software/CrystalDiskMark/index-e.html41
Thank You!
Ashish Parkhi
IDeaS a SAS Company
@AshishParkhi
http://ashishparkhi.com