current status of the development of the unified solids library marek gayer cern ph/sft

45
Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Upload: egbert-davis

Post on 19-Jan-2018

216 views

Category:

Documents


0 download

DESCRIPTION

Solids implemented so far Box Orb Trapezoid Sphere (+ sphere section) Tube (+ cylindrical section) Cone (+ conical section) Generic trapezoid Tetrahedron Arbitrary Trapezoid (ongoing) Multi-Union Tessellated Solid Polycone Polyhedra Extruded solid (ongoing) Current status of the development of the Unified Solids library3

TRANSCRIPT

Page 1: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of

the Unified Solids libraryMarek GayerCERN PH/SFT

Page 2: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 2

Motivations for a common solids

library• Optimize and guarantee better long-term

maintenance of ROOT and Gean4 solids librarieso A rough estimation indicates that about 70-80% of code investment for

the geometry modeler concerns solids, to guarantee the required precision and efficiency in a huge variety of combinations

• Create a single high quality library to replace solid libraries in Geant4 and ROOTo Starting from what exists today in Geant4 and ROOTo Adopt a single type for each shapeo Significantly optimize (Multi-Union, Tessellated Solid, Polyhedra,

Polycone)o Reach complete conformance to GDML solids schema

• Create extensive testing suite

Page 3: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 3

Solids implemented so far

• Box• Orb• Trapezoid• Sphere (+ sphere section)• Tube (+ cylindrical section) • Cone (+ conical section) • Generic trapezoid• Tetrahedron• Arbitrary Trapezoid (ongoing)• Multi-Union• Tessellated Solid• Polycone• Polyhedra• Extruded solid (ongoing)

0

500

1000

0

500

1000

200

400

600

800

1000

-10-5

05x 10

4 -5

0

5

x 104

-6

-4

-2

0

2

4

6

x 104

-4000-3000

-2000-1000

01000

20003000

4000

-4000

-3000

-2000

-10000

10002000

3000

4000-1000

0

1000

-1000

-500

0

500

1000

-1000

-500

0

500

1000-1000

-500

0

500

1000

-1000

-500

0

500

1000

-1000

-500

0

500

1000-1000

-500

0

500

1000

Page 4: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 4

Testing Suite• Unified Solid Batch Test• Optical Escape• Specialized tests (e.g. quick performance

scalability test for multi-union)

Page 5: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 5

Unified Solid Batch Test• New powerful diagnosis tool for testing solids of ROOT,

Geant4 and Unified Solids library designed for transition to Unified Solids

• Compares performance and output values from different codes

• Helps us to make sure we have similar or better performance in each method and different test cases

• Support for batch configuration and execution of tests• Useful to detect, report and fix numerous bugs or suspect

return values in Geant4, ROOT and Unified Solids• Two phases

o Sampling phase (generation of data sets, implemented in C++o Analysis phase (data post-processing, implemented as MATLAB

scripts)

Page 6: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 6

Visualization of scalar and vector data sets

Page 7: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 7

3D plots allowing to overview data sets

Page 8: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 8

3D visualization of investigated shapes

-5000

0

5000

0

50000

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

x 104

Page 9: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 9

Support for regions of data, focusing on sub-parts

Page 10: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 10

Visual analysis of differences

Page 11: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 11

Visual analysis of differences in 3D

Page 12: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 12

Inside DistanceToOut DistanceToIn Normal SafetyFromOutside SafetyFromInside0

20

40

60

80

100

120

140

Method

Tim

e pe

r one

met

hod

call

[nan

osec

onds

]

Performance of methods at folder trap-1m-performance

Geant4ROOTUSolids

Graphs with comparison of performance

Page 13: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 13

Visualization of scalability performance for specific

solids

2 4 6 8 10 20 500

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2x 10

4

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion]

Performance of method DistanceToOut at folder log

Geant4ROOTUSolid

Page 14: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 14

New Multi-Union solid

Page 15: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 15

Multi-Union solid• We implemented a new

solid as a union of many solids using voxelization technique to optimize speed and scalabilityo 3D space partition for fast

localization of componentso Aiming for a log(n) scalability,

unlike the Geant4 Boolean solid• Useful for complex

composites made of many solids

-0.8-0.6

-0.4-0.2

00.2

0.40.6

0.8

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

Page 16: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 16

Test multi-union solid for scalability measurements

Union of random boxes, orbs and trapezoids (on pictures with 5, 10, 20 solids)

Page 17: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 17

The most performance critical methods

Page 18: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 18

Fast Tessellated Solid

Page 19: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 19

Fast Tessellated Solid• Made from connected triangular and

quadrangular facets forming solid• Old implementation was slow, no

spatial optimization• We use spatial division of facets into

3D grid forming voxels• Voxelization is based on:

o Bitmasks and logical and operations during initialization

o NEW: Bitmasks are not used at runtime, only pre-calculated lists of facets candidates are used during runtime

• Speedup factor ~10x additional improvement since the previous Geant4 collaboration meeting

Page 20: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 20

Fast Tessellated Solid• Old Tessellated Solid had also several weak parts

of algorithm, used at initialization with n2 complexityo This sometimes caused very huge delays when loading (e.g. in case of

foil with 164k faces)o Rewrote to have n ∙ log n complexityo Loading the solid is much faster. What before took minutes, now takes

seconds• Backported in autumn 2012 also to Geant4 9.6• By improving design of classes and removing

inefficiencies, total memory save is about ~50%, o Voxelization has overhead vs. original G4TS, which reduces to a total memory

save of ~25%

Page 21: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 21

LHCb VELO foil benchmarkMethod Speedu

pInside 2423xDistanceToIn 1334xDistanceToOut 1976xInformation ValueNumber of facets

164.149

Number of voxels

158.928

Memory saved compared with original Geant4

22% (51MB)

Page 22: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 22

Inside Normal SafetyFromOutside SafetyFromInside DistanceToIn DistanceToOut0

5

10

15x 10

6

Method

Tim

e pe

r one

met

hod

call

[nan

osec

onds

]

Performance of methods at folder tessellatedsolid-test-t100-100k

G4TessellatedSolidNew G4TessellatedSolid

• OLD Speedup: 240x None 10000x+ None 133x 397x• NEW Speedup: 894x 10.3x 10000x+ ~8.7x 412x

1183x• NEW Speedup: 2423x 72x 10000x+ ~10.9x 1334x

1976x• NEW Speedup: 2925x 46x 10000x+ ~9.8x 1332x

2968x

Performance – 164k/SCL5 with

10k/100k/1M voxels

Page 23: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 23

Polycone

Page 24: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 24

New Ordinary Polycone

implementation• Polycone is an important solid, heavily used in most

experimental setups• Special optimization for common cases: Z sections

can only increase (big majority of real cases)o Based on composition of separate instances of cones, tubes (or their

sections)o This way, providing brief, readable, clean and fast codeo Excellent scalability over the number of Z sectionso Very significant performance improvement

• Will make visible CPU improvement in full simulation of complex setups like ATLAS and CMS o In CMS, 6.92% was measured as the total CPU share of Polycone methodso Assuming possible speedup factor ~5x, this would save around 5%

total time of the whole CMS experimento Cost of polycone in ATLAS is “around 10%”

-500005000-500005000

2

4

6

8

10

12

x 104

Page 25: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 25

Inside DistanceToOut DistanceToIn Normal SafetyFromOutside SafetyFromInside0

500

1000

1500

2000

2500

Method

Tim

e pe

r one

met

hod

call

[nan

osec

onds

]

Performance of methods at folder polycone-3s-360-perf

Geant4ROOTUSolids

Ordinary Polycone performance, 3 z

sections • Speedup factor 3.3x vs. Geant4, 7.6x vs. ROOT

for most performance critical methods, i.e.: Inside, DistanceToOut, DistanceToIn

Page 26: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 26

Ordinary Polycone performance, 10 z

sections • Speedup factor 8.3x vs. Geant4, 7.9x vs. ROOT

for most performance critical methods

Inside DistanceToOut DistanceToIn Normal SafetyFromOutsideSafetyFromInside0

200

400

600

800

1000

1200

1400

1600

1800

2000

Method

Tim

e pe

r one

met

hod

call

[nan

osec

onds

]

Performance of methods at folder polycone-10s-360-perf

Geant4ROOTUSolids

Page 27: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 27

Ordinary Polycone performance, 100 z

sections • Speedup factor 34.3x vs. Geant4, 10.7x vs.

ROOT for most performance critical methods

Inside DistanceToOut DistanceToIn Normal SafetyFromOutsideSafetyFromInside0

1000

2000

3000

4000

5000

6000

7000

Method

Tim

e pe

r one

met

hod

call

[nan

osec

onds

]

Performance of methods at folder polycone-100s-360-perf

Geant4ROOTUSolids

Page 28: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 28

3s 10s 100s0

1000

2000

3000

4000

Count of solid partsTim

e [n

anos

econ

ds p

er o

pera

tion]Performance of method SafetyFromInside at folder ordinary

Geant4ROOTUSolid

3s 10s 100s0

2000

4000

6000

8000

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion] Performance of method DistanceToOut at folder ordinary

Geant4ROOTUSolid

3s 10s 100s0

500

1000

1500

2000

2500

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion] Performance of method Inside at folder ordinary

Geant4ROOTUSolid

Ordinary Polycone scalability

-500005000-500005000

2

4

6

8

10

12

x 104

3s 10s 100s0

1000

2000

3000

4000

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion] Performance of method DistanceToIn at folder ordinary

Geant4ROOTUSolid

Geant4 poor performance scalability is explained by the lack of spatial optimization

Page 29: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 29

Ordinary Polycone: sometimes only proper, careful coding makes big

difference• Performance gain over Geant4 could be explained

due to lack of spatial division by voxelization• But new USolids polycone uses very similar

algorithms and voxelization as ROOT does: why it is factor 7 – 10x faster, much more clear, shorter, readable and easier to maintain?

• Answer is given in 2 backup slides in the end of presentation

Page 30: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 30

Generic Polycone• Optimization made as well

o Using voxelization on generalized surface facet model

o Performance improvement over Geant4 (generic polycone does not exist in ROOT)

o Performance improvement depends on the number of sections and is less than in case of ordinary polycone

o Scalability is excellent as well as for the ordinary polycone case

o Tested and measured on original Geant4 test cases of solids

o Values are 100% conformal with Geant4• This work was achieved thanks to

the 3 month extension negotiated by John Harvey and supported by the Linear Collider Detector Team

Page 31: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 31

3s 10s 100s0

1000

2000

3000

4000

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion]Performance of method SafetyFromInside at folder combined

Geant4 PolyconeNew Generic PolyconeNew Ordinary Polycone

3s 10s 100s0

2000

4000

6000

8000

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion] Performance of method DistanceToOut at folder combined

Geant4 PolyconeNew Generic PolyconeNew Ordinary Polycone

3s 10s 100s0

1000

2000

3000

4000

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion]

Performance of method DistanceToIn at folder combined

Geant4 PolyconeNew Generic PolyconeNew Ordinary Polycone

Ordinary vs. Generic Polycone

-500005000-500005000

2

4

6

8

10

12

x 104

3s 10s 100s0

500

1000

1500

2000

2500

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion] Performance of method Inside at folder combined

Geant4 PolyconeNew Generic PolyconeNew Ordinary Polycone

Geant4 poor performance scalability is explained by the lack

of spatial optimization

Page 32: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 32

Polyhedra

Page 33: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 33

New Polyhedra implementation

• Uses similar methodology as was done for the Generic Polycone

• Works for both ordinary and generic polyhedra

• Provides improvement of performance and scalability

• Values are 100% conformal with Geant4• Same as for the case of generic polycone, this work

was made during my last 3 months contract extension

Page 34: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 34

Inside scalability

2 4 6 8 10 20 500

500

1000

1500

2000

2500

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion]

Performance of method Inside at folder log

Geant4ROOTUSolid

Geant4 poor performance scalability is explained by the lack of spatial optimization

Page 35: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 35

DistanceToIn scalability

2 4 6 8 10 20 500

2000

4000

6000

8000

10000

12000

14000

16000

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion]

Performance of method DistanceToIn at folder log

Geant4ROOTUSolid

Geant4 poor performance scalability is explained by the lack of spatial optimization

Page 36: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 36

DistanceToOut scalability

2 4 6 8 10 20 500

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2x 10

4

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion]

Performance of method DistanceToOut at folder log

Geant4ROOTUSolid

Geant4 poor performance scalability is explained by the lack of spatial optimization

Page 37: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 37

Normal scalability

2 4 6 8 10 20 500

1000

2000

3000

4000

5000

6000

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion]

Performance of method Normal at folder log

Geant4ROOTUSolid

Geant4 poor performance scalability is explained by the lack of spatial optimization

Page 38: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 38

SafetyFromInside scalability

2 4 6 8 10 20 500

1000

2000

3000

4000

5000

6000

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion]

Performance of method SafetyFromInside at folder log

Geant4ROOTUSolid

Geant4 poor performance scalability is explained by the lack of spatial optimization

Page 39: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 39

2 4 6 8 10 20 500

1000

2000

3000

4000

5000

6000

7000

Count of solid parts

Tim

e [n

anos

econ

ds p

er o

pera

tion]

Performance of method SafetyFromOutside at folder log

Geant4ROOTUSolid

SafetyFromOutside scalability

TODO

Geant4 poor performance scalability is explained by the lack of spatial optimization

Page 40: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 40

Status of work Types and Unified Solids interface are defined Bridge classes for comparison with Geant4 and

ROOT implemented Testing suite defined and deployed Implemented 9 solid primitives New, high performance implementation of

composed solids: Multi-Union, Tessellated solid, Polycone, Polyhedra and Extruded-Solid (ongoing)

Code and knowledge is being passed since Q1 2013 to Tatiana Nikitina

Page 41: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 41

Future work plan• Marek Gayer

o Finish documentation• Tatiana Nikitina

o Analyze and implement remaining solids for the new libraryo Verify implementation of secondary methods for all solidso Integration to Geant4 10

Page 42: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Thank you for your attention.

???Questions and Answers

Current status of the development of the Unified Solids library 42

Page 43: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Backup

Current status of the development of the Unified Solids library 43

Page 44: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 44

Polycone: sometimes only proper, careful coding makes huge difference

1/2• But new USolids polycone uses very similar

algorithms and voxelization as ROOT does: why it is factor 7 – 10x faster, much more clear, shorter, readable and easier to maintain?

• Pre-calculates as much as possible in the constructor and not during runtime, not needing to do it in navigation methods again (e.g. if it is cylindrical or tubular section)

• Using directly methods from Tubs on Cons for polycone section to separate and move the logic from Polycone to Polycone sections o not re-implementing it again in new methods => difficult to read, maintain

• Lot of polycone data is instead contained in Tubs and Cons

Page 45: Current status of the development of the Unified Solids library Marek Gayer CERN PH/SFT

Current status of the development of the Unified Solids library 45

Polycone: sometimes only proper, careful coding makes huge difference

2/2• No use of recursion, troubling compiler, processor and

code readability• DistanceToIn moves point to bounding box• std::lower_bound is much faster than doing binary

search by own coding• C++, not feeling of mere C wrapped in classes• Using class for vector (x,y,z) is better than using C array• No use of pointers, memcpy, memset, …• Section data not contained in Tubs, Cons are grouped

together in a std::vector of struct’s• Logical components, namely repeatedly used are in

separated methods, often inlined