halo finder

27
Abstract PISTON is a portable framework which supports the development of visualization and analysis operators using a platform-independent, data-parallel programming model. Operators such as isosurface, cut-surface and threshold have been implemented in this framework, with the exact same operator code achieving good parallel performance on different architectures. An important analysis operator in cosmology is the halo finder. A halo is a cluster of particles and is considered a common feature of interest found in cosmology data. As the number of cosmological simulations carried out in the recent past has increased, the resultant data of these simulations and the required analysis tasks have increased as well. As a consequence, there is a need to develop scalable and efficient tools to carry out the needed analysis. Therefore, we are currently implementing a halo finder operator using PISTON. Researchers have developed a wide variety of techniques to identify halos in raw particle data. The most basic algorithm is the friend-of-friends (FOF) halo finder, where the particles are clustered based on two parameters: linking length and halo size. In a FOF halo finder, all particles which lie within the linking length are considered as one halo and the halos are filtered based on the halo size parameter. A naive implementation of a FOF halo finder compares each and every particle pair, requiring O(n 2 ) operations. Our data- parallel halo finder operator uses a balanced k-d tree to reduce this number of operations in the average case, and implements the algorithm using only the data-parallel primitives in order to achieve portability and performance.

Upload: wathsala-widanagamaachchi

Post on 06-Aug-2015

55 views

Category:

Documents


0 download

DESCRIPTION

Halo finder - piston based implementation

TRANSCRIPT

Page 1: Halo Finder

AbstractPISTON is a portable framework which supports the development of visualization and analysis operators using a platform-independent, data-parallel programming model. Operators such as isosurface, cut-surface and threshold have been implemented in this framework, with the exact same operator code achieving good parallel performance on different architectures.

An important analysis operator in cosmology is the halo finder. A halo is a cluster of particles and is considered a common feature of interest found in cosmology data. As the number of cosmological simulations carried out in the recent past has increased, the resultant data of these simulations and the required analysis tasks have increased as well. As a consequence, there is a need to develop scalable and efficient tools to carry out the needed analysis.

Therefore, we are currently implementing a halo finder operator using PISTON. Researchers have developed a wide variety of techniques to identify halos in raw particle data. The most basic algorithm is the friend-of-friends (FOF) halo finder, where the particles are clustered based on two parameters: linking length and halo size. In a FOF halo finder, all particles which lie within the linking length are considered as one halo and the halos are filtered based on the halo size parameter. A naive implementation of a FOF halo finder compares each and every particle pair, requiring O(n2) operations. Our data-parallel halo finder operator uses a balanced k-d tree to reduce this number of operations in the average case, and implements the algorithm using only the data-parallel primitives in order to achieve portability and performance.

Page 2: Halo Finder

Data-Parallel Halo Finder Operator in PISTON

Wathsala Widanagamaachchi (CCS-7)

University of Utah

Mentor : Christopher Sewell

Page 3: Halo Finder

● PISTON & motivation behind it● Data-Parallel programming ● Halos & Halo finder● Naive approach & Data-parallel approach● Results

Outline

Page 4: Halo Finder

● Portable framework ● Development of visualization & analysis

operators● Use a platform-independent, data-parallel

programming model● Motivation

Lack of visualization software which take full advantage of acceleration hardware and multi-core architecture

What is PISTON?

Page 5: Halo Finder

● What is data parallelism? ● Same operation is performed by different processors

on different pieces of data● What is Thrust?

● Thrust is a NVidia C++ template library, which provides CUDA and OpenMP backends

● Most STL algorithms in Thrust are data-parallel– sorting: thrust::sort and thrust::sort_by_key

4 5 6 8 7 2 1 3 : sort: 1 2 3 4 5 6 7 8– scans: thrust::inclusive_scan, thrust::exclusive_scan etc.

4 5 6 7 8 2 1 3 : sum scan: 4 9 15 22 30 32 33 36

Data-Parallel programming & Thrust

Page 6: Halo Finder

● Isosurface, Cut-surface & Threshold

Operators in PISTON

Page 7: Halo Finder

● What is a halo?● Feature of interest found in Cosmology data● Cluster of particles

● Halo Finder● Important analysis operator● Friend-Of-Friends (FOF) halo finder

– linking length & halo size ● Motivation behind a data-parallel solution

● Increased amount of simulation data available & analysis needed

Halos & Halo Finder

Page 8: Halo Finder

● Compares each & every particle pair

● Require O(n2) comparisons

Naive Approach

A

B

C

D

E

FG

Page 9: Halo Finder

● Balanced k-d tree from the particles● K-d tree is a.. space partitioning data structure for

organizing points in k-dimensional space● Use k-d tree to reduce the number of

comparisons

● Implement using only the data-parallel primitives● thrust::for_each, thrust::sort, thrust::transform,

thrust::scatter, thrust::gather & thrust::copy

Data-Parallel FOF Halo Finder Operator

Page 10: Halo Finder

Balanced k-d tree Creation

A

B

C

D

E

FG

A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)

A B D C F E GX rank 1 0 6 2 5 4 3Y rank 0 3 2 6 5 1 4Z rank 0 1 2 3 5 4 6

0

A, B, C, D, E, F, GK-d tree

Page 11: Halo Finder

Balanced k-d tree Creation

A

B

C

D

E

FG

A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)

0

A, B, C, D, E, F, GK-d tree

A B D C F E GX rank 1 0 6 2 5 4 3Y rank 0 3 2 6 5 1 4Z rank 0 1 2 3 5 4 6

Segment in X axis

Split value... 2.5 in X axis

Page 12: Halo Finder

Balanced k-d tree Creation

A

B

C

D

E

FG

A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)

Segment in X axis

A B C D F E GX rank 1 0 2 6 5 4 3Y rank 0 3 6 2 5 1 4Z rank 0 1 3 2 5 4 6

0

A, B, C D, E, F, G

K-d tree

1 2

Split value... 2.5 in X axis

Page 13: Halo Finder

Balanced k-d tree Creation

A

B

C

D

E

FG

A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)

Segment in X axis

0K-d tree

1 2

A B C D F E GX rank 1 0 2 3 2 1 0Y rank 0 1 2 1 3 0 2Z rank 0 1 2 0 2 1 3

A, B, C D, E, F, G

Split value... 2.5 in X axis

Page 14: Halo Finder

A B C D F E GX rank 1 0 2 3 2 1 0Y rank 0 1 2 1 3 0 2Z rank 0 1 2 0 2 1 3

Balanced k-d tree Creation

A

B

C

D

E

FG

A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)

Segment in Y axis

0

A, B, C D, E, F, G

K-d tree

1 2

Split value... 2.5 in Y axis

Split value... 3.5 in Y axis

Page 15: Halo Finder

0

A B, C D, E F, G

1 2

K-d tree

3 4 5 6

A B C D E F GX rank 1 0 2 3 1 2 0Y rank 0 1 2 1 0 3 2Z rank 0 1 2 0 1 2 3

Balanced k-d tree Creation

A

B

C

D

E

FG

A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)

Segment in Y axis

Split value... 2.5 in Y axis

Split value... 3.5 in Y axis

Page 16: Halo Finder

A B C D E F GX rank 0 0 1 1 0 1 0Y rank 0 0 1 1 0 1 0Z rank 0 0 1 0 1 0 1

0

1 2

K-d tree

3 4 5 6

Balanced k-d tree Creation

A

B

C

D

E

FG

A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)

Segment in Y axis

A B, C D, E F, G

Split value... 2.5 in Y axis

Split value... 3.5 in Y axis

Page 17: Halo Finder

0

B C D E F GA

1 2

3 4 5 6

K-d tree

7 8 9 10 11 12

A B C D E F GX rank 0 0 0 0 0 0 0Y rank 0 0 0 0 0 0 0Z rank 0 0 0 0 0 0 0

Balanced k-d tree Creation

A

B

C

D

E

FG

A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)

At each k-d tree node store parent, child details, segment details & split value

Page 18: Halo Finder

Finding Halos

0

B C D E F GA

1 2

3 4 5 6

K-d tree

7 8 9 10 11 12

A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)

● Bottom-up approach● At each level, consider all nodes in the level

Page 19: Halo Finder

● Bottom-up approach● At each level, consider all nodes in the level

● Look at the split value & segment particles 0

B C D E F GA

1 2

3 4 5 6

K-d tree

7 8 9 10 11 12

A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)

Split value at 0 is 2.5

Finding Halos

Page 20: Halo Finder

● Bottom-up approach● At each level, consider all nodes in the level

● Look at the split value & segment particles

● Determine the particles within the linking length in the split axis

0

B C D E F GA

1 2

3 4 5 6

K-d tree

7 8 9 10 11 12

A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)

Split value at 0 is 2.5Linking length 2

Finding Halos

Page 21: Halo Finder

● Bottom-up approach● At each level, consider all nodes in the level

● Look at the split value & segment particles

● Determine the particles within the linking length in the split axis

● Do m*n comparisons &determine halos

● Filter halos

0

B C D E F GA

1 2

3 4 5 6

K-d tree

7 8 9 10 11 12

A (1,1,0)B (0,4,0)C (2,6,0)D (8,3,0)E (4,2,0)F (5,5,0)G (3,4,0)

Split value at 0 is 2.5Linking length 2

Finding Halos

Page 22: Halo Finder

● Each node has a bounding box calculated by looking at its segment particles

● Use the BB to reduce the comparisons

Optimization - Use of Bounding Boxes

0

B C D E F GA

1 2

3 4 5 6

K-d tree

7 8 9 10 11 12

Page 23: Halo Finder

Results

24474 particles

Page 24: Halo Finder

Results

24474 particles

Linking length 0.2

Halo size 100

Halos found.. 10

Page 25: Halo Finder

Results

24474 particles

Linking length 1.1

Halo size 100

Halos found.. 5

Page 26: Halo Finder

Results

Number of particles

Number of threads

TimingsHalos foundk-d tree

creation Bounding box computation

Finding halos

21441 1 0.066s 0.00049s 0.092s

142 0.041s 0.00029s 0.052s4 0.026s 0.00021s 0.044s

42882 1 0.141s 0.0011s 0.256s

232 0.085s 0.0007s 0.142s4 0.054s 0.0005s 0.090s

Some preliminary results on halo finding using OpenMP

Next steps...Get this running on CUDACompare this with the VTK halo finder implementation

Page 27: Halo Finder

Thank You.