high-performance cluster analysis implemented with parallel genetic algorithms on the nvidia cuda...

Upload: dariusz-cieslakiewicz

Post on 04-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    1/34

    Motivation

    Background

    Approach

    Results

    References

    High-performance asset cluster analysisimplemented with Parallel Genetic Algorithms on

    the NVIDIA CUDA platform

    by Dariusz Cieslakiewicz1, Diane Wilcox1, Tim Gebbie2

    1Department of Computational and Applied MathematicsUniversity of Witwatersrand

    2Investec Bank

    CHPC Conference, Durban, December 2012

    The financial assistance of the National Research Foundation (NRF) towards this research is hereby acknowledged.Opinions expressed and conclusions arrived at, are those of the author and are not necessarily attributed to the NRF.

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    http://find/http://goback/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    2/34

    Motivation

    Background

    Approach

    Results

    References

    Outline

    1 Motivation

    Non-parametric cluster analysis

    2 Background

    Genetic AlgorithmsGenetic operators

    Parallel Genetic Algorithms

    Computing Platform

    3 Approach

    Technical architecture

    4 Results

    Results

    5 References

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    3/34

    Motivation

    Background

    Approach

    Results

    References

    Non-parametric cluster analysis

    Outline

    1 Motivation

    Non-parametric cluster analysis

    2 Background

    Genetic AlgorithmsGenetic operators

    Parallel Genetic Algorithms

    Computing Platform

    3 Approach

    Technical architecture

    4 Results

    Results

    5 References

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    4/34

    Motivation

    Background

    Approach

    Results

    References

    Non-parametric cluster analysis

    Cluster analysis

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    5/34

    Motivation

    Background

    Approach

    Results

    References

    Non-parametric cluster analysis

    Cluster analysis I

    Cluster analysis groups objects and describes their

    associations.

    Configuration of clusters is represented by a set

    S= {C1, . . . ,

    Ci,

    Ci+1, . . . ,

    Cn} where Ci is the cluster thatobject ibelongs to and adheres to the following conditions:

    Ci = for i = 1, . . . , K,Ci Cj = for i = 1, . . . , K,j = 1, . . . , K and i= j

    and Ki=1

    Ci = S

    [Giada and Marsili, 2002] addressed the problem of data

    clustering by introducing an unsupervised, parameter free

    approach based on the maximum likelihood principle (ML)

    principle.

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    M i i

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    6/34

    Motivation

    Background

    Approach

    Results

    References

    Non-parametric cluster analysis

    Log-likelihood function I

    Log-likelihood function

    [Giada and Marsili, 2002]computed the probability density P({xi}|G,S) and used this to derive the

    maximum likelihood of a featureLc thus resulting in a structureSas P(G,S|xi) e

    DL(S), where

    the resulting likelihood function is denoted by

    Lc(S) =1

    2

    s:ns>1

    [logns

    cs+ (ns 1) log

    n2s ns

    n2s cs] (1)

    The likelihood only depends on the Pearsons coefficient of data:

    ci,j =xixjxi2 xj2

    (2)

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    M ti ti

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    7/34

    Motivation

    Background

    Approach

    Results

    References

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    Outline

    1 Motivation

    Non-parametric cluster analysis

    2 Background

    Genetic AlgorithmsGenetic operators

    Parallel Genetic Algorithms

    Computing Platform

    3 Approach

    Technical architecture

    4 Results

    Results

    5 References

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    Motivation

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    8/34

    Motivation

    Background

    Approach

    Results

    References

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    What are Genetic Algorithms? I

    In 1975, Holland, in his book titled Adaptation in natural

    and artificial systems, described how to apply the

    principles of natural evolution to optimization problems and

    built the first Genetic Algorithms.

    Genetic algorithms are computationally intensive global

    search heuristics.

    Search space contains all feasiable solutions among whichthe desired solution resides.

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    Motivation

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    9/34

    Motivation

    Background

    Approach

    Results

    References

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    Outline

    1 Motivation

    Non-parametric cluster analysis

    2 Background

    Genetic AlgorithmsGenetic operators

    Parallel Genetic Algorithms

    Computing Platform

    3 Approach

    Technical architecture

    4 Results

    Results

    5 References

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    Motivation

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    10/34

    Motivation

    Background

    Approach

    Results

    References

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    Genetic operators I

    Selection

    Crossover

    Mutation

    Elitism

    Replacement

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    Motivation

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    11/34

    Motivation

    Background

    Approach

    Results

    References

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    Genetic operators

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationG ti Al ith

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    12/34

    Motivation

    Background

    Approach

    Results

    References

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    Cluster determination with GA

    Representation: Integer-based individual

    Ch=

    Ci1,

    Cj2, . . . ,

    ClK

    for i,

    j,

    l = {1, . . . ,

    K} where Ci isthe cluster that the object at the ordinal position belongs to.

    Objective function:

    Lc(S) =1

    2

    s:ns>1

    [logns

    cs+ (ns 1) log

    n2s ns

    n2s cs] (3)

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationG ti Al ith

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    13/34

    Background

    Approach

    Results

    References

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    Outline

    1 Motivation

    Non-parametric cluster analysis

    2 Background

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    3 Approach

    Technical architecture

    4 Results

    Results

    5 References

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationGenetic Algorithms

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    14/34

    Background

    Approach

    Results

    References

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    Parallelisation Schemes

    GAs lend themselves to parallelisation.Parallelisation Schemes: Master-slave and Multiple-deme

    Master-slave models

    Multiple-deme PGAs

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationGenetic Algorithms

    http://find/http://goback/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    15/34

    Background

    Approach

    Results

    References

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    Migration Topologies

    Migration

    Deme Size

    Migration Topologies: Stepping Stone, One-Way Ring,Ring, Ring+1+2 and Ring+1+2+3

    Migration rate

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationGenetic Algorithms

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    16/34

    Background

    Approach

    Results

    References

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    Outline

    1 Motivation

    Non-parametric cluster analysis

    2 Background

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    3 Approach

    Technical architecture4 Results

    Results

    5 References

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    Motivation

    B k dGenetic Algorithms

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    17/34

    Background

    Approach

    Results

    References

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    NVIDIA CUDA platform

    1

    Compute Unified Device Architecture(CUDA) is NVIDIAsplatform for massively parallel high-performance

    computing on the NVIDIA GPUs.

    2 Utilising Nsight Eclipse Edition 5 with CUDA 5 on Linux

    Mint 13 using two cards: GTX 560 Ti and GTX 660 Ti

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    Motivation

    B k d

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    18/34

    Background

    Approach

    Results

    References

    Technical architecture

    Outline

    1 Motivation

    Non-parametric cluster analysis

    2 Background

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    3 Approach

    Technical architecture4 Results

    Results

    5 References

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    Motivation

    Background

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    19/34

    Background

    Approach

    Results

    References

    Technical architecture

    Physical View: Master-slave GA on the the NVIDIA

    CUDA platform

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    Motivation

    Background

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    20/34

    Background

    Approach

    Results

    References

    Technical architecture

    Physical View: Multiple-deme GA on the the NVIDIA

    CUDA platform [Pospichal et al., 2010]

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    21/34

    Background

    Approach

    Results

    References

    Technical architecture

    Physical View: Migration of individuals between

    islands/demes [Pospichal et al., 2010]

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    http://find/http://goback/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    22/34

    Background

    Approach

    Results

    References

    Technical architecture

    Genetic Operators

    Crossover Operator

    Single-Point, Two-Point and Custom Crossover implemented and executed at Pc of 0.6 and

    0.8-0.9.

    Custom Crossover utilises a Single-Point Crossover approach with a knowledge directed

    operator applying the following measure

    f = arg max1

    12

    s:ns>1

    [log nscs + (ns 1) logn2sns

    n2scs]

    Mutation Operator

    Mutation implemented with Pm at1

    genomelength.

    Replacement with randomly generated value.

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    http://find/http://goback/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    23/34

    Background

    Approach

    Results

    References

    Technical architecture

    Migration Topologies

    Migration

    Deme Size : VariableMigration Topologies: Stepping Stone, One-Way Ring andRingMigration rate:

    Variable Default: 2

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    24/34

    g

    Approach

    Results

    References

    Technical architecture

    Implementation: Multiple-deme PGA I

    1

    Generic Framework for PGAs comprising 50 code artefactsto research different aspects of PGAs on GPUs.

    2 Includes templatised C++ artefacts: Processing Context,

    Chromsome, Individual, SubPopulation, Population,

    Archipalego, etc.

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    http://find/http://goback/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    25/34

    g

    Approach

    Results

    References

    Technical architecture

    Implementation: Multiple-deme PGA II

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    26/34

    Approach

    Results

    References

    Results

    Outline

    1 Motivation

    Non-parametric cluster analysis

    2 Background

    Genetic Algorithms

    Genetic operators

    Parallel Genetic Algorithms

    Computing Platform

    3 Approach

    Technical architecture4 Results

    Results

    5 References

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    27/34

    Approach

    Results

    References

    Results

    Master-Slave model

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    A h R l

    http://find/http://goback/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    28/34

    Approach

    Results

    References

    Results

    Conclusion

    Successfully implemented PGAs on the GPU.Speedups of more than 20x are attainable.

    Marsilli-Giada Log-likelihood function is a viable approach

    for isolating residual clusters.

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    A h

    http://find/http://goback/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    29/34

    Approach

    Results

    References

    References I

    Chen, D., Chen, W., and Zheng, W. (2012).

    Cuda-zero: a framework for porting shared memory gpu

    applications to multi-gpus.

    SCIENCE CHINA Information Sciences, 55(3):663676.

    Coley, D. (1999).

    An Introduction to Genetic Algorithms for Scientists and

    Engineers.

    World Scientific.

    Giada, L. and Marsili, M. (2001).

    Data clustering and noise undressing of correlation

    matrices.

    Phys. Rev. E, 63:061101.

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    Approach

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    30/34

    Approach

    Results

    References

    References II

    Giada, L. and Marsili, M. (2002).

    Algorithms of maximum likelihood data clustering with

    applications.Physica A: Statistical Mechanics and its Applications,

    315(34):650 664.

    Hoberock, J. and Bell, N. (2010).

    Thrust: A parallel template library.Version 1.3.0.

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    Approach

    http://goforward/http://find/http://goback/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    31/34

    Approach

    Results

    References

    References III

    Langdon, W. B. (2011).

    Debugging cuda.

    In Proceedings of the 13th annual conference companion

    on Genetic and evolutionary computation, GECCO 11,pages 415422, New York, NY, USA. ACM.

    Noh, J. (2000).

    Phys. Rev. E, 61.

    NVIDIA (2012a).

    CUDA Toolkit 4.1 CURAND Guide.

    NVIDIA Corporation.

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    Approach

    http://goforward/http://find/http://goback/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    32/34

    Approach

    Results

    References

    References IV

    NVIDIA (2012b).

    NVIDIA CUDA C Programming Guide.

    NVIDIA Corporation.Pospichal, P., Jaros, J., and Schwarz, J. (2010).

    Parallel genetic algorithm on the cuda architecture.

    In Proceedings of the 2010 international conference on

    Applications of Evolutionary Computation - Volume Part I,

    EvoApplicatons10, pages 442451, Berlin, Heidelberg.

    Springer-Verlag.

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    Approach

    http://find/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    33/34

    Approach

    Results

    References

    References V

    Robilliard, D., Marion-Poty, V., and Fonlupt, C. (2009).

    Genetic programming on graphics processing units.

    Genetic Programming and Evolvable Machines,

    10(4):447471.

    Sivanandam, S. and Deepa, S. (2010).

    Introduction to Genetic Algorithms.

    Springer.

    Xiao, S. and chun Feng, W.

    Inter-block gpu communication via fast barrier

    synchronization.

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    MotivationBackground

    Approach

    http://find/http://goback/
  • 7/29/2019 High-performance cluster analysis implemented with Parallel Genetic Algorithms on the NVIDIA CUDA platform

    34/34

    Approach

    Results

    References

    References VI

    Zhang, S. and He, Z. (2009).

    Implementation of parallel genetic algorithm based oncuda.

    In Proceedings of the 4th International Symposium on

    Advances in Computation and Intelligence, ISICA 09,

    pages 2430, Berlin, Heidelberg. Springer-Verlag.

    by Dariusz Cieslakiewicz, Diane Wilcox, Tim Gebbie High-performance asset cluster analysis implemented with Parall

    http://find/