learn how to develop a distributed game of life with dds

Post on 15-Jan-2015

1.031 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

The Game of Life is a famous zero-player game that has been devised by John Conway more than forty years ago. It is a cellular automaton that runs on a rectangular grid of cells, each of which is either alive or dead. A set of four simple transition rules prescribe how the cells in the grid evolve from one generation to the other. Life, as it is called for short, has been able to capture the fascination of many programmers ever since it was published – mostly because of the surprising behavior it can result in. Many programmers have written implementations of Life at some point, probably in an educational context or maybe just for fun. This webinar is intended as a combination of the two: education and fun. It will show how a distributed version of Life can be implemented using the Data Distribution Service (DDS) standard from the Object Management Group (OMG). The presented Distributed Life system will be able to deal with real-life system requirements like fault tolerance, scalability and deployment flexibility. It will be shown that by leveraging advanced DDS data-management features, system developers can off-load most of the complexity associated with the distribution and fault-tolerance aspects onto the infrastructure and focus on the application logics.

TRANSCRIPT

Your systems. Working as one.

May 15, 2013Reinier Torenbeekreinier@rti.com

Learn How to Develop aDistributed Game of Life with DDS

Agenda

• Problem definition: Life Distributed• A solution: RTI Connext DDS• Applying DDS to Life Distributed: concepts• Applying DDS to Life Distributed: (pseudo)-code• Advanced Life Distributed: leveraging DDS• Summary• Questions and Answers

Conway's Game of Life

• Devised by John Conway in 1970• Zero-player game– evolution determined by initial state– no further input required

• Plays in two-dimensional, orthogonal grid of square cells– originally of infinite size– for this webinar, toroidal array is used

• At any moment in time, each cell is either dead or alive• Neighboring cells interact with each other– horizontally, vertically, or diagonally adjacent.

Conway's Game of Life

At each step in time, the following transitions occur:

1. Any live cell with fewer than two live neighbors dies, as if caused by under-population.

2. Any live cell with two or three live neighbors lives on to the next generation.

3. Any live cell with more than three live neighbors dies, as if by overcrowding.

4. Any dead cell with exactly three live neighbors becomes a live cell, as if by reproduction.

These rules continue to be applied repeatedly to create further generations.

Conway's Game of Life

Glider gun

Pulsar

Block

Conway's Game of Life – Distributed

Problem description: how can Life be properly implemented in a distributed fashion?• have multiple processes work on parts of the

Universe in parallel

Conway's Game of Life – Distributed

Conway's Game of Life – Distributed

Problem description: how can Life be properly implemented in a distributed fashion?• have multiple processes work on parts of the

Universe in parallel• have these processes exchange the required

information for the evolutionary steps

Conway's Game of Life – Distributed

Conway's Game of Life – Distributed

Problem description: how can Life be properly implemented in a distributed fashion?• have multiple processes work on parts of the

Universe in parallel• have these processes exchange the required

information for the evolutionary steps

This problem and its solution serve as an example for developing distributed applications in general

Conway's Game of Life – Distributed

Properly here means:• with minimal impact on the application logics– let distribution artifacts be dealt with transparently– let the developer focus on Life and its algorithms

• allowing for mixed environments– multiple programming languages, OS-es and hardware– asymmetric processing power

• supporting scalability– for very large Life Universes on many machines– for load balancing of CPU intensive calculations

• in a fault-tolerant fashion

Agenda

• Problem definition: Life Distributed• A solution: RTI Connext DDS• Applying DDS to Life Distributed: concepts• Applying DDS to Life Distributed: (pseudo)-code• Advanced Life Distributed: leveraging DDS• Summary• Questions and Answers

RTI Connext DDS

A few words describing RTI Connext DDS:• an implementation of the Object Management

Group (OMG) Data Distribution Service (DDS)– standardized, multi-language API– standardized wire-protocol– see www.rti.com/elearning for tutorials (some free)

• a high performance, scalable, anonymous publish/subscribe infrastructure

• an advanced distributed data management technology– supporting many features know from DBMS-es

RTI Connext DDS

DDS revolves around the concept of a typed data-space that• consists of a collection of structured, observable

items which– go through their individual lifecycle of creation,

updating and deletion (CRUD)– are updated by Publishers– are observed by Subscribers

• is managed in a distributed fashion– by Connext libraries and (optionally) services– transparent to applications

RTI Connext DDS

DDS revolves around the concept of a typed data-space that• allows for extensive fine-tuning– to adjust distribution behavior according to application

needs– using standard Quality of Service (QoS) mechanisms

• can evolve dynamically– allowing Publishers and Subscribers to join and leave at

any time– automatically discovering communication paths

between Publishers and Subscribers

Agenda

• Problem definition: Life Distributed• A solution: RTI Connext DDS• Applying DDS to Life Distributed: concepts• Applying DDS to Life Distributed: (pseudo)-code• Advanced Life Distributed: leveraging DDS• Summary• Questions and Answers

Applying DDS to Life Distributed

First step is to define the data-model in IDL• cells are observable items, or "instances"– row and col identify their location in the grid– generation identifies the "tick nr" in evolution– alive identifies the state of the cell

module life {

struct CellType { long row; //@key long col; //@key unsigned long generation; boolean alive; };

};

Applying DDS to Life Distributed

First step is to define the data-model in IDL• cells are observable items, or "instances"– row and col identify their location in the grid– generation identifies the "tick nr" in evolution– alive identifies the state of the cell

• the collection of all cells is the CellTopic Topic– cells exist side-by-side and for the Universe– conceptually stored "in the data-space"– in reality, local copies where needed

row: 16col: 4

generation: 25alive: false

Applying DDS to Life Distributed

Each process is responsible for publishing the state of a certain subset of cells of the Universe:• a rectangle or square area with corners

(rowmin,colmin)i and (rowmax,colmax)i for process i

(1,1) – (10,10)

Applying DDS to Life Distributed

Each process is responsible for publishing the state of a certain subset of cells of the Universe:• a rectangle or square area with corners

(rowmin,colmin)i and (rowmax,colmax)i for process i• each cell is individually updated using the

write() call on a CellTopic DataWriter– middleware analyzes the key values (row,col) and

maintains the individual states of all cells• updating happens generation by generation

Applying DDS to Life Distributed

Each process subscribes to the required subset of cells in order to determine its current state:• all neighboring cells, as well as its "own" cells

Applying DDS to Life Distributed

Each process subscribes to the required subset of cells in order to determine its current state:• all neighboring cells, as well as its "own" cells• using a SQL-expression to identify the cells

subscribed to (content-based filtering)– complexity is "Life-specific", not "DDS-specific"

"((row >= 1 AND row <= 11) OR row = 20) AND((col >= 1 AND col <= 11) OR col = 20)"

Applying DDS to Life Distributed

Each process subscribes to the required subset of cells in order to determine its current state:• all neighboring cells, as well as its "own" cells• using a SQL-expression to identify the cells

subscribed to (content-based filtering)– complexity is "Life-specific", not "DDS-specific"

• middleware will deliver cell updates to those DataReaders that are interested in it

Applying DDS to Life Distributed

Additional processes can be added to peek at the evolution of Life:• subscribing to (a subset of) the CellTopic

"row >= 8 AND row <= 13 AND col >= 8 AND col <= 13"

Applying DDS to Life Distributed

Additional processes can be added to peek at the evolution of Life:• subscribing to (a subset of) the CellTopic• using any supported language, OS, platform– C, C++, Java, C#, Ada– Windows, Linux, AIX, Mac OS X, Solaris, INTEGRITY,

LynxOS, VxWorks, QNX…• without changes to the existing applications– middleware discovers new topology and

distributes updates accordingly

Agenda

• Problem definition: Life Distributed• A solution: RTI Connext DDS• Applying DDS to Life Distributed: concepts• Applying DDS to Life Distributed: (pseudo)-code• Advanced Life Distributed: leveraging DDS• Summary• Questions and Answers

Life Distributed (pseudo-)code

Life Distributed prototype applications were developed on Mac OS X• Life evolution application written in C• Life observer application written in Python– using Pythons extension-API

• (Pseudo-)code covers basic scenario only– more advanced apects are covered in next section

Life Distributed (pseudo-)code

Life evolution application written in C:• application is responsible for– knowing about the Life seed (initial state of cells)– executing the Life rules based on cell updates

coming from DDS– updating cell states after a full generation tick has

been processed• evolution of Life takes place one generation at

a time– consequently, Life applications run in "lock-step"

initialize DDScurrent generation = 0write sub-universe Life seed to DDSrepeat repeat wait for DDS cell update for current generation update sub-universe with cell until 8 neighbors seen for all cells execute Life rules on sub-universe increase current generation write all new cell states to DDSuntil last generation reached

Life Distributed (pseudo-)code

Worth to note about the Life application:• loss of one cell-update will eventually stall the

complete evolution– this is by nature of the Life algorithm– implies RELIABLE reliability QoS for DDS– history of 2 generations need to be stored to avoid

overwriting

<dds> <qos_library name="GameOfLifeQosLibrary"> <qos_profile name="CellProfile"> <topic_qos> <reliability> <kind>RELIABLE_RELIABILITY_QOS</kind> </reliability> <history> <kind>KEEP_LAST_HISTORY_QOS</kind> <depth>2</depth> </history> <durability> <kind>TRANSIENT_LOCAL_DURABILITY_QOS</kind> </durability> </topic_qos> </qos_profile> </qos_library></dds>

Life Distributed (pseudo-)code

Worth to note about the Life application:• loss of one cell-update will eventually stall the complete

evolution– this is by nature of the Life algorithm– implies RELIABLE reliability QoS for DDS– history of 2 generations needs to be stored to avoid

overwriting of state of a single cell• startup-order issues resolved by DDS durability QoS

– newly joined applications will be delivered current state– delivery of historical data transparent to applications– applications not waiting for other applications, but for cell

updates

<dds> <qos_library name="GameOfLifeQosLibrary"> <qos_profile name="CellProfile"> <topic_qos> <reliability> <kind>RELIABLE_RELIABILITY_QOS</kind> </reliability> <history> <kind>KEEP_LAST_HISTORY_QOS</kind> <depth>2</depth> </history> <durability> <kind>TRANSIENT_LOCAL_DURABILITY_QOS</kind> </durability> </topic_qos> </qos_profile> </qos_library></dds>

Life Distributed (pseudo-)code

Worth to note about the Life application:• DDS cell updates come from different places– mostly from the application's own DataWriter– also from neighboring sub-Universes' DataWriters– all transparently arranged based on the filter

create DDS DomainParticipantwith DomainParticipant, create DDS Topic "CellTopic"with CellTopic and filterexpression, create DDS ContentFilteredTopic "FilteredCellTopic"create DDS Subscribercreate DDS DataReader for FilteredCellTopic

Life Distributed (pseudo-)code

Worth to note about the Life application:• DDS cell updates come from different places– mostly from the application's own DataWriter– also from neighboring sub-Universes' DataWriters– all transparently arranged based on the filter

• algorithm relies on reading cell-updates for a single generation– evolving one tick at a time– leverages DDS QueryCondition– "generation = %0" with %0 value changing

create DDS DomainParticipantwith DomainParticipant, create DDS Topic "CellTopic"with CellTopic and filterexpression, create DDS ContentFilteredTopic "FilteredCellTopic"with DomainParticiapnt, create DDS Subscriberwith Subscriber and FilteredCellTopic, create DDS CellTopicDataReader

with CellTopicDataReader, query expression and parameterlist, create QueryConditionwith DomainParticipant, create WaitSetattach QueryCondition to WaitSetin main loop: in generation loop: block thread in WaitSet, wait for data from DDS read with QueryCondition from CellTopicDataReader increase generation update query parameterlist with new generation

Life Distributed (pseudo-)code

Life observer application written in Python:• application is responsible for– subscribing to cell updates– printing cell states to display evolution– ignoring any generations that have missing cell

updates

import clifedds as life

#omitted option parsing

filterString = 'row>={} and col>={} and row<={} and col<={}'. format(options.minRow, options.minCol, options.maxRow, options.maxCol)

life.open(options.domainId, filterString)

generation = 0while generation is not None:

# read from DDS, block if nothing availble, # returns Nones in case of time-out after 10 seconds row, col, generation, isAlive = life.read(10)

# omitted administration for building and printing strings

life.close()

Life Distributed (pseudo-)code

Life observer application written in Python:• application is responsible for– subscribing to cell updates– printing cell states to display evolution– ignoring any generations that have missing cell updates

• for minimal impact, DataReader uses default QoS settings– BEST_EFFORT reliability, so updates might be lost– VOLATILE durability, so no delivery of historical updates– still history depth of 2

<dds> <qos_library name="GameOfLifeQosLibrary"> <qos_profile name="CellProfile"> <topic_qos> <history> <kind>KEEP_LAST_HISTORY_QOS</kind> <depth>2</depth> </history> </topic_qos> </qos_profile> </qos_library></dds>

Life Distributed (pseudo-)code

domainId

# of generations

universe dimensions

sub-universe dimensions

Example of running a single life application:

Life Distributed (pseudo-)code

Example of running a life scenario:

Agenda

• Problem definition: Life Distributed• A solution: RTI Connext DDS• Applying DDS to Life Distributed: concepts• Applying DDS to Life Distributed: (pseudo)-code• Advanced Life Distributed: leveraging DDS• Summary• Questions and Answers

Fault tolerance

Not all is lost if Life application crashes• if using TRANSIENT durability QoS,

infrastructure will keep status roaming– requires extra service to run (redundantly)

• after restart, current status is available automatically– new incarnation can continue seamlessly

persistenceservice

Fault tolerance

Not all is lost if Life application crashes• if using TRANSIENT durability QoS,

infrastructure will keep status roaming– requires extra service to run (redundantly)

• after restart, current status is available automatically– new incarnation can continue seamlessly

• results in high robustness• even more advanced QoS-es are possible

Reliability and flow control

Running the Python app with a larger grid:• with current QoS, faster writer with slower

reader will overwrite samples in reader• whenever at least one cell update is missing,

the generation is not printed (by design)

Reliability and flow control

Running the Python app with a larger grid:• whenever at least one cell update is missing,

the generation is not printed (by design)• with current QoS, faster writer with slower

reader will overwrite samples in reader• this is often desired result, to avoid system-

wide impact of asymmetric processing power• if not desired, KEEP_ALL QoS can be leveraged

<dds> <qos_library name="GameOfLifeQosLibrary"> <qos_profile name="CellProfile"> <topic_qos> <reliability> <kind>RELIABLE_RELIABILITY_QOS</kind> </reliability> <history> <kind>KEEP_ALL_HISTORY_QOS</kind> </history> <resource_limits> <max_samples_per_instance>2</max_samples_per_instance> </resource_limits> <durability> <kind>TRANSIENT_LOCAL_DURABILITY_QOS</kind> </durability> </topic_qos> </qos_profile> </qos_library></dds>

Reliability and flow control

Running the Python app with a larger grid:• whenever at least one cell update is missing,

the generation is not printed (by design)• with current QoS, faster writer with slower

reader will overwrite samples in reader• this is often desired result, to avoid system-

wide impact of asymmetric processing power• if not desired, KEEP_ALL QoS can be leveraged– flow control will slow down writer to avoid loss

More advanced problem solving

Other ways to improve the Life implementation:• for centralized grid configuration, distribute grid-

sizes with DDS– with TRANSIENT or PERSISTENT QoS– this isolates configuration-features to one single app– dynamic grid-reconfiguration can be done by re-

publishing grid-sizes• for centralized seed (generation 0) management,

distribute seed with DDS– with TRANSIENT or PERSISTENT QoS– this isolates seeding to one single app

More advanced problem solving

Other ways to improve the Life implementation:• in addition to separate cells, distribute complete sub-Universe

state using a more compact data-type– DDS supports a very rich set of data-types

• bitmap-like type would work well

– especially useful for very large scale Universes– can be used for seeding as well– with TRANSIENT QoS

• multiple Universes can exist and evolve side-by-side using Partitions– only readers and writers that have a Partition in common will interact– Partitions can be added and removed on the fly– Partitions are string names, allowing good flexibility

Agenda

• Problem definition: Life Distributed• A solution: RTI Connext DDS• Applying DDS to Life Distributed: concepts• Applying DDS to Life Distributed: (pseudo)-code• Advanced Life Distributed: leveraging DDS• Summary• Questions and Answers

Summary

Distributed Life can be properly implemented leveraging DDS• communication complexity is off-loaded to

middleware, developer can focus on application• advanced QoS settings allow for adjustment to

requirements and deployment characteristics• DDS features simplify extending Distributed Life

beyond its basic implementation• all of this in a standardized, multi-language, multi-

platform environment with an infrastructure built to scale and perform

Agenda

• Problem definition: Life Distributed• A solution: RTI Connext DDS• Applying DDS to Life Distributed: concepts• Applying DDS to Life Distributed: (pseudo)-code• Advanced Life Distributed: leveraging DDS• Summary• Questions and Answers

Questions?

Thanks!

top related