css 700: mass cuda parallel‐computing library for multi‐agent spatial simulation fall quarter...
TRANSCRIPT
UW Bothell Computing & Software Systems
CSS 700: MASS CUDAParallel Computing Library for Multi Agent Spatial Simulation‐ ‐
Fall Quarter 2014Nathaniel Hart
UW Bothell Computing & Software Systems 2
Project Milestones
M1: The ability to execute simulations that fit on a single device. Initial focus is on creating the command & control logic with hooks to add multi-device functionality later
M2: The ability to execute simulations that fit on multiple devices. This will require extensible border exchange logic.
M3: The ability to execute simulations that exceed the memory of a host’s available devices. This means a great deal of the partition and border exchange logic.
UW Bothell Computing & Software Systems 3
Quarter Goals (M1)Date Range Activity Deliverable9/15/14 - 10-/15/14
Specify MASS CUDA Architecture and single-GPU implementation
Specifications for algorithms and design choices for phase I implementation
10/16/14 - 12/15/14 Implement part I of MASS AgentsDesign multi-GPU communication algorithms
A minimally viable version of the MASS CUDA that can run a simulation of size not exceeding available memory on a single GPU.
A version of the MASS CUDA library suitable for use in CSS 534
Performance statistics for this initial implementation
Specifications for algorithms and design choices for phase II implementation
UW Bothell Computing & Software Systems 4
Project Status
M1 Status
Finished Unfinished
Code Completion
Finished Unfinished
UW Bothell Computing & Software Systems 5
Status Details
• Still need to lock down final ghost space solution for Agents• Debugging is now possible on Hercules• Need to talk to Jason to figure out how to remote in via GUI
UW Bothell Computing & Software Systems 6
Blocking Issues
• Hercules 2 is still not operational. Depends on Chris Fox to unblock.• Places will not instantiate properly
UW Bothell Computing & Software Systems 7
Virtual Functions
Image Source:http://www.learncpp.com/cpp-tutorial/125-the-virtual-table/
UW Bothell Computing & Software Systems 8
Virtual Functions
Image Source:http://www.learncpp.com/cpp-tutorial/125-the-virtual-table/
UW Bothell Computing & Software Systems 9
The Workaround
__global__ void createPlaces(Places ** places, int qty){
int idx = blockIdx.x * blockDim.x + threadIdx.x;
if(idx < qty){
// virtual function table created on GPU
places[idx] = new PlaceImpl();
}
}
UW Bothell Computing & Software Systems 10
Old ArchitectureUser Application
Dispatcher
Places Mass Agents
Places Partition[0]
Places Partition[n]
Agents Partition[0]
Agents Partition[n]
DeviceConfig[0] DeviceConfig[1] DeviceConfig[n]
API
Model Layer
User Layer
Command & Control Layer
Hardware
UW Bothell Computing & Software Systems 11
New ArchitectureUser
Application
Places
Mass
AgentsDeviceConfig [1]
Dispatcher
DeviceConfig [0] PlacesModel
Model
AgentsModel
Passes Calls To
Returns State Returns State
Manipulates
API calls & return values
UW Bothell Computing & Software Systems 12
The New Problem
• Instances created on the GPU can not be copied to the host and used for analysis.• Just as we break the virtual function table copying from the host to
the GPU, we cannot copy from the GPU to the host.
UW Bothell Computing & Software Systems 13
The New Workaround
Place
+ void callAll(int funcID, void *arg)+ void update( )
+ etc....
State
+ int x+ Agent *agents[10]+ etc....- State *myState
UW Bothell Computing & Software Systems 14
The New Workaround
Place ** h_ptrs
placeObjects 43210
MyState *h_state 43210
Place ** d_ptrs
placeObjects 43210
MyState *d_state 43210
HOST DEVICE
UW Bothell Computing & Software Systems 15
Big TODO’s
1. Refactor existing code to allow for place instantiation and separation of behavior and state
2. Implement all control kernel functions3. Write more tests
UW Bothell Computing & Software Systems 16
?