intel hpc developer conference fuel your insight · pdf fileintel® hpc developer...

29
INTEL ® HPC DEVELOPER CONFERENCE Fuel Your Insight Large-scale Distributed Rendering with the OSPRay Ray Tracing Framework Carson Brownlee

Upload: buinhu

Post on 28-Mar-2018

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

INTEL® HPC DEVELOPER CONFERENCE

Fuel Your Insight

Large-scale Distributed Rendering with the OSPRay Ray Tracing Framework

Carson Brownlee

Page 2: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Shared-memory

Page 3: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Distributed-memory

Page 4: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Why MPI?

Data that exceeds the memory limits of a single node

Performance limitations

Tiled displays

In Situ

Page 5: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Strong Scaling

Page 6: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Weak Scaling

Page 7: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

High Fidelity Rendering

Page 8: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Related Work

Sending Rays, Kilauea - Kato ’01,’02,’03

Interactive Ray Tracing on Clusters - Wald et al. ‘03

Distributed Shared Memory - DeMarle et al. ‘03

IceT Compositing - Moreland et al. ’11

Page 9: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Multiple Device

API commands are processed on the appropriate active device. This provides a modular backend for processing API calls. Currently these include:

1. Local

2. MPI

3. COI (Now deprecated in favor of MPI)

Page 10: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Using OSPRay MPI

Compile

OSPRAY_BUILD_MPI_DEVICE=ON

Requires MPI Library with multi-threading support (IMPI recommended)

OSPRAY_EXP_DATA_PARALLEL=ON (expiremental)

Run

mpirun -n 3 ./ospGlutViewer —osp:mpi teapot.obj (mpirun args vary)

mpirun -ppn 1 -n 1 -host localhost ./ospGlutViewer —osp:mpi teapot.obj : -n 2 -host n1, n2 ./ospray_mpi_worker —osp:mpi

ParaView

VTKOSPRAY_ARGS=“—osp:mpi” mpirun -ppn 1 -n 1 -host localhost ./paraview : -n 1 -host n1, n2 ./ospray_mpi_worker —osp:mpi

Page 11: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Distributed Framebuffer

Data replicated and Data distributed

Tile ownership

Stores accumulation buffer locally

Pixel Operations

Processed tiles with framebuffer colors sent to display node

Page 12: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Tiling Pseudocode

Page 13: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Load Balancing

Static load balancing

Tiles are strided to avoid work imbalance

1 2 3 1 2 3

2 3 1 2 3 1

3 1 2 3 1 2

Page 14: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Work API Comm

API:

ospRenderFrame() {…}

MPIDevice:

MPIDevice::renderFrame()

{

work::RenderFrame work(_fb, _renderer, fbChannelFlags);

processWork(&work);

}

Work:

void RenderFrame::serialize(SerialBuffer &b) const {

b << (int64)fbHandle << (int64)rendererHandle << channels;

}

Page 15: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Work API Comm

Work:

void RenderFrame::run() {

FrameBuffer *fb = (FrameBuffer*)fbHandle.lookup();

Renderer *renderer = (Renderer*)rendererHandle.lookup();

renderer->renderFrame(fb, channels);

}

Worker:

mpi::recv(mpi::Address(&mpi::app, (int32)mpi::RECV_ALL), workCommands);

for (work::Work *&w : workCommands)

w->run();

Page 16: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Async Comm Layer

Actions are separated into

receive queue

process queue

send queue

Page 17: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Async Comm Layer

struct MasterTileMessage : public mpi::async::CommLayer::Message {

vec2i coords;

float error;

uint32 color[TILE_SIZE][TILE_SIZE];

};

void DFB::incoming(mpi::async::CommLayer::Message *_msg) {

switch (_msg->command) {

case MASTER_WRITE_TILE_NONE:

this->processMessage((MasterTileMessage_NONE*)_msg);

break;

}

Page 18: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Distributed Data

Currently experimental and only for Volume data

env var OSPRAY_DATA_PARALLEL=blockXxBlockYxBlockZ

Data is projected onto tiles, all nodes determine tile overlap

Tiles sent to owning node for compositing

Page 19: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Strong Scaling

Page 20: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Distributed API

Ability to specify what is run where

3 Modes:

Master/Slave

- All ranks not master run commands specified from master rank

Collaborative

- All ranks make the same commands

Independent

- run locally

Page 21: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

D-API Example - Distributed Volume Rendering

Sync: initialization

Sync: create shared volume

Local: create resident volume section

Local: add local volume to synchronous volume

Master: add annotations

Sync: render

Page 22: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Distributed API

ospdApiMode(OSPD_MODE_INDEPENDENT);

OSPVolume localVol = ospNewVolume("shared_structured_volume");

OSPData ospLocalVolData = ospNewData(volumeData.size(), OSP_UCHAR, volumeData.data(), OSP_DATA_SHARED_BUFFER);

ospCommit(ospLocalVolData);

// Switch back to collaborative mode and commit the collab volume and add it to the world

ospdApiMode(OSPD_MODE_COLLABORATIVE);

ospCommit(volume);

ospAddVolume(world, volume);

ospCommit(world);

Page 23: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

D-API Implementation

void MPIDevice::processWork(work::Work* work)

{

if (currentApiMode == OSPD_MODE_MASTER) {

mpi::send(mpi::Address(&mpi::worker,(int32)mpi::SEND_ALL), work);

} else if (currentApiMode == OSPD_MODE_COLLABORATIVE) {

// sync calls

}

work->run();

}

Page 24: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Tiled Displays

Page 25: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

DisplayWald - Experimental

Built as an OSPRay module

Requires MPI

Stereo supported

Routing through single head node supported if display nodes not accessible from compute nodes

Page 26: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

DisplayWald - Experimental

Server (displays):

mpirun -perhost 1 -n 6 ./ospDisplayWald -w 3 -h 2 --no-head-node

mpirun -perhost 1 -n 6 ./ospDisplayWald -w 3 -h 2 —head-node

// will output hostname and port

Client (renderer):

mpirun -n ./ospDwViewer —display-wall-host host:port

Page 27: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Performance Tips

Wayness - single MPI process per node ideal

Excessive API calls can currently cause very long load times

Affinity issues - check CPU utilization pegged at 100%.

KNL cache mode - OSPRay runs best in cache/quadrant mode

Samples Per Pixel - Negative values will subset image per frame

Page 28: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Questions?

Page 29: INTEL HPC DEVELOPER CONFERENCE Fuel Your Insight · PDF fileINTEL® HPC DEVELOPER CONFERENCE Fuel Your Insight ... IceT Compositing ... - All ranks not master run commands specified

Legal Notices and DisclaimersIntel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or

service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.

Performance tests, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.

Copyright © 2016 Intel Corporation. All rights reserved. Intel, Intel Inside, the Intel logo, Intel Xeon and Intel Xeon Phi are trademarks of Intel Corporation in the United States and other countries. *Other names and brands may be claimed as the property of others.

Copyright © 2016 Intel Corporation, All Rights Reserved

29