streaming video data - home - amd
TRANSCRIPT
STREAMING VIDEO DATA INTO 3D APPLICATIONSSession 2116Christopher MayerAMDSr. Software Engineer
3 | Streaming Video Data Into 3D Applications | June 2011
CONTENT
Introduction
Pinned Memory
Streaming Video Data
How does the APU change the game
4 | Streaming Video Data Into 3D Applications | June 2011
INTRODUCTION | Use Cases
Why streaming video content on the GPU– Integrate video in a 3D scene– Process video on the GPU– Render additional information on the video
Broadcast applications– Moderate amount of video streams– Complex rendering
Surveillance systems– Usually a huge number of video streams– Simple rendering only
5 | Streaming Video Data Into 3D Applications | June 2011
INTRODUCTION AMD | Ventuz Demo Showed at ISE 2011
6 | Streaming Video Data Into 3D Applications | June 2011
REQUIREMENTS
Fast data transfer– Low latency– High bandwidth– Small setup time for transfer– Reduced amount memory copies
Constant frame rates
No frame drops
Easy access to data buffer
7 | Streaming Video Data Into 3D Applications | June 2011
REQUIREMENTS | Data Size
720x525 1280x720 1920x1080 2048x1536Number of Pixels 378 000 921 600 2 073 600 3 145 728
Size of one Frame (RGB) 1.08 MB 2.64 MB 5.93 MB 9 MB
Bandwidth when playing at 60 HZ 64.88 MB/sec 158 MB/sec 356 MB/sec 540 MB/sec
8 | Streaming Video Data Into 3D Applications | June 2011
DATA PATH
System Memory
CaptureGraphics
9 | Streaming Video Data Into 3D Applications | June 2011
AMD PINNED MEMORY
10 | Streaming Video Data Into 3D Applications | June 2011
PINNED MEMORY ON AMD FIREPROTM
Pinned memory is non-swappable system memory
The memory can directly be accessed by the GPU
Memory needs to be allocated by the application
The memory needs to be aligned to the page size (usually 4K)
The driver will pin the memory
On AMD FireProTM, the extension AMD_pinned_memory can be used to create buffers
AMD_EXTERNAL_VIRTUAL_MEMORY is available as target for glBufferData
Access to the memory is not synchronized by the driver. The application needs to control access to the buffers.
GLSync objects can be used to verify if a transfer into or from a buffer is finished
Pinned memory buffers can be used in the same way as other OpenGL buffer objectse.g., they can be bound as GL_PIXEL_UNPACK_BUFFER
11 | Streaming Video Data Into 3D Applications | June 2011
PINNED MEMORY | Buffer Creation
// Allocate system memory and add 4K for alignmentm_pBufferMemory[i].pBasePointer = new char[m_uiBufferSize + 4096];ZeroMemory(m_pBufferMemory[i].pBasePointer, (m_uiBufferSize + 4096));
// Align memory to 4K boundarieslong addr = (long) m_pBufferMemory[i].pBasePointer;m_pBufferMemory[i].pAlignedPointer = (char*)((addr + 4095) & (~0xfff));
// create buffer to downstream data and pin the memoryglBindBuffer(GL_EXTERNAL_VIRTUAL_MEMORY_AMD, m_pBuffer[i]);glBufferData(GL_EXTERNAL_VIRTUAL_MEMORY_AMD, m_uiBufferSize, m_pBufferMemory[i].pAlignedPointer, GL_STREAM_DRAW);
glBindBuffer(GL_EXTERNAL_VIRTUAL_MEMORY_AMD, 0)
• The application can update the buffer at any time by writing to m_pBufferMemory[i].pAlignedPointer
• The application can read the buffer content at any time by accessing m_pBufferMemory[i].pAlignedPointer
• No map / unmap calls needed• Make sure the buffer is currently not accessed by the GPU
12 | Streaming Video Data Into 3D Applications | June 2011
PINNED MEMORY | Buffer Access
Copy data from a buffer into a texture
// Bind buffer as unpack buffer to copy data into a texture object
glBindBuffer(GL_PIXEL_UNPACK_BUFFER, m_pBuffer[m_uiBufferIdx]);
// Copy pinned memory to textureglTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, m_uiTexWidth, m_uiTexHeight, m_nExtFormat, m_nType, NULL);
// Insert Sync object to check for completionm_UnPackFence = glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);
Copy data from framebuffer into pinned memory buffer
// Bind buffer as pack buffer to copy data into a texture object
glBindBuffer(GL_PIXEL_PACK_BUFFER, m_pPackBuffer[m_uiBufferIdx]);
// Copy FB into pinned mem bufferglReadPixels(0, 0, m_uiBufferWidth, m_uiBufferHeight, m_nExtFormat, m_nType, NULL);
m_PackFence = glFenceSync(GL_SYNC_GPU_COMMANDS_COMPLETE, 0);
Synchronizing the buffer access
if (glIsSync(Fence)){
// Make sure that buffer memory is no longer accessed by drawingglClientWaitSync(Fence, GL_SYNC_FLUSH_COMMANDS_BIT, OneSecond);glDeleteSync(Fence);
}
13 | Streaming Video Data Into 3D Applications | June 2011
PINNED MEMORY - PERFORMANCE
0.00
0.50
1.00
1.50
2.00
2.50
3.00
256x256 720x525 720x625 1280x720 1920x1080 2048x1536
PBO vs. Pinned Memory
Speedup
14 | Streaming Video Data Into 3D Applications | June 2011
PINNED MEMORY | Summary
Easy access since memory is always present
No mapping/un-mapping is required
Reduced overhead for data transfer
Lower latency
Best choice to download permanently changing data
Buffer access needs to be synchronized by the application
15 | Streaming Video Data Into 3D Applications | June 2011
STREAMING DATA
16 | Streaming Video Data Into 3D Applications | June 2011
Rendering
Data acquisition
STREAMING DATA | Goals
capture
Transfer to memory
Transfer to GPU
Render
Continuous data acquisition at constant rate− e.g., DVD player at 59.94 HZ
No input frames should be dropped
Rendering needs to happen at constant frame rate
No tearing on video data
No stuttering while displaying video data as texture
17 | Streaming Video Data Into 3D Applications | June 2011
STREAMING DATA
Capture
N+2N+1 N+5N+3 N+6N+4N
Render
N N+1 N+2 N+3 N+5
N N+1 N+2 N+3 N+4
18 | Streaming Video Data Into 3D Applications | June 2011
STREAMING DATA | Buffer Access
WaitForVBlank
GetBuffer
CopyToBuffer
ReleaseBuffer
Copy to Texture
Image Processing
DrawReleaseBuffer
GetBufferBuffer 1
Buffer 2
Wait for a ‘full’ bufferGrant read access
Wait for a ‘empty’ bufferGrant write access
Data Acquisition Rendering
19 | Streaming Video Data Into 3D Applications | June 2011
STREAMING DATA | Synchronizing the Buffer
// get a buffer for writing. Produce new dataunsigned int SyncedBuffer::getBufferForWriting(char* &pBuffer){// Wait until an empty slot is availableWaitForSingleObject(m_hNumEmpty, INFINITE);
// Enter critical sectionWaitForSingleObject(m_pBuffer[m_uiHead].hMutex, INFINITE);
pBuffer = m_pBuffer[m_uiHead].pData;
return m_uiHead;}
void SyncedBuffer::releaseWriteBuffer(){// Leave critical sectionReleaseSemaphore(m_pBuffer[m_uiHead].hMutex, 1, 0);
// Increment the number of Full buffersReleaseSemaphore(m_hNumFull, 1, &m_lNumFullElements);
++m_lNumFullElements;
// switch to next bufferm_uiHead = (m_uiHead + 1) % m_uiSize;
}
// get a buffer for reading. Consume dataunsigned int SyncedBuffer::getBufferForReading(char* &pBuffer){// Wait until the buffer is availableWaitForSingleObject(m_hNumFull, INFINITE);
// Block bufferWaitForSingleObject(m_pBuffer[m_uiTail].hMutex, INFINITE);
pBuffer = m_pBuffer[m_uiTail].pData;
return m_uiTail;}
void SyncedBuffer::releaseReadBuffer(){// Release bufferReleaseSemaphore(m_pBuffer[m_uiTail].hMutex, 1, NULL);
// Increase number of emty buffersReleaseSemaphore(m_hNumEmpty, 1, NULL);
// switch to next bufferm_uiTail = (m_uiTail + 1) % m_uiSize;
}
20 | Streaming Video Data Into 3D Applications | June 2011
HOW DOES THE APUCHANGE THE GAME
21 | Streaming Video Data Into 3D Applications | June 2011
HOW DOES THE APU CHANGE THE GAME
Having an APU and discrete graphics in a system allows distribution of work to two GPUs
Additional computing steps that can be implemented efficiently on a GPU can be handled by the APU in parallel to the rendering on the discrete GPU
More time for rendering is available on the discrete GPU
22 | Streaming Video Data Into 3D Applications | June 2011
HOW DOES THE APU CHANGE THE GAME
Usually we have time left in the capture thread
The remaining time can be used to augment quality
– Doing de-interlacing– Performing color space conversion– Post processing of image data– …
Those tasks can benefit greatly by running on a SIMD Engine
Running those tasks on the APU frees time in the Render thread to augment complexity of 3D content
Capture
N+1N
Render
N
23 | Streaming Video Data Into 3D Applications | June 2011
Rendering Using the Discrete GPUData Acquisition and Processing Using the APU
HOW DOES THE APU CHANGE THE GAME
WaitForVBlank
GetBuffer
CopyToBufferImage processing
ReleaseBuffer
Copy to Texture
Draw
ReleaseBuffer
GetBuffer
Buffer 1
Buffer 2
Wait for a ‘full’ bufferGrant read access
Wait for an ‘empty’ bufferGrant write access
24 | Streaming Video Data Into 3D Applications | June 2011
HOW DOES THE APU CHANGE THE GAME
Pinned memory can be used for data exchange between APU and discrete GPU
Since data needs to be loaded into memory, the additional costs for data transfer on the APU remains small
The SIMD engine offers great benefit for image processing algorithms
For video streaming the APU is a great additional resource to offload tasks from the discrete GPU
25 | Streaming Video Data Into 3D Applications | June 2011
QUESTIONS
27 | Streaming Video Data Into 3D Applications | June 2011
Disclaimer & AttributionThe information presented in this document is for informational purposes only and may contain technical inaccuracies, omissions and typographical errors.
The information contained herein is subject to change and may be rendered inaccurate for many reasons, including but not limitedto product and roadmap changes, component and motherboard version changes, new model and/or product releases, product differences between differing manufacturers, software changes, BIOS flashes, firmware upgrades, or the like. There is no obligation to update or otherwise correct or revise this information. However, we reserve the right to revise this information and to make changes from time to time to the content hereof without obligation to notify any person of such revisions or changes.
NO REPRESENTATIONS OR WARRANTIES ARE MADE WITH RESPECT TO THE CONTENTS HEREOF AND NO RESPONSIBILITY IS ASSUMED FOR ANY INACCURACIES, ERRORS OR OMISSIONS THAT MAY APPEAR IN THIS INFORMATION.
ALL IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR PURPOSE ARE EXPRESSLY DISCLAIMED. IN NO EVENT WILL ANY LIABILITY TO ANY PERSON BE INCURRED FOR ANY DIRECT, INDIRECT, SPECIAL OR OTHER CONSEQUENTIAL DAMAGES ARISING FROM THE USE OF ANY INFORMATION CONTAINED HEREIN, EVEN IF EXPRESSLY ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
AMD, AMD FirePro, the AMD arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. All other names used in this presentation are for informational purposes only and may be trademarks of their respective owners.
© 2011 Advanced Micro Devices, Inc. All rights reserved.