No
Yes
View More
View Less
Working...
Close
OK
Cancel
Confirm
System Message
Delete
Schedule
An unknown error has occurred and your request could not be completed. Please contact support.
Scheduled
Wait Listed
Personal Calendar
Speaking
Conference Event
Meeting
Interest
Schedule TBD
Conflict Found
This session is already scheduled at another time. Would you like to...
Loading...
Please enter a maximum of {0} characters.
Please enter a maximum of {0} words.
must be 50 characters or less.
must be 40 characters or less.
Session Summary
We were unable to load the map image.
This has not yet been assigned to a map.
Search Catalog
Reply
Replies ()
Search
New Post
Microblog
Microblog Thread
Post Reply
Post
Your session timed out.
This web page is not optimized for viewing on a mobile device. Visit this site in a desktop browser to access the full set of features.
2017 GTC San Jose

S7706 - Essential CUDA Optimization Techniques - Presented by Acceleware (Session 4 of 4)

Session Speakers
Session Description

This tutorial is for those with some background in CUDA, including an understanding of the CUDA memory model and streaming multiprocessor. Our previous three tutorials provide the background information necessary for this session. This informative tutorial will provide an overview of the analysis performance tools and key optimization strategies for compute, latency, and memory bound problems. The session will include techniques for ensuring peak utilization of CUDA cores by choosing the optimal block size. It'll also include code examples and a programming demonstration highlighting the optimal global memory access pattern applicable to all GPU architectures. We'll provide printed copies of the material to all attendees for each session ? collect all four!


Additional Session Information
All
Tutorial
Programming Languages Other
Consulting Services
1h 20m
Session Schedule