This lab will show three approaches for deployment. The first approach is to directly use inference functionality within a deep learning framework, in this case DIGITS and Caffe. The second approach is to integrate inference within a custom application by using a deep learning framework API, again using Caffe, but this time through its Python API. The final approach is to use the NVIDIA TensorRT?, which will automatically create an optimized inference run-time from a trained Caffe model and network description file. In this lab, you will learn about the role of batch size in inference performance, as well as various optimizations that can be made in the inference process. You will also explore inference for a variety of different DNN architectures trained in other DLI labs.
Prerequisites: C++ programming experience.
This lab utilizes GPU resources in the cloud, you are required to bring your own laptop.