As GPUs get more widely deployed for machine learning, training is being done over larger datasets than ever before resulting in longer training time. Reducing training time from days to hours or less, requires clustering of large number of GPUs. As more users are starting to see the benefits of machine learning to their businesses, there is also a need to provide on-demand access to the users of these data center-based clusters. The ideal technology for such large-scale clustering in the data center is Ethernet. We'll discuss the work Broadcom is doing with NVIDIA to enable GPUDirect using its RoCE v2 line of Ethernet NICs.