Add to My Interests
Remove from My Interests
Data scientists love tools like R and Scikit-Learn, as they offer a convenient and familiar syntax for analysis tasks. However, these systems are limited to operating serially on datasets that can fit on a single node and don't allow for distributed execution. Mahout-Samsara is a linear algebra environment that offers both an easy-to-use Scala DSL and efficient distributed execution for linear algebra operations. Data scientists transitioning from R to Mahout can use the Samsara DSL for large-scale data sets with familiar R-like semantics. Machine learning and deep learning algorithms built with the Mahout-Samsara DSL are automatically parallelized and optimized to execute on distributed processing engines like Apache Spark and Apache Flink accelerated natively by CUDA, OpenCL, and OpenMP. We'll look at Mahout's distributed linear algebra capabilities and demonstrate an EigenFaces classification using Distributed SSVD executing on a GPU cluster. Machine learning practitioners will come away from this talk with a better understanding of how Samsara's linear algebra environment can help simplify developing highly scalable, CPU/GPU-accelerated machine learning and deep learning algorithms by focusing solely on the declarative specification of the algorithm without having to worry about the implementation details of a scalable distributed engine or having to learn to program with native math libraries.
Additional Session Information
Do Not Sell My Personal Information