Machine learning research on medical images has lagged similar work on conventional visible light images due to the added complexity of medical images and the lack of available annotated large image sets. To address this limitation, Stanford researchers are creating a massive clinical imaging research resource, containing de-identified versions of all Stanford radiology images, annotated with concepts from a medical imaging ontology, and linked to genomic data, tissue banks, and information from patients' electronic medical records. This dataset contains 0.5 petabyte of clinical radiology data, comprising 4.5 million studies, and over 1 billion images. The broad long-term objective of this resource is to dramatically reduce diagnostic imaging errors by: (1) facilitating reproducible science through standardization of data and algorithms for medical image machine learning research, (2) enabling patients to participate in the scientific enterprise by volunteering their data for these experiments, (3) spurring innovation by hosting competitions on clinically validated image sets, and (4) disseminating the resulting data, informatics tools, and decision support algorithms to the widest possible scientific audience. We'll review progress toward creation of the Stanford Medical ImageNet, including details of database structure and contents, and recent results from deep learning experiments on the data it contains.