The basic principle of machine learning is to train a function that transforms an input representation x to a desirable target representation y, using a set of parallel training data (examples) X and Y. They could be image and labels (image categorization), texts in two languages (machine translation), sensor data and motor commands (sensory-motor action), etc. With the spread of IoT, various new types of data (modalities) are becoming available and expected to enable many novel applications based on the machine learning framework. One fundamental limitation is that enough parallel data, X and Y, is not always available. We propose an approach to indirectly train with the help of the third modality, which is called pivot-based machine learning.