We performed the multi-GPU massively parallel implementation of resolution-of-identity second order Moller-Plesset perturbation (RI-MP2) energy calculation suitable for calculations of large molecules on CPU/GPU hybrid supercomputers. We'll report the overview of implementation and the results of performance evaluation of the implementation using up to 1,349 nodes and 4,047 GPUs of the TSUBAME 2.5 supercomputer. The GPU computation speeds up considerably (4.1-6.6 times) the RI-MP2 calculations. Parallel scalability of present GPU implementation is good with the number of nodes. 514.7 TFLOPs of the measured peak performance is attained for the GPU job of (C96H24)2 using 1,349 nodes and 4,047 GPUs of TSUBAME 2.5, which is much higher than that of CPU jobs (87.5 TFLOPs). We also present application of the inter-molecular interaction analysis of nano-carbon molecular assemblies such as nanographenes.