We'll cover the optimizing details and the inspiring performance result using NVIDIA Kepler GPUs to accelerate the 10th-order three-dimensional elastic Reverse-Time-Migration (RTM) algorithm. As an essential migration method in seismic application to image the underground geology, RTM algorithm is particularly complex due to its computational workflow and is generally the most time-consuming kernel. Especially, RTM algorithms based on elastic wave equations (elastic RTM) are generally more computationally intense compared to RTM methods for acoustic constant-density media (acoustic RTM). In recent years, the desire for covering larger regions and acquiring better resolution has further increased the algorithmic complexity of RTM. Therefore, computing platforms and optimizing methods that can better meet such challenges in seismic applications become great demands. In this work, we first modify the backward process in the RTM matrix format by adding extra layers, to generate a straightforward stencil that fits well with GPU architecture. A set of optimizing techniques, such as memory tuning and computing occupancy configuration, is then performed to exploit the performance over a set of different GPU cards. By further using the the streaming mechanism, we manage to obtain a communication-computation overlapping among multiple GPUs. The best performance employing four Tesla K40 GPU cards is 28 times better over a fully optimized reference based on a socket with two E5-2697 CPUs. This work proves the great potential to employ NVIDIA GPU accelerators in future geophysics exploration algorithms.