Coarse-grained reconfigurable architectures (CGRA) provide flexible and efficient solution for data-intensive applications. Loop kernels of these applications always consume much execution time of the whole program. However, mapping loop kernels onto CGRA is still hard to meet performance/cost constraints. This paper proposes a novel approach for automatically mapping loop kernels onto CGRA with loop selfpipelining to optimize data-intensive applications. The problem formulation is shown first. Then we present the resource sharing and pipelining of lspCGRA, together with its template standard. Further, a loop kernel pipelining mapping is proposed. The conclusions show that our approach gains less resource occupation by 16.3% and more throughputs by 169.1% than previous advanced SPKM.