In-memory computing with nanoscale memristive devices such as phase-change memory (PCM) has emerged as an alternative to conventional von Neumann systems to train deep neural networks (DNN) where a synaptic weight is represented by the device conductance. However, PCM devices exhibit temporal evolution of the conductance values referred to as the conductance drift, which poses challenges for maintaining synaptic weights reliably. Based on the mean behavior of 10,000 GST-based PCM devices, we observe that the drift coefficient is dependent on the conductance value. Moreover, we show that PCM drift is re-initialized and the drift history is erased after the application of even partial SET pulses. This is regardless of how much the device has drifted. With models capturing these features, we show that drift has a detrimental impact on training DNNs, but drift resilience can be significantly improved with a recently proposed multi-PCM synaptic architecture.