In this paper, we propose and evaluate different learning strategies based on Multi-Arm Bandit (MAB) algorithms. They allow Internet of Things (IoT) devices to improve their access to the network and their autonomy, while taking into account the impact of encountered radio collisions. For that end, several heuristics employing Upper-Confident Bound (UCB) algorithms are examined, to explore the contextual information provided by the number of retransmissions. Our results show that approaches based on UCB obtain a significant improvement in terms of successful transmission probabilities. Furthermore, it also reveals that a pure UCB channel access is as efficient as more sophisticated learning strategies.