The urban environments represent challenging areas for handheld device pose estimation (i.e., 3D position and 3D orientation) in large displacements. It is even more challenging with low-cost sensors and computational resources that are available in pedestrian mobile devices (i.e., monocular camera and Inertial Measurement Unit). To address these challenges, we propose a continuous pose estimation based on monocular Visual Odometry. To solve the scale ambiguity and suppress the scale drift, an adaptive pedestrian step lengths estimation is used for the displacements on the horizontal plane. To complete the estimation, a handheld equipment height model, with respect to the Digital Terrain Model contained in Geographical Information Systems, is used for the displacement on the vertical axis. In addition, an accurate pose estimation based on the recognition of known objects is punctually used to correct the pose estimate and reset the monocular Visual Odometry. To validate the benefit of our framework, experimental data have been collected on a 0.7 km pedestrian path in an urban environment for various people. Thus, the proposed solution allows to achieve a positioning error of 1.6–7.5% of the walked distance, and confirms the benefit of the use of an adaptive step length compared to the use of a fixed-step length.