Current Augmented Reality systems for outdoor use are almost solely developed at universities or research centres. Most of them combine different sensors to determine the position and head orientation of a user. By applying video-based techniques to these systems, more accurate results are possible. Here, frames of a video camera are compared with reference data like a digital elevation model, images, or a 3D GIS. These systems represent isolated approaches, which can only be applied to areas, where the reference data is prepared or at least available. To offer a widespread tracking system in a city requires the combination of different tracking techniques. At the Technical Universities of Munich and Vienna, approaches were developed, which aim to offer different tracking techniques in different areas. They allow tracking a person, who not only approaches a building but also enters it. Here, always the technique, that offers the most accurate results, is used by the system. This, however, does not consider that not all tasks require the most accurate tracking technique, which, in the majority of cases, is also the most time and power consuming one. The approach at Fraunhofer Institute of Computer Graphics combines different tracking sensors and techniques to offer the best fitting solution considering the current needs and the availability. This paper briefly describes our approach to sensor selection, sensor fusion and filtering the data with a Kalman filter.