A wireless ad hoc network consists of a number of mobile nodes that temporarily form a dynamic infrastructure-less network. New routing protocols that can adapt to the frequent topology changes induced by node mobility and varying link qualities are needed. During the last decade dozens of different ad hoc routing protocols have been proposed, optimized and partially compared, mainly through simulation studies. This thesis takes an experimental approach to the evaluation of ad hoc routing protocols. We argue that real world experiments are needed in order to complement simulation studies, and to gain practical experience and insights that can provide feedback to routing protocol design and existing simulation models. For example, we discovered a performance discrepancy for the AODV protocol between real world experiments and corresponding simulation studies. This so called ``communication gray zone'' problem was explored and countermeasures were implemented. As a result we could eliminate this performance problem to a large extent. We have implemented a software-based testbed called APE to carry out efficient and systematic experimental evaluation of ad hoc routing protocols. Experiments with up to 37 participating ad hoc nodes have demonstrated APE's ability to scale efficiently and assess repeatability between test runs. APE is part of our methodology for test repeatability in a real world ad hoc routing protocol testbed. It addresses the repeatability issue induced by stochastic factors like the radio environment and node mobility. Using APE, we have performed a systematic experimental evaluation of three ad hoc routing protocols (AODV, OLSR and LUNAR). Our results show that TCP does not work satisfactorily even in very small networks with limited mobility.