Understanding travel mode choice behaviour is key to effective management of transport networks, many of which are under increasing strain from rising travel demand. Conventional approaches to simulating mode choice typically make use of behavioural models either derived from stated preference choice experiments or calibrated to observed average mode shares. Whilst these models have played and continue to play a key role in economic, social, and environmental assessments of transport investments, there is growing need to gain a deeper understanding of how people interact with transport services, through exploiting available but fragmented data on passenger movements and transport networks. This thesis contributes to this need through developing a novel approach for urban mode choice prediction and applying it to historical trip records in the Greater London area. The new approach consists of two parts: (i) a data generation framework which combines multiple data-sources to build trip datasets containing the likely mode-alternative options faced by a passenger at the time of travel, and (ii) a modelling framework which makes use of these datasets to fit, optimise, validate, and select mode choice classifiers. This approach is used to compare the relative predictive performance of a complete suite of Machine Learning (ML) classification algorithms, as well as traditional utility-based choice models. Furthermore, a new assisted specification approach, where a fitted ML classifier is used to inform the utility function structure in a utility-based choice model, is then explored. The results identify three key findings. Firstly, the Gradient Boosting Decision Trees (GBDT) model is the highest performing classifier for this task. Secondly, the relative differences in predictive performance between classifiers are far smaller than has been suggested by previous research. In particular, there is a much smaller performance gap identified between Random Utility Models (RUMs) and ML classifiers. Finally, the assisted specification approach is successful in using the structure of a fitted ML classifier to improve the performance of a RUM. The resulting model achieves significantly better performance than all but the GBDT ML classifier, whilst maintaining a robust, interpretable behavioural model. / Funding provided by UK Engineering and Physical Sciences Research Council via the Future Infrastructure and Built Environment Centre for Doctoral Training (EP/L016095/1).