Forecasting of public travel demand is of great importance to public transport management. It is a very challenging task that relies on many kinds of dependencies, such as temporal, spatial or exogenous factors (e.g., weather, event, service breakdown, ...). This paper investigates the short-term multi-step ahead forecasting (t+1, ..., t+8) of passenger demand aggregated by time step of 15 minutes.The forecasting is performed with smartcard data on a railway public transport network. Predicted flows could permit to optimize resource allocation, propose the best trip planning to passengers and better understand passenger flows during special events. We propose a state of the art deep learning approach, namely the gated recurrent unit (GRU), recurrent neural network, to tackle the short-term forecasting problem. We compared it to a well-known machine learning model namely Random Forest and long-term forecasting models. The experiments are conducted on a real 2-year smart card dataset provided by the transport organization authority of Ile-de-France (Ile-de-France Mobilites). The dataset depicts the passenger demand of 30 stations of the main Paris business district named La Defense, which corresponds to different transportation modes such as train (suburban railway service), metro, RER (Regional Express Network) and tramway. The evaluation of the models focuses on their performances in the presence of specific events through two subsets of data extracted from the whole dataset. These special periods correspond to transport network service anomaly periods such as service breakdown and special days period in term of passenger flow patterns such as public holiday.