A large part of this big data is most valuable when analysed quickly, as it is generated. Under several emerging application scenarios, such as in smart cities, operational monitoring of large infrastructure, and Internet of Things (IoT), continuous data streams must be processed under very short delays. In multiple domains, there is a need for processing data streams to detect patterns, identify failures, and gain insights. Data is often gathered and analysed by Data Stream Processing Engines (DSPEs).A DSPE commonly structures an application as a directed graph or dataflow. A dataflow has one or multiple sources (i.e., gateways or actuators); operators that perform transformations on the data (e.g., filtering); and sinks (i.e., queries that consume or store the data). Most complex operator transformations store information about previously received data as new data is streamed in. Also, a dataflow has stateless operators that consider only the current data. Traditionally, Data Stream Processing (DSP) applications were conceived to run in clusters of homogeneous resources or on the cloud. In a cloud deployment, the whole application is placed on a single cloud provider to benefit from virtually unlimited resources. This approach allows for elastic DSP applications with the ability to allocate additional resources or release idle capacity on demand during runtime to match the application requirements.We introduce a set of strategies to place operators onto cloud and edge while considering characteristics of resources and meeting the requirements of applications. In particular, we first decompose the application graph by identifying behaviours such as forks and joins, and then dynamically split the dataflow graph across edge and cloud. Comprehensive simulations and a real testbed considering multiple application settings demonstrate that our approach can improve the end-to-end latency in over 50% and even other QoS metrics. The solution search space for operator reassignment can be enormous depending on the number of operators, streams, resources and network links. Moreover, it is important to minimise the cost of migration while improving latency. Reinforcement Learning (RL) and Monte-Carlo Tree Search (MCTS) have been used to tackle problems with large search spaces and states, performing at human-level or better in games such as Go. We model the application reconfiguration problem as a Markov Decision Process (MDP) and investigate the use of RL and MCTS algorithms to devise reconfiguring plans that improve QoS metrics.