Menou, Kristen
Numerical simulations are versatile predictive tools that permit explorations of complex systems. The ability of LLM agents to simulate real-world scenarios will expand the AI risk landscape. In the proxysimulation threat model, a user (or a deceptively aligned AI) can obfuscate the goal behind simulationbased predictions by leveraging the generali...
Liu, Yi (author)
Is healthy eating simply the intake of correct quantities of certain nutrients, regardless of how and where? This project addresses improving wellbeing within the broad context of eating.
Eating-related guidance products have a hyper-focus on nutrition and/or weight loss, to the detriment of a wider definition of “health”. Wellbeing is not only phy...
Wang, Ziyi (author)
Orthopedic surgeries are identified as one of the most noisy surgeries inside the OR (operation room). The highest noise sound level could reach up to around 130 dB, which is harmful to patients’ well-being and are likely to evoke negative feelings during surgery. In the previous study investigating the emotional experience of patients who receive ...
Yilmaz, Mert Can
This study aimed to investigate the impact of religion on individuals' expectations for AI alignment. Drawing from an online survey with over 200 participants from the United States, the study revealed that one's religious affiliation shapes certain expectations towards AI alignment, notably indicating a tendency among religiously affiliated indivi...
Lindahl, Caroline Saeid, Helin
Recent technological breakthroughs in natural language processing and artificial intelligence (AI) and the subsequent release of OpenAIs generative AI system, ChatGPT, have warranted much attention from researchers and the general public alike. Some with praise, foreseeing a brighter future for all, and some predicting the end of humanity. As AI ag...
Gleave, Adam R
Real-world applications of machine learning often have complex objectives and safety-critical constraints. Contemporary machine learning systems excel at achieving high average-case performance at tasks with simple procedurally specified objectives, but they struggle at many more demanding real-world tasks. In this thesis, we work towards developin...
Chen, P.Y. (author)
This project proposes a different way of looking at AI alignment, namely by introducing AI Alignment Dialogues. We argue that alignment dialogues have a number of advantages in comparison to data-driven approaches, especially for behaviour support agents, which aim to support users in achieving their desired future behaviours rather than their curr...
Shah, Rohin Monish
Typically when learning about what people want and don't want, we look to human action as evidence: what reward they specify, how they perform a task, or what preferences they express can all provide useful information about what an agent should do. This is essential in order to build AI systems that do what we intend them to do. However, existing ...