Project descriptions

The listed projects represent the available projects for the 2024 cohort. In year 1, students will undertake a group rotation project in 3 of the listed projects, before selecting a single project as their PhD topic.


Inferring ecological interactions from metagenomics

Primary supervisor: Dr. Christina Faust

Project summary

Advances in genomic sequencing have enabled the characterisation of entire viral communities within individual hosts without the need for isolation or a priori information on species. Metagenomic surveys have accelerated the discovery of new pathogen species across all kingdoms of life, vastly expanding our knowledge of global pathogen diversity. However, metagenomic approaches have generally been underutilised for advancing our understanding of ecological interactions. This project will leverage machine learning methods to use viral host predictor models to classify plant vs rodent viruses and to identify vector-borne viruses; this will allow them to characterise the community dynamics of viruses among rodents, identify demographic and ecological drivers of virome composition, and construct ecological interaction networks.

Modelling mobile animal groups as attention-based graph neural networks

Primary supervisor: Prof. Colin Torney

Project summary

Collective movement is ubiquitous throughout the natural world; from fish schools, marching ants, to herds of wildebeest, living organisms live and move collectively. Understanding how animal groups are able to achieve large scale coordinated motion in the absence of centralised control is an important challenge for movement ecology. Recent developments in artificial intelligence have made it possible to develop and train intelligent algorithms that are able to process large complex datasets and extract meaningful insight from these data. A new class of machine learning models, graph neural networks, present an exciting opportunity for understanding interacting systems including animal groups. These models can be trained on simulated or empirical data and can learn to predict the behaviour of complex interacting systems. This project seeks to develop a framework for using attention-based graph neural networks to model collective movement of animal groups using data from UAV studies of wildebeest, caribou, and bison. A core aim of the project will be to investigate the connections between information processing in animal groups and the emergent intelligence of attention-based neural networks.

Estimating the size of partially observed disease outbreaks from sequence data

Primary supervisor: Prof. Daniel Haydon

Project summary

Many important pathogens have now been shown to evolve fast enough that pathogen genomic diversity accumulates over transmission events during even a relatively short outbreak. The increasing affordability of pathogen genome sequence data and the speed at which it can accumulate provides many potential opportunities to study the epidemiology of disease outbreaks in news way. We are particularly interested in how this diversity can be used to estimate the overall number of cases generated over an outbreak. It is inevitable that sequence data will be acquired from only a subset of infected cases, so it is necessary to account quantitatively for cases for which genomic samples were not collected. Different approaches have been described in the literature, but they often require or assume detailed knowledge of the underlying epidemiology and host-pathogen population structure and dynamics. Here we propose to develop a semi-parametric and non-mechanistic approach for estimating the size of a partially observed (i.e. incompletely genomically sampled) outbreak from monophyletic pathogen sequence data that uses the internal and external lengths of the branches of a phylogenetic tree constructed from the genome data. We will need to develop a simple and unbiased estimator of outbreak size applicable to inevitably generally quite low observation frequencies, and that can accommodate various forms of common biases and heterogeneities in the observation process.


Mathematical and statistical modelling of coordinated animal behaviour

Primary supervisor: Prof. Dirk Husmeier

Project summary

Understanding collective movement is of key interest in ecology, as animal migration connects ecological communities and mediates their diversity and stability. The purpose of this project is to develop advanced statistical and computational tools to learn about the key driving factors of collective movement. The student will study a flock of 60 sheep, and formulate hypotheses about their collective movement with a set of mathematical models. The animals are equipped with GPS and accelerometers, from which high-resolution time series of their positions are available. This unique dataset will allow the student to understand the emergence of coordinated large- scale movement from individual decisions. The PhD project aims to contribute to the study of collective animal behaviour and movement, which sees the emergence of coordinated large scale phenomena from individual-based processes, such as switches between internal behavioural states and local interactions with other group members and the environment. The student will develop a novel suite of detailed mathematical models to describe these processes, and address the important challenges of sensitivity analysis and statistical inference: which model parameters have a significant effect on the large-scale phenomena of interest, how can they be estimated, and how can their uncertainty be quantified? The student's work will particularly focus on the study of a flock of free-roaming sheep in Patagonia, with location data via GPS tracking combined with accelerometer and magnetometer recordings and a detailed environmental map of vegetation and topography. This not only provides an excellent testbed for the evaluation of our new models and inference methods, but is also a unique resource for the elucidation of collective animal movement processes per se.

Understanding the impact of human presence on migratory wildebeest using edge machine learning

Primary supervisor: Prof. Grant Hopcraft

Project summary

Understanding how animals respond to human presence is a vital component of balancing conservation and development. This project will use a combination of GPS tracking data, accelerometers, edge machine learning, and AI remote sensing to understand how human presence and tourism infrastructure affects the movement and behaviour of wildebeest in the Serengeti National Park, Tanzania. The project will use a combination of statistical and mechanistic modelling approaches to understand how wildebeest alter their movement patterns and activity budgets in response to tourism and will specifically seek to develop strategies for high-volume, low impact, sustainable tourism practices.

Combining data sources in multi-layer point process models to distinguish habitat preferences and dispersal limitation in tropical forest plant communities

Primary supervisor: Prof. Janine Illian

Project summary

There is an interest in understanding the specific habitat preferences of different rainforest species and which ecological processes dominate in shaping plant communities. Large data sets exist from plots across multiple tropical forest sites that detail the exact spatial location of each tree as well as environmental covariates such as topography and soil nutrients. Some plots have a nested grid of seed traps that accumulate data on the spatial pattern of seed arrival through time. Analyses of these point patterns have shown that rainforest trees cluster in space. The processes that determine the emergence of these spatial structures are complex, but potentially include spatial aggregation on more favourable habitats, competition between species, non-random seed dispersal linked to the distribution of adult trees and spatially structured patterns of individual survival. Since these processes operate contemporaneously, they cannot be distinguished based on a single point pattern. For this project, the additional data have been collected through seed traps and seedling maps which can be used to gain a better understanding of seed dispersal. The project will develop multi-layered, multi-likelihood spatial point process models that combine the different data sources using hierarchical Bayesian methods fitted through computationally efficient model fitting approaches.


Useful ecological predictions in a changing and data-deficient world

Primary supervisor: Prof. Jason Matthiopoulos

Project summary

The predictions of applied tools such as species distribution modelling and population viability analysis are increasingly used to answer what-if questions about anthropogenic risk. This confronts us with two interlocking challenges: transferability and data-sparsity. How can we use data from a given place, time and species, to make useful (i.e., both accurate and precise) predictions about related species at some other place, in the future? This project will exploit the convergence between mechanistic ecological theory and modern statistical inference to drastically increase the predictive reach of ecological models in a world where change is the rule, not the exception. It will achieve this by combining models and data efficiently, according to six principles: 1) Space-for-time substitution, 2) Using contrasting environmental scenarios, 3) Using data from contrasting population scenarios, 4) Connecting habitat to population change, 5) Fully propagating uncertainty to predictions, 6) Sharing insights and inference between species.

Linking behaviour and space use with population dynamics

Primary supervisor: Prof. Juan Morales

Project summary

Ecology is undergoing a revolutionary transformation with a flood of new types of data from GPS tracking tags and biologging sensors that are providing an unprecedented level of detail about what animals are doing and where they are doing it. This wealth of data has been used to create step-changes in our understanding of the scales at which animals move, what habitats are selected, and the choices that determine behaviours such as vigilance and foraging. However, this new data could also be used to address new and larger-scale questions about population dynamics. In principle, forging quantitative links between fine-scale decision-making and individual fitness could permit projections up to a population scale, enabling predictions about demography and even the dynamics of ecological communities. The challenge is that the analytical methodology required to make this link is in a relatively undeveloped state, and these links are rarely rigorously established. Furthermore, the enormous volume of data being collected requires innovative and efficient ways to perform inference and uncertainty quantification. Our goal is to develop a comprehensive framework to link behaviour, space use, and population dynamics exploiting the recent technological advances in fine-scale data acquisition with cutting-edge statistical methodology. Application of this framework would provide us with much deeper and wide-ranging insights into the drivers of demographic change and will allow the development of pre-emptive conservation strategies for endangered populations and management plans for free ranging livestock.

Interactive machine learning for the automated detection of marine biological sounds to understand how biodiversity changes in time and space

Primary supervisor: Dr. Laurence De Clippele

Project summary

This project aims to automate marine biodiversity monitoring, using a combination of long-term deep-sea image and acoustic datasets, by developing a user-friendly interactive machine learning (IML) software tool. Deep-sea habitats, such as cold-water coral reefs and sponge grounds, are hotspots of biodiversity and are fundamental to our planet’s health, resilience and food provision, all of which are closely linked to human well-being. Increasing amounts of long-term image and sound (acoustic) datasets are being collected to understand how their biodiversity is affected by natural-, human- and climate-driven changes. Analysing these datasets manually for the presence of species-specific sounds, produced by e.g. fish and crustaceans, can be prohibitively time-consuming. While IML has proven useful for species detection in underwater images, using IML to detect fish and crustacean sounds is a new and exciting field of research. The PhD student will be part of international projects focusing on understanding the importance of deep-sea habitats and their restoration for biodiversity and commercially important species. For end-users, such as biologists and ecologists, to trust and use the IML tool, the PhD student will develop skills in understanding the ecological relevance of the outputs and science communication.