Every summer, the Department of Statistics offers scholarships to a number of students so they can work with staff on real-world projects. Jale, right, is exploring evolutionary relationships with Dr Steffen Klaere. Jale explains:
“To understand the evolution of life on earth, we need to make inferences about evolutionary events leading up to the diversity of life we see now. My project is about phylogenetic inference – the set of bioinformatic tools for estimating evolutionary relationships between different species, or taxa.
“The set of taxa and their divergence is usually represented by a DNA-sequence alignment. The basic assumption is that each sequence represents a taxon and the evolutionary divergence between species is identified by differences in the respective sequences. The relatedness of taxa is then represented by a phylogenetic tree, where closely-related species are identified with leaves that are close together in the tree.
“In my project, I will investigate different statistics to assess the fitness between model and data. In particular, I am interested in identifying sites in the alignment that are not well represented by the model.
“It has been established that alignments for species with an old, most common recent ancestor will have more sites that will not inform the phylogenetic hypothesis. Such sites are often called saturated sites due to the assumption that they accumulated a lot of mutations over time. It has been hypothesised that such sites can lead to systematic error in the inference.
“That is why we want to identify influential outliers and mask them for the inference. Statistics like observed variability (OV) distances have been proposed, which are easier to compute, but tend to overestimate the number of saturated sites.
“My task will be to investigate methods that propose to identify saturated sites and test them on datasets known to suffer from systematic error. In particular, I want to test the utility of combining different statistics to address such problems.
“I am from Duelmen in Germany, but I have lived and studied in Greifswald for the past three years. I have just completed my Bachelor of Science in Biomathematics at the University of Greifswald. After this research project, I would like to pursue my Master of Science in Statistics at the University of Dortmund, also in Germany.
“What I like about statistics is that it has a wide area of use and deals with diverse topics in biology, medical science and economics. Furthermore, I like that statistics deals with all aspects of data, including the planning of data collection in terms of the design of surveys and experiments. You can work together with people from different institutions and you benefit from the different knowledge they bring.
“As I have some free time over Christmas and New Year, I am thinking about travelling to the Bay of Islands to spend some time at the beaches and enjoy the beautiful countryside. After the research project, I will also have some time to travel to the South Island before flying back to Germany.”