Big Data group

Our principle aim is to develop novel research methodology in BIG DATA to solve problems of international importance in biomedicine, business, engineering, environmental monitoring, epidemiology, finance, medical imaging, multimedia, public health, social science and visualisation, taking full account of cyber security and ethics. Our impact focused research capitalises on the skills and expertise of group members and is built around a number of strategic themes.

Theoretical foundations

Research in this theme ranges from the development of statistical and machine learning methodologies to extract meaningful information from Big Data, to the study of theoretical computer science. The underpinning aim is to develop new theoretical approaches, motivated by applications, to extract, manipulate and model ever increasing amounts of high dimensional information.

Statistics and data science

Research in Statistics and Data Science is grouped around five areas: modelling and inference, statistical learning, applications to medicine and health, applications to business and finance, and statistical education. Our particular areas of expertise include multivariate dependency and time series modelling, data integration, social media information extraction, evidence synthesis and meta-analysis, and business internationalisation.


Images and videos are one of the largest and fastest growing sources of information and present some of the biggest challenges for Data Science due to their volume and complexity. Image analysis is important in a wide range of research areas including medical imaging, the developmental biology, the analysis of biometric images, video traffic surveillance and general multimedia. The explosion in data from imaging sources makes their unaided processing and interpretation by human beings impossible, and requires the development of automated storage, management, processing and analysis algorithms.

Data management

Data volumes and streaming rates are expanding. In particle physics and multimedia, extra scale data volumes present huge challenges for data processing and storage. New technologies, systems and infrastructure must be developed to handle these data volumes. Big Data is also complex and heterogeneous, and the development of tools to link and query heterogeneous datasets offers a great opportunity to extract knowledge unavailable from individual datasets. However, proper schemes to manage and store data, and the efficient management of metadata, including data on sample preparation, experimental parameters and provenance, is essential to enable Big Data to deliver trustworthy results.

Real world problems

Data Science offers immense potential for economic and societal impact, making a major difference to the way we live and work. Research in this theme is aimed at taking an interdisciplinary approach to address a range of real world problems using Big Data from application areas including health, business and the economy. Such data include but are not limited to major survey databases, digital media and transactions data.

Cyber security

Research in this theme ranges from the theoretical and technical implications of cloud computing, through anonymization, disclosure limitation, privacy and cyber security, to the legal and social dimension of data.