Open data is a leap forward in how we tackle global disease outbreaks

19 March 2020

Portrait of Dr Moritz Kraemer

by Dr Moritz Kraemer
Associate Professor of Computational and Genomic Epidemiology

Moritz's research addresses questions related to the spatial spread of infectious diseases. Specifically he is concerned with the integration of epidemiological, spatial and genomic data and how novel insights can be best used to reduce the burden of...

The scope of COVID-19 transmission is global, but we have in place a global understanding that enables a better-informed global response than has ever been possible before.

If you’ve been paying close attention to what scientists have been saying about coronavirus, you might have come across HealthMap, an online map that is tracking the outbreak in real time. Beneath the seemingly simple face of the COVID-19 map is a deep and meticulous dataset that is freely accessible to anyone involved in COVID-19 research; something that represents a completely new approach to how we collect data and make it readily availability during an outbreak. The data provides game-changing ability in how we can respond to new global threats such as these.

Let’s rewind to early January, when a novel pneumonia appeared in the Wuhan region of China. One of the first important aspects to understand was whether all of these cases were the result of animal-to-human transmission events at the wet market in Wuhan, the centre of the outbreak, or whether person-to-person transmission was carrying the disease through the general population.

To help reach this understanding I started tracking a timeline of key events and confirmed cases, collecting data from social media and local reporting within China. I recorded each confirmed case, their travel history, the date of symptom onset, the date of hospitalisation and the date that it was confirmed as what was then called the 2019-nCov outbreak (now known as COVID-19). As cases accelerated, and I shared my data across a network of researchers, it became clear that there was wide demand for data on this disease across the scientific community, and it needed to be tabulated so it was more ‘ready-for analysis’.

Over a few short weeks this project evolved into the Open COVID-19 Data Working Group, the public face of which is the HealthMap project. Every day it brings together between 15 and 25 researchers across a number of global institutions, working full-time to ensure national governments, health organisations like the US Centers for Disease Control and Prevention (CDC), and the research community have access to accurate, usable and truly global data to understand and combat what is now a global pandemic.

One important example of how this data has been used is how early in the outbreak the global scientific community was able to identify the 5 to 14 day incubation period of the virus. The detailed location and timeline data made openly available by the Open COVID-19 Data Working Group, in conjunction with other datasets and clinical information, enabled researchers to ascertain this with a high level of confidence early on. This early understanding has directly led to the government advice and controls being put in place around the world to contain the virus and delay its transmission, allowing individual healthcare systems more time to prepare.

Given the importance of data and the direct global impact any findings from analysis of the dataset might have, ensuring that it is maintained in near real-time is essential. Our co-leaders in this working group are helping to ensure this quality as the quantity of data grows. It continues to be in the same format – every line in our open access database still aims to track the same data as we did originally; travel history, date of onset, date of confirmation, date of hospitalisation. Professor David Pigott of the University of Washington is leading on precise geo-location for each data line (each confirmed COVID-19 case) in the database. Mathematical modelling is being led by Professor Samuel Scarpino of Network Science Institute (NetSI) at Northeastern University and Harvard’s HealthMap team is leading on data visualisation. Next week the Open COVID-19 Data Working Group gets its first two, full-time, fully funded researchers thanks to the support of the Oxford Martin School – this is a vital development, as up until now all researchers have done this work on unpaid time as a sideline to the work they are supported for.

Fundamentally, this is a completely new approach to data collation and availability during a global disease outbreak. Not only are we crowdsourcing and rationalising large volumes of data into a single global data set, but we are making that data open and accessible to the whole research community and the public. In previous outbreaks this sort of data was only available to governments and the WHO, and even then it was often available only at a country level and in different formats depending on who had collected it.

This means the global community has more data and analysis available than ever before to respond to the spread of the disease, and to understand if and how those responses are working. As the outbreak progresses we can evaluate and compare the approaches being taken by, for example, Hong Kong, South Korea and Singapore and compare it to other countries to understand what has worked well and what lessons should be taken forward. For that purpose we have people already focussing on regions like Latin America, where there have been few recorded cases, to make early recommendations based on an evidence-based understanding of what is effective.

With the Open COVID-19 Data Working Group we have built a unique collaboration, supporting the scientific community and the general public with never-before-seen access to data and easy-to-understand visualisations. However, we have also created more than that; a world-leading consortium that is now in place to mobilise and respond to similar threats in the future. The scope of COVID-19 transmission is global, but we have in place a global understanding that enables a better-informed global response than has ever been possible before.

This opinion piece reflects the views of the author, and does not necessarily reflect the position of the Oxford Martin School or the University of Oxford. Any errors or omissions are those of the author.