A journey to open data

Despite scientific breakthroughs and the invention of increasingly effective medical technologies, the root causes of many neurodegenerative disorders remain unknown. This is particularly the case for dementia, whose global economic impact was estimated at US$604 billion dollars in 2010. Alzheimer’s Disease (AD) represents the most common type of dementia. Clinical studies allow a better understanding of the underlying biological mechanisms of neurodegenerative disorders as well as advances in treatment. For these reasons, the contribution of research participants is invaluable. Unfortunately, data acquisition is time consuming, complex and costly, thus limiting the number and quality of discoveries made. Open science, by which scientists support the open dissemination of their data, can be a means of overcoming these challenges. 

An impressive shared dataset

During the first Canadian Open Neuroscience Platform (CONP) annual meeting in Toronto April 28-30, researcher Dr. Sylvia Villeneuve and research coordinator Jennifer Tremblay-Mercier of the Douglas Mental Health University Institute, McGill University, presented the first dataset to be made available on the CONP portal. These data were collected as part of the PREVENT-AD project launched by Dr. John Breitner, who wished to identify biomarkers specific to AD before the occurrence of clinical symptoms. This longitudinal study was initially financed and conducted from 2011 to 2017 among 425 people with a family history of AD who are at a 2-3 fold increased risk of  developing this disease.

A total of around 25 researchers, healthcare professionals and support staff were mobilized to recruit participants, as well as gather, process and analyze the data. A noteworthy fact was that the team remained unchanged during the entire research period, which added to the strength of the experimental design. The data collected as part of Prevent-AD are very rich and include, as is the case for many other AD studies, neuropsychological tests, genetic data, and structural and functional magnetic resonance imaging (MRI) scans. Moreover, this research project stands out by its inclusion of other types of data that are less commonly collected. Among these are olfactory and auditory processing tests and cerebrospinal fluid analyses. Some hardy participants, keen to advance scientific knowledge, received up to six spinal taps over the study.

In 2017, Dr. Breitner decided to share the collected data in an open way, giving access even to researchers with whom he had not already collaborated in order to accelerate AD-related discoveries.  He entrusted to Tremblay-Mercier the task of coordinating the process of preparing the data for sharing, while Villeneuve on the other hand decided to continue following the participant cohort to study the evolution of their cognitive profile.

Jennifer Tremblay-Mercier (left), Dr. Alan Evans (centre) and Dr. Sylvia Villeneuve (right) at the CONP annual meeting.

Obstacles to overcome

The transition to open data sharing was not without pitfalls, pointed out Tremblay-Mercier. First, it required going back to the research ethics committee that had initially approved Breitner’s methodology to verify that sharing participants’ data would not violate their rights. Since the CONP infrastructure and ethics and governance framework were not yet in place and there being no precedent for this kind of procedure, this review process took several months. The committee eventually gave its approval, but it was conditional on obtaining the participants’ consent for their personal data to be publicly disseminated. To that end, Villeneuve and Tremblay-Mercier met with the participants during an annual PREVENT-AD gala held in 2018 to explain the issues and challenges surrounding the public release of their research data. Participants started giving their consent shortly after the gala through individual consultations with the research coordinator. This allowed the participants to make a free and informed decision, and ensured that the process of data sharing was ethical and respectful of participants’ privacy.


Registered access, managing the risks of open access

Open data sharing does not necessarily mean that access is given to the entire public. In the case of PREVENT-AD, it was planned that most of the data would be accessible through registered access, granted only to researchers affiliated with an academic or research institution. Also, the research team had to make sure that anyone accessing the data would be unable to identify the participants providing the data. Tremblay-Mercier explained that data de-identification was particularly important for ethical reasons, since the clinical and biological information gathered could possibly be used in a malicious way by individuals or private organizations. Another ethical risk that the research team had to manage was the possibility of participants themselves looking for their own data. In fact, some of them might wish to access the data to try to find out their odds of developing AD, even though they may lack the necessary training to interpret their data. Data de-identification thus necessitated certain precautions, such as the removal of certain details and the recoding of others (e.g. occupation). Some information also required a bit more technical work before they could be shared, such as the removal of participants’ face from their MRI scans.


Making a dataset ready for open science: a costly technical process

Tremblay-Mercier emphasized that other barriers could prevent researchers from openly sharing their research data. In fact, her experience with Prevent-AD made her realize that an enormous amount of resources needed to be deployed to prepare datasets for open sharing when this was not the original intention of the study. Even if numerous already-collected datasets could be reused by the scientific community, researchers do not always have the means necessary to render them shareable to a broader community or through open science practices. Although this is a complex process, it remains feasible. Based upon their first-hand experience the PREVENT-AD team is developing a resource which details the different steps required to more effectively achieve this goal. Tremblay-Mercier believed that scientists should consider making their data openly available prospectively, by constructing an appropriate data sharing plan even before seeking research protocols approval from ethics committees and before data acquisition. 


Beyond the sharing of results

According to Tremblay-Mercier, making research results accessible in scientific journals is a step in the right direction for society, but it is no longer enough. She emphasized that researchers are placing more and more importance on transparency in research; they find it important to access the raw data and the methods of analysis used in order to be assured of the validity of the conclusions drawn. Tremblay-Mercier noted that funding is a key element in the establishment of an open-science culture and mentioned that funding agencies are already prioritizing projects for which data would be openly shared. 

Moreover, Tremblay-Mercier noted that some research participants would also appreciate greater transparency with regards to their own data: “Participants often ask us for their test results, they want to know.” She explained that researchers are currently barred for ethical reasons from informing participants about experimental data or genetic information even if some of these data are considered risk factors. “Currently, ethics committees usually ban transmitting experimental information [to our participants for good reasons]. because ifIf we transmit sensitive information that might have an impact on their lives, we have to be sure this information is true and if it is, that it will not cause more harm than benefit to the participant.” Tremblay-Mercier remained optimistic, however: “I believe there could be a way of structuring a protocol to be put into place so that we could be a little bit more transparent with our research participants, so we can transmit some of our results, in a very managed way, accompanied by a physician, with follow-ups to be sure that participants understand what we tell them and to measure the impact of such information on their lives following the disclosure of certain results.” While this appears utopian, Tremblay-Mercier believes that such transparency could encourage clinical study participation and in turn increase the pace of discovery to address the unmet therapeutic needs in dementia and other diseases. 

Data collected from the majority of participants in the Prevent-AD project is currently available on the LORIS database and more data will be added progressively as additional participants consent to open access to their data. These data and other datasets will be accessible via CONP once the infrastructure has been established.