Daily

NIH invests $32M to study ‘biomedical big data explosion’

The National Institutes of Health today announced grants totaling $32 million that will help researchers across the country study and develop strategies to analyze and leverage “the explosion of increasingly complex biomedical data sets.” The grants, part of what’s called Big Data to Knowledge, are projected to reach an investment of nearly $656 million through […]

The National Institutes of Health today announced grants totaling $32 million that will help researchers across the country study and develop strategies to analyze and leverage “the explosion of increasingly complex biomedical data sets.”

The grants, part of what’s called Big Data to Knowledge, are projected to reach an investment of nearly $656 million through 2020, according to NIH, pending available funds.

With advancements in DNA sequencing and imaging, biomedical data generation is currently “exceeding researchers’ ability to capitalize on the data,” according to NIH. The awards will support the development of new approaches, software, tools and training programs to improve access to the data and hasten discoveries using the data.

presented by

While untold opportunities abound in improving human health through the data, examples include an improved ability to predict patients at increased risk for breast cancer and heart disease, to name just a couple, along with more effective ways to treat and prevent such conditions.

“Data creation in today’s research is exponentially more rapid than anything we anticipated even a decade ago,” NIH Director Francis S. Collins, M.D., Ph.D, said in a statement. “Mammoth data sets are emerging at an accelerated pace in today’s biomedical research and these funds will help us overcome the obstacles to maximizing their utility. The potential of these data, when used effectively, is quite astounding.”

The funding will establish 12 centers that will each tackle specific data science challenges. The awards will also provide support for a consortium to cultivate a scientific community-based approach on the development of a data discovery index, and for data science training and workforce development.

Studies generating large amounts of data continue to proliferate, from imaging projects to epidemiological studies examining thousands of participants to large disease-oriented efforts such as The Cancer Genome Atlas, which examines the genomic underpinnings of more than 30 types of cancer, and the ENCODE Project, which seeks to identify all functional elements in the human genome.

Such efforts, NIH said,  have generated billions of data points and provide opportunities for the original researchers and other investigators to use these results in their own work to advance our knowledge of biology and biomedicine.

“The future of biomedical research is about assimilating data across biological scales from molecules to populations,” said Philip E. Bourne, Ph.D., NIH associate director for data science.

The four main components of the awards are:

— Centers of Excellence for Big Data Computing. These 11 centers will develop approaches, methods, software, tools and other resources. While the development efforts will focus on specific research questions, their output is expected to be more generally relevant to various aspects of big data science, such as data integration and use, analysis of genomic data and managing data from electronic health records.

— BD2K-LINCS Perturbation Data Coordination and Integration Center. This center will be a data coordination center for the NIH Common Fund’s Library of Integrated Network-based Cellular Signatures program, which aims to characterize how a variety of types of cells, tissues and networks respond to disruption by drugs and other factors. The center will support data science research focusing on interpreting and integrating LINCS-generated data from different data types and databases in the LINCS-funded projects. This center is co-funded by BD2K and the NIH Common Fund.

— BD2K Data Discovery Index Coordination Consortium. This program will create a consortium to begin a community-based development of a biomedical data discovery index that will enable discovery, access and citation of biomedical research data sets.

— Training and Workforce Development. These awards support the education and training of current and future generations of researchers who will specialize in data science fields, as well as those whose work may require certain expertise in the use of or generation of large amounts of data and data resources.

Collectively, the four above-listed centers and recipients of the awards are looking to overcome the many challenges in finding the best uses for the streams of biomedical data. Such challenges include locating data and applying the appropriate software to analyze it, a lack of standards for the data and low adoption of whatever standards do exist across the research community, according to NIH.

New polices are also needed to help data sharing while protecting patient privacy. “ A lack of standards and an unwillingness to make data available to colleagues hampers efforts to make data fully useful to the broad research community,” the NIH said in announcing the awards.

Topics