BioPharma

Carnegie Mellon builds new algorithm for analyzing the cancer genome

No tool has been able to simultaneously analyze genomic sequencing data for copy number variations, or aneuploidy, in which chromosomes get duplicated and structural rearrangements, such as DNA insertions, deletions, duplications or rearrangements. Until now.

Extra copies of normally paired chromosomes. Variations in chromosome color show where DNA has become rearranged and duplicated within and between chromosomes.

Extra copies of normally paired chromosomes. Variations in chromosome color show where DNA has become rearranged and duplicated within and between chromosomes.

 

A cancer genome can be insanely complicated, making the disease difficult to study and treat. Large chunks of DNA — including millions of base pairs or even whole chromosomes — can get yanked from their original locations and moved elsewhere, duplicated or even flipped. But an algorithm, named Weaver, developed by researchers at Carnegie Mellon University, may offer new ways to break down some of that complexity.

Named for a character called Weaver in the video game Defense of the Ancients, Weaver also describes the algorithm’s function: weaving together disparate pieces of genomic information.

“The cancer genome is reshuffled and scrambled compared to the normal genome” said associate professor of computational biology Jian Ma, whose Computational Comparative Genomics Lab is leading the project. “Weaver’s goal is to interlace genomic pieces and keep things in the right order.”

To accomplish this, Weaver analyzes two major classes of mutations in tumor DNA. The first are copy number variations and aneuploidy, in which chromosomes get duplicated. The other is structural rearrangements, such as DNA insertions, deletions, duplications or rearrangements. The algorithm uses a model called the Markov Random Field, which allows researchers to visualize interrelationships in complex data.

Until now, no tool has been able to simultaneously analyze genomic sequencing data for both types of variations. It’s like being able to identify how furniture is aligned in a room, or how many rooms are in the house, but not both. Understanding how the rooms are arranged adds context to the furniture.

“The goal is to look at the sequencing data from the cancer genome and recognize these complex alterations,” said Ma. “None of the current structural variant detection methods are specifically designed for genomes with aneuploidy, a hallmark of cancer. Our algorithm can more precisely quantify complex rearrangement structure variants in the context of aneuploidy.”

By identifying and quantifying both types of alterations, Weaver provides a more comprehensive view of the cancer genome, as well as shedding light on how these different variations interact.

“In cancer, we can see that certain regions are frequently amplified,” said Ma. “Typically, we don’t know why that amplification is happening. By applying this method, we should at least be able to get a sense that the amplification is due to a specific type of structural variation.”

There’s also the possibility of putting these genomic shifts into temporal context, which might cast light on tumor evolution and generate a better understanding of genomic cause and effect.

“Which variation happened first?” asked Ma. “Did the structural variants or deletion happen before or after the chromosome duplication? This approach could give us a better picture of how these copy number alterations and structural variants are connected.”

While this approach has more immediate applications in research, Ma envisions a possible future for Weaver in the clinic, as these structural mutations could directly impact cancer behavior.

“I think this method to view the genome globally, and in an unbiased fashion, can find complementary information for us to understand the cancer,” said Ma.

Ma’s team has successfully test-driven Weaver in a variety of cancer cell lines (HeLa, MCF-7) and samples from the National Institutes of Health’s Cancer Genome Atlas program, research that was recently published in the journal Cell Systems. The next step will be to study specific tumors.

“We would like to apply this to more samples and identify patterns in the same types of cancer, such as breast, ovarian, glioblastoma,” said Ma. “Do these structural changes have an impact on gene expression or phenotypes? If we have a better understanding of the structure of the genome, we’ll be better able to interpret functional genomic information.”

Image: Ella Marushchenko via Carnegie Mellon University