Scientific sessions, CRG Group Leader Seminars
Genomic and Epigenomic Variation in Disease Group, Bioinformatics and Genomics Programme, CRG
Early in the 70’s the heterogeneous nature of tumors was recognized in terms of the biological properties of sub-clones such as antigen resistance, immunogenicity and growth rate. These discoveries led to the hypothesis of tumor evolution based on the acquisition of alterations conferring a selective growth advantage to a neoplastic cell over surrounding normal or tumor cells. It suggests that further alterations can lead to new clones that, due to selective advantages, can become the next predominant cancer subpopulation. Cancer treatment can be thought of as a strong form of selective pressure, leading to the selection and growth of cancer cells that acquired resistance mediated by specific mutations. Today next generation sequencing (NGS) technologies allow for the identification of these heterogeneous alterations and the quantification of the fraction of tumor cells harboring a specific alteration (often termed cancer cell fraction,).
Although many cancer-driving mutations and genes are already known, these cannot explain tumor development in a majority of cases. Therefore we studied 326 cases of chronic lymphocytic leukemia (CLL), a tumor of the circulating white blood cells, namely B-lymphocytes. Exome-seq data from 326 CLL tumor-normal pairs from the ICGC-CLL project were analyzed to identify germline and somatic SNVs, indels and copy number variants, which were subsequently used to identify recurrently mutated driver genes. Using the deviation of non-reference allele frequencies (BAF) values of SNPs and CNVs from perfect heterozygosity we characterized tumor sub-clones down to 10% cancer cell fraction.
Next we developed a novel Bayesian model for identification of recurrently mutated driver genes that takes into account measures of positive selection, clonal fitness and locally adjusted background mutation rates. Our model can be applied to single genes as well as groups of genes, retrieved from either annotated pathways or protein-protein interaction networks. Applying our model to the 326 cases of CLL we identified most of the previously reported drivers of CLL as well as several novel candidate driver genes.