DATA SCIENCE HUB
This dedicated effort aims to maximize the discovery potential — and long-term value — of data generated across Break Through Cancer’s TeamLab projects. By spurring creation of robust tools for research data gathering and analysis, the Hub also aims to accelerate discovery at our collaborating institutions and in the global cancer research community.
This dedicated effort aims to maximize the discovery potential — and long-term value — of data generated across Break Through Cancer’s TeamLab projects. By spurring creation of robust tools for research data gathering and analysis, the Hub also aims to accelerate discovery at our collaborating institutions and in the global cancer research community.
PROJECT HIGHLIGHTS
- Ensure that all our scientific and clinical projects yield scientifically robust and technically reproducible findings — and that the resulting data are broadly accessible over the long term, thereby breaking through traditional research data silos.
- Expand computational discovery by developing and applying wholly new analytic methods and by integrating data sets across our projects.
- Create and adapt algorithms and methods necessary to make best use of data generated with new and emerging research technologies.
- Execute integrated, pan-cancer analysis that enable disease-specific findings to be explored in the context of other disease types.
- Provide a unique collaborative framework for training and mentoring future leaders in computational biology and for co-developing new technical approaches across laboratories and institutions.
- Guided by world-class experts in cancer data science with expertise in creating large scale data infrastructures, developing computational algorithms and methods, and deploying widely used software tools and databases on a global scale.
MEET THE TEAM
The Data Science Hub creates synergies and new opportunities for Break Through Cancer projects and collaborating researchers at five leading clinical and research centers.
We invite you to learn about the institutions and individual investigators driving this important research and development project.
Alex K. Shalek, PhD
MIT’s Koch Institute for Integrative Cancer Research
TeamLab(s): Conquering KRAS in Pancreatic Cancer, Demystifying Pancreatic Cancer Therapies, Eradicating Minimal Residual Disease in AML<, Targeting Clonal Hematopoiesis to Prevent AML, The Data Science Hub
Alex K. Shalek, PhD (pronouns: he/him/his) is the Director of the Institute for Medical Engineering & Science (IMES), the Director of the Health Innovation Hub at MIT, and the J. W. Kieckhefer Professor in IMES and the Department of Chemistry at MIT, as well as an Extramural Member of its Koch Institute for Integrative Cancer Research. He is also an Institute Member of the Broad Institute, a Member of the Ragon Institute, an Assistant in Immunology at MGB, and an Instructor in Health Sciences & Technology at HMS. Dr. Shalek received his bachelor’s degree summa cum laude from Columbia University and his Ph.D. from Harvard University in chemical physics under the guidance of Hongkun Park, and performed postdoctoral training under Hongkun Park and Aviv Regev (Broad/MIT). His lab’s research is directed towards the development and application of new approaches to elucidate cellular and molecular features that inform tissue-level function and dysfunction across the spectrum of human health and disease. Dr. Shalek and his work have received numerous honors including a NIH New Innovator Award, a Beckman Young Investigator Award, a Searle Scholar Award, a Pew-Stewart Scholar Award, the Avant-Garde (DP1 Pioneer) Award from the National Institute for Drug Abuse (NIDA), and an Alfred P. Sloan Research Fellowship in Chemistry, as well as the 2019-2020 Harold E. Edgerton Faculty Achievement Award at MIT and the 2020 HMS Young Mentor Award.
Caroline Chung, MD, MSc, FRCPC, CIP
The University of Texas MD Anderson Cancer Center
TeamLab: The Data Science Hub
Dr. Chung is vice president and Chief Data Office and Director of Data Science Development and Implementation of the Institute of Data Science in Oncology at MD Anderson Cancer Center. She is a clinician-scientist, associate professor in Radiation Oncology and Diagnostic Imaging with a clinical practice focused on CNS malignancies and a computational imaging lab focused on quantitative imaging and modeling to detect and characterize tumors and toxicities of treatment to enable personalized cancer treatment. Motivated by challenges observed in her own clinical and research pursuits, Dr. Chung has developed and leads institutional efforts to enable quantitative measurements for clinically impactful utilization and interpretation of data through a collaborative team science approach, including the Tumor Measurement Initiative (TMI) at MD Anderson. Internationally, Dr. Chung leads several multidisciplinary efforts to improve the generation and utilization of high quality, quantitative data to drive research and impact clinical practice, including her role as Vice Chair of the Radiological Society of North America (RSNA) Quantitative Imaging Biomarker Alliance (QIBA), Co-Chair of the Quantitative Imaging for Assessment of Response in Oncology Committee of the International Commission on Radiation Units and Measurements (ICRU) and National Academies of Sciences, Engineering, and Medicine-appointed committee addressing Foundational Research Gaps and Future Directions for Digital Twins. Beyond her clinical, research and administrative roles, Dr. Chung enjoys serving as an active educator and mentor with a passion to support the growth of diversity, equity and inclusion in STEM, including her role as Chair of Women in Cancer (http://www.womenincancer.org/) , a non-for-profit organization that is committed to advancing cancer care by encouraging the growth, leadership and connectivity of current and future oncologists, trainees and medical researchers.
Charlie Whittaker, PhD
MIT’s Koch Institute for Integrative Cancer Research
TeamLab: The Data Science Hub
Charlie Whittaker is a research scientist and leader of the Bioinformatics and Computing Core Facility in MIT’s Koch Institute for Integrative Cancer Research. Charlie and his staff provides expertise in bioinformatics, statistical support and powerful computational resources to the KI community. His facility has made numerous contributions to research projects in the Koch Institute and their work is frequently recognized in publications through authorship or acknowledgements.
Whittaker received his BS in biology from University of Vermont in 1990. His PhD thesis in cell and developmental biology was performed with Douglas DeSimone at the University of Virginia. He was a post-doctoral fellow in Richard Hynes’ lab at the Center for Cancer Research at MIT where he developed an interest in bioinformatics and computing. He then contributed to the Human Genome Project as part of Chinnappa Kordira’s group at the Broad Institute. He then rejoined the Koch Institute in his current role in 2004.
Cheng-Zhong Zhang, PhD
Dana-Farber Cancer Institute
TeamLab: The Data Science Hub
Cheng-Zhong Zhang, PhD, is an Assistant Professor of Pathology at Dana-Farber Cancer Institute and Harvard Medical School/Brigham and Women’s Hospital. He is also an associate member of the Cancer program at the Broad Institute of MIT and Harvard. Dr. Zhang received his PhD in Chemical Engineering with a minor in Physics from California Institute of Technology. He did postdoctoral research in single-molecule biophysics at Harvard Medical School and in cancer genomics at the Broad Institute. Zhang’s research focuses on the etiology and evolution of chromosomal aberrations during tumor development and progression. Using a combination of computational genomics and experimental biology, Zhang and collaborators have elucidated how simple cell division errors can instigate continuous chromosomal instability that both cause oncogenic alterations and contribute to therapy resistance. Zhang’s long-term goal is to determine the biological mechanisms of cancer genomic rearrangements and use genome instability as a biomarker both for early cancer detection and for new therapeutic development.
Elana Fertig, PhD
The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins
TeamLab(s): The Data Science Hub, Conquering KRAS in Pancreatic Cancer, Demystifying Pancreatic Cancer Therapies
Dr. Fertig directs a hybrid computational and experimental lab in the systems biology of cancer and therapeutic response to develop a new predictive medicine paradigm in cancer. Her wet lab develops time course models of therapeutic resistance and performs single cell technology development. Her computational methods blend mathematical modeling and artificial intelligence to determine the biomarkers and molecular mechanisms of therapeutic resistance from multi-platform genomics data. These techniques have broad applicability to the analysis of clinical biospecimens, developmental biology, and neuroscience. Dr. Fertig is a Professor of Oncology and Director of the Division / Associate Cancer Center Director in Quantitative Sciences, co-Director of the Convergence Institute, and co-Director of the Single-Cell Training and Analysis Center. She has secondary appointments in Biomedical Engineering and Applied Mathematics and Statistics, affiliations in the Institute of Computational Medicine, Center for Computational Genomics, Machine Learning, Mathematical Institute for Data Science, and the Center for Computational Biology and is a Daniel Nathans Scientific Innovator. Prior to entering the field of computational cancer biology, Dr Fertig was a NASA research fellow in numerical weather prediction.
Emma Dyer, MS
Dana-Farber Cancer Institute
TeamLabs: Revolutionizing GBM Drug Development Through Serial Biopsies, The Data Science Hub
Emma C. Dyer is a PhD student in the Harvard Biological Sciences in Public Health program studying in the laboratory of Dr. Franziska Michor at the Dana Farber Cancer Institute. She received her Bachelor’s and Master’s degrees from The University of Chicago where she studied Biological Sciences and Bioinformatics. She completed her Master’s work under the supervision of Dr. Alexander Pearson with a focus in computational pathology. Her work applied deep learning models for biomarker identification and survival prediction of patients with head and neck cancers. Currently she applies spatial statistics and deep learning methods to spatial multi-omics datasets to study tumor evolution and heterogeneity.
Ethan Cerami, PhD
Dana-Farber Cancer Institute
TeamLab: The Data Science Hub
Ethan Cerami, Ph.D. is the Director of the Knowledge Systems Group and Principal Scientist in the Department of Data Sciences at Dana-Farber Cancer Institute. Prior to joining Dana-Farber, he was the Director of Computational Biology at Blueprint Medicines, and Director of Cancer Informatics Development at Memorial Sloan Kettering Cancer Center (MSKCC). While at MSKCC, he co-founded the cBioPortal for Cancer Genomics, and his group remains active in its continued development. He is currently the Co-PI of the Human Tumor Atlas Network (HTAN) Data Coordinating Center, Co-PI of the National Cancer Institute Cancer Immunologic Data Commons (CIDC), and the Co-PI of the DFCI MatchMiner platform for algorithmically matching patients to precision cancer medicine trials. Dr. Cerami has a MS in Computer Science from New York University and a PhD in Computational Biology from Cornell University.
Greg Raskind, BS
Dana-Farber Cancer Institute
TeamLab:The Data Science Hub
Greg Raskind is a graduate student in the laboratory of Dr. Rameen Beroukhim at Harvard Medical School. He works on developing computational methods to identify patterns of structural variants associated with specific genetic alterations in cancer genomes. He received his BS in biochemistry and mathematical biology with a minor in computer science from the University of Michigan.
Kimal Rajapakshe, PhD
The University of Texas MD Anderson Cancer Center
TeamLab:Conquering KRAS in Pancreatic Cancer, Demystifying Pancreatic Cancer Therapies, The Data Science Hub
Kimal Rajapakshe is a Computational Scientist at The University of Texas MD Anderson Cancer Center with a decade of experience in analyzing and integration of multi-omics data from both solid tumor and liquid biopsy. He specialized in analyzing RNA-Seq(bulk and single cell), ATAC-Seq(bulk and single cell), genomic sequencing(WES & targeted), ChIP-Seq, Methylation array, proteomics and metabolomics data.
Linghua Wang, MD, PhD
The University of Texas MD Anderson Cancer Center
TeamLabs: Intercepting Ovarian Cancer , Conquering KRAS in Pancreatic Cancer, and The Data Science Hub
Dr. Wang is currently a tenure-track Assistant Professor in the Department of Genomic Medicine at MD Anderson Cancer Center. Dr. Wang received her MD in Medicine and her PhD in Cancer Genomics and completed her postdoctoral training at Human Genome Sequencing Center, Department of Molecular and Human Genetics, at Baylor College of Medicine. She was recruited to MD Anderson in 2017 and set up the Computational Biology Laboratory. Dr. Wang has significant expertise in computational biology, cancer immunogenomics, single-cell and spatial multiomics. Over the past few years, she has built a leading research program in cancer immunogenomics at MD Anderson and developed a collaborative, team-based approach to tackle cancer research. Her group has a vast experience in unraveling the heterogeneity and evolution of the complex tumor-immune ecosystems using the cutting-edge single-cell and spatial sequencing technologies, coupled with the state-of-the-art computation and modeling. Dr. Wang is the principal investigator of the CPRIT Individual Investigator Research Award and she serves as a co-Investigator for several peer-reviewed grants from NIH/NCI and U.S. Department of Defense. Dr. Wang is also the recipient of the Sabin Fellow Award, two SPORE Career Enhancement Program Awards and three Institutional Research Grant Awards. She serves as the Bioinformatics Lead and project co-Leader for two MD Anderson Cancer Moon Shot Projects and she also leads/co-leads several additional single-cell studies. When she was at Baylor, Dr. Wang also contributed significantly to the NHGRI rare cancer projects, the NCI Exceptional Responder Initiative, TCGA and pan-cancer projects. Dr. Wang is a productive investigator and she has published 32 first- or senior-authored papers over the past few years. Among them, 23 were published in the top-tier or other high-impact journals. As site Lead of Data Science for the pancreatic and ovarian cancer programs funded by Break Through Cancer, Dr. Wang is extremely enthusiastic to collaborate with world-renowned leaders, talented data scientists, and the multidisciplinary research teams across five participating institutions to develop effective data science strategies to better understand, detect, and treat the most lethal cancers.
Luciane T. Kagohara, PhD
The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins
TeamLab(s): Demystifying Pancreatic Cancer Therapies, Conquering KRAS in Pancreatic Cancer, The Data Science Hub
Dr. Luciane Kagohara is a molecular and computational biologist hybrid. She obtained her bachelor’s degree in Biomedical Sciences from Universidade Estadual Paulista (Botucatu, Brazil) and her Ph.D. from the A.C. Camargo Cancer Center (Sao Paulo, Brazil) at the Department of Pathology. Earlier in her carrier, she focused on the identification of epigenetic and genetic cancer biomarkers. During her postdoctoral fellowship at Johns Hopkins University, she pursued training in bioinformatics and developed the expertise to perform combined experimental and computational research on epigenetic regulation of gene expression in acquired resistance. As a molecular biologist, she has extensive experience in experimental molecular biology and aptitude to design, optimize and execute a wide range of approaches, including single-cell, spatial transcriptomics, ATAC-seq, RNA-seq, among other molecular biology techniques. As a computational biologist, Dr. Kagohara can perform integrated analysis of genetic and epigenetic high-throughput data generated with different platforms. Her unique background is suitable for her research program which applies state-of-the-art technologies, like single-cell and spatial transcriptomics, to investigate mechanisms of resistance to different immuno-, chemo- and targeted therapies. Using single-cell and spatial technologies experimental and computational approaches to study clinical trial samples, Dr. Kagohara expects to discover molecular mechanisms and cellular interaction components of therapeutic resistance in cancers.
Manuel Schuerch, PhD
Dana-Farber Cancer Institute
TeamLabs: Revolutionizing GBM Drug Development Through Serial Biopsies, The Data Science Hub
Manuel Schuerch, PhD, is a Postdoctoral Fellow at Dana-Farber Cancer Institute in the Department of Data Science.
Manuel has a diverse background in Machine Learning, Statistics, and a range of biomedical applications. He earned his Bachelor’s degree in Computer Science and a Master’s in Statistics from ETH Zurich, Switzerland, followed by a PhD in Machine Learning from USI Lugano, Switzerland. Afterward, he pursued postdoctoral research at UZH and USZ Zurich, where he developed machine learning methods for personalized decision support in fields such as rheumatology, ICU care, diabetes, delirium, immunology, oncology, and solid organ transplantation. Currently, at Dana-Farber Cancer Institute, Manuel’s research focuses on designing advanced machine learning and AI models to analyze biomedical data, particularly for cancer progression models, perturbation effects, and multi-omics cancer data integration. His technical expertise and interest span probabilistic modeling, generative time series, counterfactual treatment effect estimation, uncertainty quantification, explainable AI, and foundational AI models in biomedicine. Manuel’s work is driven by the goal of advancing AI methods for personalized medicine, with applications in oncology, organ transplantation, and beyond.
Rachel Karchin, PhD
The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins
TeamLab: Demystifying Pancreatic Cancer Therapies, The Data Science Hub
Rachel Karchin, PhD is a Professor of Biomedical Engineering at Johns Hopins University and is a core member of the Institute for Computational Medicine. She holds joint appointments in the Departments of Oncology and Computer Science, and is a member of the Cancer Biology Program and of the Multidisciplinary Pancreatic Cyst Team, both within the Kimmel Cancer Center. Since 2007, she has been an affiliate member for the McKusick-Nathans Institute of Genetic Medicine at the Johns Hopkins School of Medicine. Karchin co-led The Cancer Genome Atlas (TCGA) PanCan Atlas Essential Genes and Drivers Analysis Working Group (2017-2018). She received a BS in Computer Engineering (1998) and MS (2000) and PhD (2003) in Computer Science from the University of California, Santa Cruz, and completed her postdoctoral work at the University of California, San Francisco in the Department of Biopharmaceutical Sciences. Her lab develops algorithms and software to analyze genomic data and interpret its impact on cancer, the immune system and tumor evolution. Karchin was the Whiting School of Engineering’s William R. Brody Faculty Scholar from 2013-2019. She was inducted as a Fellow in the American Institute of Medical and Biological Engineers in 2017, received the AACR Team Science Award in 2020 (TCGA) and was appointed a Distinguished Graduate Alumnus of the Jack Baskin School of Engineering at University of California, Santa Cruz in 2021.
Rameen Beroukhim, MD, PhD
Dana-Farber Cancer Institute
TeamLabs: Intercepting Ovarian Cancer, The Data Science Hub, and Revolutionizing GBM Drug Development Through Serial Biopsies
Dr. Beroukhim is a practicing neuro-oncologist whose research focus is to understand tumor evolution, with emphases on brain tumors and alterations in chromosome structure. This work spans computational methods development, genomic studies of human cancers, and experiments in model systems. In early work describing integrated genomic profiling of glioblastomas, he developed the Genomic Identification of Significant Targets In Cancer (GISTIC) method that is now widely used to analyze copy-number changes across a range of cancers. He has also contributed to the development of several other genomic analysis methods and has led integrated genomic profiling efforts in multiple cancer types, including pan-cancer analyses across thousands of tumors. This work has identified novel mechanisms by which cancers develop and progress, and novel cancer dependencies that have spurred the development of new cancer therapeutics.
Shahab Sarmashghi, PhD
Dana-Farber Cancer Institute
TeamLab(s): The Data Science Hub, Revolutionizing GBM Drug Development Through Serial Biopsies
Shahab Sarmashghi, PhD, is a Postdoctoral Associate at the Broad Institute of MIT and Harvard and a Research Fellow at Dana-Farber Cancer Institute and HMS. He earned his BS and MS degrees from Sharif University of Technology, Iran, and his PhD from UC San Diego in Electrical Engineering. During his PhD, he developed several methods to utilize low-pass whole genome sequencing to study eukaryotic genomes. In the Beroukhim lab, Dr. Sarmashghi is interested in developing computational methods to study the biology of tumors, in particular GBM, and identify novel therapeutic targets. His main focus is on understanding positive and negative selection in cancer using somatic copy number alterations. He is also interested in studying cancer dependencies caused by loss of chromosome arms. He also works on developing new copy number calling pipelines and deploying them to the cloud.
Siri Palreddy
Dana-Farber Cancer Institute
TeamLabs: The Data Science Hub, Intercepting Ovarian Cancer,Revolutionizing GBM Drug Development Through Serial Biopsies, Conquering KRAS in Pancreatic Cancer, Demystifying Pancreatic Cancer Therapies
Siri Palreddy is a Clinical Research Coordinator for Dana-Farber Cancer Institute and works across the Break Through Cancer TeamLabs. She recently graduated magna cum laude from Amherst College, holding a BA in Biology and English.
Sophie Webster
Dana-Farber Cancer Institute
TeamLab: The Data Science Hub
Sophie Webster is an Associate Computational Biologist in Rameen Beroukhim’s lab at the Broad Institute and at Dana-Farber Cancer Institute. She currently studies mechanisms of double-strand break repair across cancer types and is developing methods to detect sequence patterns in genomic rearrangements. Sophie graduated from Harvard College in 2022 with a bachelor’s degree in Integrative Biology.
Stuart Levine, PhD
MIT’s Koch Institute for Integrative Cancer Research
TeamLab(s): Revolutionizing GBM Drug Development Through Serial Biopsies
, The Data Science Hub
Stuart S. Levine, PhD, is the director of the MIT BioMicro Center and co-leader of the Koch Institute Integrated Genomics and Bioinformatics core facility. Dr. Levine received his bachelor’s degree from MIT and his PhD from Harvard University under the guidance of Robert Kingston, and performed postdoctoral training under Richard Young (Whitehead Institute). The Integrated Genomics and Bioinformatics core provides MIT researchers with facilities for high-throughput data-intensive genomics and bioinformatic analysis, as well as large-scale database storage, management, data mining and data modeling required to fully implement systems approaches to investigate a broad spectrum of biological problems. Dr Levine is currently president of the Northeast Regional Laboratory Staff and Core Directors, a chapter of the Association of Biomolecular Resource Facilities.
Wesley Tansey, PhD
Memorial Sloan Kettering Cancer Center
TeamLab: The Data Science Hub
Wesley Tansey, PhD, is an Assistant Professor in Computational Oncology at Memorial Sloan Kettering Cancer Center and in Physiology, Biophysics, and Systems Biology at Weill Cornell Medical College. Dr. Tansey received his PhD in Computer Science from the University of Texas at Austin and completed his postdoctoral training at Columbia University under the guidance of Raul Rabadan and David Blei. The Tansey lab works on statistical machine learning methods to address pressing problems in cancer data science, including spatial modeling of the tumor microenvironment, biomarker detection, and combination therapy discovery. Dr. Tansey is the recipient of an R37 MERIT grant award from the National Cancer Institute.
Yasmine Ahmed, PhD
The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins
TeamLab: The Data Science Hub
Yasmine Ahmed, PhD, is a Postdoctoral Fellow at Karchin lab. She has a PhD in Electrical and Computer Engineering from the University of Pittsburgh in Pennsylvania, an MSc degree in Biomedical Engineering from Nile University in Egypt and a BSc degree in Biomedical Engineering from Cairo University in Egypt. Her interdisciplinary research unites Natural Language Processing, Artificial Intelligence, Machine Learning and Graph Theoretical Analysis with application in Computational and Systems Biology.
MEET THE TEAM
The Data Science Hub creates synergies and new opportunities for Break Through Cancer projects and collaborating researchers at five leading clinical and research centers.
We invite you to learn about the institutions and individual investigators driving this important research and development project.
View Team
Alex K. Shalek, PhD
MIT’s Koch Institute for Integrative Cancer Research
TeamLab(s): Conquering KRAS in Pancreatic Cancer, Demystifying Pancreatic Cancer Therapies, Eradicating Minimal Residual Disease in AML<, Targeting Clonal Hematopoiesis to Prevent AML, The Data Science Hub
Alex K. Shalek, PhD (pronouns: he/him/his) is the Director of the Institute for Medical Engineering & Science (IMES), the Director of the Health Innovation Hub at MIT, and the J. W. Kieckhefer Professor in IMES and the Department of Chemistry at MIT, as well as an Extramural Member of its Koch Institute for Integrative Cancer Research. He is also an Institute Member of the Broad Institute, a Member of the Ragon Institute, an Assistant in Immunology at MGB, and an Instructor in Health Sciences & Technology at HMS. Dr. Shalek received his bachelor’s degree summa cum laude from Columbia University and his Ph.D. from Harvard University in chemical physics under the guidance of Hongkun Park, and performed postdoctoral training under Hongkun Park and Aviv Regev (Broad/MIT). His lab’s research is directed towards the development and application of new approaches to elucidate cellular and molecular features that inform tissue-level function and dysfunction across the spectrum of human health and disease. Dr. Shalek and his work have received numerous honors including a NIH New Innovator Award, a Beckman Young Investigator Award, a Searle Scholar Award, a Pew-Stewart Scholar Award, the Avant-Garde (DP1 Pioneer) Award from the National Institute for Drug Abuse (NIDA), and an Alfred P. Sloan Research Fellowship in Chemistry, as well as the 2019-2020 Harold E. Edgerton Faculty Achievement Award at MIT and the 2020 HMS Young Mentor Award.
Caroline Chung, MD, MSc, FRCPC, CIP
The University of Texas MD Anderson Cancer Center
TeamLab: The Data Science Hub
Dr. Chung is vice president and Chief Data Office and Director of Data Science Development and Implementation of the Institute of Data Science in Oncology at MD Anderson Cancer Center. She is a clinician-scientist, associate professor in Radiation Oncology and Diagnostic Imaging with a clinical practice focused on CNS malignancies and a computational imaging lab focused on quantitative imaging and modeling to detect and characterize tumors and toxicities of treatment to enable personalized cancer treatment. Motivated by challenges observed in her own clinical and research pursuits, Dr. Chung has developed and leads institutional efforts to enable quantitative measurements for clinically impactful utilization and interpretation of data through a collaborative team science approach, including the Tumor Measurement Initiative (TMI) at MD Anderson. Internationally, Dr. Chung leads several multidisciplinary efforts to improve the generation and utilization of high quality, quantitative data to drive research and impact clinical practice, including her role as Vice Chair of the Radiological Society of North America (RSNA) Quantitative Imaging Biomarker Alliance (QIBA), Co-Chair of the Quantitative Imaging for Assessment of Response in Oncology Committee of the International Commission on Radiation Units and Measurements (ICRU) and National Academies of Sciences, Engineering, and Medicine-appointed committee addressing Foundational Research Gaps and Future Directions for Digital Twins. Beyond her clinical, research and administrative roles, Dr. Chung enjoys serving as an active educator and mentor with a passion to support the growth of diversity, equity and inclusion in STEM, including her role as Chair of Women in Cancer (http://www.womenincancer.org/) , a non-for-profit organization that is committed to advancing cancer care by encouraging the growth, leadership and connectivity of current and future oncologists, trainees and medical researchers.
Charlie Whittaker, PhD
MIT’s Koch Institute for Integrative Cancer Research
TeamLab: The Data Science Hub
Charlie Whittaker is a research scientist and leader of the Bioinformatics and Computing Core Facility in MIT’s Koch Institute for Integrative Cancer Research. Charlie and his staff provides expertise in bioinformatics, statistical support and powerful computational resources to the KI community. His facility has made numerous contributions to research projects in the Koch Institute and their work is frequently recognized in publications through authorship or acknowledgements.
Whittaker received his BS in biology from University of Vermont in 1990. His PhD thesis in cell and developmental biology was performed with Douglas DeSimone at the University of Virginia. He was a post-doctoral fellow in Richard Hynes’ lab at the Center for Cancer Research at MIT where he developed an interest in bioinformatics and computing. He then contributed to the Human Genome Project as part of Chinnappa Kordira’s group at the Broad Institute. He then rejoined the Koch Institute in his current role in 2004.
Cheng-Zhong Zhang, PhD
Dana-Farber Cancer Institute
TeamLab: The Data Science Hub
Cheng-Zhong Zhang, PhD, is an Assistant Professor of Pathology at Dana-Farber Cancer Institute and Harvard Medical School/Brigham and Women’s Hospital. He is also an associate member of the Cancer program at the Broad Institute of MIT and Harvard. Dr. Zhang received his PhD in Chemical Engineering with a minor in Physics from California Institute of Technology. He did postdoctoral research in single-molecule biophysics at Harvard Medical School and in cancer genomics at the Broad Institute. Zhang’s research focuses on the etiology and evolution of chromosomal aberrations during tumor development and progression. Using a combination of computational genomics and experimental biology, Zhang and collaborators have elucidated how simple cell division errors can instigate continuous chromosomal instability that both cause oncogenic alterations and contribute to therapy resistance. Zhang’s long-term goal is to determine the biological mechanisms of cancer genomic rearrangements and use genome instability as a biomarker both for early cancer detection and for new therapeutic development.
Elana Fertig, PhD
The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins
TeamLab(s): The Data Science Hub, Conquering KRAS in Pancreatic Cancer, Demystifying Pancreatic Cancer Therapies
Dr. Fertig directs a hybrid computational and experimental lab in the systems biology of cancer and therapeutic response to develop a new predictive medicine paradigm in cancer. Her wet lab develops time course models of therapeutic resistance and performs single cell technology development. Her computational methods blend mathematical modeling and artificial intelligence to determine the biomarkers and molecular mechanisms of therapeutic resistance from multi-platform genomics data. These techniques have broad applicability to the analysis of clinical biospecimens, developmental biology, and neuroscience. Dr. Fertig is a Professor of Oncology and Director of the Division / Associate Cancer Center Director in Quantitative Sciences, co-Director of the Convergence Institute, and co-Director of the Single-Cell Training and Analysis Center. She has secondary appointments in Biomedical Engineering and Applied Mathematics and Statistics, affiliations in the Institute of Computational Medicine, Center for Computational Genomics, Machine Learning, Mathematical Institute for Data Science, and the Center for Computational Biology and is a Daniel Nathans Scientific Innovator. Prior to entering the field of computational cancer biology, Dr Fertig was a NASA research fellow in numerical weather prediction.
Emma Dyer, MS
Dana-Farber Cancer Institute
TeamLabs: Revolutionizing GBM Drug Development Through Serial Biopsies, The Data Science Hub
Emma C. Dyer is a PhD student in the Harvard Biological Sciences in Public Health program studying in the laboratory of Dr. Franziska Michor at the Dana Farber Cancer Institute. She received her Bachelor’s and Master’s degrees from The University of Chicago where she studied Biological Sciences and Bioinformatics. She completed her Master’s work under the supervision of Dr. Alexander Pearson with a focus in computational pathology. Her work applied deep learning models for biomarker identification and survival prediction of patients with head and neck cancers. Currently she applies spatial statistics and deep learning methods to spatial multi-omics datasets to study tumor evolution and heterogeneity.
Ethan Cerami, PhD
Dana-Farber Cancer Institute
TeamLab: The Data Science Hub
Ethan Cerami, Ph.D. is the Director of the Knowledge Systems Group and Principal Scientist in the Department of Data Sciences at Dana-Farber Cancer Institute. Prior to joining Dana-Farber, he was the Director of Computational Biology at Blueprint Medicines, and Director of Cancer Informatics Development at Memorial Sloan Kettering Cancer Center (MSKCC). While at MSKCC, he co-founded the cBioPortal for Cancer Genomics, and his group remains active in its continued development. He is currently the Co-PI of the Human Tumor Atlas Network (HTAN) Data Coordinating Center, Co-PI of the National Cancer Institute Cancer Immunologic Data Commons (CIDC), and the Co-PI of the DFCI MatchMiner platform for algorithmically matching patients to precision cancer medicine trials. Dr. Cerami has a MS in Computer Science from New York University and a PhD in Computational Biology from Cornell University.
Greg Raskind, BS
Dana-Farber Cancer Institute
TeamLab:The Data Science Hub
Greg Raskind is a graduate student in the laboratory of Dr. Rameen Beroukhim at Harvard Medical School. He works on developing computational methods to identify patterns of structural variants associated with specific genetic alterations in cancer genomes. He received his BS in biochemistry and mathematical biology with a minor in computer science from the University of Michigan.
Kimal Rajapakshe, PhD
The University of Texas MD Anderson Cancer Center
TeamLab:Conquering KRAS in Pancreatic Cancer, Demystifying Pancreatic Cancer Therapies, The Data Science Hub
Kimal Rajapakshe is a Computational Scientist at The University of Texas MD Anderson Cancer Center with a decade of experience in analyzing and integration of multi-omics data from both solid tumor and liquid biopsy. He specialized in analyzing RNA-Seq(bulk and single cell), ATAC-Seq(bulk and single cell), genomic sequencing(WES & targeted), ChIP-Seq, Methylation array, proteomics and metabolomics data.
Linghua Wang, MD, PhD
The University of Texas MD Anderson Cancer Center
TeamLabs: Intercepting Ovarian Cancer , Conquering KRAS in Pancreatic Cancer, and The Data Science Hub
Dr. Wang is currently a tenure-track Assistant Professor in the Department of Genomic Medicine at MD Anderson Cancer Center. Dr. Wang received her MD in Medicine and her PhD in Cancer Genomics and completed her postdoctoral training at Human Genome Sequencing Center, Department of Molecular and Human Genetics, at Baylor College of Medicine. She was recruited to MD Anderson in 2017 and set up the Computational Biology Laboratory. Dr. Wang has significant expertise in computational biology, cancer immunogenomics, single-cell and spatial multiomics. Over the past few years, she has built a leading research program in cancer immunogenomics at MD Anderson and developed a collaborative, team-based approach to tackle cancer research. Her group has a vast experience in unraveling the heterogeneity and evolution of the complex tumor-immune ecosystems using the cutting-edge single-cell and spatial sequencing technologies, coupled with the state-of-the-art computation and modeling. Dr. Wang is the principal investigator of the CPRIT Individual Investigator Research Award and she serves as a co-Investigator for several peer-reviewed grants from NIH/NCI and U.S. Department of Defense. Dr. Wang is also the recipient of the Sabin Fellow Award, two SPORE Career Enhancement Program Awards and three Institutional Research Grant Awards. She serves as the Bioinformatics Lead and project co-Leader for two MD Anderson Cancer Moon Shot Projects and she also leads/co-leads several additional single-cell studies. When she was at Baylor, Dr. Wang also contributed significantly to the NHGRI rare cancer projects, the NCI Exceptional Responder Initiative, TCGA and pan-cancer projects. Dr. Wang is a productive investigator and she has published 32 first- or senior-authored papers over the past few years. Among them, 23 were published in the top-tier or other high-impact journals. As site Lead of Data Science for the pancreatic and ovarian cancer programs funded by Break Through Cancer, Dr. Wang is extremely enthusiastic to collaborate with world-renowned leaders, talented data scientists, and the multidisciplinary research teams across five participating institutions to develop effective data science strategies to better understand, detect, and treat the most lethal cancers.
Luciane T. Kagohara, PhD
The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins
TeamLab(s): Demystifying Pancreatic Cancer Therapies, Conquering KRAS in Pancreatic Cancer, The Data Science Hub
Dr. Luciane Kagohara is a molecular and computational biologist hybrid. She obtained her bachelor’s degree in Biomedical Sciences from Universidade Estadual Paulista (Botucatu, Brazil) and her Ph.D. from the A.C. Camargo Cancer Center (Sao Paulo, Brazil) at the Department of Pathology. Earlier in her carrier, she focused on the identification of epigenetic and genetic cancer biomarkers. During her postdoctoral fellowship at Johns Hopkins University, she pursued training in bioinformatics and developed the expertise to perform combined experimental and computational research on epigenetic regulation of gene expression in acquired resistance. As a molecular biologist, she has extensive experience in experimental molecular biology and aptitude to design, optimize and execute a wide range of approaches, including single-cell, spatial transcriptomics, ATAC-seq, RNA-seq, among other molecular biology techniques. As a computational biologist, Dr. Kagohara can perform integrated analysis of genetic and epigenetic high-throughput data generated with different platforms. Her unique background is suitable for her research program which applies state-of-the-art technologies, like single-cell and spatial transcriptomics, to investigate mechanisms of resistance to different immuno-, chemo- and targeted therapies. Using single-cell and spatial technologies experimental and computational approaches to study clinical trial samples, Dr. Kagohara expects to discover molecular mechanisms and cellular interaction components of therapeutic resistance in cancers.
Manuel Schuerch, PhD
Dana-Farber Cancer Institute
TeamLabs: Revolutionizing GBM Drug Development Through Serial Biopsies, The Data Science Hub
Manuel Schuerch, PhD, is a Postdoctoral Fellow at Dana-Farber Cancer Institute in the Department of Data Science.
Manuel has a diverse background in Machine Learning, Statistics, and a range of biomedical applications. He earned his Bachelor’s degree in Computer Science and a Master’s in Statistics from ETH Zurich, Switzerland, followed by a PhD in Machine Learning from USI Lugano, Switzerland. Afterward, he pursued postdoctoral research at UZH and USZ Zurich, where he developed machine learning methods for personalized decision support in fields such as rheumatology, ICU care, diabetes, delirium, immunology, oncology, and solid organ transplantation. Currently, at Dana-Farber Cancer Institute, Manuel’s research focuses on designing advanced machine learning and AI models to analyze biomedical data, particularly for cancer progression models, perturbation effects, and multi-omics cancer data integration. His technical expertise and interest span probabilistic modeling, generative time series, counterfactual treatment effect estimation, uncertainty quantification, explainable AI, and foundational AI models in biomedicine. Manuel’s work is driven by the goal of advancing AI methods for personalized medicine, with applications in oncology, organ transplantation, and beyond.
Rachel Karchin, PhD
The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins
TeamLab: Demystifying Pancreatic Cancer Therapies, The Data Science Hub
Rachel Karchin, PhD is a Professor of Biomedical Engineering at Johns Hopins University and is a core member of the Institute for Computational Medicine. She holds joint appointments in the Departments of Oncology and Computer Science, and is a member of the Cancer Biology Program and of the Multidisciplinary Pancreatic Cyst Team, both within the Kimmel Cancer Center. Since 2007, she has been an affiliate member for the McKusick-Nathans Institute of Genetic Medicine at the Johns Hopkins School of Medicine. Karchin co-led The Cancer Genome Atlas (TCGA) PanCan Atlas Essential Genes and Drivers Analysis Working Group (2017-2018). She received a BS in Computer Engineering (1998) and MS (2000) and PhD (2003) in Computer Science from the University of California, Santa Cruz, and completed her postdoctoral work at the University of California, San Francisco in the Department of Biopharmaceutical Sciences. Her lab develops algorithms and software to analyze genomic data and interpret its impact on cancer, the immune system and tumor evolution. Karchin was the Whiting School of Engineering’s William R. Brody Faculty Scholar from 2013-2019. She was inducted as a Fellow in the American Institute of Medical and Biological Engineers in 2017, received the AACR Team Science Award in 2020 (TCGA) and was appointed a Distinguished Graduate Alumnus of the Jack Baskin School of Engineering at University of California, Santa Cruz in 2021.
Rameen Beroukhim, MD, PhD
Dana-Farber Cancer Institute
TeamLabs: Intercepting Ovarian Cancer, The Data Science Hub, and Revolutionizing GBM Drug Development Through Serial Biopsies
Dr. Beroukhim is a practicing neuro-oncologist whose research focus is to understand tumor evolution, with emphases on brain tumors and alterations in chromosome structure. This work spans computational methods development, genomic studies of human cancers, and experiments in model systems. In early work describing integrated genomic profiling of glioblastomas, he developed the Genomic Identification of Significant Targets In Cancer (GISTIC) method that is now widely used to analyze copy-number changes across a range of cancers. He has also contributed to the development of several other genomic analysis methods and has led integrated genomic profiling efforts in multiple cancer types, including pan-cancer analyses across thousands of tumors. This work has identified novel mechanisms by which cancers develop and progress, and novel cancer dependencies that have spurred the development of new cancer therapeutics.
Shahab Sarmashghi, PhD
Dana-Farber Cancer Institute
TeamLab(s): The Data Science Hub, Revolutionizing GBM Drug Development Through Serial Biopsies
Shahab Sarmashghi, PhD, is a Postdoctoral Associate at the Broad Institute of MIT and Harvard and a Research Fellow at Dana-Farber Cancer Institute and HMS. He earned his BS and MS degrees from Sharif University of Technology, Iran, and his PhD from UC San Diego in Electrical Engineering. During his PhD, he developed several methods to utilize low-pass whole genome sequencing to study eukaryotic genomes. In the Beroukhim lab, Dr. Sarmashghi is interested in developing computational methods to study the biology of tumors, in particular GBM, and identify novel therapeutic targets. His main focus is on understanding positive and negative selection in cancer using somatic copy number alterations. He is also interested in studying cancer dependencies caused by loss of chromosome arms. He also works on developing new copy number calling pipelines and deploying them to the cloud.
Siri Palreddy
Dana-Farber Cancer Institute
TeamLabs: The Data Science Hub, Intercepting Ovarian Cancer,Revolutionizing GBM Drug Development Through Serial Biopsies, Conquering KRAS in Pancreatic Cancer, Demystifying Pancreatic Cancer Therapies
Siri Palreddy is a Clinical Research Coordinator for Dana-Farber Cancer Institute and works across the Break Through Cancer TeamLabs. She recently graduated magna cum laude from Amherst College, holding a BA in Biology and English.
Sophie Webster
Dana-Farber Cancer Institute
TeamLab: The Data Science Hub
Sophie Webster is an Associate Computational Biologist in Rameen Beroukhim’s lab at the Broad Institute and at Dana-Farber Cancer Institute. She currently studies mechanisms of double-strand break repair across cancer types and is developing methods to detect sequence patterns in genomic rearrangements. Sophie graduated from Harvard College in 2022 with a bachelor’s degree in Integrative Biology.
Stuart Levine, PhD
MIT’s Koch Institute for Integrative Cancer Research
TeamLab(s): Revolutionizing GBM Drug Development Through Serial Biopsies
, The Data Science Hub
Stuart S. Levine, PhD, is the director of the MIT BioMicro Center and co-leader of the Koch Institute Integrated Genomics and Bioinformatics core facility. Dr. Levine received his bachelor’s degree from MIT and his PhD from Harvard University under the guidance of Robert Kingston, and performed postdoctoral training under Richard Young (Whitehead Institute). The Integrated Genomics and Bioinformatics core provides MIT researchers with facilities for high-throughput data-intensive genomics and bioinformatic analysis, as well as large-scale database storage, management, data mining and data modeling required to fully implement systems approaches to investigate a broad spectrum of biological problems. Dr Levine is currently president of the Northeast Regional Laboratory Staff and Core Directors, a chapter of the Association of Biomolecular Resource Facilities.
Wesley Tansey, PhD
Memorial Sloan Kettering Cancer Center
TeamLab: The Data Science Hub
Wesley Tansey, PhD, is an Assistant Professor in Computational Oncology at Memorial Sloan Kettering Cancer Center and in Physiology, Biophysics, and Systems Biology at Weill Cornell Medical College. Dr. Tansey received his PhD in Computer Science from the University of Texas at Austin and completed his postdoctoral training at Columbia University under the guidance of Raul Rabadan and David Blei. The Tansey lab works on statistical machine learning methods to address pressing problems in cancer data science, including spatial modeling of the tumor microenvironment, biomarker detection, and combination therapy discovery. Dr. Tansey is the recipient of an R37 MERIT grant award from the National Cancer Institute.
Yasmine Ahmed, PhD
The Sidney Kimmel Comprehensive Cancer Center at Johns Hopkins
TeamLab: The Data Science Hub
Yasmine Ahmed, PhD, is a Postdoctoral Fellow at Karchin lab. She has a PhD in Electrical and Computer Engineering from the University of Pittsburgh in Pennsylvania, an MSc degree in Biomedical Engineering from Nile University in Egypt and a BSc degree in Biomedical Engineering from Cairo University in Egypt. Her interdisciplinary research unites Natural Language Processing, Artificial Intelligence, Machine Learning and Graph Theoretical Analysis with application in Computational and Systems Biology.
PROJECT SUMMARY
Break Through Cancer’s projects will generate large and varied datasets, reflecting work at multiple institutions, across different scales, using a variety of technologies and tissue types. The kinds of data range, for example, from clinical parameters and single cell multi-omics to high resolution spatial profiling and medical imaging. Achieving robust, reproducible research findings across this complex data stream, and ensuring long-term access to high volumes of multifaceted data it will yield, requires substantial technical infrastructure, expert data analysis, and strong data governance.
Break Through Cancer also has a unique opportunity to expand computational discovery by developing and applying wholly new analytic methods for the data our projects produce, and by integrating data sets across the diseases we are studying. For example, meta analyses of our projects could focus on cancer evolution and cellular plasticity, spatial determinants of the tumor microenvironment, and cancer-immune-stromal cell interactions. But pursuing these novel computational approaches to discovery will depend on a rigorous application of data science principles and systems from the projects’ inception.
The Data Science Hub will conceptualize, maintain, and—where necessary— create the technical resources necessary to pursue this ambitious vision for leveraging data created in Break Through Cancer’s projects. The Hub will also provide a unique collaborative framework for training and mentoring future leaders in computational biology, for offering opportunities for co-development of new technical approaches across laboratories and institutions, and for establishing enduring professional networks.
The Hub is pursuing six primary aims:
- Implementing bioinformatics best practices, tools, and pipelines—in order to enhance Break Through Cancer projects’ efficiency and interoperability, address their considerable analytical needs, and enable end-to-end reproducibility of findings. This process will include harmonizing metadata collection, data generation and processing, and quality control, as well as applying standards for data quality assessment.
- Creating a robust, cloud-based Data Science and Data Governance Platform that will support standardized pipelines, ensure secure data collaborations, and offer effective data governance across collaborating institutions — while protecting patient privacy and satisfying security and regulatory constraints.
- Advancing algorithms and methods necessary to make best and fullest use of data generated with new and emerging technologies such as multiplexed spatial imaging and spatial transcriptomic and proteomic analysis. Given that novel statistical and machine learning models for these data types are still in development, the Hub will pursue a targeted effort to develop new computational methods — prioritizing tools for three interconnected areas: cancer evolution and cellular plasticity; spatial determinants of microenvironments; and tumor cell immune cell interactions.
- Enabling robust statistical evaluation of existing and emerging data analysis tools — allowing for continuous monitoring of performance and early detection of data anomalies, code issues, or problematic assumptions. Toward this aim, the Hub will deliver production grade software for running new tools, benchmarking datasets for all data types, and visualization software tools to evaluate performance and generate interpretable results.
- Executing integrated, pan-cancer analysis across all Break Through Cancer-funded projects. By harnessing, for example, the multi-omics datasets generated by all TeamLabs, these integrated analyses can maximize insights from individual projects and enable disease-specific findings to be explored in the context of other disease types. Ultimately, the integrated data analyses the Hub create holds the potential to drive a better understanding of therapy resistance, tumor-immune interaction, the mechanisms underlying cancer immune evasion, and tumor evolution.
- Creating a Break Through Cancer Data Science Network that will forge close and ongoing connections among scientists at participating institutions, and will support ongoing development of the broader cancer data-science community. Not only will the Network engage computational scientists at multiple institutions in generating cutting-edge algorithms, it will also create a data portal accessible to cancer researchers globally, host events that advance community building and methods development, and offer unique learning opportunities for cancer data science trainees.
MAKE A DIFFERENCE
Break Through Cancer was created in February 2021 with an extraordinary matching gift of $250,000,000. Every gift to the Foundation supports groundbreaking cancer research and helps us to meet our matching commitment.
For questions about giving please email Lisa Schwarz, Chief Philanthropy Officer at LMS@BreakThroughCancer.org