CRISPR/Cas9 gene editing has made possible a multitude of biomedical experiments including studies that systematically turn off genes in cancer cells to look for ones that the cancer cells heavily depend on to survive and grow. These genes, or "cancer dependencies," are often promising drug targets. But new research shows that many of these CRISPR screening experiments rely on components, called CRISPR/Cas9 guides, that do not perform equally well in cells from people of all ancestries, which can cause CRISPR screens to miss cancer dependencies.
These CRISPR guides are short sequences of RNA that steer the CRISPR Cas9 enzyme to a specific site in the genome to cut DNA and deactivate a targeted gene. The new findings, from scientists at the Broad Institute of MIT and Harvard, show about 2 percent of these guides miss their target. This means that Cas9 won't make a cut and disable a specific gene, thereby obscuring a potential role of that gene in cancer growth. The team found that this happens disproportionately in cells from people of African ancestry, because CRISPR guides were designed using reference genomes from people who are largely of European ancestry and do not fully represent global genetic diversity.
"These inaccuracies exist in places we might not recognize and in ways that we wouldn't have predicted," said Rameen Beroukhim, an associate member at the Broad and a co-senior author on the paper, which appeared recently in Nature Communications. "This work shows that it's really worthwhile to conduct a systematic assessment of all the tools and datasets that we're using so that we can fix these hidden biases before they become an issue."
CRISPR is used ubiquitously in preclinical research, but only a minority of researchers are thinking carefully about the specific germline and ancestries that relate to their model systems. This is a warning call for the community that functional genomics is not immune to ancestry bias, and a source of opportunity to look more closely at this kind of data."
Jesse Boehm, associated scientist at the Broad and co-senior author on the paper
In their study, the team analyzed data from the Broad's Cancer Dependency Map (DepMap), the largest cancer dependency resource, which currently includes genome-wide screens in more than 1,000 cancer cell lines, about 90 percent of which are from people of European or East Asian descent.
Francisca Vazquez, director of the DepMap at the Broad, said that less than 1 percent of cell line-guide pairs in the DepMap are affected by the ancestry bias shown by this study, but that these biases are important to recognize and fix in future libraries. After these results were first posted as a preprint in 2022, the DepMap team removed from their library all guide RNAs that didn't work, so that instead of falsely returning no dependencies for the affected genes, the database indicates that there is not sufficient data to draw conclusions.
A New Kind of Dependency Search
Previously, the search for cancer dependencies focused on genetic changes that arise in some cells during a person's life, called somatic mutations. But when postdoctoral researcher and study first author Sean Misek joined Boehm's and Beroukhim's labs in 2020, he wanted to know how germline genetic variants -; which are inherited and in all cells throughout the body -; influence how tumors respond to treatment.
Misek found many strong associations between ancestry and genetic dependencies, and that most of those associations came from artifacts related to germline variants. In particular, he saw these effects in CRISPR guides. The sequence of the guide RNAs didn't sufficiently match the target genetic sequence because that target sequence varied depending on ancestry.
The scientists found that 89 percent of guides in genome-scale libraries have a mismatch in at least one cell line. They also found that mismatches occur to a greater degree in cells from people of African ancestry.
"These sorts of experimental biases are probably everywhere in preclinical research," Misek said. "We hope that this paper is part of a larger conversation."
Understanding the extent of this bias in a research project can be challenging for a scientist because it can take several days to download all the necessary data to do so. To address this, Boehm, Beroukhim, and the Pattern team at the Broad built Ancestry Garden, a website based on data from the Genome Aggregation Database (gnomAD) that can help researchers determine the effect of ancestry on a guide of their choosing.
"A lot of labs use CRISPR in some sense, and they should have a mechanism to check their reagents," Misek said. "Our goal is to make it a little bit easier for people to mitigate this issue in their own hands."
Library Lessons
Boehm said that genetic variation due to ancestry affects research far beyond the search for cancer dependencies, and that the extent to which the team's findings will impact individual studies will vary. Although the effect of this bias was relatively modest in the DepMap, it may be much larger in experiments that study only one or a small number of cell lines, Boehm said.
Going forward, the study team and DepMap researchers say that an important way to address this bias is to increase the genetic diversity in large-scale cell line libraries.
"We encourage the community to send us cell lines from under-represented populations if they have them," Vazquez said. "This is a very important issue to address."
Source:
Journal reference:
Misek, S. A., et al. (2024). Germline variation contributes to false negatives in CRISPR-based experiments with varying burden across ancestries. Nature Communications. doi.org/10.1038/s41467-024-48957-z.