DAJIN enables multiplex genotyping to simultaneously validate intended and unintended target genome editing outcomes

・DAJIN, a machine learning-based model, identifies and quantifies allele numbers and their mutation patterns and reports consensus sequences to visualize mutations in alleles at single-nucleotide resolutions.
・One of DAJIN’s distinguishing features is its automatic allele clustering and annotation, as well as the utilization of a long-read sequencer, and can identify cis-or trans-heterozygosity and complex mutant alleles such as unexpected indels and Large rearrangements.
・It is preeminent in multi specimen processing due to its PCR-based barcoding, enabling multiplexed sequencing and allowing sufficient coverage of numerous samples in a single run.

Abstract

Genome editing can introduce designed mutations into a target genomic site. Recent research has revealed that it can also induce various unintended events such as structural variations, small indels, and substitutions at, and in some cases, away from the target site. These rearrangements may result in confounding phenotypes in biomedical research samples and cause concern in clinical or agricultural applications. However, current genotyping methods do not allow a comprehensive analysis of diverse mutations for phasing and mosaic variant detection. Here, we developed a genotyping method with an on-target site analysis software named Determine Allele mutations and Judge Intended genotype by Nanopore sequencer (DAJIN) that can automatically identify and classify both intended and unintended diverse mutations, including point mutations, deletions, inversions, and cis double knock-in at single-nucleotide resolution. Our approach with DAJIN can handle approximately 100 samples under different editing conditions in a single run. With its high versatility, scalability, and convenience, DAJIN-assisted multiplex genotyping may become a new standard for validating genome editing outcomes.

Benefit

DAJIN has an advantage for primary and comprehensive analysis on multiple genome-edited samples processing compared to the current method in cost and workability when multiple samples are processed simultaneously. The machine-learning-based model could bypass molecular tagging to provide a feasible approach for routine assessment of genome editing outcomes.

Market Application

Long-read DNA sequencing platforms, such as Nanopore or PacBio

Publications

https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001507

Other

https://www.tsukuba.ac.jp/en/research-news/20220119040000.html

This entry was posted in Research Highlights. Bookmark the permalink.