Toronto, Ontario george.stefan@mail.utoronto.ca

Develop Novel Statistical Methods For Analyzing Longread Sequencing Data

Read Post

Presenter: Gengming He

Supervisory Committee: Lisa Strug (Supervisor), Lei Sun, and Dehan Kong

Date and Time: Wednesday, September 14, 2022, 1-3pm EDT

Zoom Meeting Info: https://utoronto.zoom.us/j/83156954317

Abstract: Phase information and variable number tandem repeat (VNTR) are known to contribute to genetic diseases but were missing from most genetic association studies due to the constrains of short-read sequencing assays, limit our understanding of their roles in disease development. Recently developed long-read sequencing technology is capable of measuring phase information and complete VNTR sequences at individual level and novel statistical methods need to be tailored to incorporate this new data into analysis. In this project, we aim to 1) utilized the phase information to detect cis-acting effects between genetic variants based on a matrix regression framework and 2) develop a reference-free association method based on k-mers to detect disease related VNTRs. For objective 1), we introduced a cis-interaction matrix to analyze genotypes and their phase information simultaneously. Simulation studies demonstrated that this approach has improved power of detecting cis-acting effect comparing to current genotype-only methods. For objective 2), we developed a k-mer based method to genotype VNTR number of repeats without using the reference genome (reference-free). Genotyping error could occur due to the point mutations in the region. We proposed a score statistics to remedy its impact.

All are welcome!