We are interested in understanding how genomic sequences function through computational approaches. My lab currently focuses on developing machine learning and AI methods for decoding the logic of gene regulation, improving the interpretation of genetic variation, and enabling flexible design of regulatory sequences. Our broader interests span genomics, human genetics, evolution, machine learning, and AI.
Princeton University
Ph.D. - Quantitative and Computational Biology
2017
Peking University
B.S. - Biological Sciences
2011
An oligodendrocyte silencer element underlies the pathogenic impact of lamin B1 structural variants.
An oligodendrocyte silencer element underlies the pathogenic impact of lamin B1 structural variants. Nat Commun. 2025 Feb 05; 16(1):1373.
PMID: 39910058
Sequence basis of transcription initiation in the human genome.
Sequence basis of transcription initiation in the human genome. Science. 2024 Apr 26; 384(6694):eadj0116.
PMID: 38662817
Structural variation cooperates with permissive chromatin to control enhancer hijacking-mediated oncogenic transcription.
Structural variation cooperates with permissive chromatin to control enhancer hijacking-mediated oncogenic transcription. Blood. 2023 Jul 27; 142(4):336-351.
PMID: 36947815
A sequence-based global map of regulatory activity for deciphering human genetics.
A sequence-based global map of regulatory activity for deciphering human genetics. Nat Genet. 2022 Jul; 54(7):940-949.
PMID: 35817977
Sequence-based modeling of three-dimensional genome architecture from kilobase to chromosome scale.
Sequence-based modeling of three-dimensional genome architecture from kilobase to chromosome scale. Nat Genet. 2022 May; 54(5):725-734.
PMID: 35551308
An analytical framework for interpretable and generalizable single-cell data analysis.
An analytical framework for interpretable and generalizable single-cell data analysis. Nat Methods. 2021 Nov; 18(11):1317-1321.
PMID: 34725480
Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk.
Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk. Nat Genet. 2019 Jun; 51(6):973-980.
PMID: 31133750
Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk.
Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk. Nat Genet. 2018 Aug; 50(8):1171-1179.
PMID: 30013180
Predicting effects of noncoding variants with deep learning-based sequence model.
Predicting effects of noncoding variants with deep learning-based sequence model. Nat Methods. 2015 Oct; 12(10):931-4.
PMID: 26301843
NIH Director's New Innovator Award
2021 - 2026