La Trobe
- No file added yet -

Machine learning optimized polygenic scores for blood cell traits identify sex-specific trajectories and genetic correlations with disease.

Download (2.84 MB)
journal contribution
posted on 2023-08-02, 23:31 authored by Yu Xu, Dragana Vuckovic, Scott C Ritchie, Parsa Akbari, Tao Jiang, Jason Gavin Grealey, Adam S Butterworth, Willem H Ouwehand, David J Roberts, Emanuele Di Angelantonio, John Danesh, Nicole Soranzo, Michael Inouye
Genetic association studies for blood cell traits, which are key indicators of health and immune function, have identified several hundred associations and defined a complex polygenic architecture. Polygenic scores (PGSs) for blood cell traits have potential clinical utility in disease risk prediction and prevention, but designing PGS remains challenging and the optimal methods are unclear. To address this, we evaluated the relative performance of 6 methods to develop PGS for 26 blood cell traits, including a standard method of pruning and thresholding (P + T) and 5 learning methods: LDpred2, elastic net (EN), Bayesian ridge (BR), multilayer perceptron (MLP) and convolutional neural network (CNN). We evaluated these optimized PGSs on blood cell trait data from UK Biobank and INTERVAL. We find that PGSs designed using common machine learning methods EN and BR show improved prediction of blood cell traits and consistently outperform other methods. Our analyses suggest EN/BR as the top choices for PGS construction, showing improved performance for 25 blood cell traits in the external validation, with correlations with the directly measured traits increasing by 10%–23%. Ten PGSs showed significant statistical interaction with sex, and sex-specific PGS stratification showed that all of them had substantial variation in the trajectories of blood cell traits with age. Genetic correlations between the PGSs for blood cell traits and common human diseases identified well-known as well as new associations. We develop machine learning-optimized PGS for blood cell traits, demonstrate their relationships with sex, age, and disease, and make these publicly available as a resource.

History

Publication Date

2022-01-12

Journal

Cell Genomics

Volume

2

Issue

1

Article Number

100086

Pagination

18p.

Publisher

Cell Press

ISSN

2666-979X

Rights Statement

© 2021 The Author(s). This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Usage metrics

    Journal Articles

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC