pygscatalog

https://img.shields.io/readthedocs/pygscatalog/latest?logo=read-the-docs https://img.shields.io/github/last-commit/pgscatalog/pygscatalog/main https://github.com/pgscatalog/pygscatalog/workflows/Run%20pytest/badge.svg https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white

pygscatalog provides a set of Python CLI applications and developer libraries for working with polygenic scores (PGS), including integration with the PGS Catalog.

These applications and libraries are used internally by the PGS Catalog Calculator, which is an automated workflow for calculating PGS, including adjustment of scores in the context of genetic ancestry similarity.

If you’re interested in PGS but aren’t sure where to begin, the calculator is the best place.

If you’re working with PGS data and want to do some kinds of bespoke analysis not supported by the calculator, these tools might be helpful.

Credits

pygscatalog (aka pgscatalog_utils) is developed as part of the PGS Catalog project, a collaboration between the University of Cambridge’s Department of Public Health and Primary Care (Michael Inouye, Samuel Lambert) and the European Bioinformatics Institute (Helen Parkinson, Laura Harris).

This package contains code libraries and apps for working with PGS Catalog data and calculating PGS within the PGS Catalog Calculator (pgsc_calc) workflow, and is based on an earlier codebase (pgscatalog_utils) with contributions and input from members of the PGS Catalog team (Samuel Lambert, Benjamin Wingfield, Aoife McMahon Laurent Gil) and Inouye lab (Rodrigo Canovas, Scott Ritchie, Jingqin Wu).

If you use this package in your analysis, please cite:

Lambert, Wingfield, et al. (2024) Enhancing the Polygenic Score Catalog with tools for score calculation and ancestry normalization. Nature Genetics. doi:10.1038/s41588-024-01937-x.

All of our code is open source and permissively licensed with Apache 2.

This work has received funding from EMBL-EBI core funds, the Baker Institute, the University of Cambridge, Health Data Research UK (HDRUK), and the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101016775 INTERVENE.