How to download scoring files from the PGS Catalog

pgscatalog-download is a CLI application that makes it easy to download scoring files from the PGS Catalog with a mixture of PGS, publication, or trait accessions. The application:

  • automatically retries downloads if they fail

  • validates the checksum of downloaded scoring files

  • automatically selects scoring files aligned to a requested genome build

Installation

$ pip install pgscatalog-core

Usage

Downloading PGS IDs scoring files aligned to GRCh38

$ mkdir downloads
$ pgscatalog-download --pgs PGS000822 PGS001229 --build GRCh38 -o downloads

Note

Setting --build will download scoring files harmonised by the PGS Catalog. This means scoring fields have consistent fields, like genomic coordinates.

Downloading all scores associated with a trait

To download all scores associated with Alzheimer’s disease:

$ mkdir downloads
$ pgscatalog-download --efo MONDO_0004975 -b GRCh38 -o downloads

By default scores associated with child traits, like late-onset Alzheimer’s disease, are included. To exclude them use:

$ mkdir downloads
$ pgscatalog-download --efo MONDO_0004975 -b GRCh38 -o downloads --efo_direct

Downloading all scores associated with a publication

If you’re interested in scores from a specific publication:

$ mkdir downloads
$ pgscatalog-download --pgp PGP000517 -b GRCh38 -o downloads

Help

$ pgscatalog-download --help