match.lib.targetvariants

This module contains classes to work with target variants. When a scoring file is being reused to calculate scores for new genotypes, the new genotypes are target genomes.

Classes

TargetType

Create a collection of name/value pairs.

TargetVariant

A single target variant, including genomic coordinates and allele information

TargetVariants

A container of TargetVariant

Functions

read_bim(path)

Read plink1 bim variant information files using python core library

read_pvar(path)

Read plink2 pvar variant information files using python core library

Module Contents

class match.lib.targetvariants.TargetType(*args, **kwds)

Create a collection of name/value pairs.

Example enumeration:

>>> class Color(Enum):
...     RED = 1
...     BLUE = 2
...     GREEN = 3

Access them by:

  • attribute access:

    >>> Color.RED
    <Color.RED: 1>
    
  • value lookup:

    >>> Color(1)
    <Color.RED: 1>
    
  • name lookup:

    >>> Color['RED']
    <Color.RED: 1>
    

Enumerations can be iterated over, and know how many members they have:

>>> len(Color)
3
>>> list(Color)
[<Color.RED: 1>, <Color.BLUE: 2>, <Color.GREEN: 3>]

Methods can be added to enumerations, and members can have their own attributes – see the documentation for details.

BIM
PVAR
class match.lib.targetvariants.TargetVariant(*, chrom, pos, ref, alt, id)

A single target variant, including genomic coordinates and allele information

>>> a = TargetVariant(chrom="1", pos=12, ref="A", alt="C", id='1:12:A:C')
>>> a
TargetVariant(chrom='1', pos=12, ref='A', alt='C', id='1:12:A:C')
>>> b = a
>>> b == a
True
alt
chrom
id
pos
ref
class match.lib.targetvariants.TargetVariants(path, chrom=None)

A container of TargetVariant :raises: FileNotFoundError

>>> from pgscatalog.match.lib._config import Config  # ignore, only to load test data
>>> pvar = TargetVariants(Config.ROOT_DIR / "tests" / "data" / "hapnest.pvar")
>>> pvar.ftype
<TargetType.PVAR: 1>

Iterating over TargetVariants is done via a read-only generator attribute: >>> pvar.variants # doctest: +ELLIPSIS <generator object read_pvar at …> >>> for variant in pvar: … variant … break TargetVariant(chrom=’14’, pos=65003549, ref=’T’, alt=’C’, id=’14:65003549:T:C’)

gzip and zstandard compression is transparently handled for pvar: >>> pvar = TargetVariants(Config.ROOT_DIR / “tests” / “data” / “hapnest.pvar.zst”) >>> for variant in pvar: … variant … break TargetVariant(chrom=’14’, pos=65003549, ref=’T’, alt=’C’, id=’14:65003549:T:C’)

The same is true for bim files: >>> bim = TargetVariants(Config.ROOT_DIR / “tests” / “data” / “hapnest.bim.gz”) >>> bim.ftype <TargetType.BIM: 2>

>>> for variant in bim:
...    variant
...    break
TargetVariant(chrom='1', pos=10180, ref='C', alt='T', id='1:10180:T:C')
>>> bim = TargetVariants(Config.ROOT_DIR / "tests" / "data" / "hapnest.bim.zst")
>>> for variant in bim:
...    variant
...    break
TargetVariant(chrom='1', pos=10180, ref='C', alt='T', id='1:10180:T:C')
>>> bim = TargetVariants(Config.ROOT_DIR / "tests" / "data" / "hapnest.bim")
>>> for variant in bim:
...    variant
...    break
TargetVariant(chrom='1', pos=10180, ref='C', alt='T', id='1:10180:T:C')

Note, A1/A2 isn’t guaranteed to be ref/alt because of PLINK1 file format limitations. PGS Catalog libraries handle this internally, but you should be aware REF/ALT can be swapped by plink during VCF to bim conversion. Some pvar files can contain a lot of comments in the header, which are ignored: >>> pvar = TargetVariants(Config.ROOT_DIR / “tests” / “data” / “1000G.pvar”) >>> for variant in pvar: … variant … break TargetVariant(chrom=’1’, pos=10390, ref=’CCCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA’, alt=’C’, id=’1:10390:CCCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA:C’)

property chrom
property path
property variants
match.lib.targetvariants.read_bim(path)

Read plink1 bim variant information files using python core library

match.lib.targetvariants.read_pvar(path)

Read plink2 pvar variant information files using python core library