match.lib.targetvariants¶
This module contains classes to work with target variants. When a scoring file is being reused to calculate scores for new genotypes, the new genotypes are target genomes.
Classes¶
Create a collection of name/value pairs. |
|
A single target variant, including genomic coordinates and allele information |
|
A container of |
Functions¶
|
Read plink1 bim variant information files using python core library |
|
Read plink2 pvar variant information files using python core library |
Module Contents¶
- class match.lib.targetvariants.TargetType(*args, **kwds)¶
Create a collection of name/value pairs.
Example enumeration:
>>> class Color(Enum): ... RED = 1 ... BLUE = 2 ... GREEN = 3
Access them by:
attribute access:
>>> Color.RED <Color.RED: 1>
value lookup:
>>> Color(1) <Color.RED: 1>
name lookup:
>>> Color['RED'] <Color.RED: 1>
Enumerations can be iterated over, and know how many members they have:
>>> len(Color) 3
>>> list(Color) [<Color.RED: 1>, <Color.BLUE: 2>, <Color.GREEN: 3>]
Methods can be added to enumerations, and members can have their own attributes – see the documentation for details.
- BIM¶
- PVAR¶
- class match.lib.targetvariants.TargetVariant(*, chrom, pos, ref, alt, id)¶
A single target variant, including genomic coordinates and allele information
>>> a = TargetVariant(chrom="1", pos=12, ref="A", alt="C", id='1:12:A:C') >>> a TargetVariant(chrom='1', pos=12, ref='A', alt='C', id='1:12:A:C')
>>> b = a >>> b == a True
- alt¶
- chrom¶
- id¶
- pos¶
- ref¶
- class match.lib.targetvariants.TargetVariants(path, chrom=None)¶
A container of
TargetVariant:raises: FileNotFoundError>>> from pgscatalog.match.lib._config import Config # ignore, only to load test data >>> pvar = TargetVariants(Config.ROOT_DIR / "tests" / "data" / "hapnest.pvar") >>> pvar.ftype <TargetType.PVAR: 1>
Iterating over TargetVariants is done via a read-only generator attribute: >>> pvar.variants # doctest: +ELLIPSIS <generator object read_pvar at …> >>> for variant in pvar: … variant … break TargetVariant(chrom=’14’, pos=65003549, ref=’T’, alt=’C’, id=’14:65003549:T:C’)
gzip and zstandard compression is transparently handled for pvar: >>> pvar = TargetVariants(Config.ROOT_DIR / “tests” / “data” / “hapnest.pvar.zst”) >>> for variant in pvar: … variant … break TargetVariant(chrom=’14’, pos=65003549, ref=’T’, alt=’C’, id=’14:65003549:T:C’)
The same is true for bim files: >>> bim = TargetVariants(Config.ROOT_DIR / “tests” / “data” / “hapnest.bim.gz”) >>> bim.ftype <TargetType.BIM: 2>
>>> for variant in bim: ... variant ... break TargetVariant(chrom='1', pos=10180, ref='C', alt='T', id='1:10180:T:C')
>>> bim = TargetVariants(Config.ROOT_DIR / "tests" / "data" / "hapnest.bim.zst") >>> for variant in bim: ... variant ... break TargetVariant(chrom='1', pos=10180, ref='C', alt='T', id='1:10180:T:C')
>>> bim = TargetVariants(Config.ROOT_DIR / "tests" / "data" / "hapnest.bim") >>> for variant in bim: ... variant ... break TargetVariant(chrom='1', pos=10180, ref='C', alt='T', id='1:10180:T:C')
Note, A1/A2 isn’t guaranteed to be ref/alt because of PLINK1 file format limitations. PGS Catalog libraries handle this internally, but you should be aware REF/ALT can be swapped by plink during VCF to bim conversion. Some pvar files can contain a lot of comments in the header, which are ignored: >>> pvar = TargetVariants(Config.ROOT_DIR / “tests” / “data” / “1000G.pvar”) >>> for variant in pvar: … variant … break TargetVariant(chrom=’1’, pos=10390, ref=’CCCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA’, alt=’C’, id=’1:10390:CCCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA:C’)
- property chrom¶
- property path¶
- property variants¶
- match.lib.targetvariants.read_bim(path)¶
Read plink1 bim variant information files using python core library
- match.lib.targetvariants.read_pvar(path)¶
Read plink2 pvar variant information files using python core library