match.lib.targetvariants ======================== .. py:module:: match.lib.targetvariants .. autoapi-nested-parse:: This module contains classes to work with target variants. When a scoring file is being reused to calculate scores for new genotypes, the new genotypes are target genomes. Classes ------- .. autoapisummary:: match.lib.targetvariants.TargetType match.lib.targetvariants.TargetVariant match.lib.targetvariants.TargetVariants Functions --------- .. autoapisummary:: match.lib.targetvariants.read_bim match.lib.targetvariants.read_pvar Module Contents --------------- .. py:class:: TargetType(*args, **kwds) Create a collection of name/value pairs. Example enumeration: >>> class Color(Enum): ... RED = 1 ... BLUE = 2 ... GREEN = 3 Access them by: - attribute access: >>> Color.RED - value lookup: >>> Color(1) - name lookup: >>> Color['RED'] Enumerations can be iterated over, and know how many members they have: >>> len(Color) 3 >>> list(Color) [, , ] Methods can be added to enumerations, and members can have their own attributes -- see the documentation for details. .. py:attribute:: BIM .. py:attribute:: PVAR .. py:class:: TargetVariant(*, chrom, pos, ref, alt, id) A single target variant, including genomic coordinates and allele information >>> a = TargetVariant(chrom="1", pos=12, ref="A", alt="C", id='1:12:A:C') >>> a TargetVariant(chrom='1', pos=12, ref='A', alt='C', id='1:12:A:C') >>> b = a >>> b == a True .. py:attribute:: alt .. py:attribute:: chrom .. py:attribute:: id .. py:attribute:: pos .. py:attribute:: ref .. py:class:: TargetVariants(path, chrom=None) A container of :class:`TargetVariant` :raises: FileNotFoundError >>> from pgscatalog.match.lib._config import Config # ignore, only to load test data >>> pvar = TargetVariants(Config.ROOT_DIR / "tests" / "data" / "hapnest.pvar") >>> pvar.ftype Iterating over TargetVariants is done via a read-only generator attribute: >>> pvar.variants # doctest: +ELLIPSIS >>> for variant in pvar: ... variant ... break TargetVariant(chrom='14', pos=65003549, ref='T', alt='C', id='14:65003549:T:C') gzip and zstandard compression is transparently handled for pvar: >>> pvar = TargetVariants(Config.ROOT_DIR / "tests" / "data" / "hapnest.pvar.zst") >>> for variant in pvar: ... variant ... break TargetVariant(chrom='14', pos=65003549, ref='T', alt='C', id='14:65003549:T:C') The same is true for bim files: >>> bim = TargetVariants(Config.ROOT_DIR / "tests" / "data" / "hapnest.bim.gz") >>> bim.ftype >>> for variant in bim: ... variant ... break TargetVariant(chrom='1', pos=10180, ref='C', alt='T', id='1:10180:T:C') >>> bim = TargetVariants(Config.ROOT_DIR / "tests" / "data" / "hapnest.bim.zst") >>> for variant in bim: ... variant ... break TargetVariant(chrom='1', pos=10180, ref='C', alt='T', id='1:10180:T:C') >>> bim = TargetVariants(Config.ROOT_DIR / "tests" / "data" / "hapnest.bim") >>> for variant in bim: ... variant ... break TargetVariant(chrom='1', pos=10180, ref='C', alt='T', id='1:10180:T:C') Note, A1/A2 isn't guaranteed to be ref/alt because of PLINK1 file format limitations. PGS Catalog libraries handle this internally, but you should be aware REF/ALT can be swapped by plink during VCF to bim conversion. Some pvar files can contain a lot of comments in the header, which are ignored: >>> pvar = TargetVariants(Config.ROOT_DIR / "tests" / "data" / "1000G.pvar") >>> for variant in pvar: ... variant ... break TargetVariant(chrom='1', pos=10390, ref='CCCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA', alt='C', id='1:10390:CCCCTAACCCCTAACCCTAACCCTAACCCTAACCCTAACCCTAA:C') .. py:property:: chrom .. py:property:: path .. py:property:: variants .. py:function:: read_bim(path) Read plink1 bim variant information files using python core library .. py:function:: read_pvar(path) Read plink2 pvar variant information files using python core library