calc.lib.cache.targetvariants ============================= .. py:module:: calc.lib.cache.targetvariants .. autoapi-nested-parse:: Provides the TargetVariants class for representing hard-called genotypes and variants This module defines a lightweight container of variant information, including: - chromosome name - chromosome position - reference allele - alternate allele Genotypes are stored in numpy unsigned 8-bit integer arrays (values between 0 - 255). Valid genotype values include 0, 1, and a sentinel value to represent missing data. The class exposes accessors for: - genotypes: a 3D numpy array of shape (n_variants, n_samples, ploidy) - samples: a list of sample identifiers - variant_df: a polars dataframe which contains variant metadata The module depends on: - numpy for genotype matrix ops - polars for building variant metadata tables quickly Attributes ---------- .. autoapisummary:: calc.lib.cache.targetvariants.logger Classes ------- .. autoapisummary:: calc.lib.cache.targetvariants.TargetVariants Functions --------- .. autoapisummary:: calc.lib.cache.targetvariants.add_missing_positions_to_lists Module Contents --------------- .. py:class:: TargetVariants(chr_name: list[str], pos: list[int], refs: list[str | None], alts: list[list[str] | None], gts: list[numpy.typing.NDArray[numpy.uint8]], samples: list[str], target_path: pgscatalog.calc.lib.types.Pathish, sampleset: str) .. py:method:: write_zarr(zarr_group: zarr.Group) -> None Write TargetVariants to a zarr group Sample IDs, variant metadata, and a genotype array is written to the zarr group The group must be at a file level in the hierarchy .. py:property:: genotypes :type: numpy.typing.NDArray[numpy.uint8] .. py:property:: samples :type: list[str] .. py:property:: variant_ids :type: list[str] .. py:property:: variant_metadata :type: calc.lib.cache.zarrmodels.ZarrVariantMetadata Convert variant metadata to a dict .. py:function:: add_missing_positions_to_lists(*, chroms: list[str], positions: list[int], ref_alleles: list[str | None], alt_alleles: list[list[str] | None], hard_calls: list[numpy.typing.NDArray[numpy.uint8]], scoring_file_regions: list[tuple[str, int]], seen_positions: set[tuple[str, int]], n_samples: int) -> None Mutates lists in place! .. py:data:: logger