calc.lib.cache.zarrmodels¶
Pydantic models for loading and saving data from zarr attributes (metadata)
These models help to read and write structured data about variants and samples This data is helpful when working with the genotype array (variants = row names, samples = column names).
Attributes¶
Classes¶
A dataframe-y model, suitable for ingesting with pandas / polars / databases |
Functions¶
|
Module Contents¶
- class calc.lib.cache.zarrmodels.ZarrVariantMetadata¶
A dataframe-y model, suitable for ingesting with pandas / polars / databases
Useful for saving / loading data about variants into a group-level zarr attribute.
Each target genome file will have its own variant metadata.
- check_even_length() ZarrVariantMetadata¶
- merge(other: ZarrVariantMetadata) ZarrVariantMetadata¶
- to_df() polars.DataFrame¶
- to_numpy() collections.abc.Mapping[str, numpy.typing.NDArray[Any]]¶
Convert the dataframe into 1D arrays
Handles converting strings to a consistent fixed width dtype
- alts: Annotated[list[list[str] | None], AfterValidator(is_valid_allele)]¶
- chr_name: list[str]¶
- chr_pos: list[pydantic.PositiveInt]¶
- ref: Annotated[list[str | None], AfterValidator(is_valid_allele)]¶
- variant_id: list[str]¶
- calc.lib.cache.zarrmodels.is_valid_allele(alleles: list[list[str] | None] | list[str | None]) list[list[str] | None] | list[str | None]¶
- calc.lib.cache.zarrmodels.logger¶