calc.lib.scorefile¶
Attributes¶
Classes¶
One or more scoring files processed with the pgscatalog-format program. |
Functions¶
|
Get variants from the "meta" zarr group array |
|
Load scoring files into the score_variant_table |
Module Contents¶
- class calc.lib.scorefile.Scorefiles(paths: calc.lib.types.Pathish | calc.lib.types.PathishList)¶
One or more scoring files processed with the pgscatalog-format program.
- get_unique_positions(chrom: str | None = None, zarr_group: zarr.Group | None = None) list[tuple[str, int]]¶
- column_types¶
- property paths: list[pathlib.Path]¶
Return the list of scoring file paths.
- calc.lib.scorefile.get_position_df(zarr_group: zarr.Group) polars.DataFrame¶
Get variants from the “meta” zarr group array
- calc.lib.scorefile.load_scoring_files(db_path: calc.lib.types.Pathish, scorefile_paths: calc.lib.types.PathishList, max_memory_gb: str, threads: int) None¶
Load scoring files into the score_variant_table
Parameters¶
- db_pathPathish
Path to the DuckDB database file.
- scorefile_pathsPathishList
A list of Pathish objects to the scoring CSV file(s). Scoring files must be in a structured format as created by pgscatalog-format.
- max_memory_gbstr
Maximum memory DuckDB is allowed to use (e.g., “4GB”).
- threadsint
Number of threads for DuckDB to use.
Notes¶
The score_variant_table is created or replaced each time this function is called.
Effect weights are stored as double precision floating-point numbers (np.float64 equivalent). All previous processing (e.g. by pgscatalog.core) treats effect weights as strings to prevent precision problems.
pgscatalog-format can process scoring files from the PGS Catalog or custom scoring files.
- calc.lib.scorefile.logger¶