match.lib.scoringfileframe¶
Attributes¶
Classes¶
Like |
Functions¶
|
Get all match candidates for a VariantFrame dataframe and ScoringFileFrame dataframe |
Module Contents¶
- class match.lib.scoringfileframe.ScoringFileFrame(paths, chrom=None, cleanup=True, tmpdir=None)¶
Like
pgscatalog.core.NormalisedScoringFile, but backed by the polars dataframe libraryInstantiated with a
pgscatalog.core.NormalisedScoringFilewritten to a file. This is a long format/melted CSV file containing normalised variant data (i.e. the output of combine scorefiles application):>>> from ._config import Config >>> path = Config.ROOT_DIR.parent / "pgscatalog.core" / "tests" / "data" / "combined.txt.gz" >>> x = ScoringFileFrame(path) >>> x ScoringFileFrame([NormalisedScoringFile('.../combined.txt.gz')])
Using a context manager is important to prepare a polars dataframe:
>>> with x as arrow: ... assert all(os.path.exists(x) for x in x.arrowpaths) ... arrow.collect().shape (154, 11) >>> assert not any(os.path.exists(x) for x in x.arrowpaths) # all cleaned up
>>> from .variantframe import VariantFrame >>> path = Config.ROOT_DIR.parent / "pgscatalog.core" / "tests" / "data" / "hapnest.bim" >>> target = VariantFrame(path, dataset="hapnest") >>> with target as target_df, x as score_df: ... match_variants(score_df=score_df, target_df=target_df, target=target) MatchResult(dataset=hapnest, matchresult=[<LazyFrame ...
- save_ipc(destination)¶
Save the dataframe prepared by the context manager to an Arrow IPC file
Useful because the context manager will clean up the IPC files while exiting.
This method allows data to be persisted.
- arrowpaths = None¶
- chrom = None¶
- match.lib.scoringfileframe.match_variants(score_df, target_df, target)¶
Get all match candidates for a VariantFrame dataframe and ScoringFileFrame dataframe
Returns a
MatchResult
- match.lib.scoringfileframe.logger¶