match.lib.scoringfileframe ========================== .. py:module:: match.lib.scoringfileframe Attributes ---------- .. autoapisummary:: match.lib.scoringfileframe.logger Classes ------- .. autoapisummary:: match.lib.scoringfileframe.ScoringFileFrame Functions --------- .. autoapisummary:: match.lib.scoringfileframe.match_variants Module Contents --------------- .. py:class:: ScoringFileFrame(paths, chrom=None, cleanup=True, tmpdir=None) Like :class:`pgscatalog.core.NormalisedScoringFile`, but backed by the polars dataframe library Instantiated with a :class:`pgscatalog.core.NormalisedScoringFile` written to a file. This is a long format/melted CSV file containing normalised variant data (i.e. the output of combine scorefiles application): >>> from ._config import Config >>> path = Config.ROOT_DIR.parent / "pgscatalog.core" / "tests" / "data" / "combined.txt.gz" >>> x = ScoringFileFrame(path) >>> x # doctest: +ELLIPSIS ScoringFileFrame([NormalisedScoringFile('.../combined.txt.gz')]) Using a context manager is important to prepare a polars dataframe: >>> with x as arrow: ... assert all(os.path.exists(x) for x in x.arrowpaths) ... arrow.collect().shape (154, 11) >>> assert not any(os.path.exists(x) for x in x.arrowpaths) # all cleaned up >>> from .variantframe import VariantFrame >>> path = Config.ROOT_DIR.parent / "pgscatalog.core" / "tests" / "data" / "hapnest.bim" >>> target = VariantFrame(path, dataset="hapnest") >>> with target as target_df, x as score_df: ... match_variants(score_df=score_df, target_df=target_df, target=target) # doctest: +ELLIPSIS MatchResult(dataset=hapnest, matchresult=[