pandas_plink.Chunk¶
- class pandas_plink.Chunk(nsamples=1024, nvariants=1024)[source]¶
Chunk specification.
It is effectively a contiguous submatrix of the dosage matrix.
- Parameters
nsamples (Optional[int]) – Number of samples in a single chunk, thresholded by the total number of samples. Set to
Noneto include all samples. Default to1024.nvariants (Optional[int]) – Number of variants in a single chunk, thresholded by the total number of variants. Set to
Noneto include all variants. Default to1024.
Note
Small chunks might increase computation time while large chunks might increase IO usage. If you have a small data set, try setting both
nsamplesandnvariantstoNone. If the data set is too large but your application will use every sample, try to setnsamples = Noneand choose a small value fornvariants.Examples
>>> from pandas_plink import Chunk >>> >>> Chunk() Chunk(nsamples=1024, nvariants=1024) >>> Chunk(nsamples=None) Chunk(nsamples=None, nvariants=1024)
- __init__(nsamples=1024, nvariants=1024)¶
Initialize self. See help(type(self)) for accurate signature.