pandas_plink.Chunk¶
- class pandas_plink.Chunk(nsamples=1024, nvariants=1024)[source]¶
Chunk specification.
It is effectively a contiguous submatrix of the dosage matrix.
- Parameters
nsamples (Optional[int]) – Number of samples in a single chunk, thresholded by the total number of samples. Set to
None
to include all samples. Default to1024
.nvariants (Optional[int]) – Number of variants in a single chunk, thresholded by the total number of variants. Set to
None
to include all variants. Default to1024
.
Note
Small chunks might increase computation time while large chunks might increase IO usage. If you have a small data set, try setting both
nsamples
andnvariants
toNone
. If the data set is too large but your application will use every sample, try to setnsamples = None
and choose a small value fornvariants
.Examples
>>> from pandas_plink import Chunk >>> >>> Chunk() Chunk(nsamples=1024, nvariants=1024) >>> Chunk(nsamples=None) Chunk(nsamples=None, nvariants=1024)
- __init__(nsamples=1024, nvariants=1024)¶
Initialize self. See help(type(self)) for accurate signature.