معرفی شرکت ها


biobear-0.4.0


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

-
ویژگی مقدار
سیستم عامل -
نام فایل biobear-0.4.0
نام biobear
نسخه کتابخانه 0.4.0
نگهدارنده []
ایمیل نگهدارنده []
نویسنده -
ایمیل نویسنده WHERE TRUE devs <thauck+biobear@wheretrue.com>
آدرس صفحه اصلی -
آدرس اینترنتی https://pypi.org/project/biobear/
مجوز -
# biobear (v0.4.0) biobear is a Python library designed for reading and searching bioinformatic file formats, using Rust as its backend and producing Arrow or Polars DataFrames as its output. The python package has minimal dependencies and only requires Polars. Biobear can be used to read various bioinformatic file formats, including FASTA, FASTQ, VCF, BAM, and GFF. It can also query some indexed file formats, including VCF and BAM. - [Installation](#installation) - [Usage](#usage) - [Similar Packages](#similar-packages) - [API Documentation](#api-documentation) - [vcf\_reader](#vcf_reader) - [VCFReader](#vcfreader) - [VCFIndexedReader](#vcfindexedreader) - [genbank\_reader](#genbank_reader) - [GenbankReader](#genbankreader) - [fasta\_reader](#fasta_reader) - [FastaReader](#fastareader) - [compression](#compression) - [Compression](#compression-1) - [\_\_init\_\_](#__init__-4) - [bam\_reader](#bam_reader) - [BamReader](#bamreader) - [BamIndexedReader](#bamindexedreader) - [fastq\_reader](#fastq_reader) - [FastqReader](#fastqreader) - [gff\_reader](#gff_reader) - [GFFReader](#gffreader) ## Installation ```bash pip install biobear ``` Prefer python 3.10 or higher, though python 3.7+ should work. ## Usage Read a FASTQ file: ```python import biobear as bb df = bb.FastqReader("test.fq").read() print(df.head()) # ┌─────────┬───────────────────────┬───────────────────────────────────┬───────────────────────────────────┐ # │ name ┆ description ┆ sequence ┆ quality │ # │ --- ┆ --- ┆ --- ┆ --- │ # │ str ┆ str ┆ str ┆ str │ # ╞═════════╪═══════════════════════╪═══════════════════════════════════╪═══════════════════════════════════╡ # │ SEQ_ID ┆ This is a description ┆ GATTTGGGGTTCAAAGCAGTATCGATCAAATA… ┆ !''*((((***+))%%%++)(%%%%).1***-… │ # │ SEQ_ID2 ┆ null ┆ GATTTGGGGTTCAAAGCAGTATCGATCAAATA… ┆ !''*((((***+))%%%++)(%%%%).1***-… │ # └─────────┴───────────────────────┴───────────────────────────────────┴───────────────────────────────────┘ ``` Read a gzipped FASTQ file: ```python import biobear as bb from biobear.compression import Compression df = bb.FastqReader("./python/tests/data/test.fastq.gz", compression=Compression.GZIP).read() print(df.head()) # ┌─────────┬─────────────┬───────────────────────────────────┬───────────────────────────────────┐ # │ name ┆ description ┆ sequence ┆ quality │ # │ --- ┆ --- ┆ --- ┆ --- │ # │ str ┆ str ┆ str ┆ str │ # ╞═════════╪═════════════╪═══════════════════════════════════╪═══════════════════════════════════╡ # │ SEQ_ID ┆ null ┆ GATTTGGGGTTCAAAGCAGTATCGATCAAATA… ┆ !''*((((***+))%%%++)(%%%%).1***-… │ # │ SEQ_ID2 ┆ null ┆ GATTTGGGGTTCAAAGCAGTATCGATCAAATA… ┆ !''*((((***+))%%%++)(%%%%).1***-… │ # └─────────┴─────────────┴───────────────────────────────────┴───────────────────────────────────┘ # The compression type is also inferred from the extension of the file df = bb.FastqReader("test.fq.gz").read() print(df.head()) # ┌─────────┬─────────────┬───────────────────────────────────┬───────────────────────────────────┐ # │ name ┆ description ┆ sequence ┆ quality │ # │ --- ┆ --- ┆ --- ┆ --- │ # │ str ┆ str ┆ str ┆ str │ # ╞═════════╪═════════════╪═══════════════════════════════════╪═══════════════════════════════════╡ # │ SEQ_ID ┆ null ┆ GATTTGGGGTTCAAAGCAGTATCGATCAAATA… ┆ !''*((((***+))%%%++)(%%%%).1***-… │ # │ SEQ_ID2 ┆ null ┆ GATTTGGGGTTCAAAGCAGTATCGATCAAATA… ┆ !''*((((***+))%%%++)(%%%%).1***-… │ # └─────────┴─────────────┴───────────────────────────────────┴───────────────────────────────────┘ ``` Query an indexed VCF file: ```python import biobear as bb # Will error if test.vcf.gz.tbi is not present df = bb.VCFIndexedReader("test.vcf.gz").query("1") print(df.head()) # ┌────────────┬──────────┬───────┬───────────┬───┬───────────────┬────────┬───────────────────────────────────┬────────────────┐ # │ chromosome ┆ position ┆ id ┆ reference ┆ … ┆ quality_score ┆ filter ┆ info ┆ format │ # │ --- ┆ --- ┆ --- ┆ --- ┆ ┆ --- ┆ --- ┆ --- ┆ --- │ # │ str ┆ i32 ┆ str ┆ str ┆ ┆ f32 ┆ str ┆ str ┆ str │ # ╞════════════╪══════════╪═══════╪═══════════╪═══╪═══════════════╪════════╪═══════════════════════════════════╪════════════════╡ # │ 1 ┆ 3000150 ┆ ┆ C ┆ … ┆ 59.200001 ┆ PASS ┆ AN=4;AC=2 ┆ GT:GQ │ # │ 1 ┆ 3000151 ┆ ┆ C ┆ … ┆ 59.200001 ┆ PASS ┆ AN=4;AC=2 ┆ GT:DP:GQ │ # │ 1 ┆ 3062915 ┆ id3D ┆ GTTT ┆ … ┆ 12.9 ┆ q10 ┆ DP4=1,2,3,4;AN=4;AC=2;INDEL;STR=… ┆ GT:GQ:DP:GL │ # │ 1 ┆ 3062915 ┆ idSNP ┆ G ┆ … ┆ 12.6 ┆ test ┆ TEST=5;DP4=1,2,3,4;AN=3;AC=1,1 ┆ GT:TT:GQ:DP:GL │ # │ 1 ┆ 3106154 ┆ ┆ CAAA ┆ … ┆ 342.0 ┆ PASS ┆ AN=4;AC=2 ┆ GT:GQ:DP │ # └────────────┴──────────┴───────┴───────────┴───┴───────────────┴────────┴───────────────────────────────────┴────────────────┘ ``` ## Similar Packages Similar packages and/or inspiration for this package: - https://github.com/abdenlab/saimin/ - https://github.com/tshauck/brrrr/ - https://github.com/natir/vcf2parquet/ - https://github.com/zaeleus/noodles/ - https://github.com/eto-ai/lance ## API Documentation These docs are auto-generated, please file an issue if something is amiss. <a id="vcf_reader"></a> ### vcf\_reader VCF File Readers. <a id="vcf_reader.VCFReader"></a> #### VCFReader ```python class VCFReader() ``` A VCF File Reader. This class is used to read a VCF file and convert it to a polars DataFrame. <a id="vcf_reader.VCFReader.__init__"></a> ##### \_\_init\_\_ ```python def __init__(path: Path) ``` Initialize the VCFReader. **Arguments**: - `path` _Path_ - Path to the VCF file. <a id="vcf_reader.VCFReader.to_arrow_record_batch_reader"></a> ##### to\_arrow\_record\_batch\_reader ```python def to_arrow_record_batch_reader() -> pa.RecordBatchReader ``` Convert the VCF reader to an arrow batch reader. <a id="vcf_reader.VCFReader.to_arrow_scanner"></a> ##### to\_arrow\_scanner ```python def to_arrow_scanner() -> ds.Scanner ``` Convert the VCF reader to an arrow scanner. <a id="vcf_reader.VCFReader.read"></a> ##### read ```python def read() -> pl.DataFrame ``` Read the VCF file and return a polars DataFrame. <a id="vcf_reader.VCFIndexedReader"></a> #### VCFIndexedReader ```python class VCFIndexedReader() ``` An Indexed VCF File Reader. This class is used to read or query an indexed VCF file and convert it to a polars DataFrame. <a id="vcf_reader.VCFIndexedReader.__init__"></a> ##### \_\_init\_\_ ```python def __init__(path: Path) ``` Initialize the VCFIndexedReader. <a id="vcf_reader.VCFIndexedReader.read"></a> ##### read ```python def read() -> pl.DataFrame ``` Read the VCF file and return a polars DataFrame. <a id="vcf_reader.VCFIndexedReader.to_arrow_record_batch_reader"></a> ##### to\_arrow\_record\_batch\_reader ```python def to_arrow_record_batch_reader() -> pa.RecordBatchReader ``` Convert the VCF reader to an arrow batch reader. <a id="vcf_reader.VCFIndexedReader.to_arrow_scanner"></a> ##### to\_arrow\_scanner ```python def to_arrow_scanner() -> ds.Scanner ``` Convert the VCF reader to an arrow scanner. <a id="vcf_reader.VCFIndexedReader.query"></a> ##### query ```python def query(region: str) -> pl.DataFrame ``` Query the VCF file and return a polars DataFrame. **Arguments**: - `region` _str_ - The region to query. <a id="genbank_reader"></a> ### genbank\_reader Genbank file reader. <a id="genbank_reader.GenbankReader"></a> #### GenbankReader ```python class GenbankReader() ``` <a id="genbank_reader.GenbankReader.__init__"></a> ##### \_\_init\_\_ ```python def __init__(path: Path) ``` Read a fasta file. **Arguments**: - `path` _Path_ - Path to the fasta file. <a id="genbank_reader.GenbankReader.read"></a> ##### read ```python def read() -> pl.DataFrame ``` Read the fasta file and return a polars DataFrame. <a id="genbank_reader.GenbankReader.to_arrow_scanner"></a> ##### to\_arrow\_scanner ```python def to_arrow_scanner() -> ds.Scanner ``` Convert the fasta reader to an arrow scanner. <a id="genbank_reader.GenbankReader.to_arrow_record_batch_reader"></a> ##### to\_arrow\_record\_batch\_reader ```python def to_arrow_record_batch_reader() -> pa.RecordBatchReader ``` Convert the fasta reader to an arrow batch reader. <a id="fasta_reader"></a> ### fasta\_reader FASTA file reader. <a id="fasta_reader.FastaReader"></a> #### FastaReader ```python class FastaReader() ``` <a id="fasta_reader.FastaReader.__init__"></a> ##### \_\_init\_\_ ```python def __init__(path: Path, compression: Compression = Compression.INFERRED) ``` Read a fasta file. **Arguments**: - `path` _Path_ - Path to the fasta file. Kwargs: - `compression` _Compression_ - Compression type of the file. Defaults to Compression.INFERRED. <a id="fasta_reader.FastaReader.read"></a> ##### read ```python def read() -> pl.DataFrame ``` Read the fasta file and return a polars DataFrame. <a id="fasta_reader.FastaReader.to_arrow_scanner"></a> ##### to\_arrow\_scanner ```python def to_arrow_scanner() -> ds.Scanner ``` Convert the fasta reader to an arrow scanner. <a id="fasta_reader.FastaReader.to_arrow_record_batch_reader"></a> ##### to\_arrow\_record\_batch\_reader ```python def to_arrow_record_batch_reader() -> pa.RecordBatchReader ``` Convert the fasta reader to an arrow batch reader. <a id="compression"></a> ### compression Compression configuration. <a id="compression.Compression"></a> #### Compression ```python class Compression(Enum) ``` Compression types for files. <a id="compression.Compression.from_file"></a> ##### from\_file ```python @classmethod def from_file(cls, path: os.PathLike) -> "Compression" ``` Infer the compression type from the file extension. <a id="compression.Compression.infer_or_use"></a> ##### infer\_or\_use ```python def infer_or_use(path: os.PathLike) -> "Compression" ``` Infer the compression type from the file extension if needed. <a id="__init__"></a> ### \_\_init\_\_ Main biobear package. <a id="bam_reader"></a> ### bam\_reader BAM File Readers. <a id="bam_reader.BamReader"></a> #### BamReader ```python class BamReader() ``` A BAM File Reader. <a id="bam_reader.BamReader.__init__"></a> ##### \_\_init\_\_ ```python def __init__(path: Path) ``` Initialize the BamReader. **Arguments**: - `path` _Path_ - Path to the BAM file. <a id="bam_reader.BamReader.to_arrow_record_batch_reader"></a> ##### to\_arrow\_record\_batch\_reader ```python def to_arrow_record_batch_reader() -> pa.RecordBatchReader ``` Convert the BAM reader to an arrow batch reader. <a id="bam_reader.BamReader.to_arrow_scanner"></a> ##### to\_arrow\_scanner ```python def to_arrow_scanner() -> ds.Scanner ``` Convert the BAM reader to an arrow scanner. <a id="bam_reader.BamReader.read"></a> ##### read ```python def read() -> pl.DataFrame ``` Read the BAM file and return a polars DataFrame. <a id="bam_reader.BamIndexedReader"></a> #### BamIndexedReader ```python class BamIndexedReader() ``` An Indexed BAM File Reader. <a id="bam_reader.BamIndexedReader.__init__"></a> ##### \_\_init\_\_ ```python def __init__(path: Path, index: Path) ``` Initialize the BamIndexedReader. **Arguments**: - `path` _Path_ - Path to the BAM file. - `index` _Path_ - Path to the BAM index file. <a id="bam_reader.BamIndexedReader.read"></a> ##### read ```python def read() -> pl.DataFrame ``` Read the BAM file and return a polars DataFrame. <a id="bam_reader.BamIndexedReader.to_arrow_record_batch_reader"></a> ##### to\_arrow\_record\_batch\_reader ```python def to_arrow_record_batch_reader() -> pa.RecordBatchReader ``` Convert the BAM reader to an arrow batch reader. <a id="bam_reader.BamIndexedReader.to_arrow_scanner"></a> ##### to\_arrow\_scanner ```python def to_arrow_scanner() -> ds.Scanner ``` Convert the BAM reader to an arrow scanner. <a id="bam_reader.BamIndexedReader.query"></a> ##### query ```python def query(chrom: str, start: int, end: int) -> pl.DataFrame ``` Query the BAM file and return a polars DataFrame. **Arguments**: - `chrom` _str_ - The chromosome to query. - `start` _int_ - The start position to query. - `end` _int_ - The end position to query. <a id="fastq_reader"></a> ### fastq\_reader FASTQ reader. <a id="fastq_reader.FastqReader"></a> #### FastqReader ```python class FastqReader() ``` <a id="fastq_reader.FastqReader.__init__"></a> ##### \_\_init\_\_ ```python def __init__(path: os.PathLike, compression: Compression = Compression.INFERRED) ``` Read a fastq file. **Arguments**: - `path` _Path_ - Path to the fastq file. Kwargs: - `compression` _Compression_ - Compression type of the file. Defaults to Compression.INFERRED. <a id="fastq_reader.FastqReader.read"></a> ##### read ```python def read() -> pl.DataFrame ``` Read the fasta file and return a polars DataFrame. <a id="fastq_reader.FastqReader.to_arrow_scanner"></a> ##### to\_arrow\_scanner ```python def to_arrow_scanner() -> ds.Scanner ``` Convert the fasta reader to an arrow scanner. <a id="fastq_reader.FastqReader.to_arrow_record_batch_reader"></a> ##### to\_arrow\_record\_batch\_reader ```python def to_arrow_record_batch_reader() -> pa.RecordBatchReader ``` Convert the fasta reader to an arrow batch reader. <a id="gff_reader"></a> ### gff\_reader GFF File Reader. <a id="gff_reader.GFFReader"></a> #### GFFReader ```python class GFFReader() ``` A GFF File Reader. <a id="gff_reader.GFFReader.__init__"></a> ##### \_\_init\_\_ ```python def __init__(path: Path, compression: Compression = Compression.INFERRED) ``` Initialize the GFFReader. **Arguments**: - `path` - The path to the GFF file. <a id="gff_reader.GFFReader.read"></a> ##### read ```python def read() -> pl.DataFrame ``` Read the GFF file and return a polars DataFrame. <a id="gff_reader.GFFReader.to_arrow_record_batch_reader"></a> ##### to\_arrow\_record\_batch\_reader ```python def to_arrow_record_batch_reader() -> pa.RecordBatchReader ``` Convert the GFF reader to an arrow batch reader. <a id="gff_reader.GFFReader.to_arrow_scanner"></a> ##### to\_arrow\_scanner ```python def to_arrow_scanner() -> ds.Scanner ``` Convert the GFF reader to an arrow scanner.


نیازمندی

مقدار نام
- polars[pyarrow]>=0.17.0


زبان مورد نیاز

مقدار نام
>=3.7 Python


نحوه نصب


نصب پکیج whl biobear-0.4.0:

    pip install biobear-0.4.0.whl


نصب پکیج tar.gz biobear-0.4.0:

    pip install biobear-0.4.0.tar.gz