معرفی شرکت ها


bcf-extras-0.2.0


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

A set of variant file helper utilities built on top of bcftools and htslib.
ویژگی مقدار
سیستم عامل -
نام فایل bcf-extras-0.2.0
نام bcf-extras
نسخه کتابخانه 0.2.0
نگهدارنده []
ایمیل نگهدارنده []
نویسنده David Lougheed
ایمیل نویسنده david.lougheed@gmail.com
آدرس صفحه اصلی https://github.com/davidlougheed/bcf_extras
آدرس اینترنتی https://pypi.org/project/bcf-extras/
مجوز GPLv3
# bcf_extras ![Test Status](https://github.com/davidlougheed/bcf_extras/workflows/Tests/badge.svg) Extra variant file helper utilities built on top of `bcftools` and `htslib`. ## License `bcf_extras` is licensed under GPLv3. See [the license](./LICENSE) for more information. ## Dependencies The package requires that `bcftools` and `htslib` are installed on your operating system. For the `str` extra, `TRtools` is required as a pip dependency. This should be handled automatically upon installation, but I have encountered installation issues before (especially when copies of `htslib` interfere with one another.) Feel free to file an issue to ask about this. ## Installation One can use pip to install either the base `bcf-extras` or `bcf-extras[str]`, which has additional dependencies and adds some more utilities related to STR calling. ```bash pip install bcf-extras # or pip install bcf-extras[str] ``` ## What's Included (Base) ### `copy-compress-index` Creates `.vcf.gz` files with a corresponding tabix indices from one or more VCFs, sorting the VCFs if necessary. For example, the following would generate `sample-1.vcf.gz` and `sample-1.vcf.gz.tbi`: ```bash bcf-extras copy-compress-index sample-1.vcf ``` ### `add-header-lines` Adds header lines from a text file to a particular position in the VCF header. Useful for e.g. inserting missing `##contig` lines to a bunch of VCFs at once (taking advantage of this + something like GNU parallel.) For the `##contig` lines example, inserting the contents of [`tests/vcfs/new_lines.txt`](tests/vcfs/new_lines.txt), we could run the following command on [`tests/vcfs/ahl.vcf`](tests/vcfs/ahl.vcf), replacing the file with a new copy: ```bash bcf-extras add-header-lines tests/vcfs/ahl.vcf tests/vcfs/new_lines.txt ``` There is also a flag, `--tmp-dir`, for specifying a temporary folder location into which header artifacts will be placed. This is especially useful when running jobs on clusters, which may have specific locations for temporary I/O. Using GNU parallel, we can do multiple VCFs at once, e.g.: ```bash parallel 'bcf-extras add-header-lines {} tests/vcfs/new_lines.txt --keep-old' ::: /path/to/my/vcfs/*.vcf ``` The `--keep-old` flag keeps the original VCFs as a copy. ### `arg-join` Some bioinformatics utilities take in comma-separated file lists rather than the more standard whitespace-separated lists that something like a glob (`*.vcf.gz`) generates. This command can be run by itself, e.g.: ```bash bcf-extras arg-join --sep ";" *.vcf # Outputs e.g. sample1.vcf;sample2.vcf ``` It can be used embedded in another command, e.g. with `mergeSTR`, [a tool for merging STR caller VCFs](https://github.com/gymreklab/TRTools) which takes as input a *comma-separated* list of files: ```bash mergeSTR --vcfs $(bcf-extras arg-join *.vcf) --out my_merge ``` The default separator (specified via `--sep`) is `,`. ### `filter-gff3` This command can filter a GFF3 (or similarly formatted) file and filter it by various columns using regular expressions. It prints the filtered lines to `stdout`, which can then be redirected to a file or piped to another process. Currently, you can filter by the `seqid`, `source`, `type`, `strand`, and `phrase` columns using Python-formatted regular expressions, e.g. the following, which filters `type` to be either `gene` or `exon` and stores that in a new file: ```bash bcf-extras filter-gff3 --type '^(gene|exon)$' example.gff3 > example-genes-exons.gff3 ``` For help, run the sub-command with no arguments: ```bash bcf-extras filter-gff3 ``` ### `reformat-vcf-contigs` TODO ## What's Included (STR) ### `parallel-mergeSTR` [mergeSTR](https://github.com/gymreklab/TRTools) is a tool by the Gymrek lab used to merge STR call VCFs. It proceeds linearly over a list of files, which cannot easily take advantage of multiple cores. This utility merges VCFs in a tree fashion to produce a final merged result, and is handy when merging 100s of STR call VCFs at once. ```bash bcf-extras parallel-mergeSTR *.vcf.gz --out my_merge --ntasks 10 ``` In a dataset of 148 single-sample gangSTR call VCFs, merging with `parallel-mergeSTR` on 10 cores resulted in an 60% speedup versus running on a single core (~2 hours versus ~5 hours.) Speedup is not linear with number of cores used, so it only makes sense to use this if turnaround time is important and resources are available. To not over-allocate resources on a cluster, the process can be split further into first a parallelized task, and then a task which only uses one core: ```bash # Runs on multiple cores bcf-extras parallel-mergeSTR *.vcf.gz --ntasks 10 --out my_merge --step1-only # Intermediate files generated by the first step will feed into the second step. # Bottlenecked single process step; the ntasks argument is still needed to # calculate the names of the intermediate output files (but sub-processes are # not spawned) - thus, the provided value must match the value above. bcf-extras parallel-mergeSTR --ntasks 10 --out my_merge --step2-only ```


نیازمندی

مقدار نام
- trtools


زبان مورد نیاز

مقدار نام
~=3.7 Python


نحوه نصب


نصب پکیج whl bcf-extras-0.2.0:

    pip install bcf-extras-0.2.0.whl


نصب پکیج tar.gz bcf-extras-0.2.0:

    pip install bcf-extras-0.2.0.tar.gz