معرفی شرکت ها


badlon-0.1.2


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

A bioinf tool for analyzing pan-genome and other features based on synteny blocks
ویژگی مقدار
سیستم عامل -
نام فایل badlon-0.1.2
نام badlon
نسخه کتابخانه 0.1.2
نگهدارنده []
ایمیل نگهدارنده []
نویسنده Alexey Zabelkin
ایمیل نویسنده a.zabelkin@itmo.ru
آدرس صفحه اصلی https://github.com/oxygen311/badlon
آدرس اینترنتی https://pypi.org/project/badlon/
مجوز -
# Badlon ## Installation Bablon can be installed with `pip`: ```bash pip install badlon ``` Now you can run tool from any directory as `badlon`. ## Usage ### Modules Badlon includes multiple modules to process data. They can be listed with help command: ```bash $ badlon --help usage: badlon [-h] {prepare,analysis,match} ... Tool for block based analysis of bacterial populations. Choose one of available modules. positional arguments: {prepare,analysis,match} prepare Prepare draft dataset for SibeliaZ. analysis Analyze pan-genome and other block-based features based on synteny blocks. match Performs matching of block and genes based on coordinates. optional arguments: -h, --help show this help message and exit ``` ### Step 1: prepare data with [`PanACoTA` pipeline](https://github.com/gem-pasteur/PanACoTA) If you have genomes in some folder called `some_folder` (one file for genome), we suggest preparing data using [`PanACoTA` pipeline](https://github.com/gem-pasteur/PanACoTA). To do so, you can use those commands: #### 1.1 Preparing data and tables with `PanACoTA prepare` module: `PanACoTA prepare --norefseq --min 0 --max 1 -o 1-prepare -d some_folder --cutn 125` * `--min 0 --max 1` are used to keep all genomes, parameter can be changed depending on task as well as all other parameters; * For check other parameters visit [`PanACoTA prepare`](https://aperrin.pages.pasteur.fr/pipeline_annotation/html-doc/usage.html#prepare-subcommand) documentation. #### 1.2 Annotating genomes with `PanACoTA annotate` module: `PanACoTA annotate --info 1-prepare/L* -r 2-annotate -n ESCO --threads 16` * You can change label `-n ESCO` depending on your species (ESCO is for *Escherichia coli*); * For check parameters visit [`PanACoTA prepare`](https://aperrin.pages.pasteur.fr/pipeline_annotation/html-doc/usage.html#annotate-subcommand) documentation. #### 1.3 Calling orthology genes using `PanACoTA pangenome` module: `PanACoTA pangenome -l 2-annotate/LSTINFO-* -n ESCO -d 2-annotate/Proteins/ -o 3-pangenome` * You can change `-i` which is minimum sequence identity to be considered in the same cluster (float between 0 and 1). Default is 0.8. * For check parameters visit [`PanACoTA prepare`](https://aperrin.pages.pasteur.fr/pipeline_annotation/html-doc/usage.html#annotate-subcommand) documentation. ### Step 2: `prepare` module of `badlon` Prepare module is used to prepare data for using SibeliaZ package keeping all necessary information: genome labels and chromosome numbers. Parameters can be checked with help option: ```bash $ badlon prepare --help usage: badlon prepare [-h] --folder FOLDER [--contigs CONTIGS] [--output OUTPUT] [--annotate_subfolder ANNOTATE_SUBFOLDER] [--min_len MIN_LEN] optional arguments: -h, --help show this help message and exit --contigs CONTIGS, -c CONTIGS Number of maximum contigs to take from every genome. By default, keeps all. --output OUTPUT, -o OUTPUT Output file path. --annotate_subfolder ANNOTATE_SUBFOLDER, -a ANNOTATE_SUBFOLDER Subfolder of PanACoTA contains results of annotate module. Used for finding LSTINFO file. Default is '2-annotate'. --min_len MIN_LEN, -l MIN_LEN Minimum contig length, less then that value will be filtered. Default is 1000. Required arguments: --folder FOLDER, -f FOLDER Folder with PanACoTA output. Will be used to search genome files based on LSTINFO file from annotate module. ``` Example command: `badlon prepare -f 2-annotate -o for_sibeliaz.fna` ### Step 2. Obtaining blocks with [SibeliaZ](https://github.com/medvedevgroup/SibeliaZ) #### 2.1 Running SibeliaZ with recommended command based on `badlon prepare` output. Example: `sibeliaz -k 15 -a 100 -n -t 32 -o sibeliaz_out for_sibeliaz.fna` * Watch out `-a` it's needs to be equal around `number_of_genome * 20`, `badlon prepare` calculates it automatically. #### 2.2 Obtaining blocks from alignment Check recommended command from `badlon prepare` module output. Usually it's (blocks minimal size 3000): ```bash cd sibeliaz_out echo $'30 150\n100 500\n500 1500' > fine.txt maf2synteny -s fine.txt -b 3000 blocks_coords.gff ``` ### Step 3. Calculating block based statistics and charts with `badlon analysis` module: Parameters can be checked with help option: ```bash $ badlon analysis --help usage: badlon analysis [-h] --blocks_file BLOCKS_FILE --type {chr,contig} [--output OUTPUT] optional arguments: -h, --help show this help message and exit --output OUTPUT, -o OUTPUT Path to output folder. Default: blockomics_output. Required arguments: --blocks_file BLOCKS_FILE, -b BLOCKS_FILE Blocks resulted as output of original Sibelia or maf2synteny tool. Usually it's sibeliaz_out/3000/block_coords.txt file. --type {chr,contig}, -t {chr,contig} Type of genome assembly, either 'chr' or 'contig' ``` Example command: ```bash cd .. badlon analysis -b sibeliaz_out/3000/blocks_coords.txt ``` ### Step 4 (optional): Match block and genes annotation with `badlon match` module Parameters can be checked with help option: ```bash $ badlon match --help usage: badlon match [-h] --blocks_file BLOCKS_FILE --annotated_folder ANNOTATED_FOLDER --pangenome_file PANGENOME_FILE --type {chr,contig} [--output OUTPUT] optional arguments: -h, --help show this help message and exit --output OUTPUT, -o OUTPUT Path to output folder. Default: blockomics_output. Required arguments: --blocks_file BLOCKS_FILE, -b BLOCKS_FILE Blocks folder resulted as output of original Sibelia or maf2synteny tool. Usually it's `sibeliaz_out/3000/` folder. --annotated_folder ANNOTATED_FOLDER, -a ANNOTATED_FOLDER LSTINFO folder path, output of `annotate` step of PanACoTA. --pangenome_file PANGENOME_FILE, -pg PANGENOME_FILE File .lst with orthologous genes, output of `pangenome` step of PanACoTA. --type {chr,contig}, -t {chr,contig} Type of genome assembly, either 'chr' or 'contig' ``` Example command: `badlon match -b sibeliaz_out/3000/blocks_coords.txt -a 2-annotate/ -pg 3-pangenome/*.lst -t contig`


نیازمندی

مقدار نام
- pandas
- numpy
- seaborn
- biopython


زبان مورد نیاز

مقدار نام
>=3.6, <4 Python


نحوه نصب


نصب پکیج whl badlon-0.1.2:

    pip install badlon-0.1.2.whl


نصب پکیج tar.gz badlon-0.1.2:

    pip install badlon-0.1.2.tar.gz