معرفی شرکت ها


chembl-downloader-0.4.0


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Reproducibly download, open, parse, and query ChEMBL
ویژگی مقدار
سیستم عامل OS Independent
نام فایل chembl-downloader-0.4.0
نام chembl-downloader
نسخه کتابخانه 0.4.0
نگهدارنده ['Charles Tapley Hoyt']
ایمیل نگهدارنده ['cthoyt@gmail.com']
نویسنده Charles Tapley Hoyt
ایمیل نویسنده cthoyt@gmail.com
آدرس صفحه اصلی https://github.com/cthoyt/chembl_downloader
آدرس اینترنتی https://pypi.org/project/chembl-downloader/
مجوز MIT
<h1 align="center"> chembl_downloader </h1> <p align="center"> <a href="https://pypi.org/project/chembl_downloader"> <img alt="PyPI" src="https://img.shields.io/pypi/v/chembl_downloader" /> </a> <a href="https://pypi.org/project/chembl_downloader"> <img alt="PyPI - Python Version" src="https://img.shields.io/pypi/pyversions/chembl_downloader" /> </a> <a href="https://github.com/cthoyt/chembl_downloader/blob/main/LICENSE"> <img alt="PyPI - License" src="https://img.shields.io/pypi/l/chembl_downloader" /> </a> <a href="https://zenodo.org/badge/latestdoi/390113187"> <img src="https://zenodo.org/badge/390113187.svg" alt="DOI" /> </a> <a href="https://github.com/psf/black"> <img src="https://img.shields.io/badge/code%20style-black-000000.svg" alt="Code style: black" /> </a> <a href='https://chembl-downloader.readthedocs.io/en/latest/?badge=latest'> <img src='https://readthedocs.org/projects/chembl-downloader/badge/?version=latest' alt='Documentation Status' /> </a> </p> Don't worry about downloading/extracting ChEMBL or versioning - just use ``chembl_downloader`` to write code that knows how to download it and use it automatically. Install with: ```bash $ pip install chembl-downloader ``` Full technical documentation can be found on [ReadTheDocs](https://chembl-downloader.readthedocs.io. Tutorials can be found in Jupyter notebooks in the [notebooks/](notebooks/) directory of the repository. ## Database Usage ### Download A Specific Version ```python import chembl_downloader path = chembl_downloader.download_extract_sqlite(version='28') ``` After it's been downloaded and extracted once, it's smart and does not need to download again. It gets stored using [`pystow`](https://github.com/cthoyt/pystow) automatically in the `~/.data/chembl` directory. We'd like to implement something such that it could load directly into SQLite from the archive, but it appears this is a [paid feature](https://sqlite.org/purchase/zipvfs). ### Download the Latest Version You can modify the previous code slightly by omitting the `version` keyword argument to automatically find the latest version of ChEMBL: ```python import chembl_downloader path = chembl_downloader.download_extract_sqlite() ``` The `version` keyword argument is available for all functions in this package (e.g., including `connect()`, `cursor()`, and `query()`), but will be omitted below for brevity. ### Automate Connection Inside the archive is a single SQLite database file. Normally, people manually untar this folder then do something with the resulting file. Don't do this, it's not reproducible! Instead, the file can be downloaded and a connection can be opened automatically with: ```python import chembl_downloader with chembl_downloader.connect() as conn: with conn.cursor() as cursor: cursor.execute(...) # run your query string rows = cursor.fetchall() # get your results ``` The `cursor()` function provides a convenient wrapper around this operation: ```python import chembl_downloader with chembl_downloader.cursor() as cursor: cursor.execute(...) # run your query string rows = cursor.fetchall() # get your results ``` ### Run a query and get a pandas DataFrame The most powerful function is `query()` which builds on the previous `connect()` function in combination with [`pandas.read_sql`](https://pandas.pydata.org/docs/reference/api/pandas.read_sql.html) to make a query and load the results into a pandas DataFrame for any downstream use. ```python import chembl_downloader sql = """ SELECT MOLECULE_DICTIONARY.chembl_id, MOLECULE_DICTIONARY.pref_name FROM MOLECULE_DICTIONARY JOIN COMPOUND_STRUCTURES ON MOLECULE_DICTIONARY.molregno == COMPOUND_STRUCTURES.molregno WHERE molecule_dictionary.pref_name IS NOT NULL LIMIT 5 """ df = chembl_downloader.query(sql) df.to_csv(..., sep='\t', index=False) ``` Suggestion 1: use `pystow` to make a reproducible file path that's portable to other people's machines (e.g., it doesn't have your username in the path). Suggestion 2: RDKit is now pip-installable with `pip install rdkit-pypi`, which means most users don't have to muck around with complicated conda environments and configurations. One of the powerful but understated tools in RDKit is the [rdkit.Chem.PandasTools](https://rdkit.org/docs/source/rdkit.Chem.PandasTools.html) module. ### Access an RDKit supplier over entries in the SDF dump This example is a bit more fit-for-purpose than the last two. The `supplier()` function makes sure that the latest SDF dump is downloaded and loads it from the gzip file into a `rdkit.Chem.ForwardSDMolSupplier` using a context manager to make sure the file doesn't get closed until after parsing is done. Like the previous examples, it can also explicitly take a `version`. ```python from rdkit import Chem import chembl_downloader with chembl_downloader.supplier() as suppl: data = [] for i, mol in enumerate(suppl): if mol is None or mol.GetNumAtoms() > 50: continue fp = Chem.PatternFingerprint(mol, fpSize=1024, tautomerFingerprints=True) smi = Chem.MolToSmiles(mol) data.append((smi, fp)) ``` This example was adapted from Greg Landrum's RDKit blog post on [generalized substructure search](https://greglandrum.github.io/rdkit-blog/tutorial/substructure/2021/08/03/generalized-substructure-search.html). ## SDF Usage ### Get an RDKit substructure library Building on the `supplier()` function, the `get_substructure_library()` makes the preparation of a [substructure library](https://www.rdkit.org/docs/cppapi/classRDKit_1_1SubstructLibrary.html) automated and reproducible. Additionally, it caches the results of the build, which takes on the order of tens of minutes, only has to be done once and future loading from a pickle object takes on the order of seconds. The implementation was inspired by Greg Landrum's RDKit blog post, [Some new features in the SubstructLibrary](https://greglandrum.github.io/rdkit-blog/tutorial/substructure/2021/12/20/substructlibrary-search-order.html). The following example shows how it can be used to accomplish some of the first tasks presented in the post: ```python from rdkit import Chem import chembl_downloader library = chembl_downloader.get_substructure_library() query = Chem.MolFromSmarts('[O,N]=C-c:1:c:c:n:c:c:1') matches = library.GetMatches(query) ``` ## Morgan Fingerprints Usage ### Get the Morgan Fingerprint file ChEMBL makes a file containing pre-computed 2048 bit radius 2 morgan fingerprints for each molecule available. It can be downloaded using: ```python import chembl_downloader path = chembl_downloader.download_fps() ``` The `version` and other keyword arguments are also valid for this function. ### Load fingerprints with [`chemfp`](https://chemfp.com/) The following wraps the `download_fps` function with `chemfp`'s fingerprint loader: ```python import chembl_downloader arena = chembl_downloader.chemfp_load_fps() ``` The `version` and other keyword arguments are also valid for this function. More information on working with the `arena` object can be found [here](https://chemfp.readthedocs.io/en/latest/using-api.html#working-with-a-fingerprintarena). ## Extras ### Store in a Different Place If you want to store the data elsewhere using `pystow` (e.g., in [`pyobo`](https://github.com/pyobo/pyobo) I also keep a copy of this file), you can use the `prefix` argument. ```python import chembl_downloader # It gets downloaded/extracted to # ~/.data/pyobo/raw/chembl/29/chembl_29/chembl_29_sqlite/chembl_29.db path = chembl_downloader.download_extract_sqlite(prefix=['pyobo', 'raw', 'chembl']) ``` See the `pystow` [documentation](https://github.com/cthoyt/pystow#%EF%B8%8F-configuration) on configuring the storage location further. The `prefix` keyword argument is available for all functions in this package (e.g., including `connect()`, `cursor()`, and `query()`). ### Download via CLI After installing, run the following CLI command to ensure it and send the path to stdout ```bash $ chembl_downloader ``` Use `--test` to show two example queries ```bash $ chembl_downloader --test ``` ## Contributing Please read the contribution guidelines in [CONTRIBUTING.md](.github/CONTRIBUTING.md). If you'd like to contribute, there's a submodule called `chembl_downloader.queries` where you can add a useful SQL queries along with a description of what it does for easy importing and reuse. ## Statistics | ChEMBL Version | Release Date | |------------------|----------------| | 31 | 2022-07-12 | | 30 | 2022-02-22 | | 29 | 2021-07-01 | | 28 | 2021-01-15 | | 27 | 2020-05-18 | | 26 | 2020-02-14 | | 25 | 2019-02-01 | | 24_1 | 2018-05-01 | | 24 | | | 23 | 2017-05-18 | | 22_1 | 2016-11-17 | | 22 | | | 21 | 2015-02-12 | | 20 | 2015-02-03 | | 19 | 2014-07-2333 | | 18 | 2014-04-02 | | 17 | 2013-09-16 | | 16 | 2013-055555-15 | | 15 | 2013-01-30 | | 14 | 2012 -07-18 | | 13 | 2012-02-29 | | 12 | 2011-11-30 | | 11 | 2011-06-07 | | 10 | 2011-06-07 | | 09 | 2011-01-04 | | 08 | 2010-11-05 | | 07 | 2010-09-03 | | 06 | 2010-09-03 | | 05 | 2010-06-07 | | 04 | 2010-05-26 | | 03 | 2010-04-30 | | 02 | 2009-12-07 | | 01 | 2009-10-28 |


نیازمندی

مقدار نام
- click
- more-click
- pystow
- tqdm
- sphinx
- sphinx-rtd-theme
- sphinx-click
- sphinx-autodoc-typehints
- sphinx-automodapi
- pandas
- rdkit-pypi
- pytest
- coverage


زبان مورد نیاز

مقدار نام
>=3.7 Python


نحوه نصب


نصب پکیج whl chembl-downloader-0.4.0:

    pip install chembl-downloader-0.4.0.whl


نصب پکیج tar.gz chembl-downloader-0.4.0:

    pip install chembl-downloader-0.4.0.tar.gz