معرفی شرکت ها


dataquality-0.9.1a3


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

dataquality is a library for tracking and analyzing your machine learning models.
ویژگی مقدار
سیستم عامل -
نام فایل dataquality-0.9.1a3
نام dataquality
نسخه کتابخانه 0.9.1a3
نگهدارنده []
ایمیل نگهدارنده []
نویسنده Galileo Technologies, Inc
ایمیل نویسنده team@rungalileo.io
آدرس صفحه اصلی https://www.github.com/rungalileo/dataquality
آدرس اینترنتی https://pypi.org/project/dataquality/
مجوز -
# dataquality The Official Python Client for [Galileo](https://rungalileo.io). Galileo is a tool for understanding and improving the quality of your NLP (and soon CV!) data. Galileo gives you access to all of the information you need, at a UI and API level, to continuously build better and more robust datasets and models. `dataquality` is your entrypoint to Galileo. It helps you start and complete the loop of data quality improvements. # ToC * [Getting Started](#getting-started) * [Custom Integrations](#can-i-analyze-data-using-a-custom-model) * [No labels? No problem](#what-if-i-dont-have-labels-to-train-with-can-you-help-with-labeling) * [Programmatic Access](#is-there-a-python-api-for-programmatically-interacting-with-the-console) * [Contributing](#contributing) <details> <summary><h2>Getting Started</h2></summary> Install the package. ```sh pip install dataquality ``` Create an account at [Galileo](https://console.cloud.rungalileo.io/sign-up) Grab your [token](https://console.cloud.rungalileo.io/get-token) Get your dataset and analyze it with `dq.auto` (You will be prompted for your token here) ```python import dataquality as dq dq.auto( train_data="/path/to/train.csv", val_data="/path/to/val.csv", test_data="/path/to/test.csv", project_name="my_first_project", run_name="my_first_run", ) ``` ☕️ Wait for Galileo to train your model and analyze the results. ✨ A link to your run will be provided automatically #### Pro tip: Set your token programmatically for automated workflows By setting the token, you'll never be prompted to log in ```python import dataquality as dq dq.config.token = 'MY-TOKEN' ``` For long-lived flows like CI/CD, see our docs on [environment variables](https://rungalileo.gitbook.io/galileo/python-library-api/environment-variables) <details> <summary><h3>What kinds of datasets can I analyze?</h3></summary> Currently, you can analyze **Text Classification** and **NER** If you want support for other kinds, [reach out!](https://github.com/rungalileo/dataquality/issues/new?assignees=ben-epstein&labels=enhancement&template=feature.md&title=%5BFEATURE%5D) </details> <details> <summary><h3>Can I use auto with other data forms?</h3></summary> `auto` params `train_data`, `val_data`, and `test_data` can also take as input pandas dataframes and huggingface dataframes! </details> <details> <summary><h3>What if all my data is in huggingface?</h3></summary> Use the `hf_data` param to point to a dataset in huggingface ```python import dataquality as dq dq.auto(hf_data="rungalileo/emotion") ``` </details> <details> <summary><h3>Anything else? Can I learn more?</h3></summary> Run `help(dq.auto)` for more information on usage<br> Check out our [docs](https://rungalileo.gitbook.io/galileo/getting-started/add-your-data-to-galileo/dq-auto) for the inspiration behind this methodology. </details> </details> <details> <summary><h2>Can I analyze data using a custom model?</h2></summary> Yes! Check out our [full documentation](https://rungalileo.gitbook.io/galileo/getting-started/byom-bring-your-own-model) and [example notebooks](https://rungalileo.gitbook.io/galileo/example-notebooks) on how to integrate your own model with Galileo </details> <details> <summary><h2>What if I don't have labels to train with? Can you help with labeling?</h2></summary> We have an [app for that](https://github.com/rungalileo/bulk-labeling/)! Currently text classification only, but [reach out](https://github.com/rungalileo/bulk-labeling/issues/new?assignee=ben-epstein) if you want a new modality!<br> This is currently in development, and not an official part of the Galileo product, but rather an open source tool for the community. We've built a bulk-labeling tool (and hosted it on streamlit) to help you generate labels quickly using semantic embeddings and text search. For more info on how it works and how to use it, check out the [open source repo](https://github.com/rungalileo/bulk-labeling/). </details> <details> <summary><h2>Is there a Python API for programmatically interacting with the console?</h2></summary> Yes! See our docs on [`dq.metrics`](https://rungalileo.gitbook.io/galileo/python-library-api/dq.metrics) to access things like overall metrics, your analyzed dataframe, and even your embeddings. </details> <details> <summary><h2>Contributing</h2></summary> Read our [contributing doc](./CONTRIBUTING.md)!


نیازمندی

مقدار نام
- pydantic>=1.8.2
- requests>=2.25.1
- types-requests>=2.25.2
- pandas>=0.20.0
- pyarrow>=5.0.0
- vaex-core==4.16.0
- vaex-hdf5>=0.12,<0.13
- diskcache>=5.2.1
- resource>=0.2.1
- tqdm>=4.62.3
- blake3>=0.2.1
- wrapt>=1.13.3
- scipy>=1.7.0
- cachetools>=4.2.4
- importlib-metadata<6.0.1
- datasets>=2.6
- transformers>=4.17.0
- seqeval
- sentence-transformers>=2.2
- Pillow
=3.1. h5py
- numpy<1.24.0
- tenacity>=8.1.0
- ucx-py-cu11<=0.30
- rmm-cu11==23.2.0
- raft-dask-cu11==23.2.0
- pylibraft-cu11==23.2.0
- dask-cudf-cu11==23.2.0
- cudf-cu11==23.2.0
- cuml-cu11==23.2.0
- flake8
- black>=23.1.0
- isort>=5.11.5
- autoflake
- jupyter==1.0.0
- evaluate
- furo
- sphinx
- sphinx-autodoc-typehints
- myst-parser
- sphinx-markdown-builder
- sphinx-autobuild
- sphinx-markdown-builder
- evaluate
- minio>=7.1.0,<7.2.0
- cachetools>=5.2.0
- types-cachetools>=5.3.0.0
- importlib-metadata<5.0.0
- ultralytics
- pytest>=7.2.1
- mypy>=1.0.0
- freezegun>=1.2.2
- coverage>=7.0.5
- pytest-cov>=4.0.0
- scikit-learn>=1.0
- tensorflow>=2.9.1
- pytest-env>=0.8.1
- spacy==3.2.1
- types-setuptools>=67.3.0.1
- types-cachetools>=4.2.4
- torchvision>=0.13.1
- torch>=1.12.1
- torchtext>=0.13.1
- torchdata>=0.4.1
- xgboost>=1.6.2
- timm>=0.6.12
- fastai>=2.7.11
- portalocker==2.7.0
- types-PyYAML==6.0.12.9


زبان مورد نیاز

مقدار نام
>=3.7 Python


نحوه نصب


نصب پکیج whl dataquality-0.9.1a3:

    pip install dataquality-0.9.1a3.whl


نصب پکیج tar.gz dataquality-0.9.1a3:

    pip install dataquality-0.9.1a3.tar.gz