معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Dynamic item set clustering UI tool: The goal of this package is to split datasets (e.g. words defined by several variables) into subsets that are as comparable as possible.

ویژگی	مقدار
سیستم عامل	-
نام فایل	discuit-0.2.1
نام	discuit
نسخه کتابخانه	0.2.1
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	Dörte de Kok
ایمیل نویسنده	me@doerte.eu
آدرس صفحه اصلی	https://github.com/doerte/discuit-project
آدرس اینترنتی	https://pypi.org/project/discuit/
مجوز	-

## Badges (Customize these badges with your own links, and check https://shields.io/ or https://badgen.net/ to see which other badges are available.) | fair-software.eu recommendations | | | :-- | :-- | | (1/5) code repository | [![github repo badge](https://img.shields.io/badge/github-repo-000.svg?logo=github&labelColor=gray&color=blue)](https://github.com/doerte/discuit-project) | | (2/5) license | [![github license badge](https://img.shields.io/github/license/doerte/discuit-project)](https://github.com/doerte/discuit-project) | | (3/5) community registry | | | (4/5) citation | [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.7671856.svg)](https://doi.org/10.5281/zenodo.7671856) | | (5/5) checklist | [![workflow cii badge](https://bestpractices.coreinfrastructure.org/projects/<replace-with-created-project-identifier>/badge)](https://bestpractices.coreinfrastructure.org/projects/<replace-with-created-project-identifier>) | | howfairis | [![fair-software badge](https://img.shields.io/badge/fair--software.eu-%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8B-yellow)](https://fair-software.eu) | | **Other best practices** |   | | Static analysis | [![workflow scq badge](https://sonarcloud.io/api/project_badges/measure?project=doerte_discuit-project&metric=alert_status)](https://sonarcloud.io/dashboard?id=doerte_discuit-project) | | Coverage | [![workflow scc badge](https://sonarcloud.io/api/project_badges/measure?project=doerte_discuit-project&metric=coverage)](https://sonarcloud.io/dashboard?id=doerte_discuit-project) | | Documentation |  | | **GitHub Actions** |   | | Build | [![build](https://github.com/doerte/discuit-project/actions/workflows/build.yml/badge.svg)](https://github.com/doerte/discuit-project/actions/workflows/build.yml) | | Citation data consistency | [![cffconvert](https://github.com/doerte/discuit-project/actions/workflows/cffconvert.yml/badge.svg)](https://github.com/doerte/discuit-project/actions/workflows/cffconvert.yml) | | SonarCloud | [![sonarcloud](https://github.com/doerte/discuit-project/actions/workflows/sonarcloud.yml/badge.svg)](https://github.com/doerte/discuit-project/actions/workflows/sonarcloud.yml) | | MarkDown link checker | [![markdown-link-check](https://github.com/doerte/discuit-project/actions/workflows/markdown-link-check.yml/badge.svg)](https://github.com/doerte/discuit-project/actions/workflows/markdown-link-check.yml) | ## About Discuit Discuit is a **D**ynamic **i**tem **s**et **c**lustering **UI** **t**ool (with the UI part yet to come) Discuit can split datasets (e.g. words defined by several variables) into subsets that are as comparable as possible. The package takes a csv file as input and generates a defined number of matched sets for a given number of continuous and categorical variables. One of the categorical variables can be selected to be split absolutely even across sets. Discuit generates the following output: - csv file ammended with the set membership for each item - txt file that reports on the outcomes of statistics tests In the future, this will be integrated in an GUI. The project setup is documented in [project_setup.md](project_setup.md). ## Installation Discuit is available on PyPI! ```bash pip install discuit ``` Or install the latest development version directly from GitHub: ```console git clone git@github.com:doerte/discuit-project.git cd discuit-project python3 -m pip install . ```  ## Using Discuit In the terminal, run Discuit with the following command: python3 run_discuit.py "name of input file" [number of desired sets] --columns l/c/n/a/d --runs [desired number of runs] Example: python3 discuit/run_discuit.py example/input.csv 2 --columns l a n c n d --runs 3 This will run Discuit with the [provided testfile](example/input.csv) and create 2 subsets. The columns in the file are identified as "label", "categorical", "numerical", "absolute", "numerical" and "disregard" (in that order). The program will run 3 times (and create 3 output files). ### Required input The input file needs to be a .csv file with a first line containing headings followed by rows that represent the different items. Each column specifies one variable. When launching the script, please specify per column what kind a data the script should expect: - (l)abel: just a label, will not be taken into consideration, could be the itemname or itemnumber. This can only be assigned once. - (n)umerical: a numerical variable, such as frequency or AoA, - (c)ategorical: a categorical variable, such as "transitivity" or "accuracy", - (a)bsolute: this needs to be perfectly divided between sets. This can only be assigned once. - (d)isregard: a column that does not need to be taken into account for the split, but contains other information you have in the same file. The package will try maximally 20 times to come-up with a good split. If it doesn't it will give up and output it's last try. You can always run it again. Often it will succeed eventually. If not, consider dropping variables. If you run the script without specifying --columns, you will be asked what you want per column. If you don't specify the desired number of runs, it will generate 1 output file. ### Missing data If you choose an "absolute split variable", this variable cannot have missing data. The program will exit if it does. For categorical variables, a dummy category is created that holds the items with missing data. For numerical variables, missing data is replaced with the average of this variable. If you prefer a different approach, please prepare your input file in a way that does not include missing data. ## Contributing If you want to contribute to the development of Discuit, have a look at the [contribution guidelines](CONTRIBUTING.md). ## Credits This package was created with [Cookiecutter](https://github.com/audreyr/cookiecutter) and the [NLeSC/python-template](https://github.com/NLeSC/python-template).

نیازمندی

مقدار	نام
>=1.4.0dev	argparse
>=1.0.0	pathlib
>=3.7.4.0	typing
>=1.3.0	pandas
>=0.12.0	kmodes
>=1.7.0	scipy
>=1.0.0	scikit-learn
-	bump2version
-	coverage[toml]
-	prospector[with_pyroma]
-	isort
-	pytest
-	pytest-cov
-	sphinx
-	sphinx-rtd-theme
-	sphinx-autoapi
-	tox
-	myst-parser
-	twine
-	wheel

زبان مورد نیاز

مقدار	نام
>=3.7	Python

نحوه نصب

نصب پکیج whl discuit-0.2.1:

pip install discuit-0.2.1.whl

نصب پکیج tar.gz discuit-0.2.1:

pip install discuit-0.2.1.tar.gz