معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Generic Anti-Clustering

ویژگی	مقدار
سیستم عامل	-
نام فایل	anti-clustering-0.2.1
نام	anti-clustering
نسخه کتابخانه	0.2.1
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	Matthias Als
ایمیل نویسنده	mata@ecco.com
آدرس صفحه اصلی	-
آدرس اینترنتی	https://pypi.org/project/anti-clustering/
مجوز	-

# Anti-clustering A generic Python library for solving the anti-clustering problem. While clustering algorithms will achieve high similarity within a cluster and low similarity between clusters, the anti-clustering algorithms will achieve the opposite; namely to minimise similarity within a cluster and maximise the similarity between clusters. Currently, a handful of algorithms are implemented in this library: * An exact approach using a BIP formulation. * An enumerated exchange heuristic. * A simulated annealing heuristic. Keep in mind anti-clustering is computationally difficult problem and may run slow even for small instance sizes. The current ILP does not finish in reasonable time when anti-clustering the Iris dataset (150 data points). The two former approaches are implemented as described in following paper:\ *Papenberg, M., & Klau, G. W. (2021). Using anticlustering to partition data sets into equivalent parts. Psychological Methods, 26(2), 161–174. [DOI](https://doi.org/10.1037/met0000301). [Preprint](https://psyarxiv.com/3razc/)* \ The paper is accompanied by a library for the R programming language: [anticlust](https://github.com/m-Py/anticlust). Differently to the [anticlust](https://github.com/m-Py/anticlust) R package, this library currently only have one objective function. In this library the objective will maximise intra-cluster distance: Euclidean distance for numerical columns and Hamming distance for categorical columns. ## Use cases Within software testing, anti-clustering can be used for generating test and control groups in AB-testing. Example: You have a webshop with a number of users. The webshop is undergoing active development and you have a new feature coming up. This feature should be tested against as many different users as possible without testing against the entire user-base. For that you can create a maximally diverse subset of the user-base to test against (the A group). The remaining users (B group) will not test this feature. For dividing the user-base you can use the anti-clustering algorithms. A and B groups should be as similar as possible to have a reliable basis of comparison, but internally in group A (and B) the elements should be as dissimilar as possible. This is just one use case, probably many more exists. ## Installation The anti-clustering package is available on [PyPI](https://pypi.org/project/anti-clustering/). To install it, run the following command: ```bash pip install anti-clustering ``` The package currently supports Python 3.8 and above. ## Usage The input to the algorithm is a Pandas dataframe with each row representing a data point. The output is the same dataframe with an extra column containing integer encoded cluster labels. Below is an example based on the Iris dataset: ```python from anti_clustering import ExactClusterEditingAntiClustering from sklearn import datasets import pandas as pd iris_data = datasets.load_iris(as_frame=True) iris_df = pd.DataFrame(data=iris_data.data, columns=iris_data.feature_names) algorithm = ExactClusterEditingAntiClustering() df = algorithm.run( df=iris_df, numerical_columns=list(iris_df.columns), categorical_columns=None, num_groups=2, destination_column='Cluster' ) ``` ## Contributions If you have any suggestions or have found a bug, feel free to open issues. If you have implemented a new algorithm or know how to tweak the existing ones; PRs are very appreciated. ## License This library is licensed under the Apache 2.0 license.

نیازمندی

مقدار	نام
==9.3.10497	ortools
>=1.4.4,<1.5.0	pandas
>=1.23.1,<1.24.0	numpy
>=1.9.0,<1.10.0	scipy
>=1.1.1,<1.2.0	scikit-learn

زبان مورد نیاز

مقدار	نام
>=3.8,<3.11	Python

نحوه نصب

نصب پکیج whl anti-clustering-0.2.1:

pip install anti-clustering-0.2.1.whl

نصب پکیج tar.gz anti-clustering-0.2.1:

pip install anti-clustering-0.2.1.tar.gz