معرفی شرکت ها


blitzml-0.9.0


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Automate machine learning pipelines rapidly
ویژگی مقدار
سیستم عامل -
نام فایل blitzml-0.9.0
نام blitzml
نسخه کتابخانه 0.9.0
نگهدارنده []
ایمیل نگهدارنده []
نویسنده AI Team
ایمیل نویسنده AI Team <CV.Team.CSE.2023@gmail.com>
آدرس صفحه اصلی -
آدرس اینترنتی https://pypi.org/project/blitzml/
مجوز -
<div align="center"> # <span style="color:violet">BlitzML</span> ### **Automate machine learning pipelines rapidly** <div align="left"> - [Install BlitzML](#install-blitzml) - [Classification](#classification) - [Regression](#regression) # Install BlitzML ```bash pip install blitzml ``` # Classification ```python from blitzml.tabular import Classification import pandas as pd # prepare your dataframes train_df = pd.read_csv("auxiliary/datasets/banknote/train.csv") test_df = pd.read_csv("auxiliary/datasets/banknote/test.csv") # create the pipeline auto = Classification(train_df, test_df, classifier = 'RF', n_estimators = 50) # first perform data preprocessing auto.preprocess() # second train the model auto.train_the_model() # After training the model we can generate: auto.gen_pred_df(auto.test_df) auto.gen_metrics_dict() # We can get their values using: pred_df = auto.pred_df metrics_dict = auto.metrics_dict print(pred_df.head()) print(metrics_dict) ``` ## Available Classifiers - Random Forest 'RF' - LinearDiscriminantAnalysis 'LDA' - Support Vector Classifier 'SVC' - KNeighborsClassifier 'KNN' - GaussianNB 'GNB' - LogisticRegression 'LR' - AdaBoostClassifier 'AB' - GradientBoostingClassifier 'GB' - DecisionTreeClassifier 'DT' - MLPClassifier 'MLP' ## **Parameters** **classifier** options: {'RF','LDA','SVC','KNN','GNB','LR','AB','GB','DT','MLP', 'auto', 'custom'}, default = 'RF' `auto: selects the best scoring classifier based on f1-score` `custom: enables providing a custom classifier through *file_path* and *class_name* parameters` **file_path** when using 'custom' classifier, pass the path of the file containing the custom class, default = 'none' **class_name** when using 'custom' classifier, pass the class name through this parameter, default = 'none' **feature_selection** options: {'correlation', 'importance', 'none'}, default = 'none' `correlation: use feature columns with the highest correlation with the target` `importance: use feature columns that are important for the model to predict the target` `none: use all feature columns` **validation_percentage** value determining the validation split percentage (value from 0 to 1), default = 0.1 **average_type** when performing multiclass classification, provide the average type for the resulting metrics, default = 'macro' **cross_validation_k_folds** number of k-folds for cross validation, if 1 then no cv will be performed, default = 1 ****kwargs** optional parameters for the chosen classifier. you can find available parameters in the [sklearn docs](https://scikit-learn.org/stable/user_guide.html) ## **Attributes** **train_df** the preprocessed train dataset (after running `Classification.preprocess()`) **test_df** the preprocessed test dataset (after running `Classification.preprocess()`) **model** the trained model (after running `Classification.train_the_model()`) **pred_df** the prediction dataframe (test_df + predicted target) (after running `Classification.gen_pred_df(Classification.test_df)`) **metrics_dict** the validation metrics (after running `Classification.gen_metrics_dict()`) {     "accuracy": acc,     "f1": f1,     "precision": pre,     "recall": recall,     "hamming_loss": h_loss,     "cross_validation_score":cv_score, `returns None if cross_validation_k_folds==1` } ## **Methods** **preprocess()** perform preprocessing on train_df and test_df **train_the_model()** train the chosen classifier on the train_df **accuracy_history()** accuracy scores when varying the sampling size of the train_df (after running `Classification.train_the_model()`). *returns:* {     'x':train_df_sample_sizes,     'y1':train_scores_mean,     'y2':test_scores_mean,     'title':title } **gen_pred_df(test_df)** generates the prediction dataframe and assigns it to the `pred_df` attribute **gen_metrics_dict()** generates the validation metrics and assigns it to the `metrics_dict` **run()** a shortcut that runs the following methods: preprocess() train_the_model() gen_pred_df(Classification.test_df) gen_metrics_dict() # Regression ```python from blitzml.tabular import Regression import pandas as pd # prepare your dataframes train_df = pd.read_csv("auxiliary/datasets/house prices/train.csv") test_df = pd.read_csv("auxiliary/datasets/house prices/test.csv") # create the pipeline auto = Regression(train_df, test_df, regressor = 'RF') # first perform data preprocessing auto.preprocess() # second train the model auto.train_the_model() # After training the model we can generate: auto.gen_pred_df(auto.test_df) auto.gen_metrics_dict() # We can get their values using: pred_df = auto.pred_df metrics_dict = auto.metrics_dict print(pred_df.head()) print(metrics_dict) ``` ## Available Regressors - Random Forest 'RF' - Support Vector Regressor 'SVR' - KNeighborsRegressor 'KNN' - Lasso Regressor 'LSS' - LinearRegression 'LR' - Ridge Regressor 'RDG' - GaussianProcessRegressor 'GPR' - GradientBoostingRegressor 'GB' - DecisionTreeRegressor 'DT' - MLPRegressor 'MLP' ## **Parameters** **regressor** options: {'RF','SVR','KNN','LSS','LR','RDG','GPR','GB','DT','MLP', 'auto', 'custom'}, default = 'RF' `auto: selects the best scoring regressor based on r2 score` `custom: enables providing a custom regressor through *file_path* and *class_name* parameters` **file_path** when using 'custom' regressor, pass the path of the file containing the custom class, default = 'none' **class_name** when using 'custom' regressor, pass the class name through this parameter, default = 'none' **feature_selection** options: {'correlation', 'importance', 'none'}, default = 'none' `correlation: use feature columns with the highest correlation with the target` `importance: use feature columns that are important for the model to predict the target` `none: use all feature columns` **validation_percentage** value determining the validation split percentage (value from 0 to 1), default = 0.1 **cross_validation_k_folds** number of k-folds for cross validation, if 1 then no cv will be performed, default = 1 ****kwargs** optional parameters for the chosen regressor. you can find available parameters in the [sklearn docs](https://scikit-learn.org/stable/user_guide.html) ## **Attributes** **train_df** the preprocessed train dataset (after running `Regression.preprocess()`) **test_df** the preprocessed test dataset (after running `Regression.preprocess()`) **model** the trained model (after running `Regression.train_the_model()`) **pred_df** the prediction dataframe (test_df + predicted target) (after running `Regression.gen_pred_df(Regression.test_df)`) **metrics_dict** the validation metrics (after running `Regression.gen_metrics_dict()`) {     "r2_score": r2,     "mean_squared_error": mse,     "root_mean_squared_error": rmse,     "mean_absolute_error" : mae,     "cross_validation_score":cv_score, `returns None if cross_validation_k_folds==1` } ## **Methods** **preprocess()** perform preprocessing on train_df and test_df **train_the_model()** train the chosen regressor on the train_df **RMSE_history()** RMSE scores when varying the sampling size of the train_df (after running `Regression.train_the_model()`). *returns:* {     'x':train_df_sample_sizes,     'y1':train_scores_mean,     'y2':test_scores_mean,     'title':title } **gen_pred_df(test_df)** generates the prediction dataframe and assigns it to the `pred_df` attribute **gen_metrics_dict()** generates the validation metrics and assigns it to the `metrics_dict` **run()** a shortcut that runs the following methods: preprocess() train_the_model() gen_pred_df(Regression.test_df) gen_metrics_dict() ## Development - Clone the repo - run `pip install virtualenv` - run `python -m virtualenv venv` - run `. ./venv/bin/activate` on UNIX based systems or `. ./venv/Scripts/activate.ps1` if on windows - run `pip install -r requirements.txt` - run `pre-commit install`


نیازمندی

مقدار نام
>=1.2.0 joblib
<=1.23.4 numpy
>=1.5.1 pandas
>=1.1.3 scikit-learn
>=0.3 Boruta
- pip-tools
- pytest


زبان مورد نیاز

مقدار نام
>=3.8 Python


نحوه نصب


نصب پکیج whl blitzml-0.9.0:

    pip install blitzml-0.9.0.whl


نصب پکیج tar.gz blitzml-0.9.0:

    pip install blitzml-0.9.0.tar.gz