معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Boolean Network To Vector

ویژگی	مقدار
سیستم عامل	-
نام فایل	bn2vec-1.0.0
نام	bn2vec
نسخه کتابخانه	1.0.0
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	Mohamed Hmini
ایمیل نویسنده	mo@mhmini.com
آدرس صفحه اصلی	-
آدرس اینترنتی	https://pypi.org/project/bn2vec/
مجوز	-

# bn2vec Boolean Networks' embedding techniques & ML based Boolean Networks' classification. ## 0. Introduction: bn2vec is the result of an earlier research which has been conducted in 2021 (Mar-Sept) as part of a larger project named [BNediction](https://bnediction.github.io/), the goal of the research was fixated on developing new embedding techniques specifically built for dealing with Boolean Networks, the aim though was to use these techniques to help classifiy Boolean Networks and develop a solid set of features which would be able to explain the performance of a given BN. the full master's thesis report which wraps the work done in this package could be found in [Master's Thesis](https://drive.google.com/file/d/1I8tlNt7-CV9RZhmOJ5rv5Hxi_padirUl/view?usp=sharing), any details regarding how the embedding or the classification work are discussed in the report. for a walk through example please check [test.ipynb](./tests/test.ipynb). ## 1. Setting up: step 1. creating a new virtual env. ```bash python -m venv env ``` for a manual setup we should install the packages from the requirements.txt file and then install bn2vec using pip. ```bash pip install -r requirements.txt ``` ```bash pip install -e . ``` ## 3. Config: when creating a ConfigParser object you will be asked to feed the path for your configuration file, the file should be a yaml type and it should conform to the validation rules for it to be used, in case of abscense of the config file a default file (allow-all) would be used, see [Default Config File](./bn2vec/config.yaml). under section **Memory** 6 options are allowed: - memorize_dnf_graphs (resp. _bn_graphs): if set to true it allows for remembering graphs' data generated from DNFs (resp. BNs). - memorize_dnf_sequences (resp. _bn_sequences): if set to true it allows for remembering sequences' data generated from DNFs (resp. BNs). - hard_memory: if set to true it allows for storing the generated data from an ensemble of BNs into the desk. - hard_memory_loc: the path folder for hard_memory. under section **Embeddings** we can specify any of the following options: - rsf: stands for **Relaxed Structural Features**, if specified the system generates RSF features of the given ensemble of BNs. - lsf: stands for **Lossy Structural Features**, if specified the system generates LSF features of the given ensemble of BNs. - ptnrs: acronym of **Patterns**, if specified the system generates PTRNS features of the given ensemble of BNs. - igf: stands for **Influence Graph Features**, if specified the system generates IGF features of the given ensemble of BNs. for more details about the rest of the file please have a look at the [Default File](./bn2vec/config.yaml) and the [Full Report](https://drive.google.com/file/d/1I8tlNt7-CV9RZhmOJ5rv5Hxi_padirUl/view?usp=sharing). ## 4. Embeddings: let us have a look at the different ways of using the feature engineering module. necessary imports: ```python from colomoto import minibn from bn2vec.feature_engineering import Dnf2Vec, Bn2Vec, Ens2Mat from bn2vec.utils import ConfigParser ``` in the case of using **Dnf2Vec** (embedding a single DNF) or **Bn2Vec** (embedding a single BN, ensemble of DNFs), we have to tell the system to parse the config file ourselves. ```python ConfigParser.parse("path/to/configfile") ``` we use minibn.BooleanNetwork to parse Boolean Networks' files. ```python bn = minibn.BooleanNetwork("path/to/boolean_network") BN = list(bn.items()) ``` then when using **Dnf2Vec** we can perform the embedding to one of the BN's DNFs this way. ```python gen = Dnf2Vec(dnf=BN[0][1], comp_name=BN[0][0]) graphs, seqs, features = gen.generate_features() ``` the generate_features method returns three objects: - graphs (resp. seqs): is a dictionary containing the graphs' (res. sequences') data of the given DNF.(if asked for). - features: is a pandas Series object containing the final features extracted from the given DNF. likewise we can embed the whole BN. ```python gen = Bn2Vec(BN) bn_graphs, bn_seqs, dnfs_data, bn_features = gen.generate_features() ``` this time we have more complicated semi-structed data to look at: - bn_graphs (resp. bn_seqs): is a dictionary containing the graphs' (res. sequences') data of the given BN.(if asked for). - dnfs_data: contains dnf graphs, sequences and features generated by Dnf2Vec for all dnfs in the given BN. - bn_features: is a pandas Series object containing the final features extracted from the given BN. if we want to embed an ensemble of BNs we simply use **Ens2Mat** (ensemble to matrix). ```python gen = Ens2Mat( config_path='path/to/config_file', master_model_src = 'path/to/master_model' ) X,Y = gen.vectorize_BNs( 'path/to/base_directory', '', # bundle file name (under base_directory) size = 'all' # or an integer (the number of BNs to embed) ) ``` ## 5. Features Selector: in order to use **BnFeaturesSelector** we should import one extra module: ```python from bn2vec.feature_selection import BnFeaturesSelector ``` this module has 3 main methods: - drop_zero_variance_features: literally removes features without any flactuations. - cluster_collinear_features_leiden: uses the leiden algorithm to cluster features based on their collinearities, then the method selects the best representative feature from each cluster, this method is only useful in the case of LSF and RSF (mostly LSF where elminiating collinearities is important but also deciding which to remove is more important). - correct_collinearity: takes a set of features and then returns another set of features (with high collinearity with the input features) which are better explainable than the originals. ```python selector = BnFeaturesSelector(X, mode='lsf') X = selector.drop_zero_variance_features() X, clusters = selector.cluster_collinear_features_leiden(thresh = 0.8) ``` the argument thresh is the threshold (minimal value) to decide that two features are correlated, it is calulated as the absolute value of the correlation value between the two features. ## 6. Rules Extractor: necessary imports for using the rules extraction module: ```python from bn2vec.utils import BnDataset from bn2vec.rules_extraction import DTC, RulesExtractor ``` creating a BnDataset object is necessary: ```python base_dir='path/to/base_directory' BN = BnDataset( dataset_X = os.path.join(base_dir, 'path/to/X_file'), dataset_Y= os.path.join(base_dir, 'path/to/Y_file'), score_threshold = 1 ) ``` then we can create our **DTC** (stands for Decision Tree Classifier) object: ```python dtc = DTC( dataset = BN, save_dir = "path/to/saving_directory", ensemble="ens1", embedding="ptrns" ) ``` the arguments 'ensemble' and 'embedding' are there just for naming conventions, to train deep decision tree classifiers we use the train_deep_dtcs method: ```python dtc.train_deep_dtcs(test_size=0.3) ``` this well train a balanced and an unbalanced version of the tree, it will save the trees and the metrics in the save_dir folder and it will print the metrics for visual inspection. in order to extract useful rules from these trees we should use the **RulesExtractor** class: ```python rule_extractor = RulesExtractor( dataset = BN, dtc = "path/to/dtc", ) rules = rule_extractor.extract_rules( thresh = 0, tpr_weight = 0.5, # importance of the true positive rate tnr_weight = 0.5 # importance of the true negative rate ) ``` for training singleton decision trees (trees with a single split) we use train_singleton_dtcs: ```python rules = dtc.train_singleton_dtcs( test_size=0.3, balanced=False, thresh=0.5, tpr_weight=0.5, tnr_weight=0.5 ) ```

نیازمندی

مقدار	نام
==0.8.2	colomoto-jupyter
==0.9.9	igraph
==3.5.1	matplotlib
==1.22.2	numpy
==1.4.1	pandas
==1.0.2	scikit-learn
==4.63.0	tqdm
-	PyYAML

نحوه نصب

نصب پکیج whl bn2vec-1.0.0:

pip install bn2vec-1.0.0.whl

نصب پکیج tar.gz bn2vec-1.0.0:

pip install bn2vec-1.0.0.tar.gz