معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Fibber is a benchmarking suite for adversarial attacks on text classification.

ویژگی	مقدار
سیستم عامل	-
نام فایل	fibber-0.4.0
نام	fibber
نسخه کتابخانه	0.4.0
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	MIT Data To AI Lab
ایمیل نویسنده	dailabmit@gmail.com
آدرس صفحه اصلی	https://github.com/DAI-Lab/fibber
آدرس اینترنتی	https://pypi.org/project/fibber/
مجوز	MIT license

<p align="left"> <img width=15% src="https://dai.lids.mit.edu/wp-content/uploads/2018/06/Logo_DAI_highres.png" alt=“DAI-Lab” /> <i>An open source project from Data to AI Lab at MIT.</i> </p>  [![PyPI Shield](https://img.shields.io/pypi/v/fibber.svg)](https://pypi.python.org/pypi/fibber) [![Downloads](https://pepy.tech/badge/fibber)](https://pepy.tech/project/fibber) ![build](https://github.com/DAI-Lab/fibber/workflows/build/badge.svg?branch=stable) # Fibber Fibber is a library to evaluate different strategies to paraphrase natural language, especially how these strategies can break text classifiers without changing the meaning of a sentence. - Documentation: [https://DAI-Lab.github.io/fibber](https://DAI-Lab.github.io/fibber) - GitHub: [https://github.com/DAI-Lab/fibber](https://github.com/DAI-Lab/fibber) # Overview Fibber is a library to evaluate different strategies to paraphrase natural language. In this library, we have several built-in paraphrasing strategies. We also have a benchmark framework to evaluate the quality of paraphrase. In particular, we use the GPT2 language model to measure how meaningful is the paraphrased text. We use a universal sentence encoder to evaluate the semantic similarity between original and paraphrased text. We also train a BERT classifier on the original dataset, and check of paraphrased sentences can break the text classifier. # Try it now! No matter how much experience you have on natural language processing and adversarial attack, we encourage you to try the demo. Our demo is running on colab, **so you can try it without install!** This colab will automatically download a sentiment classifier, and all required resources. When resources are downloaded, you can type in your own sentences, and use Fibber to rewrite it. You can read the rewritten sentences, and metric evaluation of rewritten sentence. You will see some rewritten sentences have the same meaning as your input but get misclassified by the classifier. **[Click here to Launch Colab!](https://colab.research.google.com/drive/1zefsU19P3HdrBUqJy7HU9b9cSaB_nBMP#scrollTo=uNcmhgzHJ3VQ)** # Install ## Requirements **fibber** has been developed and tested on [Python 3.6, 3.7 and 3.8](https://www.python.org/downloads/) Also, although it is not strictly required, the usage of [conda](https://docs.conda.io/en/latest/miniconda.html) is highly recommended to avoid interfering with other software installed in the system in which **fibber** is run. These are the minimum commands needed to create a conda environment using python3.6 for **fibber**: ```bash # First you should install conda. conda create -n fibber_env python=3.6 ``` Afterward, you have to execute this command to activate the environment: ```bash conda activate fibber_env ``` **Then you should install tensorflow and pytorch.** Please follow the instructions for [tensorflow](https://www.tensorflow.org/install) and [pytorch](https://pytorch.org). Fibber requires `tensorflow>=2.0.0` and `pytorch>=1.5.0`. Please choose a proper version of tensorflow and pytorch according to the CUDA version on your computer. Remember to execute `conda activate fibber_env` every time you start a new console to work on **fibber**! **Install Java** Please install a Java runtime environment on your computer. ## Install from PyPI After creating the conda environment and activating it, we recommend using [pip](https://pip.pypa.io/en/stable/) in order to install **fibber**: ```bash pip install fibber ``` This will pull and install the latest stable release from [PyPI](https://pypi.org/). ## Use without install If you are using this project for research purpose and want to make changes to the code, you can install all requirements by ```bash git clone git@github.com:DAI-Lab/fibber.git cd fibber pip install --requirement requirement.txt ``` Then you can use fibber by ```base python -m fibber.datasets.download_datasets python -m fibber.benchmark.benchmark ``` In this case, any changes you made on the code will take effect immediately. ## Install from source With your conda environment activated, you can clone the repository and install it from source by running `make install` on the `stable` branch: ```bash git clone git@github.com:DAI-Lab/fibber.git cd fibber git checkout stable make install ``` # Quickstart In this short tutorial, we will guide you through a series of steps that will help you getting started with **fibber**. **(1) [Install Fibber](#Install)** **(2) Get a demo dataset and resources.** ```python from fibber.datasets import get_demo_dataset trainset, testset = get_demo_dataset() from fibber.resources import download_all # resources are downloaded to ~/.fibber download_all() ``` **(3) Create a Fibber object.** ```python from fibber.fibber import Fibber # args starting with "asrs_" are hyperparameters for the ASRSStrategy. arg_dict = { "use_gpu_id": 0, "gpt2_gpu_id": 0, "transformer_clf_gpu_id": 0, "ce_gpu_id": 0, "strategy_gpu_id": 0, "asrs_block_size": 3, "asrs_wpe_weight": 10000, "asrs_sim_weight": 500, "asrs_sim_threshold": 0.95, "asrs_ppl_weight": 5, "asrs_clf_weight": 3 "asrs_sim_metric": "CESimilarityMetric" } # create a fibber object. # This step may take a while (about 1 hour) on RTX TITAN, and requires 20G of # GPU memory. If there's not enough GPU memory on your GPU, consider assign use # gpt2, bert, and strategy to different GPUs. # fibber = Fibber(arg_dict, dataset_name="demo", strategy_name="ASRSStrategy", trainset=trainset, testset=testset, output_dir="exp-demo") ``` **(4) You can also ask fibber to paraphrase your sentence.** The following command can randomly paraphrase the sentence into 5 different ways. ```python # Try sentences you like. # label 0 means negative, and 1 means positive. fibber.paraphrase( {"text0": ("The Avengers is a good movie. Although it is 3 hours long, every scene has something to watch."), "label": 1}, field_name="text0", n=5) ``` The output is a tuple of (str, list, list). ```python # Original Text 'The Avengers is a good movie. Although it is 3 hours long, every scene has something to watch.' # 5 paraphrase_list ['the avengers is a good movie. even it is 2 hours long, there is not enough to watch.', 'the avengers is a good movie. while it is 3 hours long, it is still very watchable.', 'the avengers is a good movie and although it is 2 ¹⁄₂ hours long, it is never very interesting.', 'avengers is not a good movie. while it is three hours long, it is still something to watch.', 'the avengers is a bad movie. while it is three hours long, it is still something to watch.'] # Evaluation metrics of these 5 paraphrase_list. {'EditingDistance': 8, 'USESimilarityMetric': 0.9523628950119019, 'GloVeSimilarityMetric': 0.9795315341042675, 'GPT2PerplexityMetric': 1.492070198059082, 'BertClassifier': 0}, {'EditingDistance': 9, 'USESimilarityMetric': 0.9372092485427856, 'GloVeSimilarityMetric': 0.9575780832312993, 'GPT2PerplexityMetric': 0.9813404679298401, 'BertClassifier': 1}, {'EditingDistance': 11, 'USESimilarityMetric': 0.9265919327735901, 'GloVeSimilarityMetric': 0.9710499628056698, 'GPT2PerplexityMetric': 1.325406551361084, 'BertClassifier': 0}, {'EditingDistance': 7, 'USESimilarityMetric': 0.8913971185684204, 'GloVeSimilarityMetric': 0.9800737898362042, 'GPT2PerplexityMetric': 1.2504483461380005, 'BertClassifier': 1}, {'EditingDistance': 8, 'USESimilarityMetric': 0.9124080538749695, 'GloVeSimilarityMetric': 0.9744155151490856, 'GPT2PerplexityMetric': 1.1626977920532227, 'BertClassifier': 0}] ``` **(5) You can ask fibber to randomly pick a sentence from the dataset and paraphrase it.** ```python fibber.paraphrase_a_random_sentence(n=5) ``` # Supported strategies In this version, we implement three strategies - IdentityStrategy: - The identity strategy outputs the original text as its paraphrase. - This strategy generates exactly 1 paraphrase for each original text regardless of `--num_paraphrases_per_text` flag. - RandomStrategy: - The random strategy outputs the random shuffle of words in the original text. - TextAttackStrategy: - We create a wrapper around [TextAttack](https://github.com/QData/TextAttack). To use TextAttack, run `pip install textattack` first. - ASRSStrategy: - The implementation for [ASRS](https://arxiv.org/abs/2104.08453) # Citing Fibber If you use Fibber, please cite the following work: - *Lei Xu, Kalyan Veeramachaneni.* Attacking Text Classifiers via Sentence Rewriting Sampler. ``` @article{xu2021attacking, title={Attacking Text Classifiers via Sentence Rewriting Sampler}, author={Xu, Lei and Veeramachaneni, Kalyan}, journal={arXiv preprint arXiv:2104.08453}, year={2021} } ``` # What's next? For more details about **fibber** and all its possibilities and features, please check the [documentation site]( https://DAI-Lab.github.io/fibber/). # History ## Version 0.4.0 - 2022-06-29 This release includes the following updates ## New Features - Add RewriteRollbackStrategy, SAPStrategy. - Redesign defense strategy API. - Add AdvTrainStrategy, SEMStrategy, SAPDStrategy ## Version 0.3.1 - 2021-07-20 This release includes the following updates ## New Features - Add `dynamic_len` seed option to ASRS, so the paraphrase can have different length. - Fix bug in `asrs_utils_lm` ## Version 0.3.0 - 2021-06-05 This release includes the following updates ## New Features - Rename BertSamplingStrategy model as ASRSStrategy. - Update ASRS to use Cross Encoder as default similarity metric. - Add timeout feature to TextAttackStrategy. - Update benchmark results. - Add paper reference. ## Version 0.2.5 - 2021-03-22 This release is an emergency bug fix. - Fix the bug in DatasetForBert introduced by the previous update. ## Version 0.2.4 - 2021-03-03 This release includes the following updates ### New Features - Improve the doc string and documentation for adversarial training. - Add experimental code about non-autoregressive paraphrase strategy. ## Version 0.2.3 - 2021-02-17 This release adds experimental code to adversarial training. ### New Features - Add a default adversarial tuning strategy. - Add API in classifiers to support adversarial tuning. - Add args in benchmark for adversarial tuning. ## Version 0.2.2 - 2021-02-03 This release fixes bugs and adds unit tests. ### New Features - Add Sentence BERT metric and corresponding unit test. - Fix the bug of the colab demo. ## Version 0.2.1 - 2021-01-20 This release improves documentation and unit tests. ### New Features - Add integrity test for IdentityStrategy, TextAttackStrategy, and BertSamplingStrategy. - For IdentityStrategy and TextAttackStrategy, accuracy is verified. - Improve documentation, split data format from benchmark. ## Version 0.2.0 - 2021-01-06 This release updates the structure of the project and improve documentation. ### New Features - Metric module is redesigned to have a consistant API. ([Issue #12](https://github.com/DAI-Lab/fibber/issues/12)) - More unit tests are added. Slow unit tests are skipped in CI. ([Issue #11](https://github.com/DAI-Lab/fibber/issues/11)) - Benchmark table is updated. ([Issue #10](https://github.com/DAI-Lab/fibber/issues/10)) - Better support to `TextAttack`. Users can choose any implemented attacking method in `TextAttack` using the `ta_recipe` arg. ([Issue #9](https://github.com/DAI-Lab/fibber/issues/9)) ## Version 0.1.3 This release includes the following updates: - Add a benchmark class. Users can integrate fibber benchmark to other projects. The class supports customized datasets, target classifier and attacking method. - Migrate from Travis CI to Github Action. - Move adversarial-attack-related aggragation functions from benchmark module to metric module. ## Version 0.1.2 This minor release add pretrained classifiers and downloadable resources on a demo dataset, and a demo Colab. ## Version 0.1.1 This minor release removes the dependency on `textattack` because it produces dependency conflicts. Users can install it manually to use attacking strategies in `textattack`. ## version 0.1.0 This release is a major update to Fibber library. Advanced paraphrase algorithms are included. - Add two strategies: TextFoolerStrategy and BertSamplingStrategy. - Improve the benchmarking framework: add more metrics specifically designed for adversarial attack. - Datasets: add a variation of AG's news dataset, `ag_no_title`. - Bug fix and improvements. ## version 0.0.1 This is the first release of Fibber library. This release contains: - Datasets: fibber contains 6 built-in datasets. - Metrics: fibber contains 6 metrics to evaluate the quality of paraphrased sentences. All metrics have a unified interface. - Benchmark framework: the benchmark framework and easily evaluate the phraphrase strategies on built-in datasets and metrics. - Strategies: this release contains 2 basic strategies, the identity strategy and random strategy. - A unified Fibber interface: users can easily use fibber by creating a Fibber object.

نیازمندی

مقدار	نام
>=1.18.0	numpy
>=2.0.0	tensorflow-gpu
>=0.9.0	tensorflow-hub
<2,>=1.0	torch
<1,>=0.4.2	torchvision
>=2.4.0	transformers
>=4.0.0	tqdm
>=2.0.0	spacy
>=1.0.0	pandas
>=3.0	nltk
>=1.0.4	rake-nltk
>=1.1.0	stanza
>=0.3.0	sentence-transformers
>=0.9.0	fasttext
>=1.2.0	expiringdict
>=0.5.3	bumpversion
>=9.0.1	pip
>=0.8.3	watchdog
<0.3,>=0.2.5	m2r2
<0.7,>=0.5.0	nbsphinx
==3.2.1	Sphinx
-	pydata-sphinx-theme
>=0.1.10	autodocsumm
<6,>=5.3.1	PyYaml
<1,>=0.26.2	argh
<1,>=0.4	sphinx-rtd-theme
<8,>=7	ipython
>=3.7.7	flake8
>=5	isort
>=1.2	autoflake
>=1.4.3	autopep8
>=1.10.0	twine
>=0.30.0	wheel
>=4.5.1	coverage
>=2.9.1	tox
>=3.4.2	pytest
>=2.6.0	pytest-cov
>=3.4.2	pytest
>=2.6.0	pytest-cov

زبان مورد نیاز

مقدار	نام
>=3.6	Python

نحوه نصب

نصب پکیج whl fibber-0.4.0:

pip install fibber-0.4.0.whl

نصب پکیج tar.gz fibber-0.4.0:

pip install fibber-0.4.0.tar.gz