معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

A tool to substitue patterns/names in a file tree

ویژگی	مقدار
سیستم عامل	-
نام فایل	anonymoUUs-0.0.8
نام	anonymoUUs
نسخه کتابخانه	0.0.8
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	C.S. Kaandorp
ایمیل نویسنده	c.s.kaandorp@uu.nl
آدرس صفحه اصلی	https://github.com/UtrechtUniversity/anonymouus
آدرس اینترنتی	https://pypi.org/project/anonymoUUs/
مجوز	MIT

# anonymoUUs This description can be found [on GitHub here](https://github.com/UtrechtUniversity/anonymouus) anonymoUUs facilitates the replacement of keywords or regex-patterns within a file tree or zipped archive. It recursively traverses the tree, opens supported files and substitutes any found pattern or keyword with a replacement. Besides contents, anonymoUUs will substitue keywords/patterns in file/folder-paths as well. The result will be either a copied or replaced version of the original file-tree with all substitutions made. As of now, anonymoUUs supports text-based files, like .txt, .html, .json and .csv. UTF-8 encoding is assumed. Besides text files, anonymoUUs is also able to handle (nested) zip archives. These archives will be unpacked in a temp folder, processed and zipped again. ## Installation `$ pip install anonymoUUs` ## Usage In order to replace words or patterns you need a replacement-mapping in the form of: 1. a dictionary - the keys will be replaced by the values 2. the path to a csv file - a csv file will be converted in a dictionary, the first column provides keys, the second value provides values. Path can be a String, Path or PosixPath! 3. a function - a replacement function can be passed if a pattern is used. The function takes a found match and should return its replacement. The function must have at least one input argument. ### Example of replacement with a dictionary Import the Anomymize class in your code and create an anonymization object like this: ```python from anonymoUUs import Anonymize # refer to csv files in which keywords and substitutions are paired anonymize_csv = Anonymize('/Users/casper/Desktop/keys.csv') # using a dictionary instead of a csv file: my_dict = { 'A1234': 'aaaa', 'B9876': 'bbbb', } anonymize_dict = Anonymize(my_dict) ``` Putting regular expression in dictionaries is also possible.When using a dictionary only (absence of the `pattern` argument), the keys-pattern will be replaced by its value: ``` anon = Anonymize( { 'regular-key': 'replacement-1', re.compile('ca.*?er'): 'replacement-2' } ) ``` ### Example of replacement with a CSV file ```python # specifying a zip-format to zip unpacked archives after processing (.zip is default) anonymize_zip = Anonymize('/Users/casper/Desktop/keys.csv') ``` When using a csv-file, anonymoUUs will assume your file contains two columns: the left column contains the keywords which need to be replaced, the right column contains their substitutions. **Column headers are mandatory**, but don't have to follow a specific format. It is possible to add a regular expression as keyword in the csv-file. Make sure they start with the prefix 'r#'. Example: ``` r#ca.*?er, replacement_string ``` The key will be compiles as a regex and replace every match with 'replacement_string'. ### Example of replacement by regex pattern and function If you are replacing with a pattern you can also use a function to 'calculate' the replacement string: ```python def replace(match, **kwargs): result = 'default-replacement' match = int(match) threshold = kwargs.get("threshold", 4000) if match < threshold: result = 'special-replacement' return result anon = Anonymize(replace, pattern=r'\d{4}', threshold=1000) anon.substitute( '/Users/casperkaandorp/Desktop/test.json', '/Users/casperkaandorp/Desktop/result-folder' ) ``` Note the possibility to provide additional arguments when you initialize an Anonymize object that will be passed to the replcement function (in the previous example, the `threshold` is passed to the `replace` function). ### Other arguments Performance is probably best when your keywords can be generalized into a single regular expressions. anonymoUUs will search these patterns and replace them instead of matching the entire dictionary/csv-file against file contents or file/folder-paths. Example: ``` anonymize_regex = Anonymize(my_dict, pattern=r'[A-B]\d{4}') ``` By default is case sensitive by default. The regular expressions that take care of the replacements can be modified by using the `flag` parameter. It takes one or more variables [which can be found here](https://docs.python.org/3/library/re.html). Multiple variables are combined by a bitwise OR (the | operator). Example for a case-insensitive substitution: ``` anonymize_regex = Anonymize(my_dict, flags=re.IGNORECASE) ``` By using the `use_word_boundaries` argument (defaults to False), the algorithm ignores substring matches. If 'ted' is a key in your dictionary, without `use_word_boundaries` the algorithm will replace the 'ted' part in f.i. 'created_at'. You can overcome this problem by setting `use_word_boundaries` to True. It will put the `\b`-anchor around your regex pattern or dictionary keys. The beauty of the boundary anchors is that '@' is considered a boundary as well, and thus names in email addresses can be replaced. Example: ``` anonymize_regex = Anonymize(my_dict, use_word_boundaries=True) ``` It is also to specify how to re-zip unzipped folders: ```python # specifying a zip-format to zip unpacked archives after processing (.zip is default) anonymize_zip = Anonymize('/Users/casper/Desktop/keys.csv', zip_format='gztar') ``` ### Windows usage There is an issue with creating zip archives. Make sure you **run anonymoUUs as administrator**. ### Inplace replacements vs. replacements in a copy anonymoUUs is able to create a copy of the processed file-tree or replace it. The `substitute` method takes a mandatory source-path argument (path to a file, folder or zip-archive, either a string or a [Path](https://docs.python.org/3/library/pathlib.html#basic-use) object) and an optional target-path argument (again, a string or [Path](https://docs.python.org/3/library/pathlib.html#basic-use) object). The target **needs to refer to a folder**, which can't be a sub-folder of the source-folder. The target-folder will be created if it doesn't exist. When the target argument is provided, anonymoUUs will create a processed copy of the source into the target-folder. If the source is a single file, and the file path does not contain elements that will be replaced, and the target-folder is identical to the source folder, than the processed result will get a 'copy' extension to prevent overwriting. When the target argument is omitted, the source will be overwritten by a processed version of it: ``` # process the datadownload.zip file, replace all patterns and write # a copy to the 'bucket' folder. anonymize_regex.substitute( '/Users/casper/Desktop/datadownload.zip', '/Users/casper/Desktop/bucket' ) # process the 'download' folder and replace the original by its processed # version anonymize_regex.substitute('/Users/casper/Desktop/download') # process a single file, and replace it anonymize_regex.substitute('/Users/casper/Desktop/my_file.json') ``` ### Reading contents of a file Files will be opened depending on their extension. Non refognized extensions will be skipped. The standard version of this package assumes 'UTF-8' encoding. Errors are going to be ignored. Since reading file-contents is done with a single function, it will be easy to adjust (different encodings,etc) by overloading it in an extension: ```python # standard reading function def _read_file(self, source: Path): f = open(source, 'r', encoding='utf-8', errors='ignore') contents = list(f) f.close() return contents ``` ## Todo Cleaning up this document Testing! Sweet momma, it needs testing.

زبان مورد نیاز

مقدار	نام
>=3.6	Python

نحوه نصب

نصب پکیج whl anonymoUUs-0.0.8:

pip install anonymoUUs-0.0.8.whl

نصب پکیج tar.gz anonymoUUs-0.0.8:

pip install anonymoUUs-0.0.8.tar.gz