معرفی شرکت ها


dirhash-0.2.1


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Python module and CLI for hashing of file system directories.
ویژگی مقدار
سیستم عامل -
نام فایل dirhash-0.2.1
نام dirhash
نسخه کتابخانه 0.2.1
نگهدارنده []
ایمیل نگهدارنده []
نویسنده Anders Huss
ایمیل نویسنده andhus@kth.se
آدرس صفحه اصلی https://github.com/andhus/dirhash-python
آدرس اینترنتی https://pypi.org/project/dirhash/
مجوز MIT
[![Build Status](https://travis-ci.com/andhus/dirhash-python.svg?branch=master)](https://travis-ci.com/andhus/dirhash-python) [![codecov](https://codecov.io/gh/andhus/dirhash-python/branch/master/graph/badge.svg)](https://codecov.io/gh/andhus/dirhash-python) # dirhash A lightweight python module and CLI for computing the hash of any directory based on its files' structure and content. - Supports all hashing algorithms of Python's built-in `hashlib` module. - Glob/wildcard (".gitignore style") path matching for expressive filtering of files to include/exclude. - Multiprocessing for up to [6x speed-up](#performance) The hash is computed according to the [Dirhash Standard](https://github.com/andhus/dirhash), which is designed to allow for consistent and collision resistant generation/verification of directory hashes across implementations. ## Installation From PyPI: ```commandline pip install dirhash ``` Or directly from source: ```commandline git clone git@github.com:andhus/dirhash-python.git pip install dirhash/ ``` ## Usage Python module: ```python from dirhash import dirhash dirpath = "path/to/directory" dir_md5 = dirhash(dirpath, "md5") pyfiles_md5 = dirhash(dirpath, "md5", match=["*.py"]) no_hidden_sha1 = dirhash(dirpath, "sha1", ignore=[".*", ".*/"]) ``` CLI: ```commandline dirhash path/to/directory -a md5 dirhash path/to/directory -a md5 --match "*.py" dirhash path/to/directory -a sha1 --ignore ".*" ".*/" ``` ## Why? If you (or your application) need to verify the integrity of a set of files as well as their name and location, you might find this useful. Use-cases range from verification of your image classification dataset (before spending GPU-$$$ on training your fancy Deep Learning model) to validation of generated files in regression-testing. There isn't really a standard way of doing this. There are plenty of recipes out there (see e.g. these SO-questions for [linux](https://stackoverflow.com/questions/545387/linux-compute-a-single-hash-for-a-given-folder-contents) and [python](https://stackoverflow.com/questions/24937495/how-can-i-calculate-a-hash-for-a-filesystem-directory-using-python)) but I couldn't find one that is properly tested (there are some gotcha:s to cover!) and documented with a compelling user interface. `dirhash` was created with this as the goal. [checksumdir](https://github.com/cakepietoast/checksumdir) is another python module/tool with similar intent (that inspired this project) but it lacks much of the functionality offered here (most notably including file names/structure in the hash) and lacks tests. ## Performance The python `hashlib` implementation of common hashing algorithms are highly optimised. `dirhash` mainly parses the file tree, pipes data to `hashlib` and combines the output. Reasonable measures have been taken to minimize the overhead and for common use-cases, the majority of time is spent reading data from disk and executing `hashlib` code. The main effort to boost performance is support for multiprocessing, where the reading and hashing is parallelized over individual files. As a reference, let's compare the performance of the `dirhash` [CLI](https://github.com/andhus/dirhash-python/blob/master/src/dirhash/cli.py) with the shell command: `find path/to/folder -type f -print0 | sort -z | xargs -0 md5 | md5` which is the top answer for the SO-question: [Linux: compute a single hash for a given folder & contents?](https://stackoverflow.com/questions/545387/linux-compute-a-single-hash-for-a-given-folder-contents) Results for two test cases are shown below. Both have 1 GiB of random data: in "flat_1k_1MB", split into 1k files (1 MiB each) in a flat structure, and in "nested_32k_32kB", into 32k files (32 KiB each) spread over the 256 leaf directories in a binary tree of depth 8. Implementation | Test Case | Time (s) | Speed up ------------------- | --------------- | -------: | -------: shell reference | flat_1k_1MB | 2.29 | -> 1.0 `dirhash` | flat_1k_1MB | 1.67 | 1.36 `dirhash`(8 workers)| flat_1k_1MB | 0.48 | **4.73** shell reference | nested_32k_32kB | 6.82 | -> 1.0 `dirhash` | nested_32k_32kB | 3.43 | 2.00 `dirhash`(8 workers)| nested_32k_32kB | 1.14 | **6.00** The benchmark was run a MacBook Pro (2018), further details and source code [here](https://github.com/andhus/dirhash-python/tree/master/benchmark). ## Documentation Please refer to `dirhash -h`, the python [source code](https://github.com/andhus/dirhash-python/blob/master/src/dirhash/__init__.py) and the [Dirhash Standard](https://github.com/andhus/dirhash).


نیازمندی

مقدار نام
>=0.0.1 scantree


نحوه نصب


نصب پکیج whl dirhash-0.2.1:

    pip install dirhash-0.2.1.whl


نصب پکیج tar.gz dirhash-0.2.1:

    pip install dirhash-0.2.1.tar.gz