معرفی شرکت ها


audiodatasets-1.0.0


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

Pulls and pre-processes major Open Source (non-commercial mostly) datasets for spoken audio
ویژگی مقدار
سیستم عامل -
نام فایل audiodatasets-1.0.0
نام audiodatasets
نسخه کتابخانه 1.0.0
نگهدارنده []
ایمیل نگهدارنده []
نویسنده Mike C. Fletcher
ایمیل نویسنده mcfletch@vrplumber.com
آدرس صفحه اصلی https://github.com/mcfletch/audiodatasets
آدرس اینترنتی https://pypi.org/project/audiodatasets/
مجوز MIT license
============== Audio Datasets ============== .. image:: https://img.shields.io/pypi/v/audiodatasets.svg :target: https://pypi.python.org/pypi/audiodatasets .. image:: https://img.shields.io/travis/mcfletch/audiodatasets.svg :target: https://travis-ci.org/mcfletch/audiodatasets .. image:: https://readthedocs.org/projects/audiodatasets/badge/?version=latest :target: https://audiodatasets.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status .. image:: https://pyup.io/repos/github/mcfletch/audiodatasets/shield.svg :target: https://pyup.io/repos/github/mcfletch/audiodatasets/ :alt: Updates Pulls and pre-processes major Open Source datasets for spoken audio * Supported Datasets: * `Librispeech <http://www.openslr.org/resources/12/>`_ (60GB) * `TEDLIUM_release2 <http://www.openslr.org/resources/19/>`_ (35GB) * `VCTK-Corpus <http://homepages.inf.ed.ac.uk/jyamagis/release/>`_ (11GB) * This is intended for use on Linux servers and it is expected that you will be using the library to feed a machine learning system (not necessary, but that's sort of the point of collecting these datasets) * MIT license for the software, but please note that the datasets themselves are generally for non-commercial use only Features -------- * Downloads common Open Source datasets and performs basic preprocessing on them * Provides iterables that produce Numpy arrays from the audio data in common formats * Uses `sphfile` to directly accesses sph files instead of needing to convert to `wav` first * Uses a single shared location for the datasets intended to be used by multiple projects Installation/Setup ------------------ You need to create the download directory and make it writable by the running user. Preferably you will do that via group-based permissions to allow sharing, but we will here show creation of a user-specific ownership:: $ mkdir -p /var/datasets $ chown user:group /var/datasets $ chmod g+rw /var/datasets if `/var/datasets` doesn't exist, or isn't writable, the downloader will instead populate `~/.config/datasets` with the data. You may wish to link that directory to `/var/datasets` so that you can use default instantiations of the corpora:: $ ln -s /var/datasets ~/.config/datasets Note that the downloader expects that you have the following available, this may not yet be the case in a docker or minimal OS installation: * `tar` * `wget` Now you can download the datasets. .. note:: The datasets are big (100+GB)! If you are paying for data or are working on a slow connection you will likely want to arrange to do this step during a low-rated period or on a separate data connection. From a command prompt:: $ pip install audiodatasets # this will download 100+GB and then unpack it on disk, it will take a while... $ audiodatasets-download Creating MFCC data-files:: # this will generate Multi-frequency Cepestral Coefficient (MFCC) summaries for the # audio datasets (and download them if that hasn't been done). This isn't necessary # if you are doing only raw-audio processing $ audiodatasets-preprocess Playing some audio:: # this will iterate through playing every utterance that includes 'moon' in the transcript $ audiodatasets-search 'moon' Usage ------- Once setup, you likely want to iterate over the data-sets using, for instance, a partition to separate out test/train/validate data. To iterate over the raw audio: .. code:: python from audiodatasets.corpora import build_corpora, partition import random def train_valid_test(): """Create training, validation and tests datasets returns three iterators yielding (array[10:512],transcript) batches """ utterances = [] for corpus in build_corpora(): utterances.extend( corpus.iter_utterances()) random.shuffle(utterances) train, test,valid = partition( utterances, (3,1,1) ) def generation( utterances ): while True: offset = random.randint(0,511) for name,transcript,audio_file in utterances: for batch in t.iter_batches( audio_file, batch_size=10, input=512, offset=offset ): yield batch,transcript return generation(train),generation(test),generation(valid) To iterate over the 10ms MFCC preprocessed data, which yields 20 frequency batches per processing window (10ms): .. code:: python from audiodatasets.corpora import build_corpora, partition import random def train_valid_test(): """Create training, validation and tests datasets Note: the batches vary in *time* at highest frequency, while the frequency bins are the second-highest frequency. See: `LibRosa MFCC <https://librosa.github.io/librosa/generated/librosa.feature.mfcc.html>`_ returns three iterators yielding (array[10:20:63],transcript) batches """ utterances = [] for corpus in build_corpora(): utterances.extend( corpus.mfcc_utterances()) random.shuffle(utterances) train, test,valid = partition( utterances, (3,1,1) ) def generation( utterances ): while True: offset = random.randint(0,62) for name,transcript,audio_file in utterances: for batch in t.iter_batches( audio_file, batch_size=10, input=63, offset=offset ): yield batch,transcript return generation(train),generation(test),generation(valid)


نحوه نصب


نصب پکیج whl audiodatasets-1.0.0:

    pip install audiodatasets-1.0.0.whl


نصب پکیج tar.gz audiodatasets-1.0.0:

    pip install audiodatasets-1.0.0.tar.gz