معرفی شرکت ها


MMSA-FET-0.4.1


Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر
Card image cap
تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

A Tool for extracting multimodal features from videos.
ویژگی مقدار
سیستم عامل OS Independent
نام فایل MMSA-FET-0.4.1
نام MMSA-FET
نسخه کتابخانه 0.4.1
نگهدارنده ['FlameSky']
ایمیل نگهدارنده []
نویسنده THUIAR
ایمیل نویسنده -
آدرس صفحه اصلی https://github.com/thuiar/MMSA_FET
آدرس اینترنتی https://pypi.org/project/MMSA-FET/
مجوز -
# MMSA-Feature Extraction Toolkit [![](https://badgen.net/badge/license/GPL-3.0/green)](#License) [![](https://badgen.net/pypi/v/MMSA-FET)](https://pypi.org/project/MMSA-FET/) ![](https://badgen.net/pypi/python/MMSA-FET/) [![](https://badgen.net/badge/contact/THUIAR/purple)](https://thuiar.github.io/) MMSA-Feature Extraction Toolkit extracts multimodal features for Multimodal Sentiment Analysis Datasets. It integrates several commonly used tools for visual, acoustic and text modality. The extracted features are compatible with the [MMSA](https://github.com/thuiar/MMSA) Framework and thus can be used directly. The tool can also extract features for single videos. This work is included in the ACL-2022 DEMO paper: [M-SENA: An Integrated Platform for Multimodal Sentiment Analysis](). If you find our work useful, don't hesitate to cite our paper. Thank you! ```text @article{mao2022m, title={M-SENA: An Integrated Platform for Multimodal Sentiment Analysis}, author={Mao, Huisheng and Yuan, Ziqi and Xu, Hua and Yu, Wenmeng and Liu, Yihe and Gao, Kai}, journal={arXiv preprint arXiv:2203.12441}, year={2022} } ``` ### Features - Extract fully customized features for single videos or datasets. - Integrate some most commonly used tools, including Librosa, OpenFace, Transformers, etc. - Support Active Speaker Detection in case multiple faces exists in a video. - Easy to use, provides Python APIs and commandline tools. - Extracted features are compatible with [MMSA](https://github.com/thuiar/MMSA), a unified training & testing framework for Multimodal Sentiment Analysis. ## 1. Installation MMSA-Feature Extraction Toolkit is available from PyPI. Due to package size limitation on PyPi, large model files cannot be shipped with the package. Users need to run a post install command to download these files manually. If you can't access Google Drive, please refer to [this page](https://github.com/FlameSky-S/MMSA-FET/wiki/Dependency-Installation#2-post-installation-script) for manual download. ```bash # Install package from PyPI $ pip install MMSA-FET # Download models & libraries from Google Drive. Use --proxy if needed. $ python -m MSA_FET install ``` > **Note:** A few system-wide dependancies need to be installed manually. See [Dependency Installation](https://github.com/FlameSky-S/MMSA-FET/wiki/Dependency-Installation) for more information. ## 2. Quick Start MMSA-FET is fairly easy to use. You can either call API in python or use commandline interface. Below is a basic example using python APIs. > **Note:** To extract features for datasets, the datasets need to be organized in a specific file structure, and a `label.csv` file is needed. See [Dataset and Structure](https://github.com/FlameSky-S/MMSA-FET/wiki/Dataset-and-Structure) for details. Raw video files and label files for MOSI, MOSEI and CH-SIMS can be downloaded from [BaiduYunDisk](https://pan.baidu.com/s/14dDwya98MK4fFI0qn-ujhw?pwd=gsxk) or [Google Drive](https://drive.google.com/drive/folders/1A2S4pqCHryGmiqnNSPLv7rEg63WvjCSk?usp=sharing). ```python from MSA_FET import FeatureExtractionTool # initialize with default librosa config which only extracts audio features fet = FeatureExtractionTool("librosa") # alternatively initialize with a custom config file fet = FeatureExtractionTool("custom_config.json") # extract features for single video feature = fet.run_single("input.mp4") print(feature) # extract for dataset & save features to file feature = fet.run_dataset(dataset_dir="~/MOSI", out_file="output/feature.pkl") ``` The `custom_config.json` is the path to a custom config file, the format of which is introduced below. For detailed usage, please read [APIs](https://github.com/FlameSky-S/MMSA-FET/wiki/APIs) and [Command Line Arguments](https://github.com/FlameSky-S/MMSA-FET/wiki/Command-Line-Arguments). ## 3. Config File MMSA-FET comes with a few example configs which can be used like below. ```python # Each supported tool has an example config fet = FeatureExtractionTool(config="librosa") fet = FeatureExtractionTool(config="opensmile") fet = FeatureExtractionTool(config="wav2vec") fet = FeatureExtractionTool(config="openface") fet = FeatureExtractionTool(config="mediapipe") fet = FeatureExtractionTool(config="bert") fet = FeatureExtractionTool(config="roberta") ``` For customized features, you can: 1. Edit the default configs and pass a dictionary to the config parameter like the example below: ```python from MSA_FET import FeatureExtractionTool, get_default_config # here we only extract audio and video features config_a = get_default_config('opensmile') config_v = get_default_config('openface') # modify default config config_a['audio']['args']['feature_level'] = 'LowLevelDescriptors' # combine audio and video configs config = {**config_a, **config_v} # initialize fet = FeatureExtractionTool(config=config) ``` 2. Provide a config json file. The below example extracts features of all three modalities. To extract unimodal features, just remove unnecessary sections from the file. ```json { "audio": { "tool": "librosa", "sample_rate": null, "args": { "mfcc": { "n_mfcc": 20, "htk": true }, "rms": {}, "zero_crossing_rate": {}, "spectral_rolloff": {}, "spectral_centroid": {} } }, "video": { "tool": "openface", "fps": 25, "average_over": 3, "args": { "hogalign": false, "simalign": false, "nobadaligned": false, "landmark_2D": true, "landmark_3D": false, "pdmparams": false, "head_pose": true, "action_units": true, "gaze": true, "tracked": false } }, "text": { "model": "bert", "device": "cpu", "pretrained": "models/bert_base_uncased", "args": {} } } ``` ## 4. Supported Tools & Features ### 4.1 Audio Tools - **Librosa** ([link](https://librosa.org/doc/latest/index.html)) Supports all librosa features listed [here](https://librosa.org/doc/latest/feature.html), including: [mfcc](https://librosa.org/doc/latest/generated/librosa.feature.mfcc.html#librosa.feature.mfcc), [rms](https://librosa.org/doc/latest/generated/librosa.feature.rms.html#librosa.feature.rms), [zero_crossing_rate](https://librosa.org/doc/latest/generated/librosa.feature.zero_crossing_rate.html#librosa.feature.zero_crossing_rate), [spectral_rolloff](https://librosa.org/doc/latest/generated/librosa.feature.spectral_rolloff.html#librosa.feature.spectral_rolloff), [spectral_centroid](https://librosa.org/doc/latest/generated/librosa.feature.spectral_centroid.html#librosa.feature.spectral_centroid), etc. Detailed configurations can be found [here](https://github.com/FlameSky-S/MMSA-FET/wiki/Configurations#11-librosa). - **openSMILE** ([link](https://audeering.github.io/opensmile-python/)) Supports all feature sets listed [here](https://audeering.github.io/opensmile-python/api-smile.html#featureset), including: [ComParE_2016](http://www.tangsoo.de/documents/Publications/Schuller16-TI2.pdf), [GeMAPS](https://sail.usc.edu/publications/files/eyben-preprinttaffc-2015.pdf), [eGeMAPS](https://sail.usc.edu/publications/files/eyben-preprinttaffc-2015.pdf), emobase, etc. Detailed configurations can be found [here](https://github.com/FlameSky-S/MMSA-FET/wiki/Configurations#12-opensmile). - **Wav2vec2** ([link](https://huggingface.co/docs/transformers/model_doc/wav2vec2)) Integrated from huggingface transformers. Detailed configurations can be found [here](https://github.com/FlameSky-S/MMSA-FET/wiki/Configurations#13-wav2vec2). ### 4.2 Video Tools - **OpenFace** ([link](https://github.com/TadasBaltrusaitis/OpenFace)) Supports all features in OpenFace's FeatureExtraction binary, including: facial landmarks in 2D and 3D, head pose, gaze related, facial action units, HOG binary files. Details of these features can be found in the OpenFace Wiki [here](https://github.com/TadasBaltrusaitis/OpenFace/wiki/Output-Format) and [here](https://github.com/TadasBaltrusaitis/OpenFace/wiki/Action-Units). Detailed configurations can be found [here](https://github.com/FlameSky-S/MMSA-FET/wiki/Configurations#21-openface). - **MediaPipe** ([link](https://google.github.io/mediapipe/)) Supports [face](https://google.github.io/mediapipe/solutions/face_mesh.html) mesh and [holistic](https://google.github.io/mediapipe/solutions/holistic)(face, hand, pose) solutions. Detailed configurations can be found [here](https://github.com/FlameSky-S/MMSA-FET/wiki/Configurations#22-mediapipe). - **TalkNet**([link](https://github.com/TaoRuijie/TalkNet_ASD)) TalkNet is used to support Active Speaker Detection in case there are multiple human faces in the video. ### 4.3 Text Tools - **BERT** ([link](https://huggingface.co/docs/transformers/model_doc/bert)) Integrated from huggingface transformers. Detailed configurations can be found [here](https://github.com/FlameSky-S/MMSA-FET/wiki/Configurations#31-bert). - **XLNet** ([link](https://huggingface.co/docs/transformers/model_doc/xlnet)) Integrated from huggingface transformers. Detailed configurations can be found [here](https://github.com/FlameSky-S/MMSA-FET/wiki/Configurations#32-xlnet).


نیازمندی

مقدار نام
>=1.9.1 torch
>=0.8.8.1 mediapipe
>=4.5.4.58 opencv-contrib-python
>=4.4.0 transformers
>=2.2.0 opensmile
>=0.8.1 librosa
>=1.20.3 numpy
>=1.2.5 pandas
>=4.62.2 tqdm
>=4.4.0 gdown
>=0.5.6.1 scenedetect
>=1.7.3 scipy
>=0.6 python-speech-features
>=1.7.3 ctc-segmentation


زبان مورد نیاز

مقدار نام
>=3.10 Python


نحوه نصب


نصب پکیج whl MMSA-FET-0.4.1:

    pip install MMSA-FET-0.4.1.whl


نصب پکیج tar.gz MMSA-FET-0.4.1:

    pip install MMSA-FET-0.4.1.tar.gz