معرفی شرکت ها

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

تبلیغات ما

مشتریان به طور فزاینده ای آنلاین هستند. تبلیغات می تواند به آنها کمک کند تا کسب و کار شما را پیدا کنند.

مشاهده بیشتر

توضیحات

TensorFlow 2.X reimplementation of CvT: Introducing Convolutions to Vision Transformers, Haiping Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, Lei Zhang.

ویژگی	مقدار
سیستم عامل	-
نام فایل	cvt-tensorflow-1.1.4
نام	cvt-tensorflow
نسخه کتابخانه	1.1.4
نگهدارنده	[]
ایمیل نگهدارنده	[]
نویسنده	EMalagoli92
ایمیل نویسنده	emala.892@gmail.com
آدرس صفحه اصلی	https://github.com/EMalagoli92/CvT-TensorFlow
آدرس اینترنتی	https://pypi.org/project/cvt-tensorflow/
مجوز	MIT

<div align="center"> <a href="https://www.tensorflow.org">![TensorFLow](https://img.shields.io/badge/TensorFlow-2.X-orange?style=for-the-badge) <a href="https://github.com/EMalagoli92/CvT-TensorFlow/blob/main/LICENSE">![License](https://img.shields.io/github/license/EMalagoli92/CvT-TensorFlow?style=for-the-badge) <a href="https://www.python.org">![Python](https://img.shields.io/badge/python-%3E%3D%203.9-blue?style=for-the-badge)</a> </div> # CvT-TensorFlow TensorFlow 2.X reimplementation of [CvT: Introducing Convolutions to Vision Transformers](https://arxiv.org/abs/2103.15808), Haiping Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, Lei Zhang. - Exact TensorFlow reimplementation of official PyTorch repo, including `timm` modules used by authors, preserving models and layers structure. - ImageNet pretrained weights ported from PyTorch official implementation. ## Table of contents - [Abstract](#abstract) - [Results](#results) - [Installation](#installation) - [Usage](#usage) - [Acknowledgement](#acknowledgement) - [Citations](#citations) - [License](#license) <div id="abstract"/> ## Abstract Convolutional vision Transformers (CvT), improves Vision Transformers (ViT) in performance and efficienty by introducing convolutions into ViT to yield the best of both designs. This is accomplished through two primary modifications: a hierarchy of Transformers containing a new convolutional token embedding, and a convolutional Transformer block leveraging a convolutional projection. These changes introduce desirable properties of convolutional neural networks (CNNs) to the ViT architecture (e.g. shift, scale, and distortion invariance) while maintaining the merits of Transformers (e.g. dynamic attention, global context, and better generalization). Moreover the achieved results show that the positional encoding, a crucial component in existing Vision Transformers, can be safely removed in the model, simplifying the design for higher resolution vision tasks. ![Alt text](https://raw.githubusercontent.com/EMalagoli92/CvT-TensorFlow/266afd1057827d10f0dfb842f8ef73f5b19e471d/assets/images/pipeline.svg) <p align = "center"><sub>The pipeline of the CvT architecture. (a) Overall architecture, showing the hierarchical multi-stage structure facilitated by the Convolutional Token Embedding layer. (b) Details of the Convolutional Transformer Block, which contains the convolution projection as the first layer.</sub></p> <div id="results"/> ## Results TensorFlow implementation and ImageNet ported weights have been compared to the official PyTorch implementation on [ImageNet-V2](https://www.tensorflow.org/datasets/catalog/imagenet_v2) test set. ### Models pre-trained on ImageNet-1K | Configuration | Resolution | Top-1 (Original) | Top-1 (Ported) | Top-5 (Original) | Top-5 (Ported) | #Params | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | | CvT-13 | 224x224 | 69.81 | 69.81 | 89.13 | 89.13 | 20M | | CvT-13 | 384x384 | 71.31 | 71.31 | 89.97 | 89.97 | 20M | | CvT-21 | 224x224 | 71.18 | 71.17 | 89.31 | 89.31 | 32M | | CvT-21 | 384x384 | 71.61 | 71.61 | 89.71 | 89.71 | 32M | ### Models pre-trained on ImageNet-22K | Configuration | Resoluton | Top-1 (Original) | Top-1 (Ported) | Top-5 (Original) | Top-5 (Ported) | #Params | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | | CvT-13 | 384x284 | 71.76 | 71.76 | 91.39 | 91.39 | 20M | | CvT-21 | 384x384 | 74.97 | 74.97 | 92.63 | 92.63 | 32M | | CvT-W24 | 384x384 | 78.15 | 78.15 | 94.48 | 94.48 | 277M | Max metrics difference: `9e-5`. <div id="installation"/> ## Installation - Install from PyPI ``` pip install cvt-tensorflow ``` - Install from Github ``` pip install git+https://github.com/EMalagoli92/CvT-TensorFlow ``` - Clone the repo and install necessary packages ``` git clone https://github.com/EMalagoli92/CvT-TensorFlow.git pip install -r requirements.txt ``` Tested on *Ubuntu 20.04.4 LTS x86_64*, *python 3.9.7*. <div id="usage"/> ## Usage - Define a custom CvT configuration. ```python from cvt_tensorflow import CvT # Define a custom CvT configuration model = CvT( in_chans=3, num_classes=1000, classifier_activation="softmax", data_format="channels_last", spec={ "INIT": "trunc_norm", "NUM_STAGES": 3, "PATCH_SIZE": [7, 3, 3], "PATCH_STRIDE": [4, 2, 2], "PATCH_PADDING": [2, 1, 1], "DIM_EMBED": [64, 192, 384], "NUM_HEADS": [1, 3, 6], "DEPTH": [1, 2, 10], "MLP_RATIO": [4.0, 4.0, 4.0], "ATTN_DROP_RATE": [0.0, 0.0, 0.0], "DROP_RATE": [0.0, 0.0, 0.0], "DROP_PATH_RATE": [0.0, 0.0, 0.1], "QKV_BIAS": [True, True, True], "CLS_TOKEN": [False, False, True], "QKV_PROJ_METHOD": ["dw_bn", "dw_bn", "dw_bn"], "KERNEL_QKV": [3, 3, 3], "PADDING_KV": [1, 1, 1], "STRIDE_KV": [2, 2, 2], "PADDING_Q": [1, 1, 1], "STRIDE_Q": [1, 1, 1], }, ) ``` - Use a predefined CvT configuration. ```python from cvt_tensorflow import CvT model = CvT( configuration="cvt-21", data_format="channels_last", classifier_activation="softmax" ) model.build((None, 224, 224, 3)) print(model.summary()) ``` ``` Model: "cvt-21" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= stage0 (VisionTransformer) multiple 62080 stage1 (VisionTransformer) multiple 1920576 stage2 (VisionTransformer) ((None, 384, 14, 14), 29296128 (None, 1, 384)) norm (LayerNorm_) (None, 1, 384) 768 head (Linear_) (None, 1000) 385000 pred (Activation) (None, 1000) 0 ================================================================= Total params: 31,664,552 Trainable params: 31,622,696 Non-trainable params: 41,856 _________________________________________________________________ ``` - Train from scratch the model. ```python # Example model.compile( optimizer="sgd", loss="sparse_categorical_crossentropy", metrics=["accuracy", "sparse_top_k_categorical_accuracy"], ) model.fit(x, y) ``` - Use ported ImageNet pretrained weights ```python # Example from cvt_tensorflow import CvT # Use cvt-13-384x384_22k ImageNet pretrained weights model = CvT( configuration="cvt-13", pretrained=True, pretrained_resolution=384, pretrained_version="22k", classifier_activation="softmax", ) y_pred = model(image) ``` <div id="acknowledgement"/> ## Acknowledgement [CvT](https://github.com/microsoft/CvT) (Official PyTorch implementation) <div id="citations"/> ## Citations ```bibtex @article{wu2021cvt, title={Cvt: Introducing convolutions to vision transformers}, author={Wu, Haiping and Xiao, Bin and Codella, Noel and Liu, Mengchen and Dai, Xiyang and Yuan, Lu and Zhang, Lei}, journal={arXiv preprint arXiv:2103.15808}, year={2021} } ``` <div id="license"/> ## License This work is made available under the [MIT License](https://github.com/EMalagoli92/CvT-TensorFlow/blob/main/LICENSE)

نیازمندی

مقدار	نام
==1.3.0	absl-py
==1.6.3	astunparse
==5.2.0	cachetools
==2022.9.24	certifi
==2.1.1	charset-normalizer
==2.2.0	cloudpickle
==5.1.1	decorator
==0.1.7	dm-tree
==0.4.1	einops
==1.12	flatbuffers
==0.4.0	gast
==2.12.0	google-auth
==0.4.6	google-auth-oauthlib
==0.2.0	google-pasta
==1.49.1	grpcio
==3.7.0	h5py
==3.4	idna
==5.0.0	importlib-metadata
==2.9.0	keras
==1.1.2	Keras-Preprocessing
==14.0.6	libclang
==3.4.1	Markdown
==2.1.1	MarkupSafe
==1.23.1	numpy
==3.2.1	oauthlib
==3.3.0	opt-einsum
==21.3	packaging
==3.19.6	protobuf
==0.4.8	pyasn1
==0.2.8	pyasn1-modules
==3.0.9	pyparsing
==2.28.1	requests
==1.3.1	requests-oauthlib
==4.9	rsa
==1.16.0	six
==2.9.1	tensorboard
==0.6.1	tensorboard-data-server
==1.8.1	tensorboard-plugin-wit
==2.9.0	tensorflow
==0.17.1	tensorflow-addons
==2.9.0	tensorflow-estimator
==0.27.0	tensorflow-io-gcs-filesystem
==0.17.0	tensorflow-probability
==2.0.1	termcolor
==2.13.3	typeguard
==4.4.0	typing-extensions
==1.26.12	urllib3
==2.2.2	Werkzeug
==1.14.1	wrapt
==3.9.0	zipp

زبان مورد نیاز

مقدار	نام
>=3.9	Python

نحوه نصب

نصب پکیج whl cvt-tensorflow-1.1.4:

pip install cvt-tensorflow-1.1.4.whl

نصب پکیج tar.gz cvt-tensorflow-1.1.4:

pip install cvt-tensorflow-1.1.4.tar.gz