%global _empty_manifest_terminate_build 0
Name: python-pytorch-pretrained-vit
Version: 0.0.7
Release: 1
Summary: Visual Transformers (ViT) in PyTorch.
License: Apache
URL: https://github.com/lukemelas/ViT-PyTorch
Source0: https://mirrors.aliyun.com/pypi/web/packages/02/8d/b404fe410a984ce2bc95a8ce02d397e3b8b12d6dd3118db6ac9b8edaa370/pytorch-pretrained-vit-0.0.7.tar.gz
BuildArch: noarch
%description
# ViT PyTorch
### Quickstart
Install with `pip install pytorch_pretrained_vit` and load a pretrained ViT with:
```python
from pytorch_pretrained_vit import ViT
model = ViT('B_16_imagenet1k', pretrained=True)
```
Or find a Google Colab example [here](https://colab.research.google.com/drive/1muZ4QFgVfwALgqmrfOkp7trAvqDemckO?usp=sharing).
### Overview
This repository contains an op-for-op PyTorch reimplementation of the [Visual Transformer](https://openreview.net/forum?id=YicbFdNTTy) architecture from [Google](https://github.com/google-research/vision_transformer), along with pre-trained models and examples.
The goal of this implementation is to be simple, highly extensible, and easy to integrate into your own projects.
At the moment, you can easily:
* Load pretrained ViT models
* Evaluate on ImageNet or your own data
* Finetune ViT on your own dataset
_(Upcoming features)_ Coming soon:
* Train ViT from scratch on ImageNet (1K)
* Export to ONNX for efficient inference
### Table of contents
1. [About ViT](#about-vit)
2. [About ViT-PyTorch](#about-vit-pytorch)
3. [Installation](#installation)
4. [Usage](#usage)
* [Load pretrained models](#loading-pretrained-models)
* [Example: Classify](#example-classification)
6. [Contributing](#contributing)
### About ViT
Visual Transformers (ViT) are a straightforward application of the [transformer architecture](https://arxiv.org/abs/1706.03762) to image classification. Even in computer vision, it seems, attention is all you need.
The ViT architecture works as follows: (1) it considers an image as a 1-dimensional sequence of patches, (2) it prepends a classification token to the sequence, (3) it passes these patches through a transformer encoder (like [BERT](https://arxiv.org/abs/1810.04805)), (4) it passes the first token of the output of the transformer through a small MLP to obtain the classification logits.
ViT is trained on a large-scale dataset (ImageNet-21k) with a huge amount of compute.
### About ViT-PyTorch
ViT-PyTorch is a PyTorch re-implementation of ViT. It is consistent with the [original Jax implementation](https://github.com/google-research/vision_transformer), so that it's easy to load Jax-pretrained weights.
At the same time, we aim to make our PyTorch implementation as simple, flexible, and extensible as possible.
### Installation
Install with pip:
```bash
pip install pytorch_pretrained_vit
```
Or from source:
```bash
git clone https://github.com/lukemelas/ViT-PyTorch
cd ViT-Pytorch
pip install -e .
```
### Usage
#### Loading pretrained models
Loading a pretrained model is easy:
```python
from pytorch_pretrained_vit import ViT
model = ViT('B_16_imagenet1k', pretrained=True)
```
Details about the models are below:
| *Name* |* Pretrained on *|*Finetuned on*|*Available? *|
|:-----------------:|:---------------:|:------------:|:-----------:|
| `B_16` | ImageNet-21k | - | ✓ |
| `B_32` | ImageNet-21k | - | ✓ |
| `L_16` | ImageNet-21k | - | - |
| `L_32` | ImageNet-21k | - | ✓ |
| `B_16_imagenet1k` | ImageNet-21k | ImageNet-1k | ✓ |
| `B_32_imagenet1k` | ImageNet-21k | ImageNet-1k | ✓ |
| `L_16_imagenet1k` | ImageNet-21k | ImageNet-1k | ✓ |
| `L_32_imagenet1k` | ImageNet-21k | ImageNet-1k | ✓ |
#### Custom ViT
Loading custom configurations is just as easy:
```python
from pytorch_pretrained_vit import ViT
# The following is equivalent to ViT('B_16')
config = dict(hidden_size=512, num_heads=8, num_layers=6)
model = ViT.from_config(config)
```
#### Example: Classification
Below is a simple, complete example. It may also be found as a Jupyter notebook in `examples/simple` or as a [Colab Notebook]().
```python
import json
from PIL import Image
import torch
from torchvision import transforms
# Load ViT
from pytorch_pretrained_vit import ViT
model = ViT('B_16_imagenet1k', pretrained=True)
model.eval()
# Load image
# NOTE: Assumes an image `img.jpg` exists in the current directory
img = transforms.Compose([
transforms.Resize((384, 384)),
transforms.ToTensor(),
transforms.Normalize(0.5, 0.5),
])(Image.open('img.jpg')).unsqueeze(0)
print(img.shape) # torch.Size([1, 3, 384, 384])
# Classify
with torch.no_grad():
outputs = model(img)
print(outputs.shape) # (1, 1000)
```
#### ImageNet
See `examples/imagenet` for details about evaluating on ImageNet.
#### Credit
Other great repositories with this model include:
- [Ross Wightman's repo](https://github.com/rwightman/pytorch-image-models)
- [Phil Wang's repo](https://github.com/lucidrains/vit-pytorch)
### Contributing
If you find a bug, create a GitHub issue, or even better, submit a pull request. Similarly, if you have questions, simply post them as GitHub issues.
I look forward to seeing what the community does with these models!
%package -n python3-pytorch-pretrained-vit
Summary: Visual Transformers (ViT) in PyTorch.
Provides: python-pytorch-pretrained-vit
BuildRequires: python3-devel
BuildRequires: python3-setuptools
BuildRequires: python3-pip
%description -n python3-pytorch-pretrained-vit
# ViT PyTorch
### Quickstart
Install with `pip install pytorch_pretrained_vit` and load a pretrained ViT with:
```python
from pytorch_pretrained_vit import ViT
model = ViT('B_16_imagenet1k', pretrained=True)
```
Or find a Google Colab example [here](https://colab.research.google.com/drive/1muZ4QFgVfwALgqmrfOkp7trAvqDemckO?usp=sharing).
### Overview
This repository contains an op-for-op PyTorch reimplementation of the [Visual Transformer](https://openreview.net/forum?id=YicbFdNTTy) architecture from [Google](https://github.com/google-research/vision_transformer), along with pre-trained models and examples.
The goal of this implementation is to be simple, highly extensible, and easy to integrate into your own projects.
At the moment, you can easily:
* Load pretrained ViT models
* Evaluate on ImageNet or your own data
* Finetune ViT on your own dataset
_(Upcoming features)_ Coming soon:
* Train ViT from scratch on ImageNet (1K)
* Export to ONNX for efficient inference
### Table of contents
1. [About ViT](#about-vit)
2. [About ViT-PyTorch](#about-vit-pytorch)
3. [Installation](#installation)
4. [Usage](#usage)
* [Load pretrained models](#loading-pretrained-models)
* [Example: Classify](#example-classification)
6. [Contributing](#contributing)
### About ViT
Visual Transformers (ViT) are a straightforward application of the [transformer architecture](https://arxiv.org/abs/1706.03762) to image classification. Even in computer vision, it seems, attention is all you need.
The ViT architecture works as follows: (1) it considers an image as a 1-dimensional sequence of patches, (2) it prepends a classification token to the sequence, (3) it passes these patches through a transformer encoder (like [BERT](https://arxiv.org/abs/1810.04805)), (4) it passes the first token of the output of the transformer through a small MLP to obtain the classification logits.
ViT is trained on a large-scale dataset (ImageNet-21k) with a huge amount of compute.
### About ViT-PyTorch
ViT-PyTorch is a PyTorch re-implementation of ViT. It is consistent with the [original Jax implementation](https://github.com/google-research/vision_transformer), so that it's easy to load Jax-pretrained weights.
At the same time, we aim to make our PyTorch implementation as simple, flexible, and extensible as possible.
### Installation
Install with pip:
```bash
pip install pytorch_pretrained_vit
```
Or from source:
```bash
git clone https://github.com/lukemelas/ViT-PyTorch
cd ViT-Pytorch
pip install -e .
```
### Usage
#### Loading pretrained models
Loading a pretrained model is easy:
```python
from pytorch_pretrained_vit import ViT
model = ViT('B_16_imagenet1k', pretrained=True)
```
Details about the models are below:
| *Name* |* Pretrained on *|*Finetuned on*|*Available? *|
|:-----------------:|:---------------:|:------------:|:-----------:|
| `B_16` | ImageNet-21k | - | ✓ |
| `B_32` | ImageNet-21k | - | ✓ |
| `L_16` | ImageNet-21k | - | - |
| `L_32` | ImageNet-21k | - | ✓ |
| `B_16_imagenet1k` | ImageNet-21k | ImageNet-1k | ✓ |
| `B_32_imagenet1k` | ImageNet-21k | ImageNet-1k | ✓ |
| `L_16_imagenet1k` | ImageNet-21k | ImageNet-1k | ✓ |
| `L_32_imagenet1k` | ImageNet-21k | ImageNet-1k | ✓ |
#### Custom ViT
Loading custom configurations is just as easy:
```python
from pytorch_pretrained_vit import ViT
# The following is equivalent to ViT('B_16')
config = dict(hidden_size=512, num_heads=8, num_layers=6)
model = ViT.from_config(config)
```
#### Example: Classification
Below is a simple, complete example. It may also be found as a Jupyter notebook in `examples/simple` or as a [Colab Notebook]().
```python
import json
from PIL import Image
import torch
from torchvision import transforms
# Load ViT
from pytorch_pretrained_vit import ViT
model = ViT('B_16_imagenet1k', pretrained=True)
model.eval()
# Load image
# NOTE: Assumes an image `img.jpg` exists in the current directory
img = transforms.Compose([
transforms.Resize((384, 384)),
transforms.ToTensor(),
transforms.Normalize(0.5, 0.5),
])(Image.open('img.jpg')).unsqueeze(0)
print(img.shape) # torch.Size([1, 3, 384, 384])
# Classify
with torch.no_grad():
outputs = model(img)
print(outputs.shape) # (1, 1000)
```
#### ImageNet
See `examples/imagenet` for details about evaluating on ImageNet.
#### Credit
Other great repositories with this model include:
- [Ross Wightman's repo](https://github.com/rwightman/pytorch-image-models)
- [Phil Wang's repo](https://github.com/lucidrains/vit-pytorch)
### Contributing
If you find a bug, create a GitHub issue, or even better, submit a pull request. Similarly, if you have questions, simply post them as GitHub issues.
I look forward to seeing what the community does with these models!
%package help
Summary: Development documents and examples for pytorch-pretrained-vit
Provides: python3-pytorch-pretrained-vit-doc
%description help
# ViT PyTorch
### Quickstart
Install with `pip install pytorch_pretrained_vit` and load a pretrained ViT with:
```python
from pytorch_pretrained_vit import ViT
model = ViT('B_16_imagenet1k', pretrained=True)
```
Or find a Google Colab example [here](https://colab.research.google.com/drive/1muZ4QFgVfwALgqmrfOkp7trAvqDemckO?usp=sharing).
### Overview
This repository contains an op-for-op PyTorch reimplementation of the [Visual Transformer](https://openreview.net/forum?id=YicbFdNTTy) architecture from [Google](https://github.com/google-research/vision_transformer), along with pre-trained models and examples.
The goal of this implementation is to be simple, highly extensible, and easy to integrate into your own projects.
At the moment, you can easily:
* Load pretrained ViT models
* Evaluate on ImageNet or your own data
* Finetune ViT on your own dataset
_(Upcoming features)_ Coming soon:
* Train ViT from scratch on ImageNet (1K)
* Export to ONNX for efficient inference
### Table of contents
1. [About ViT](#about-vit)
2. [About ViT-PyTorch](#about-vit-pytorch)
3. [Installation](#installation)
4. [Usage](#usage)
* [Load pretrained models](#loading-pretrained-models)
* [Example: Classify](#example-classification)
6. [Contributing](#contributing)
### About ViT
Visual Transformers (ViT) are a straightforward application of the [transformer architecture](https://arxiv.org/abs/1706.03762) to image classification. Even in computer vision, it seems, attention is all you need.
The ViT architecture works as follows: (1) it considers an image as a 1-dimensional sequence of patches, (2) it prepends a classification token to the sequence, (3) it passes these patches through a transformer encoder (like [BERT](https://arxiv.org/abs/1810.04805)), (4) it passes the first token of the output of the transformer through a small MLP to obtain the classification logits.
ViT is trained on a large-scale dataset (ImageNet-21k) with a huge amount of compute.
### About ViT-PyTorch
ViT-PyTorch is a PyTorch re-implementation of ViT. It is consistent with the [original Jax implementation](https://github.com/google-research/vision_transformer), so that it's easy to load Jax-pretrained weights.
At the same time, we aim to make our PyTorch implementation as simple, flexible, and extensible as possible.
### Installation
Install with pip:
```bash
pip install pytorch_pretrained_vit
```
Or from source:
```bash
git clone https://github.com/lukemelas/ViT-PyTorch
cd ViT-Pytorch
pip install -e .
```
### Usage
#### Loading pretrained models
Loading a pretrained model is easy:
```python
from pytorch_pretrained_vit import ViT
model = ViT('B_16_imagenet1k', pretrained=True)
```
Details about the models are below:
| *Name* |* Pretrained on *|*Finetuned on*|*Available? *|
|:-----------------:|:---------------:|:------------:|:-----------:|
| `B_16` | ImageNet-21k | - | ✓ |
| `B_32` | ImageNet-21k | - | ✓ |
| `L_16` | ImageNet-21k | - | - |
| `L_32` | ImageNet-21k | - | ✓ |
| `B_16_imagenet1k` | ImageNet-21k | ImageNet-1k | ✓ |
| `B_32_imagenet1k` | ImageNet-21k | ImageNet-1k | ✓ |
| `L_16_imagenet1k` | ImageNet-21k | ImageNet-1k | ✓ |
| `L_32_imagenet1k` | ImageNet-21k | ImageNet-1k | ✓ |
#### Custom ViT
Loading custom configurations is just as easy:
```python
from pytorch_pretrained_vit import ViT
# The following is equivalent to ViT('B_16')
config = dict(hidden_size=512, num_heads=8, num_layers=6)
model = ViT.from_config(config)
```
#### Example: Classification
Below is a simple, complete example. It may also be found as a Jupyter notebook in `examples/simple` or as a [Colab Notebook]().
```python
import json
from PIL import Image
import torch
from torchvision import transforms
# Load ViT
from pytorch_pretrained_vit import ViT
model = ViT('B_16_imagenet1k', pretrained=True)
model.eval()
# Load image
# NOTE: Assumes an image `img.jpg` exists in the current directory
img = transforms.Compose([
transforms.Resize((384, 384)),
transforms.ToTensor(),
transforms.Normalize(0.5, 0.5),
])(Image.open('img.jpg')).unsqueeze(0)
print(img.shape) # torch.Size([1, 3, 384, 384])
# Classify
with torch.no_grad():
outputs = model(img)
print(outputs.shape) # (1, 1000)
```
#### ImageNet
See `examples/imagenet` for details about evaluating on ImageNet.
#### Credit
Other great repositories with this model include:
- [Ross Wightman's repo](https://github.com/rwightman/pytorch-image-models)
- [Phil Wang's repo](https://github.com/lucidrains/vit-pytorch)
### Contributing
If you find a bug, create a GitHub issue, or even better, submit a pull request. Similarly, if you have questions, simply post them as GitHub issues.
I look forward to seeing what the community does with these models!
%prep
%autosetup -n pytorch-pretrained-vit-0.0.7
%build
%py3_build
%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .
%files -n python3-pytorch-pretrained-vit -f filelist.lst
%dir %{python3_sitelib}/*
%files help -f doclist.lst
%{_docdir}/*
%changelog
* Tue Jun 20 2023 Python_Bot - 0.0.7-1
- Package Spec generated