diff options
Diffstat (limited to 'python-sru.spec')
-rw-r--r-- | python-sru.spec | 395 |
1 files changed, 395 insertions, 0 deletions
diff --git a/python-sru.spec b/python-sru.spec new file mode 100644 index 0000000..e97c05f --- /dev/null +++ b/python-sru.spec @@ -0,0 +1,395 @@ +%global _empty_manifest_terminate_build 0 +Name: python-sru +Version: 2.6.0 +Release: 1 +Summary: Simple Recurrent Units for Highly Parallelizable Recurrence +License: MIT +URL: https://github.com/taolei87/sru +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/40/ca/7537e0ef8c3361402b1787474f0960521d4de82673ab45c1f11909e1c7a1/sru-2.6.0.tar.gz +BuildArch: noarch + +Requires: python3-torch +Requires: python3-ninja + +%description + +## News +SRU++, a new SRU variant, is released. [[tech report](https://arxiv.org/pdf/2102.12459.pdf)] [[blog](https://www.asapp.com/blog/reducing-the-high-cost-of-training-nlp-models-with-sru/)] + +The experimental code and SRU++ implementation are available on [the dev branch](https://github.com/asappresearch/sru/tree/3.0.0-dev/experiments/srupp_experiments) which will be merged into master later. + +## About + +**SRU** is a recurrent unit that can run over 10 times faster than cuDNN LSTM, without loss of accuracy tested on many tasks. +<p align="center"> +<img width=620 src="https://raw.githubusercontent.com/taolei87/sru/master/imgs/speed.png"><br> +<i>Average processing time of LSTM, conv2d and SRU, tested on GTX 1070</i><br> +</p> +For example, the figure above presents the processing time of a single mini-batch of 32 samples. SRU achieves 10 to 16 times speed-up compared to LSTM, and operates as fast as (or faster than) word-level convolution using conv2d. + +#### Reference: +Simple Recurrent Units for Highly Parallelizable Recurrence [[paper](https://arxiv.org/abs/1709.02755)] +``` +@inproceedings{lei2018sru, + title={Simple Recurrent Units for Highly Parallelizable Recurrence}, + author={Tao Lei and Yu Zhang and Sida I. Wang and Hui Dai and Yoav Artzi}, + booktitle={Empirical Methods in Natural Language Processing (EMNLP)}, + year={2018} +} +``` + +When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute [[paper](https://arxiv.org/pdf/2102.12459)] +``` +@article{lei2021srupp, + title={When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute}, + author={Tao Lei}, + journal={arXiv preprint arXiv:2102.12459}, + year={2021} +} +``` +<br> + +## Requirements + - [PyTorch](http://pytorch.org/) >=1.6 recommended + - [ninja](https://ninja-build.org/) + +Install requirements via `pip install -r requirements.txt`. + +<br> + +## Installation + +#### From source: +SRU can be installed as a regular package via `python setup.py install` or `pip install .`. + +#### From PyPi: +`pip install sru` + + +#### Directly use the source without installation: +Make sure this repo and CUDA library can be found by the system, e.g. +``` +export PYTHONPATH=path_to_repo/sru +export LD_LIBRARY_PATH=/usr/local/cuda/lib64 +``` + +<br> + +## Examples +The usage of SRU is similar to `nn.LSTM`. SRU likely requires more stacking layers than LSTM. We recommend starting by 2 layers and use more if necessary (see our report for more experimental details). +```python +import torch +from sru import SRU, SRUCell + +# input has length 20, batch size 32 and dimension 128 +x = torch.FloatTensor(20, 32, 128).cuda() + +input_size, hidden_size = 128, 128 + +rnn = SRU(input_size, hidden_size, + num_layers = 2, # number of stacking RNN layers + dropout = 0.0, # dropout applied between RNN layers + bidirectional = False, # bidirectional RNN + layer_norm = False, # apply layer normalization on the output of each layer + highway_bias = -2, # initial bias of highway gate (<= 0) +) +rnn.cuda() + +output_states, c_states = rnn(x) # forward pass + +# output_states is (length, batch size, number of directions * hidden size) +# c_states is (layers, batch size, number of directions * hidden size) + +``` + +<br> + +## Contributing +Please read and follow the [guidelines](CONTRIBUTING.md). + + +### Other Implementations + +[@musyoku](https://github.com/musyoku) had a very nice [SRU implementaion](https://github.com/musyoku/chainer-sru) in chainer. + +[@adrianbg](https://github.com/adrianbg) implemented the first [CPU version](https://github.com/taolei87/sru/pull/42). + +<br> + + + + + + +%package -n python3-sru +Summary: Simple Recurrent Units for Highly Parallelizable Recurrence +Provides: python-sru +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-sru + +## News +SRU++, a new SRU variant, is released. [[tech report](https://arxiv.org/pdf/2102.12459.pdf)] [[blog](https://www.asapp.com/blog/reducing-the-high-cost-of-training-nlp-models-with-sru/)] + +The experimental code and SRU++ implementation are available on [the dev branch](https://github.com/asappresearch/sru/tree/3.0.0-dev/experiments/srupp_experiments) which will be merged into master later. + +## About + +**SRU** is a recurrent unit that can run over 10 times faster than cuDNN LSTM, without loss of accuracy tested on many tasks. +<p align="center"> +<img width=620 src="https://raw.githubusercontent.com/taolei87/sru/master/imgs/speed.png"><br> +<i>Average processing time of LSTM, conv2d and SRU, tested on GTX 1070</i><br> +</p> +For example, the figure above presents the processing time of a single mini-batch of 32 samples. SRU achieves 10 to 16 times speed-up compared to LSTM, and operates as fast as (or faster than) word-level convolution using conv2d. + +#### Reference: +Simple Recurrent Units for Highly Parallelizable Recurrence [[paper](https://arxiv.org/abs/1709.02755)] +``` +@inproceedings{lei2018sru, + title={Simple Recurrent Units for Highly Parallelizable Recurrence}, + author={Tao Lei and Yu Zhang and Sida I. Wang and Hui Dai and Yoav Artzi}, + booktitle={Empirical Methods in Natural Language Processing (EMNLP)}, + year={2018} +} +``` + +When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute [[paper](https://arxiv.org/pdf/2102.12459)] +``` +@article{lei2021srupp, + title={When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute}, + author={Tao Lei}, + journal={arXiv preprint arXiv:2102.12459}, + year={2021} +} +``` +<br> + +## Requirements + - [PyTorch](http://pytorch.org/) >=1.6 recommended + - [ninja](https://ninja-build.org/) + +Install requirements via `pip install -r requirements.txt`. + +<br> + +## Installation + +#### From source: +SRU can be installed as a regular package via `python setup.py install` or `pip install .`. + +#### From PyPi: +`pip install sru` + + +#### Directly use the source without installation: +Make sure this repo and CUDA library can be found by the system, e.g. +``` +export PYTHONPATH=path_to_repo/sru +export LD_LIBRARY_PATH=/usr/local/cuda/lib64 +``` + +<br> + +## Examples +The usage of SRU is similar to `nn.LSTM`. SRU likely requires more stacking layers than LSTM. We recommend starting by 2 layers and use more if necessary (see our report for more experimental details). +```python +import torch +from sru import SRU, SRUCell + +# input has length 20, batch size 32 and dimension 128 +x = torch.FloatTensor(20, 32, 128).cuda() + +input_size, hidden_size = 128, 128 + +rnn = SRU(input_size, hidden_size, + num_layers = 2, # number of stacking RNN layers + dropout = 0.0, # dropout applied between RNN layers + bidirectional = False, # bidirectional RNN + layer_norm = False, # apply layer normalization on the output of each layer + highway_bias = -2, # initial bias of highway gate (<= 0) +) +rnn.cuda() + +output_states, c_states = rnn(x) # forward pass + +# output_states is (length, batch size, number of directions * hidden size) +# c_states is (layers, batch size, number of directions * hidden size) + +``` + +<br> + +## Contributing +Please read and follow the [guidelines](CONTRIBUTING.md). + + +### Other Implementations + +[@musyoku](https://github.com/musyoku) had a very nice [SRU implementaion](https://github.com/musyoku/chainer-sru) in chainer. + +[@adrianbg](https://github.com/adrianbg) implemented the first [CPU version](https://github.com/taolei87/sru/pull/42). + +<br> + + + + + + +%package help +Summary: Development documents and examples for sru +Provides: python3-sru-doc +%description help + +## News +SRU++, a new SRU variant, is released. [[tech report](https://arxiv.org/pdf/2102.12459.pdf)] [[blog](https://www.asapp.com/blog/reducing-the-high-cost-of-training-nlp-models-with-sru/)] + +The experimental code and SRU++ implementation are available on [the dev branch](https://github.com/asappresearch/sru/tree/3.0.0-dev/experiments/srupp_experiments) which will be merged into master later. + +## About + +**SRU** is a recurrent unit that can run over 10 times faster than cuDNN LSTM, without loss of accuracy tested on many tasks. +<p align="center"> +<img width=620 src="https://raw.githubusercontent.com/taolei87/sru/master/imgs/speed.png"><br> +<i>Average processing time of LSTM, conv2d and SRU, tested on GTX 1070</i><br> +</p> +For example, the figure above presents the processing time of a single mini-batch of 32 samples. SRU achieves 10 to 16 times speed-up compared to LSTM, and operates as fast as (or faster than) word-level convolution using conv2d. + +#### Reference: +Simple Recurrent Units for Highly Parallelizable Recurrence [[paper](https://arxiv.org/abs/1709.02755)] +``` +@inproceedings{lei2018sru, + title={Simple Recurrent Units for Highly Parallelizable Recurrence}, + author={Tao Lei and Yu Zhang and Sida I. Wang and Hui Dai and Yoav Artzi}, + booktitle={Empirical Methods in Natural Language Processing (EMNLP)}, + year={2018} +} +``` + +When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute [[paper](https://arxiv.org/pdf/2102.12459)] +``` +@article{lei2021srupp, + title={When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute}, + author={Tao Lei}, + journal={arXiv preprint arXiv:2102.12459}, + year={2021} +} +``` +<br> + +## Requirements + - [PyTorch](http://pytorch.org/) >=1.6 recommended + - [ninja](https://ninja-build.org/) + +Install requirements via `pip install -r requirements.txt`. + +<br> + +## Installation + +#### From source: +SRU can be installed as a regular package via `python setup.py install` or `pip install .`. + +#### From PyPi: +`pip install sru` + + +#### Directly use the source without installation: +Make sure this repo and CUDA library can be found by the system, e.g. +``` +export PYTHONPATH=path_to_repo/sru +export LD_LIBRARY_PATH=/usr/local/cuda/lib64 +``` + +<br> + +## Examples +The usage of SRU is similar to `nn.LSTM`. SRU likely requires more stacking layers than LSTM. We recommend starting by 2 layers and use more if necessary (see our report for more experimental details). +```python +import torch +from sru import SRU, SRUCell + +# input has length 20, batch size 32 and dimension 128 +x = torch.FloatTensor(20, 32, 128).cuda() + +input_size, hidden_size = 128, 128 + +rnn = SRU(input_size, hidden_size, + num_layers = 2, # number of stacking RNN layers + dropout = 0.0, # dropout applied between RNN layers + bidirectional = False, # bidirectional RNN + layer_norm = False, # apply layer normalization on the output of each layer + highway_bias = -2, # initial bias of highway gate (<= 0) +) +rnn.cuda() + +output_states, c_states = rnn(x) # forward pass + +# output_states is (length, batch size, number of directions * hidden size) +# c_states is (layers, batch size, number of directions * hidden size) + +``` + +<br> + +## Contributing +Please read and follow the [guidelines](CONTRIBUTING.md). + + +### Other Implementations + +[@musyoku](https://github.com/musyoku) had a very nice [SRU implementaion](https://github.com/musyoku/chainer-sru) in chainer. + +[@adrianbg](https://github.com/adrianbg) implemented the first [CPU version](https://github.com/taolei87/sru/pull/42). + +<br> + + + + + + +%prep +%autosetup -n sru-2.6.0 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-sru -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Fri May 05 2023 Python_Bot <Python_Bot@openeuler.org> - 2.6.0-1 +- Package Spec generated |