diff options
Diffstat (limited to 'python-torch-stoi.spec')
| -rw-r--r-- | python-torch-stoi.spec | 330 |
1 files changed, 330 insertions, 0 deletions
diff --git a/python-torch-stoi.spec b/python-torch-stoi.spec new file mode 100644 index 0000000..c3d7ba8 --- /dev/null +++ b/python-torch-stoi.spec @@ -0,0 +1,330 @@ +%global _empty_manifest_terminate_build 0 +Name: python-torch-stoi +Version: 0.1.2 +Release: 1 +Summary: Computes Short Term Objective Intelligibility in PyTorch +License: MIT +URL: https://github.com/mpariente/pytorch_stoi +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/4a/bb/0a3122124f18d1091274af1fe59bc77218143f4fc35fa48914e89a7431e9/torch_stoi-0.1.2.tar.gz +BuildArch: noarch + + +%description +## PyTorch implementation of STOI +[![Build Status][travis-badge]][travis] +[](https://badge.fury.io/py/torch-stoi) + + +Implementation of the classical and extended Short +Term Objective Intelligibility in PyTorch. +See also [Cees Taal's website](http://www.ceestaal.nl/code/) and +the [python implementation](https://github.com/mpariente/pystoi) + +### Install +```bash +pip install torch_stoi +``` + +## Important warning +**This implementation is intended to be used as a loss function only.** +It doesn't replicate the exact behavior of the original metrics +but the results should be close enough that it can be used +as a loss function. See the Notes in the + [`NegSTOILoss`](./torch_stoi/stoi.py) class. + +Quantitative comparison coming soon hopefully :rocket: + +### Usage +```python +import torch +from torch import nn +from torch_stoi import NegSTOILoss + +sample_rate = 16000 +loss_func = NegSTOILoss(sample_rate=sample_rate) +# Your nnet and optimizer definition here +nnet = nn.Module() + +noisy_speech = torch.randn(2, 16000) +clean_speech = torch.randn(2, 16000) +# Estimate clean speech +est_speech = nnet(noisy_speech) +# Compute loss and backward (then step etc...) +loss_batch = loss_func(est_speech, clean_speech) +loss_batch.mean().backward() +``` + +### Comparing NumPy and PyTorch versions : the static test +Values obtained with the NumPy version are compared to +the PyTorch version in the following graphs. +##### 8kHz +Classic STOI measure + +<img src="./plots/8kHzwithVAD.png" width="400"/> <img src="./plots/8kHzwoVAD.png" width="400"/> + +Extended STOI measure + +<img src="./plots/8kHzExtendedwithVAD.png" width="400"/> <img src="./plots/8kHzExtendedwoVAD.png" width="400"> + +##### 16kHz +Classic STOI measure + +<img src="./plots/16kHzwithVAD.png" width="400"> <img src="./plots/16kHzwoVAD.png" width="400"> + +Extended STOI measure + +<img src="./plots/16kHzExtendedwithVAD.png" width="400"> <img src="./plots/16kHzExtendedwoVAD.png" width="400"> + + +16kHz signals used to compare both versions contained a lot +of silence, which explains why the match is very bad without +VAD. + +### Comparing NumPy and PyTorch versions : Training a DNN +Coming in the near future + +### References +* [1] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen 'A Short-Time + Objective Intelligibility Measure for Time-Frequency Weighted Noisy Speech', + ICASSP 2010, Texas, Dallas. +* [2] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen 'An Algorithm for + Intelligibility Prediction of Time-Frequency Weighted Noisy Speech', + IEEE Transactions on Audio, Speech, and Language Processing, 2011. +* [3] J. Jensen and C. H. Taal, 'An Algorithm for Predicting the + Intelligibility of Speech Masked by Modulated Noise Maskers', + IEEE Transactions on Audio, Speech and Language Processing, 2016. + + +[travis]: https://travis-ci.com/mpariente/pytorch_stoi +[travis-badge]: https://travis-ci.com/mpariente/pytorch_stoi.svg?branch=master + +%package -n python3-torch-stoi +Summary: Computes Short Term Objective Intelligibility in PyTorch +Provides: python-torch-stoi +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-torch-stoi +## PyTorch implementation of STOI +[![Build Status][travis-badge]][travis] +[](https://badge.fury.io/py/torch-stoi) + + +Implementation of the classical and extended Short +Term Objective Intelligibility in PyTorch. +See also [Cees Taal's website](http://www.ceestaal.nl/code/) and +the [python implementation](https://github.com/mpariente/pystoi) + +### Install +```bash +pip install torch_stoi +``` + +## Important warning +**This implementation is intended to be used as a loss function only.** +It doesn't replicate the exact behavior of the original metrics +but the results should be close enough that it can be used +as a loss function. See the Notes in the + [`NegSTOILoss`](./torch_stoi/stoi.py) class. + +Quantitative comparison coming soon hopefully :rocket: + +### Usage +```python +import torch +from torch import nn +from torch_stoi import NegSTOILoss + +sample_rate = 16000 +loss_func = NegSTOILoss(sample_rate=sample_rate) +# Your nnet and optimizer definition here +nnet = nn.Module() + +noisy_speech = torch.randn(2, 16000) +clean_speech = torch.randn(2, 16000) +# Estimate clean speech +est_speech = nnet(noisy_speech) +# Compute loss and backward (then step etc...) +loss_batch = loss_func(est_speech, clean_speech) +loss_batch.mean().backward() +``` + +### Comparing NumPy and PyTorch versions : the static test +Values obtained with the NumPy version are compared to +the PyTorch version in the following graphs. +##### 8kHz +Classic STOI measure + +<img src="./plots/8kHzwithVAD.png" width="400"/> <img src="./plots/8kHzwoVAD.png" width="400"/> + +Extended STOI measure + +<img src="./plots/8kHzExtendedwithVAD.png" width="400"/> <img src="./plots/8kHzExtendedwoVAD.png" width="400"> + +##### 16kHz +Classic STOI measure + +<img src="./plots/16kHzwithVAD.png" width="400"> <img src="./plots/16kHzwoVAD.png" width="400"> + +Extended STOI measure + +<img src="./plots/16kHzExtendedwithVAD.png" width="400"> <img src="./plots/16kHzExtendedwoVAD.png" width="400"> + + +16kHz signals used to compare both versions contained a lot +of silence, which explains why the match is very bad without +VAD. + +### Comparing NumPy and PyTorch versions : Training a DNN +Coming in the near future + +### References +* [1] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen 'A Short-Time + Objective Intelligibility Measure for Time-Frequency Weighted Noisy Speech', + ICASSP 2010, Texas, Dallas. +* [2] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen 'An Algorithm for + Intelligibility Prediction of Time-Frequency Weighted Noisy Speech', + IEEE Transactions on Audio, Speech, and Language Processing, 2011. +* [3] J. Jensen and C. H. Taal, 'An Algorithm for Predicting the + Intelligibility of Speech Masked by Modulated Noise Maskers', + IEEE Transactions on Audio, Speech and Language Processing, 2016. + + +[travis]: https://travis-ci.com/mpariente/pytorch_stoi +[travis-badge]: https://travis-ci.com/mpariente/pytorch_stoi.svg?branch=master + +%package help +Summary: Development documents and examples for torch-stoi +Provides: python3-torch-stoi-doc +%description help +## PyTorch implementation of STOI +[![Build Status][travis-badge]][travis] +[](https://badge.fury.io/py/torch-stoi) + + +Implementation of the classical and extended Short +Term Objective Intelligibility in PyTorch. +See also [Cees Taal's website](http://www.ceestaal.nl/code/) and +the [python implementation](https://github.com/mpariente/pystoi) + +### Install +```bash +pip install torch_stoi +``` + +## Important warning +**This implementation is intended to be used as a loss function only.** +It doesn't replicate the exact behavior of the original metrics +but the results should be close enough that it can be used +as a loss function. See the Notes in the + [`NegSTOILoss`](./torch_stoi/stoi.py) class. + +Quantitative comparison coming soon hopefully :rocket: + +### Usage +```python +import torch +from torch import nn +from torch_stoi import NegSTOILoss + +sample_rate = 16000 +loss_func = NegSTOILoss(sample_rate=sample_rate) +# Your nnet and optimizer definition here +nnet = nn.Module() + +noisy_speech = torch.randn(2, 16000) +clean_speech = torch.randn(2, 16000) +# Estimate clean speech +est_speech = nnet(noisy_speech) +# Compute loss and backward (then step etc...) +loss_batch = loss_func(est_speech, clean_speech) +loss_batch.mean().backward() +``` + +### Comparing NumPy and PyTorch versions : the static test +Values obtained with the NumPy version are compared to +the PyTorch version in the following graphs. +##### 8kHz +Classic STOI measure + +<img src="./plots/8kHzwithVAD.png" width="400"/> <img src="./plots/8kHzwoVAD.png" width="400"/> + +Extended STOI measure + +<img src="./plots/8kHzExtendedwithVAD.png" width="400"/> <img src="./plots/8kHzExtendedwoVAD.png" width="400"> + +##### 16kHz +Classic STOI measure + +<img src="./plots/16kHzwithVAD.png" width="400"> <img src="./plots/16kHzwoVAD.png" width="400"> + +Extended STOI measure + +<img src="./plots/16kHzExtendedwithVAD.png" width="400"> <img src="./plots/16kHzExtendedwoVAD.png" width="400"> + + +16kHz signals used to compare both versions contained a lot +of silence, which explains why the match is very bad without +VAD. + +### Comparing NumPy and PyTorch versions : Training a DNN +Coming in the near future + +### References +* [1] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen 'A Short-Time + Objective Intelligibility Measure for Time-Frequency Weighted Noisy Speech', + ICASSP 2010, Texas, Dallas. +* [2] C.H.Taal, R.C.Hendriks, R.Heusdens, J.Jensen 'An Algorithm for + Intelligibility Prediction of Time-Frequency Weighted Noisy Speech', + IEEE Transactions on Audio, Speech, and Language Processing, 2011. +* [3] J. Jensen and C. H. Taal, 'An Algorithm for Predicting the + Intelligibility of Speech Masked by Modulated Noise Maskers', + IEEE Transactions on Audio, Speech and Language Processing, 2016. + + +[travis]: https://travis-ci.com/mpariente/pytorch_stoi +[travis-badge]: https://travis-ci.com/mpariente/pytorch_stoi.svg?branch=master + +%prep +%autosetup -n torch-stoi-0.1.2 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-torch-stoi -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Tue May 30 2023 Python_Bot <Python_Bot@openeuler.org> - 0.1.2-1 +- Package Spec generated |
