%global _empty_manifest_terminate_build 0 Name: python-commoncorrections Version: 1.0.12 Release: 1 Summary: A small python implementation of common ASR corrections License: Apache Software License URL: https://github.com/robmsmt/CommonCorrections Source0: https://mirrors.nju.edu.cn/pypi/web/packages/1e/e5/dbc5090c12042e0413311b378329b9777971faaed225ee09e292e6e7c814/commoncorrections-1.0.12.tar.gz BuildArch: noarch Requires: python3-pandas Requires: python3-inflect Requires: python3-requests %description # CC - CommonCorrections A simple repo that is used to correct common ASR outputs. The aim is not on mistakes but different ways of transcribing the same thing with a focus on how something may sound as opposed to the shortened form. The primary use case is to align the ground-truth and output from ASRs just before the WER is calculated. ### Static Examples ```text there's -> there is google.com -> google dot com ``` ### Dynamic Examples ```text 1 2 3 -> one two three 53.4 -> fifty three point four 23:59 -> twenty three fifty nine ``` ## Features 1. Designed to be used and fast (ish) with Pandas dataframes 2. Lots of built in corrections for free 3. Ability to easily extend with private corrections ## Getting Started 1. Install with: `pip install commoncorrections` 2. Import with: `from commoncorrections import CommonCorrections` ## Usage Examples Turn numbers into words: ```python >>> cc = CommonCorrections() >>> print(cc.correct_str("1 2 3")) one two three ``` Turn times into words: ```python >>> cc = CommonCorrections() >>> print(cc.correct_str("23:59")) twenty three fifty nine ``` Correct a pandas dataframe: ```python df = pd.DataFrame(data={"transcript": ['5 4 3', "123 the time is 1:23"], "asr_1": ["five four three", "one two three the time is one twenty three"], "filename": ["./my_local_file.wav", "file2.wav"]}) cc = CommonCorrections() # to correct only specific columns new_df = cc.correct_df(df, column_list=['transcript', 'asr_1']) # to apply to whole dataframe new_whole_df = cc.correct_df(df) ``` ## mypy Type Checks I tested installing mypy to check that types are compatible ```bash (py) rob@rob-T480s:~/projects/CommonCorrections/commoncorrections (master)$ mypy commoncorrections.py Success: no issues found in 1 source file ``` ## Change Log - v1.0.0 - First release - v1.0.1 - Fixed packaging issue - v1.0.3 - Fixed pip packaging issue - v1.0.4 - Fixed pip packaging issue - v1.0.5 - Fixed issue single digits - v1.0.6 - Fixed case where dataframe contains a non-str type (e.g. int) - v1.0.7 - Fixed adding additional dict works and added print(cc) object - v1.0.8 - Fixed print bug with repl - v1.0.9 - Added some words with space in default corrections csv - v1.0.10 - Typo in some corrections - v1.0.11 - Added test case and fixed mistake - v1.0.12 - Fixed pinning requirements %package -n python3-commoncorrections Summary: A small python implementation of common ASR corrections Provides: python-commoncorrections BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-commoncorrections # CC - CommonCorrections A simple repo that is used to correct common ASR outputs. The aim is not on mistakes but different ways of transcribing the same thing with a focus on how something may sound as opposed to the shortened form. The primary use case is to align the ground-truth and output from ASRs just before the WER is calculated. ### Static Examples ```text there's -> there is google.com -> google dot com ``` ### Dynamic Examples ```text 1 2 3 -> one two three 53.4 -> fifty three point four 23:59 -> twenty three fifty nine ``` ## Features 1. Designed to be used and fast (ish) with Pandas dataframes 2. Lots of built in corrections for free 3. Ability to easily extend with private corrections ## Getting Started 1. Install with: `pip install commoncorrections` 2. Import with: `from commoncorrections import CommonCorrections` ## Usage Examples Turn numbers into words: ```python >>> cc = CommonCorrections() >>> print(cc.correct_str("1 2 3")) one two three ``` Turn times into words: ```python >>> cc = CommonCorrections() >>> print(cc.correct_str("23:59")) twenty three fifty nine ``` Correct a pandas dataframe: ```python df = pd.DataFrame(data={"transcript": ['5 4 3', "123 the time is 1:23"], "asr_1": ["five four three", "one two three the time is one twenty three"], "filename": ["./my_local_file.wav", "file2.wav"]}) cc = CommonCorrections() # to correct only specific columns new_df = cc.correct_df(df, column_list=['transcript', 'asr_1']) # to apply to whole dataframe new_whole_df = cc.correct_df(df) ``` ## mypy Type Checks I tested installing mypy to check that types are compatible ```bash (py) rob@rob-T480s:~/projects/CommonCorrections/commoncorrections (master)$ mypy commoncorrections.py Success: no issues found in 1 source file ``` ## Change Log - v1.0.0 - First release - v1.0.1 - Fixed packaging issue - v1.0.3 - Fixed pip packaging issue - v1.0.4 - Fixed pip packaging issue - v1.0.5 - Fixed issue single digits - v1.0.6 - Fixed case where dataframe contains a non-str type (e.g. int) - v1.0.7 - Fixed adding additional dict works and added print(cc) object - v1.0.8 - Fixed print bug with repl - v1.0.9 - Added some words with space in default corrections csv - v1.0.10 - Typo in some corrections - v1.0.11 - Added test case and fixed mistake - v1.0.12 - Fixed pinning requirements %package help Summary: Development documents and examples for commoncorrections Provides: python3-commoncorrections-doc %description help # CC - CommonCorrections A simple repo that is used to correct common ASR outputs. The aim is not on mistakes but different ways of transcribing the same thing with a focus on how something may sound as opposed to the shortened form. The primary use case is to align the ground-truth and output from ASRs just before the WER is calculated. ### Static Examples ```text there's -> there is google.com -> google dot com ``` ### Dynamic Examples ```text 1 2 3 -> one two three 53.4 -> fifty three point four 23:59 -> twenty three fifty nine ``` ## Features 1. Designed to be used and fast (ish) with Pandas dataframes 2. Lots of built in corrections for free 3. Ability to easily extend with private corrections ## Getting Started 1. Install with: `pip install commoncorrections` 2. Import with: `from commoncorrections import CommonCorrections` ## Usage Examples Turn numbers into words: ```python >>> cc = CommonCorrections() >>> print(cc.correct_str("1 2 3")) one two three ``` Turn times into words: ```python >>> cc = CommonCorrections() >>> print(cc.correct_str("23:59")) twenty three fifty nine ``` Correct a pandas dataframe: ```python df = pd.DataFrame(data={"transcript": ['5 4 3', "123 the time is 1:23"], "asr_1": ["five four three", "one two three the time is one twenty three"], "filename": ["./my_local_file.wav", "file2.wav"]}) cc = CommonCorrections() # to correct only specific columns new_df = cc.correct_df(df, column_list=['transcript', 'asr_1']) # to apply to whole dataframe new_whole_df = cc.correct_df(df) ``` ## mypy Type Checks I tested installing mypy to check that types are compatible ```bash (py) rob@rob-T480s:~/projects/CommonCorrections/commoncorrections (master)$ mypy commoncorrections.py Success: no issues found in 1 source file ``` ## Change Log - v1.0.0 - First release - v1.0.1 - Fixed packaging issue - v1.0.3 - Fixed pip packaging issue - v1.0.4 - Fixed pip packaging issue - v1.0.5 - Fixed issue single digits - v1.0.6 - Fixed case where dataframe contains a non-str type (e.g. int) - v1.0.7 - Fixed adding additional dict works and added print(cc) object - v1.0.8 - Fixed print bug with repl - v1.0.9 - Added some words with space in default corrections csv - v1.0.10 - Typo in some corrections - v1.0.11 - Added test case and fixed mistake - v1.0.12 - Fixed pinning requirements %prep %autosetup -n commoncorrections-1.0.12 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-commoncorrections -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Tue May 30 2023 Python_Bot - 1.0.12-1 - Package Spec generated