%global _empty_manifest_terminate_build 0
Name: python-word-forms
Version: 2.1.0
Release: 1
Summary: Generate all possible forms of an English word.
License: MIT License
URL: https://github.com/gutfeeling/word_forms
Source0: https://mirrors.aliyun.com/pypi/web/packages/31/39/e0f24b7c3f228561b346ae8c046817ff3d3929d77b0c3ca14a12e4d106b2/word_forms-2.1.0.tar.gz
BuildArch: noarch
Requires: python3-inflect
Requires: python3-nltk
%description
## Accurately generate all possible forms of an English word
Word forms can accurately generate all possible forms of an English word. It can conjugate verbs. It can connect different
parts of speeches e.g noun to adjective, adjective to adverb, noun to verb etc. It can pluralize singular nouns. It does this all in one function. Enjoy!
## Examples
Some very timely examples :-P
```python
>>> from word_forms.word_forms import get_word_forms
>>> get_word_forms("president")
>>> {'n': {'presidents', 'presidentships', 'presidencies', 'presidentship', 'president', 'presidency'},
'a': {'presidential'},
'v': {'preside', 'presided', 'presiding', 'presides'},
'r': {'presidentially'}}
>>> get_word_forms("elect")
>>> {'n': {'elects', 'electives', 'electors', 'elect', 'eligibilities', 'electorates', 'eligibility', 'elector', 'election', 'elections', 'electorate', 'elective'},
'a': {'eligible', 'electoral', 'elective', 'elect'},
'v': {'electing', 'elects', 'elected', 'elect'},
'r': set()}
>>> get_word_forms("politician")
>>> {'n': {'politician', 'politics', 'politicians'},
'a': {'political'},
'v': set(),
'r': {'politically'}}
>>> get_word_forms("am")
>>> {'n': {'being', 'beings'},
'a': set(),
'v': {'was', 'be', "weren't", 'am', "wasn't", "aren't", 'being', 'were', 'is', "isn't", 'been', 'are', 'am not'},
'r': set()}
>>> get_word_forms("ran")
>>> {'n': {'run', 'runniness', 'runner', 'runninesses', 'running', 'runners', 'runnings', 'runs'},
'a': {'running', 'runny'},
'v': {'running', 'run', 'ran', 'runs'},
'r': set()}
>>> get_word_forms('continent', 0.8) # with configurable similarity threshold
>>> {'n': {'continents', 'continency', 'continences', 'continent', 'continencies', 'continence'},
'a': {'continental', 'continent'},
'v': set(),
'r': set()}
```
As you can see, the output is a dictionary with four keys. "r" stands for adverb, "a" for adjective, "n" for noun
and "v" for verb. Don't ask me why "r" stands for adverb. This is what WordNet uses, so this is why I use it too :-)
Help can be obtained at any time by typing the following:
```python
>>> help(get_word_forms)
```
## Why?
In Natural Language Processing and Search, one often needs to treat words like "run" and "ran", "love" and "lovable"
or "politician" and "politics" as the same word. This is usually done by algorithmically reducing each word into a
base word and then comparing the base words. The process is called Stemming.
For example, the [Porter Stemmer](http://text-processing.com/demo/stem/) reduces both "love" and "lovely"
into the base word "love".
Stemmers have several shortcomings. Firstly, the base word produced by the Stemmer is not always a valid English word.
For example, the Porter Stemmer reduces the word "operation" to "oper". Secondly, the Stemmers have a high false negative rate.
For example, "run" is reduced to "run" and "ran" is reduced to "ran". This happens because the Stemmers use a set of
rational rules for finding the base words, and as we all know, the English language does not always behave very rationally.
Lemmatizers are more accurate than Stemmers because they produce a base form that is present in the dictionary (also called the Lemma). So the reduced word is always a valid English word. However, Lemmatizers also have false negatives because they are not very good at connecting words across different parts of speeches. The [WordNet Lemmatizer](http://textanalysisonline.com/nltk-wordnet-lemmatizer) included with NLTK fails at almost all such examples. "operations" is reduced to "operation" and "operate" is reduced to "operate".
Word Forms tries to solve this problem by finding all possible forms of a given English word. It can perform verb conjugations, connect noun forms to verb forms, adjective forms, adverb forms, plularize singular forms etc.
## Bonus: A simple lemmatizer
We also offer a very simple lemmatizer based on ``word_forms``. Here is how to use it.
```python
>>> from word_forms.lemmatizer import lemmatize
>>> lemmatize("operations")
'operant'
>>> lemmatize("operate")
'operant'
```
Enjoy!
## Compatibility
Tested on Python 3
## Installation
Using `pip`:
```
pip install -U word_forms
```
### From source
Or you can install it from source:
1. Clone the repository:
```
git clone https://github.com/gutfeeling/word_forms.git
```
2. Install it using `pip` or `setup.py`
```
pip install -e word_forms
% or
cd word_forms
python setup.py install
```
## Acknowledgement
1. [The XTAG project](http://www.cis.upenn.edu/~xtag/) for information on [verb conjugations](word_forms/en-verbs.txt).
2. [WordNet](http://wordnet.princeton.edu/)
## Maintainer
Hi, I am Dibya and I maintain this repository. I would love to hear from you. Feel free to get in touch with me
at dibyachakravorty@gmail.com.
## Contributors
- Tom Aarsen @CubieDev is a major contributor and is singlehandedly responsible for v2.0.0.
- Sajal Sharma @sajal2692 ia a major contributor.
## Contributions
Word Forms is not perfect. In particular, a couple of aspects can be improved.
1. It sometimes generates non dictionary words like "runninesses" because the pluralization/singularization algorithm is
not perfect. At the moment, I am using [inflect](https://pypi.python.org/pypi/inflect) for it.
If you like this package, feel free to contribute. Your pull requests are most welcome.
%package -n python3-word-forms
Summary: Generate all possible forms of an English word.
Provides: python-word-forms
BuildRequires: python3-devel
BuildRequires: python3-setuptools
BuildRequires: python3-pip
%description -n python3-word-forms
## Accurately generate all possible forms of an English word
Word forms can accurately generate all possible forms of an English word. It can conjugate verbs. It can connect different
parts of speeches e.g noun to adjective, adjective to adverb, noun to verb etc. It can pluralize singular nouns. It does this all in one function. Enjoy!
## Examples
Some very timely examples :-P
```python
>>> from word_forms.word_forms import get_word_forms
>>> get_word_forms("president")
>>> {'n': {'presidents', 'presidentships', 'presidencies', 'presidentship', 'president', 'presidency'},
'a': {'presidential'},
'v': {'preside', 'presided', 'presiding', 'presides'},
'r': {'presidentially'}}
>>> get_word_forms("elect")
>>> {'n': {'elects', 'electives', 'electors', 'elect', 'eligibilities', 'electorates', 'eligibility', 'elector', 'election', 'elections', 'electorate', 'elective'},
'a': {'eligible', 'electoral', 'elective', 'elect'},
'v': {'electing', 'elects', 'elected', 'elect'},
'r': set()}
>>> get_word_forms("politician")
>>> {'n': {'politician', 'politics', 'politicians'},
'a': {'political'},
'v': set(),
'r': {'politically'}}
>>> get_word_forms("am")
>>> {'n': {'being', 'beings'},
'a': set(),
'v': {'was', 'be', "weren't", 'am', "wasn't", "aren't", 'being', 'were', 'is', "isn't", 'been', 'are', 'am not'},
'r': set()}
>>> get_word_forms("ran")
>>> {'n': {'run', 'runniness', 'runner', 'runninesses', 'running', 'runners', 'runnings', 'runs'},
'a': {'running', 'runny'},
'v': {'running', 'run', 'ran', 'runs'},
'r': set()}
>>> get_word_forms('continent', 0.8) # with configurable similarity threshold
>>> {'n': {'continents', 'continency', 'continences', 'continent', 'continencies', 'continence'},
'a': {'continental', 'continent'},
'v': set(),
'r': set()}
```
As you can see, the output is a dictionary with four keys. "r" stands for adverb, "a" for adjective, "n" for noun
and "v" for verb. Don't ask me why "r" stands for adverb. This is what WordNet uses, so this is why I use it too :-)
Help can be obtained at any time by typing the following:
```python
>>> help(get_word_forms)
```
## Why?
In Natural Language Processing and Search, one often needs to treat words like "run" and "ran", "love" and "lovable"
or "politician" and "politics" as the same word. This is usually done by algorithmically reducing each word into a
base word and then comparing the base words. The process is called Stemming.
For example, the [Porter Stemmer](http://text-processing.com/demo/stem/) reduces both "love" and "lovely"
into the base word "love".
Stemmers have several shortcomings. Firstly, the base word produced by the Stemmer is not always a valid English word.
For example, the Porter Stemmer reduces the word "operation" to "oper". Secondly, the Stemmers have a high false negative rate.
For example, "run" is reduced to "run" and "ran" is reduced to "ran". This happens because the Stemmers use a set of
rational rules for finding the base words, and as we all know, the English language does not always behave very rationally.
Lemmatizers are more accurate than Stemmers because they produce a base form that is present in the dictionary (also called the Lemma). So the reduced word is always a valid English word. However, Lemmatizers also have false negatives because they are not very good at connecting words across different parts of speeches. The [WordNet Lemmatizer](http://textanalysisonline.com/nltk-wordnet-lemmatizer) included with NLTK fails at almost all such examples. "operations" is reduced to "operation" and "operate" is reduced to "operate".
Word Forms tries to solve this problem by finding all possible forms of a given English word. It can perform verb conjugations, connect noun forms to verb forms, adjective forms, adverb forms, plularize singular forms etc.
## Bonus: A simple lemmatizer
We also offer a very simple lemmatizer based on ``word_forms``. Here is how to use it.
```python
>>> from word_forms.lemmatizer import lemmatize
>>> lemmatize("operations")
'operant'
>>> lemmatize("operate")
'operant'
```
Enjoy!
## Compatibility
Tested on Python 3
## Installation
Using `pip`:
```
pip install -U word_forms
```
### From source
Or you can install it from source:
1. Clone the repository:
```
git clone https://github.com/gutfeeling/word_forms.git
```
2. Install it using `pip` or `setup.py`
```
pip install -e word_forms
% or
cd word_forms
python setup.py install
```
## Acknowledgement
1. [The XTAG project](http://www.cis.upenn.edu/~xtag/) for information on [verb conjugations](word_forms/en-verbs.txt).
2. [WordNet](http://wordnet.princeton.edu/)
## Maintainer
Hi, I am Dibya and I maintain this repository. I would love to hear from you. Feel free to get in touch with me
at dibyachakravorty@gmail.com.
## Contributors
- Tom Aarsen @CubieDev is a major contributor and is singlehandedly responsible for v2.0.0.
- Sajal Sharma @sajal2692 ia a major contributor.
## Contributions
Word Forms is not perfect. In particular, a couple of aspects can be improved.
1. It sometimes generates non dictionary words like "runninesses" because the pluralization/singularization algorithm is
not perfect. At the moment, I am using [inflect](https://pypi.python.org/pypi/inflect) for it.
If you like this package, feel free to contribute. Your pull requests are most welcome.
%package help
Summary: Development documents and examples for word-forms
Provides: python3-word-forms-doc
%description help
## Accurately generate all possible forms of an English word
Word forms can accurately generate all possible forms of an English word. It can conjugate verbs. It can connect different
parts of speeches e.g noun to adjective, adjective to adverb, noun to verb etc. It can pluralize singular nouns. It does this all in one function. Enjoy!
## Examples
Some very timely examples :-P
```python
>>> from word_forms.word_forms import get_word_forms
>>> get_word_forms("president")
>>> {'n': {'presidents', 'presidentships', 'presidencies', 'presidentship', 'president', 'presidency'},
'a': {'presidential'},
'v': {'preside', 'presided', 'presiding', 'presides'},
'r': {'presidentially'}}
>>> get_word_forms("elect")
>>> {'n': {'elects', 'electives', 'electors', 'elect', 'eligibilities', 'electorates', 'eligibility', 'elector', 'election', 'elections', 'electorate', 'elective'},
'a': {'eligible', 'electoral', 'elective', 'elect'},
'v': {'electing', 'elects', 'elected', 'elect'},
'r': set()}
>>> get_word_forms("politician")
>>> {'n': {'politician', 'politics', 'politicians'},
'a': {'political'},
'v': set(),
'r': {'politically'}}
>>> get_word_forms("am")
>>> {'n': {'being', 'beings'},
'a': set(),
'v': {'was', 'be', "weren't", 'am', "wasn't", "aren't", 'being', 'were', 'is', "isn't", 'been', 'are', 'am not'},
'r': set()}
>>> get_word_forms("ran")
>>> {'n': {'run', 'runniness', 'runner', 'runninesses', 'running', 'runners', 'runnings', 'runs'},
'a': {'running', 'runny'},
'v': {'running', 'run', 'ran', 'runs'},
'r': set()}
>>> get_word_forms('continent', 0.8) # with configurable similarity threshold
>>> {'n': {'continents', 'continency', 'continences', 'continent', 'continencies', 'continence'},
'a': {'continental', 'continent'},
'v': set(),
'r': set()}
```
As you can see, the output is a dictionary with four keys. "r" stands for adverb, "a" for adjective, "n" for noun
and "v" for verb. Don't ask me why "r" stands for adverb. This is what WordNet uses, so this is why I use it too :-)
Help can be obtained at any time by typing the following:
```python
>>> help(get_word_forms)
```
## Why?
In Natural Language Processing and Search, one often needs to treat words like "run" and "ran", "love" and "lovable"
or "politician" and "politics" as the same word. This is usually done by algorithmically reducing each word into a
base word and then comparing the base words. The process is called Stemming.
For example, the [Porter Stemmer](http://text-processing.com/demo/stem/) reduces both "love" and "lovely"
into the base word "love".
Stemmers have several shortcomings. Firstly, the base word produced by the Stemmer is not always a valid English word.
For example, the Porter Stemmer reduces the word "operation" to "oper". Secondly, the Stemmers have a high false negative rate.
For example, "run" is reduced to "run" and "ran" is reduced to "ran". This happens because the Stemmers use a set of
rational rules for finding the base words, and as we all know, the English language does not always behave very rationally.
Lemmatizers are more accurate than Stemmers because they produce a base form that is present in the dictionary (also called the Lemma). So the reduced word is always a valid English word. However, Lemmatizers also have false negatives because they are not very good at connecting words across different parts of speeches. The [WordNet Lemmatizer](http://textanalysisonline.com/nltk-wordnet-lemmatizer) included with NLTK fails at almost all such examples. "operations" is reduced to "operation" and "operate" is reduced to "operate".
Word Forms tries to solve this problem by finding all possible forms of a given English word. It can perform verb conjugations, connect noun forms to verb forms, adjective forms, adverb forms, plularize singular forms etc.
## Bonus: A simple lemmatizer
We also offer a very simple lemmatizer based on ``word_forms``. Here is how to use it.
```python
>>> from word_forms.lemmatizer import lemmatize
>>> lemmatize("operations")
'operant'
>>> lemmatize("operate")
'operant'
```
Enjoy!
## Compatibility
Tested on Python 3
## Installation
Using `pip`:
```
pip install -U word_forms
```
### From source
Or you can install it from source:
1. Clone the repository:
```
git clone https://github.com/gutfeeling/word_forms.git
```
2. Install it using `pip` or `setup.py`
```
pip install -e word_forms
% or
cd word_forms
python setup.py install
```
## Acknowledgement
1. [The XTAG project](http://www.cis.upenn.edu/~xtag/) for information on [verb conjugations](word_forms/en-verbs.txt).
2. [WordNet](http://wordnet.princeton.edu/)
## Maintainer
Hi, I am Dibya and I maintain this repository. I would love to hear from you. Feel free to get in touch with me
at dibyachakravorty@gmail.com.
## Contributors
- Tom Aarsen @CubieDev is a major contributor and is singlehandedly responsible for v2.0.0.
- Sajal Sharma @sajal2692 ia a major contributor.
## Contributions
Word Forms is not perfect. In particular, a couple of aspects can be improved.
1. It sometimes generates non dictionary words like "runninesses" because the pluralization/singularization algorithm is
not perfect. At the moment, I am using [inflect](https://pypi.python.org/pypi/inflect) for it.
If you like this package, feel free to contribute. Your pull requests are most welcome.
%prep
%autosetup -n word_forms-2.1.0
%build
%py3_build
%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .
%files -n python3-word-forms -f filelist.lst
%dir %{python3_sitelib}/*
%files help -f doclist.lst
%{_docdir}/*
%changelog
* Thu Jun 08 2023 Python_Bot - 2.1.0-1
- Package Spec generated