diff options
| author | CoprDistGit <infra@openeuler.org> | 2023-05-05 13:48:38 +0000 |
|---|---|---|
| committer | CoprDistGit <infra@openeuler.org> | 2023-05-05 13:48:38 +0000 |
| commit | 705d13e32ce6caa55ffddb172dafb1bcc306ca15 (patch) | |
| tree | 1f9ef95dac22fd9680f5aa6c3d8b663d738b11a5 | |
| parent | 51e4dfcbc0330cb21f3351f472fc01db503259d5 (diff) | |
automatic import of python-instancelibopeneuler20.03
| -rw-r--r-- | .gitignore | 1 | ||||
| -rw-r--r-- | python-instancelib.spec | 299 | ||||
| -rw-r--r-- | sources | 1 |
3 files changed, 301 insertions, 0 deletions
@@ -0,0 +1 @@ +/instancelib-0.4.9.1.tar.gz diff --git a/python-instancelib.spec b/python-instancelib.spec new file mode 100644 index 0000000..f95b654 --- /dev/null +++ b/python-instancelib.spec @@ -0,0 +1,299 @@ +%global _empty_manifest_terminate_build 0 +Name: python-instancelib +Version: 0.4.9.1 +Release: 1 +Summary: A typed dataset abstraction toolkit for machine learning projects +License: GNU LGPL v3 +URL: https://pypi.org/project/instancelib/ +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/6e/3d/7ee9dccc7fa94386539a2f528f8f2916f5061dbf82cc41a18497345578f7/instancelib-0.4.9.1.tar.gz +BuildArch: noarch + +Requires: python3-numpy +Requires: python3-pandas +Requires: python3-h5py +Requires: python3-scikit-learn +Requires: python3-openpyxl +Requires: python3-xlrd +Requires: python3-tqdm +Requires: python3-more-itertools +Requires: python3-typing-extensions +Requires: python3-gensim +Requires: python3-tables + +%description +`instancelib` provides a **generic architecture** for datasets. +© Michiel Bron, 2021 +## Quick tour +**Load dataset**: Load the dataset in an environment +```python +import instancelib as il +text_env = il.read_excel_dataset("./datasets/testdataset.xlsx", + data_cols=["fulltext"], + label_cols=["label"]) +ds = text_env.dataset # A `dict-like` interface for instances +labels = text_env.labels # An object that stores all labels +labelset = labels.labelset # All labels that can be given to instances +ins = ds[20] # Get instance with identifier key `20` +ins_data = ins.data # Get the raw data for instance 20 +ins_vector = ins.vector # Get the vector representation for 20 if any +ins_labels = labels.get_labels(ins) +``` +**Dataset manipulation**: Divide the dataset in a train and test set +```python +train, test = text_env.train_test_split(ds, train_size=0.70) +print(20 in train) # May be true or false, because of random sampling +``` +**Train a model**: +```python +from sklearn.pipeline import Pipeline +from sklearn.naive_bayes import MultinomialNB +from sklearn.feature_extraction.text import TfidfTransformer, CountVectorizer +pipeline = Pipeline([ + ('vect', CountVectorizer()), + ('tfidf', TfidfTransformer()), + ('clf', MultinomialNB()), + ]) +model = il.SkLearnDataClassifier.build(pipeline, text_env) +model.fit_provider(train, labels) +predictions = model.predict(test) +``` +## Installation +See [installation.md](docs/installation.md) for an extended installation guide. +| Method | Instructions | +|--------|--------------| +| `pip` | Install from [PyPI](https://pypi.org/project/instancelib/) via `pip install instancelib`. | +| Local | Clone this repository and install via `pip install -e .` or locally run `python setup.py install`. +## Documentation +Full documentation of the latest version is provided at [https://instancelib.readthedocs.org](https://instancelib.readthedocs.org). +## Example usage +See [usage.py](usage.py) to see an example of how the package can be used. +## Releases +`instancelib` is officially released through [PyPI](https://pypi.org/project/instancelib/). +See [CHANGELOG.md](CHANGELOG.md) for a full overview of the changes for each version. +## Citation +```bibtex +@misc{instancelib, + title = {Python package instancelib}, + author = {Michiel Bron}, + howpublished = {\url{https://github.com/mpbron/instancelib}}, + year = {2021} +} +``` +## Library usage +This library is used in the following projects: +- [python-allib](https://github.com/mpbron/allib). A typed Active Learning framework for Python for both Classification and Technology-Assisted Review systems. +- [text_explainability](https://marcelrobeer.github.io/text_explainability/). A generic explainability architecture for explaining text machine learning models +- [text_sensitivity](https://marcelrobeer.github.io/text_sensitivity/). Sensitivity testing (fairness & robustness) for text machine learning models. +## Maintenance +### Contributors +- [Michiel Bron](https://www.uu.nl/staff/MPBron) (`@mpbron`) +### Todo +Tasks yet to be done: +* Implement support for ONNX models +* Implement support for Python DataLoaders +* Make the external dataset interface more user friendly +* Redesign LabelProvider to support more attribute levels +* CI/CD tests + +%package -n python3-instancelib +Summary: A typed dataset abstraction toolkit for machine learning projects +Provides: python-instancelib +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-instancelib +`instancelib` provides a **generic architecture** for datasets. +© Michiel Bron, 2021 +## Quick tour +**Load dataset**: Load the dataset in an environment +```python +import instancelib as il +text_env = il.read_excel_dataset("./datasets/testdataset.xlsx", + data_cols=["fulltext"], + label_cols=["label"]) +ds = text_env.dataset # A `dict-like` interface for instances +labels = text_env.labels # An object that stores all labels +labelset = labels.labelset # All labels that can be given to instances +ins = ds[20] # Get instance with identifier key `20` +ins_data = ins.data # Get the raw data for instance 20 +ins_vector = ins.vector # Get the vector representation for 20 if any +ins_labels = labels.get_labels(ins) +``` +**Dataset manipulation**: Divide the dataset in a train and test set +```python +train, test = text_env.train_test_split(ds, train_size=0.70) +print(20 in train) # May be true or false, because of random sampling +``` +**Train a model**: +```python +from sklearn.pipeline import Pipeline +from sklearn.naive_bayes import MultinomialNB +from sklearn.feature_extraction.text import TfidfTransformer, CountVectorizer +pipeline = Pipeline([ + ('vect', CountVectorizer()), + ('tfidf', TfidfTransformer()), + ('clf', MultinomialNB()), + ]) +model = il.SkLearnDataClassifier.build(pipeline, text_env) +model.fit_provider(train, labels) +predictions = model.predict(test) +``` +## Installation +See [installation.md](docs/installation.md) for an extended installation guide. +| Method | Instructions | +|--------|--------------| +| `pip` | Install from [PyPI](https://pypi.org/project/instancelib/) via `pip install instancelib`. | +| Local | Clone this repository and install via `pip install -e .` or locally run `python setup.py install`. +## Documentation +Full documentation of the latest version is provided at [https://instancelib.readthedocs.org](https://instancelib.readthedocs.org). +## Example usage +See [usage.py](usage.py) to see an example of how the package can be used. +## Releases +`instancelib` is officially released through [PyPI](https://pypi.org/project/instancelib/). +See [CHANGELOG.md](CHANGELOG.md) for a full overview of the changes for each version. +## Citation +```bibtex +@misc{instancelib, + title = {Python package instancelib}, + author = {Michiel Bron}, + howpublished = {\url{https://github.com/mpbron/instancelib}}, + year = {2021} +} +``` +## Library usage +This library is used in the following projects: +- [python-allib](https://github.com/mpbron/allib). A typed Active Learning framework for Python for both Classification and Technology-Assisted Review systems. +- [text_explainability](https://marcelrobeer.github.io/text_explainability/). A generic explainability architecture for explaining text machine learning models +- [text_sensitivity](https://marcelrobeer.github.io/text_sensitivity/). Sensitivity testing (fairness & robustness) for text machine learning models. +## Maintenance +### Contributors +- [Michiel Bron](https://www.uu.nl/staff/MPBron) (`@mpbron`) +### Todo +Tasks yet to be done: +* Implement support for ONNX models +* Implement support for Python DataLoaders +* Make the external dataset interface more user friendly +* Redesign LabelProvider to support more attribute levels +* CI/CD tests + +%package help +Summary: Development documents and examples for instancelib +Provides: python3-instancelib-doc +%description help +`instancelib` provides a **generic architecture** for datasets. +© Michiel Bron, 2021 +## Quick tour +**Load dataset**: Load the dataset in an environment +```python +import instancelib as il +text_env = il.read_excel_dataset("./datasets/testdataset.xlsx", + data_cols=["fulltext"], + label_cols=["label"]) +ds = text_env.dataset # A `dict-like` interface for instances +labels = text_env.labels # An object that stores all labels +labelset = labels.labelset # All labels that can be given to instances +ins = ds[20] # Get instance with identifier key `20` +ins_data = ins.data # Get the raw data for instance 20 +ins_vector = ins.vector # Get the vector representation for 20 if any +ins_labels = labels.get_labels(ins) +``` +**Dataset manipulation**: Divide the dataset in a train and test set +```python +train, test = text_env.train_test_split(ds, train_size=0.70) +print(20 in train) # May be true or false, because of random sampling +``` +**Train a model**: +```python +from sklearn.pipeline import Pipeline +from sklearn.naive_bayes import MultinomialNB +from sklearn.feature_extraction.text import TfidfTransformer, CountVectorizer +pipeline = Pipeline([ + ('vect', CountVectorizer()), + ('tfidf', TfidfTransformer()), + ('clf', MultinomialNB()), + ]) +model = il.SkLearnDataClassifier.build(pipeline, text_env) +model.fit_provider(train, labels) +predictions = model.predict(test) +``` +## Installation +See [installation.md](docs/installation.md) for an extended installation guide. +| Method | Instructions | +|--------|--------------| +| `pip` | Install from [PyPI](https://pypi.org/project/instancelib/) via `pip install instancelib`. | +| Local | Clone this repository and install via `pip install -e .` or locally run `python setup.py install`. +## Documentation +Full documentation of the latest version is provided at [https://instancelib.readthedocs.org](https://instancelib.readthedocs.org). +## Example usage +See [usage.py](usage.py) to see an example of how the package can be used. +## Releases +`instancelib` is officially released through [PyPI](https://pypi.org/project/instancelib/). +See [CHANGELOG.md](CHANGELOG.md) for a full overview of the changes for each version. +## Citation +```bibtex +@misc{instancelib, + title = {Python package instancelib}, + author = {Michiel Bron}, + howpublished = {\url{https://github.com/mpbron/instancelib}}, + year = {2021} +} +``` +## Library usage +This library is used in the following projects: +- [python-allib](https://github.com/mpbron/allib). A typed Active Learning framework for Python for both Classification and Technology-Assisted Review systems. +- [text_explainability](https://marcelrobeer.github.io/text_explainability/). A generic explainability architecture for explaining text machine learning models +- [text_sensitivity](https://marcelrobeer.github.io/text_sensitivity/). Sensitivity testing (fairness & robustness) for text machine learning models. +## Maintenance +### Contributors +- [Michiel Bron](https://www.uu.nl/staff/MPBron) (`@mpbron`) +### Todo +Tasks yet to be done: +* Implement support for ONNX models +* Implement support for Python DataLoaders +* Make the external dataset interface more user friendly +* Redesign LabelProvider to support more attribute levels +* CI/CD tests + +%prep +%autosetup -n instancelib-0.4.9.1 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-instancelib -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Fri May 05 2023 Python_Bot <Python_Bot@openeuler.org> - 0.4.9.1-1 +- Package Spec generated @@ -0,0 +1 @@ +4fd63b3f8706dd366caf713ee0142ee4 instancelib-0.4.9.1.tar.gz |
