%global _empty_manifest_terminate_build 0 Name: python-gokart Version: 1.2.2 Release: 1 Summary: Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline. [Documentation](https://gokart.readthedocs.io/en/latest/) License: MIT URL: https://github.com/m3dev/gokart Source0: https://mirrors.nju.edu.cn/pypi/web/packages/a9/59/d3e333f2c6b6871187a7b15d078cbdddb72fe2f052696e6e43386a699e1b/gokart-1.2.2.tar.gz BuildArch: noarch Requires: python3-luigi Requires: python3-boto3 Requires: python3-slack-sdk Requires: python3-pandas Requires: python3-numpy Requires: python3-tqdm Requires: python3-google-auth Requires: python3-pyarrow Requires: python3-uritemplate Requires: python3-google-api-python-client Requires: python3-APScheduler Requires: python3-redis Requires: python3-matplotlib %description # gokart
[![Test](https://github.com/m3dev/gokart/workflows/Test/badge.svg)](https://github.com/m3dev/gokart/actions?query=workflow%3ATest) [![](https://readthedocs.org/projects/gokart/badge/?version=latest)](https://gokart.readthedocs.io/en/latest/) [![Python Versions](https://img.shields.io/pypi/pyversions/gokart.svg)](https://pypi.org/project/gokart/) [![](https://img.shields.io/pypi/v/gokart)](https://pypi.org/project/gokart/) ![](https://img.shields.io/pypi/l/gokart) Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline. [Documentation](https://gokart.readthedocs.io/en/latest/) for the latest release is hosted on readthedocs. # About gokart Here are some good things about gokart. - The following meta data for each Task is stored separately in a `pkl` file with hash value - task output data - imported all module versions - task processing time - random seed in task - displayed log - all parameters set as class variables in the task - Automatically rerun the pipeline if parameters of Tasks are changed. - Support GCS and S3 as a data store for intermediate results of Tasks in the pipeline. - The above output is exchanged between tasks as an intermediate file, which is memory-friendly - `pandas.DataFrame` type and column checking during I/O - Directory structure of saved files is automatically determined from structure of script - Seeds for numpy and random are automatically fixed - Can code while adhering to [SOLID](https://en.wikipedia.org/wiki/SOLID) principles as much as possible - Tasks are locked via redis even if they run in parallel **All the functions above are created for constructing Machine Learning batches. Provides an excellent environment for reproducibility and team development.** Here are some non-goal / downside of the gokart. - Batch execution in parallel is supported, but parallel and concurrent execution of task in memory. - Gokart is focused on reproducibility. So, I/O and capacity of data storage can become a bottleneck. - No support for task visualize. - Gokart is not an experiment management tool. The management of the execution result is cut out as [Thunderbolt](https://github.com/m3dev/thunderbolt). - Gokart does not recommend writing pipelines in toml, yaml, json, and more. Gokart is preferring to write them in Python. # Getting Started Within the activated Python environment, use the following command to install gokart. ``` pip install gokart ``` # Quickstart A minimal gokart tasks looks something like this: ```python import gokart class Example(gokart.TaskOnKart): def run(self): self.dump('Hello, world!') task = Example() output = gokart.build(task) print(output) ``` `gokart.build` return the result of dump by `gokart.TaskOnKart`. The example will output the following. ``` Hello, world! ``` This is an introduction to some of the gokart. There are still more useful features. Please See [Documentation](https://gokart.readthedocs.io/en/latest/) . Have a good gokart life. # Achievements Gokart is a proven product. - It's actually been used by [m3.inc](https://corporate.m3.com/en) for over 3 years - Natural Language Processing Competition by [Nishika.inc](https://nishika.com) 2nd prize : [Solution Repository](https://github.com/vaaaaanquish/nishika_akutagawa_2nd_prize) # Thanks gokart is a wrapper for luigi. Thanks to luigi and dependent projects! - [luigi](https://github.com/spotify/luigi) %package -n python3-gokart Summary: Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline. [Documentation](https://gokart.readthedocs.io/en/latest/) Provides: python-gokart BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-gokart # gokart
[![Test](https://github.com/m3dev/gokart/workflows/Test/badge.svg)](https://github.com/m3dev/gokart/actions?query=workflow%3ATest) [![](https://readthedocs.org/projects/gokart/badge/?version=latest)](https://gokart.readthedocs.io/en/latest/) [![Python Versions](https://img.shields.io/pypi/pyversions/gokart.svg)](https://pypi.org/project/gokart/) [![](https://img.shields.io/pypi/v/gokart)](https://pypi.org/project/gokart/) ![](https://img.shields.io/pypi/l/gokart) Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline. [Documentation](https://gokart.readthedocs.io/en/latest/) for the latest release is hosted on readthedocs. # About gokart Here are some good things about gokart. - The following meta data for each Task is stored separately in a `pkl` file with hash value - task output data - imported all module versions - task processing time - random seed in task - displayed log - all parameters set as class variables in the task - Automatically rerun the pipeline if parameters of Tasks are changed. - Support GCS and S3 as a data store for intermediate results of Tasks in the pipeline. - The above output is exchanged between tasks as an intermediate file, which is memory-friendly - `pandas.DataFrame` type and column checking during I/O - Directory structure of saved files is automatically determined from structure of script - Seeds for numpy and random are automatically fixed - Can code while adhering to [SOLID](https://en.wikipedia.org/wiki/SOLID) principles as much as possible - Tasks are locked via redis even if they run in parallel **All the functions above are created for constructing Machine Learning batches. Provides an excellent environment for reproducibility and team development.** Here are some non-goal / downside of the gokart. - Batch execution in parallel is supported, but parallel and concurrent execution of task in memory. - Gokart is focused on reproducibility. So, I/O and capacity of data storage can become a bottleneck. - No support for task visualize. - Gokart is not an experiment management tool. The management of the execution result is cut out as [Thunderbolt](https://github.com/m3dev/thunderbolt). - Gokart does not recommend writing pipelines in toml, yaml, json, and more. Gokart is preferring to write them in Python. # Getting Started Within the activated Python environment, use the following command to install gokart. ``` pip install gokart ``` # Quickstart A minimal gokart tasks looks something like this: ```python import gokart class Example(gokart.TaskOnKart): def run(self): self.dump('Hello, world!') task = Example() output = gokart.build(task) print(output) ``` `gokart.build` return the result of dump by `gokart.TaskOnKart`. The example will output the following. ``` Hello, world! ``` This is an introduction to some of the gokart. There are still more useful features. Please See [Documentation](https://gokart.readthedocs.io/en/latest/) . Have a good gokart life. # Achievements Gokart is a proven product. - It's actually been used by [m3.inc](https://corporate.m3.com/en) for over 3 years - Natural Language Processing Competition by [Nishika.inc](https://nishika.com) 2nd prize : [Solution Repository](https://github.com/vaaaaanquish/nishika_akutagawa_2nd_prize) # Thanks gokart is a wrapper for luigi. Thanks to luigi and dependent projects! - [luigi](https://github.com/spotify/luigi) %package help Summary: Development documents and examples for gokart Provides: python3-gokart-doc %description help # gokart
[![Test](https://github.com/m3dev/gokart/workflows/Test/badge.svg)](https://github.com/m3dev/gokart/actions?query=workflow%3ATest)
[![](https://readthedocs.org/projects/gokart/badge/?version=latest)](https://gokart.readthedocs.io/en/latest/)
[![Python Versions](https://img.shields.io/pypi/pyversions/gokart.svg)](https://pypi.org/project/gokart/)
[![](https://img.shields.io/pypi/v/gokart)](https://pypi.org/project/gokart/)
![](https://img.shields.io/pypi/l/gokart)
Gokart solves reproducibility, task dependencies, constraints of good code, and ease of use for Machine Learning Pipeline.
[Documentation](https://gokart.readthedocs.io/en/latest/) for the latest release is hosted on readthedocs.
# About gokart
Here are some good things about gokart.
- The following meta data for each Task is stored separately in a `pkl` file with hash value
- task output data
- imported all module versions
- task processing time
- random seed in task
- displayed log
- all parameters set as class variables in the task
- Automatically rerun the pipeline if parameters of Tasks are changed.
- Support GCS and S3 as a data store for intermediate results of Tasks in the pipeline.
- The above output is exchanged between tasks as an intermediate file, which is memory-friendly
- `pandas.DataFrame` type and column checking during I/O
- Directory structure of saved files is automatically determined from structure of script
- Seeds for numpy and random are automatically fixed
- Can code while adhering to [SOLID](https://en.wikipedia.org/wiki/SOLID) principles as much as possible
- Tasks are locked via redis even if they run in parallel
**All the functions above are created for constructing Machine Learning batches. Provides an excellent environment for reproducibility and team development.**
Here are some non-goal / downside of the gokart.
- Batch execution in parallel is supported, but parallel and concurrent execution of task in memory.
- Gokart is focused on reproducibility. So, I/O and capacity of data storage can become a bottleneck.
- No support for task visualize.
- Gokart is not an experiment management tool. The management of the execution result is cut out as [Thunderbolt](https://github.com/m3dev/thunderbolt).
- Gokart does not recommend writing pipelines in toml, yaml, json, and more. Gokart is preferring to write them in Python.
# Getting Started
Within the activated Python environment, use the following command to install gokart.
```
pip install gokart
```
# Quickstart
A minimal gokart tasks looks something like this:
```python
import gokart
class Example(gokart.TaskOnKart):
def run(self):
self.dump('Hello, world!')
task = Example()
output = gokart.build(task)
print(output)
```
`gokart.build` return the result of dump by `gokart.TaskOnKart`. The example will output the following.
```
Hello, world!
```
This is an introduction to some of the gokart.
There are still more useful features.
Please See [Documentation](https://gokart.readthedocs.io/en/latest/) .
Have a good gokart life.
# Achievements
Gokart is a proven product.
- It's actually been used by [m3.inc](https://corporate.m3.com/en) for over 3 years
- Natural Language Processing Competition by [Nishika.inc](https://nishika.com) 2nd prize : [Solution Repository](https://github.com/vaaaaanquish/nishika_akutagawa_2nd_prize)
# Thanks
gokart is a wrapper for luigi. Thanks to luigi and dependent projects!
- [luigi](https://github.com/spotify/luigi)
%prep
%autosetup -n gokart-1.2.2
%build
%py3_build
%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .
%files -n python3-gokart -f filelist.lst
%dir %{python3_sitelib}/*
%files help -f doclist.lst
%{_docdir}/*
%changelog
* Tue May 30 2023 Python_Bot