summaryrefslogtreecommitdiff
path: root/python-bert-etl.spec
blob: 03051e3b505f7d3dea7003159ef44f3691887170 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
%global _empty_manifest_terminate_build 0
Name:		python-bert-etl
Version:	0.4.77
Release:	1
Summary:	A microframework for simple ETL solutions
License:	MIT
URL:		https://github.com/jbcurtin/bert
Source0:	https://mirrors.nju.edu.cn/pypi/web/packages/a3/2a/a8bf07796f295711794c2462c94a88d2178003fef372bed55b0def1f490c/bert-etl-0.4.77.tar.gz
BuildArch:	noarch


%description
[![Documentation Status](https://readthedocs.org/projects/bert-etl/badge/?version=latest)](https://bert-etl.readthedocs.io/en/latest/?badge=latest)

# Bert
A microframework for simple ETL solutions.


## Architecture

At its core, `bert-etl` uses Dynamodb Streams to communicate between lambda functions. `bert-etl.yaml` provides control on how the initial lambda function is called, either by periodic events, sns topics, or s3 bucket (planned)events. Passing an event to `bert-etl` is straight forward from `zappa` or a generic AWS lambda function you've hooked up to API Gateway.

At this moment in time, there are no plans to attach API Gateway to `bert-etl.yaml` because there is already great software(like `zappa`) that does this.

## Warning: aws-lambda deploy target still considered beta

`bert-etl` ships with a deploy target to `aws-lambda`. This feature isn't very well documented yet, and has quite a bit of work to de done so it may function more consistently. Be aware that `aws-lambda` is a product ran and controlled by AWS. If you incure charges using `bert-etl` while utilizing `aws-lambda`, you may not consider us responsible. `bert-etl` is offered under `MIT` license which includes a `Use at your own risk` clause.

## Begin with

Lets begin with an example of loading data from a file-server and than loading it into numpy arrays

```
$ virtualenv -p $(which python3) env
$ source env/bin/activate
$ pip install bert-etl
$ pip install librosa # for demo project
$ docker run -p 6379:6379 -d redis # bert-etl runs on redis to share data across CPUs
$ bert-runner.py -n demo
$ PYTHONPATH='.' bert-runner.py -m demo -j sync_sounds -f
```

## Release Notes

### 0.3.0

* Added Error Management. When an error occurs, bert-runner will log the error and re-run the job. If the same error happens often enough, the job will be aborted

### 0.2.1

* Added Release Notes

### 0.2.0

* Added Redis Service auto run. Using docker, redis will be pulled and started in the background
* Added Redis Service channels, sometimes you'll want to run to etl-jobs on the same machine

## Fund Bounty Target Upgrades

Bert provides a boiler plate framework that'll allow one to write concurrent ETL code using Pythons' `microprocessing` module. One function starts the process, piping data into a Redis backend that'll then be consumed by the next function. The queues are respectfully named for the scope of the function: Work(start) and Done(end) queue. Please consider contributing to Bert Bounty Targets to improve this documentation

https://www.patreon.com/jbcurtin


## Roadmap

* Create configuration file, `bert-etl.yaml`
* Support conda venv
* Support pyenv venv
* Support dynamodb flush
* Support multipule invocations per AWS account
* Support undeploy AWS Lambda
* Support Bottle functions in AWS Lambda


## Tutorial Roadmap

* Introduce Bert API
* Explain `bert.binding`
* Explain `comm_binder`
* Explain `work_queue`
* Explain `done_queue`
* Explain `ologger`
* Explain `DEBUG` and how turning it off allows for x-concurrent processes
* Show an example on how to load timeseries data, calcualte the mean, and display the final output of the mean
* Expand the example to show how to scale the application implicitly
* Show how to run locally using Redis
* Show how to run locally without Redis, using Dynamodb instead
* Show how to run remotly using AWSLambda and Dynamodb 
* Talk about dynamodb and eventual consistency

%package -n python3-bert-etl
Summary:	A microframework for simple ETL solutions
Provides:	python-bert-etl
BuildRequires:	python3-devel
BuildRequires:	python3-setuptools
BuildRequires:	python3-pip
%description -n python3-bert-etl
[![Documentation Status](https://readthedocs.org/projects/bert-etl/badge/?version=latest)](https://bert-etl.readthedocs.io/en/latest/?badge=latest)

# Bert
A microframework for simple ETL solutions.


## Architecture

At its core, `bert-etl` uses Dynamodb Streams to communicate between lambda functions. `bert-etl.yaml` provides control on how the initial lambda function is called, either by periodic events, sns topics, or s3 bucket (planned)events. Passing an event to `bert-etl` is straight forward from `zappa` or a generic AWS lambda function you've hooked up to API Gateway.

At this moment in time, there are no plans to attach API Gateway to `bert-etl.yaml` because there is already great software(like `zappa`) that does this.

## Warning: aws-lambda deploy target still considered beta

`bert-etl` ships with a deploy target to `aws-lambda`. This feature isn't very well documented yet, and has quite a bit of work to de done so it may function more consistently. Be aware that `aws-lambda` is a product ran and controlled by AWS. If you incure charges using `bert-etl` while utilizing `aws-lambda`, you may not consider us responsible. `bert-etl` is offered under `MIT` license which includes a `Use at your own risk` clause.

## Begin with

Lets begin with an example of loading data from a file-server and than loading it into numpy arrays

```
$ virtualenv -p $(which python3) env
$ source env/bin/activate
$ pip install bert-etl
$ pip install librosa # for demo project
$ docker run -p 6379:6379 -d redis # bert-etl runs on redis to share data across CPUs
$ bert-runner.py -n demo
$ PYTHONPATH='.' bert-runner.py -m demo -j sync_sounds -f
```

## Release Notes

### 0.3.0

* Added Error Management. When an error occurs, bert-runner will log the error and re-run the job. If the same error happens often enough, the job will be aborted

### 0.2.1

* Added Release Notes

### 0.2.0

* Added Redis Service auto run. Using docker, redis will be pulled and started in the background
* Added Redis Service channels, sometimes you'll want to run to etl-jobs on the same machine

## Fund Bounty Target Upgrades

Bert provides a boiler plate framework that'll allow one to write concurrent ETL code using Pythons' `microprocessing` module. One function starts the process, piping data into a Redis backend that'll then be consumed by the next function. The queues are respectfully named for the scope of the function: Work(start) and Done(end) queue. Please consider contributing to Bert Bounty Targets to improve this documentation

https://www.patreon.com/jbcurtin


## Roadmap

* Create configuration file, `bert-etl.yaml`
* Support conda venv
* Support pyenv venv
* Support dynamodb flush
* Support multipule invocations per AWS account
* Support undeploy AWS Lambda
* Support Bottle functions in AWS Lambda


## Tutorial Roadmap

* Introduce Bert API
* Explain `bert.binding`
* Explain `comm_binder`
* Explain `work_queue`
* Explain `done_queue`
* Explain `ologger`
* Explain `DEBUG` and how turning it off allows for x-concurrent processes
* Show an example on how to load timeseries data, calcualte the mean, and display the final output of the mean
* Expand the example to show how to scale the application implicitly
* Show how to run locally using Redis
* Show how to run locally without Redis, using Dynamodb instead
* Show how to run remotly using AWSLambda and Dynamodb 
* Talk about dynamodb and eventual consistency

%package help
Summary:	Development documents and examples for bert-etl
Provides:	python3-bert-etl-doc
%description help
[![Documentation Status](https://readthedocs.org/projects/bert-etl/badge/?version=latest)](https://bert-etl.readthedocs.io/en/latest/?badge=latest)

# Bert
A microframework for simple ETL solutions.


## Architecture

At its core, `bert-etl` uses Dynamodb Streams to communicate between lambda functions. `bert-etl.yaml` provides control on how the initial lambda function is called, either by periodic events, sns topics, or s3 bucket (planned)events. Passing an event to `bert-etl` is straight forward from `zappa` or a generic AWS lambda function you've hooked up to API Gateway.

At this moment in time, there are no plans to attach API Gateway to `bert-etl.yaml` because there is already great software(like `zappa`) that does this.

## Warning: aws-lambda deploy target still considered beta

`bert-etl` ships with a deploy target to `aws-lambda`. This feature isn't very well documented yet, and has quite a bit of work to de done so it may function more consistently. Be aware that `aws-lambda` is a product ran and controlled by AWS. If you incure charges using `bert-etl` while utilizing `aws-lambda`, you may not consider us responsible. `bert-etl` is offered under `MIT` license which includes a `Use at your own risk` clause.

## Begin with

Lets begin with an example of loading data from a file-server and than loading it into numpy arrays

```
$ virtualenv -p $(which python3) env
$ source env/bin/activate
$ pip install bert-etl
$ pip install librosa # for demo project
$ docker run -p 6379:6379 -d redis # bert-etl runs on redis to share data across CPUs
$ bert-runner.py -n demo
$ PYTHONPATH='.' bert-runner.py -m demo -j sync_sounds -f
```

## Release Notes

### 0.3.0

* Added Error Management. When an error occurs, bert-runner will log the error and re-run the job. If the same error happens often enough, the job will be aborted

### 0.2.1

* Added Release Notes

### 0.2.0

* Added Redis Service auto run. Using docker, redis will be pulled and started in the background
* Added Redis Service channels, sometimes you'll want to run to etl-jobs on the same machine

## Fund Bounty Target Upgrades

Bert provides a boiler plate framework that'll allow one to write concurrent ETL code using Pythons' `microprocessing` module. One function starts the process, piping data into a Redis backend that'll then be consumed by the next function. The queues are respectfully named for the scope of the function: Work(start) and Done(end) queue. Please consider contributing to Bert Bounty Targets to improve this documentation

https://www.patreon.com/jbcurtin


## Roadmap

* Create configuration file, `bert-etl.yaml`
* Support conda venv
* Support pyenv venv
* Support dynamodb flush
* Support multipule invocations per AWS account
* Support undeploy AWS Lambda
* Support Bottle functions in AWS Lambda


## Tutorial Roadmap

* Introduce Bert API
* Explain `bert.binding`
* Explain `comm_binder`
* Explain `work_queue`
* Explain `done_queue`
* Explain `ologger`
* Explain `DEBUG` and how turning it off allows for x-concurrent processes
* Show an example on how to load timeseries data, calcualte the mean, and display the final output of the mean
* Expand the example to show how to scale the application implicitly
* Show how to run locally using Redis
* Show how to run locally without Redis, using Dynamodb instead
* Show how to run remotly using AWSLambda and Dynamodb 
* Talk about dynamodb and eventual consistency

%prep
%autosetup -n bert-etl-0.4.77

%build
%py3_build

%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
	find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
	find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
	find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
	find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
	find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .

%files -n python3-bert-etl -f filelist.lst
%dir %{python3_sitelib}/*

%files help -f doclist.lst
%{_docdir}/*

%changelog
* Fri May 05 2023 Python_Bot <Python_Bot@openeuler.org> - 0.4.77-1
- Package Spec generated