python-pycausalimpact.spec


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341

%global _empty_manifest_terminate_build 0
Name:		python-pycausalimpact
Version:	0.1.1
Release:	1
Summary:	Python version of Google's Causal Impact model
License:	MIT
URL:		https://github.com/dafiti/causalimpact
Source0:	https://mirrors.nju.edu.cn/pypi/web/packages/e3/62/4b471c8ceb8f9a2115892bf80a438b7e2567a8a4fe0d9f95544a1fc53918/pycausalimpact-0.1.1.tar.gz
BuildArch:	noarch

Requires:	python3-numpy
Requires:	python3-scipy
Requires:	python3-statsmodels
Requires:	python3-matplotlib
Requires:	python3-jinja2
Requires:	python3-ipython
Requires:	python3-jupyter
Requires:	python3-pytest
Requires:	python3-pytest-cov
Requires:	python3-mock
Requires:	python3-tox

%description
# Causal Impact [![Build Status](https://travis-ci.com/dafiti/causalimpact.svg?branch=master)](https://travis-ci.com/dafiti/causalimpact) [![Coverage Status](https://coveralls.io/repos/github/dafiti/causalimpact/badge.svg?branch=master)](https://coveralls.io/github/dafiti/causalimpact?branch=master) [![PyPI version](https://badge.fury.io/py/pycausalimpact.svg)](https://badge.fury.io/py/pycausalimpact) [![Pyversions](https://img.shields.io/pypi/pyversions/pycausalimpact.svg)](https://pypi.python.org/pypi/pycausalimpact) [![GitHub license](https://img.shields.io/github/license/dafiti/causalimpact.svg)](https://github.com/dafiti/causalimpact/blob/master/LICENSE)

Python causal impact (or causal inference) implementation of [Google's](https://github.com/google/CausalImpact) model with all functionalities fully ported and tested.

## How it works
The main goal of the algorithm is to infer  the expected effect a given intervention (or any action) had on some response variable by analyzing differences between expected and observed time series data.

Data is divided in two parts: the first one is what is known as the "pre-intervention" period and the concept of [Bayesian Structural Time Series](https://en.wikipedia.org/wiki/Bayesian_structural_time_series)  is used to fit a model that best explains what has been observed. The fitted model is used in the second part of data ("post-intervention" period) to forecast what the response would look like had the intervention not taken place. The inferences are based on the differences between observed response to the predicted one which yields the absolute and relative expected effect the intervention caused on data.

The model makes as assumption (which is recommended to be confirmed in your data) that the response variable can be precisely modeled by a linear regression with what is known as "covariates" (or `X`) that **must not** be affected by the intervention that took place (for instance, if a company wants to infer what impact a given marketing campaign will have on its "revenue", then its daily "visits" cannot be used as a covariate as probably the total visits might be affected by the campaign. 

It is more commonly used to infer the impact that marketing interventions have on businesses such as the expected revenue associated to a given campaign or even to assert more precisely the revenue a given channel brings in by completely turning it off (also known as "hold-out" tests). It's important to note though that the model can be extensively used in different areas and subjects; any intervention on time series data can potentially be modeled and inferences be made upon observed and predicted data.

Please refer to <a href=http://nbviewer.jupyter.org/github/dafiti/causalimpact/blob/master/examples/getting_started.ipynb>getting started</a> in the `examples` folder for more information.

## Installation

    pip install pycausalimpact

## Requirements

 - python{2.7, 3.6, 3.7, 3.8} \*
 - numpy
 - scipy
 - statsmodels
 - matplotlib
 - jinja2

\* **We no longer support Python2.7!** Please refer to the tag `0.0.16` (`pip install pycausalimpact==0.0.16`) for the latest available supported version.

## Getting Started
We recommend this [presentation](https://www.youtube.com/watch?v=GTgZfCltMm8) by Kay Brodersen (one of the creators of the causal impact implementation in R).

We also created this introductory [ipython notebook](http://nbviewer.jupyter.org/github/dafiti/causalimpact/blob/master/examples/getting_started.ipynb) with examples of how to use this package.

### Simple Example
Here's a simple example (which can also be found in the original Google's R implementation) running in Python:

```python
import numpy as np
import pandas as pd
from statsmodels.tsa.arima_process import ArmaProcess
from causalimpact import CausalImpact


np.random.seed(12345)
ar = np.r_[1, 0.9]
ma = np.array([1])
arma_process = ArmaProcess(ar, ma)
X = 100 + arma_process.generate_sample(nsample=100)
y = 1.2 * X + np.random.normal(size=100)
y[70:] += 5

data = pd.DataFrame({'y': y, 'X': X}, columns=['y', 'X'])
pre_period = [0, 69]
post_period = [70, 99]

ci = CausalImpact(data, pre_period, post_period)
print(ci.summary())
print(ci.summary(output='report'))
ci.plot()
```

![alt text](https://raw.githubusercontent.com/dafiti/causalimpact/master/examples/ci_plot.png)

## Differences Between Python and R Packages
One thing you'll notice when using this package is that sometimes results will converge to be similar to the R package output and at times it may yield different conclusions.

This is a quite complex topic and we have discussed it more throroughly on the issues number [#34](https://github.com/dafiti/causalimpact/issues/34), [#37](https://github.com/dafiti/causalimpact/issues/37) and [#40](https://github.com/dafiti/causalimpact/issues/40) which we highly recommend the reading.

In a nutshell, Python implementation relies on [statsmodels](https://github.com/statsmodels/statsmodels) which uses a classical Kalman Filter approach for solving the statespace equations whereas R\`s uses a Bayesian approach (from [bsts](https://github.com/cran/bsts) package) with a stochastic Kalman Filter technique; both algorithms are expected to converge to similar final statespace solution [(ref)](https://stackoverflow.com/questions/57300211/local-level-model-not-fully-optimizing-irregular-state/57316141?noredirect=1#comment101157526_57316141).

Still, despite the similarities, both packages uses different assumptions for prior initalizations as well as for steps involved in the optimization process: while in R we find an approach that relies on user prior knowledge, Python uses classical statistical techniques aiming to maximize the likelihood function expressed in terms of the structural time series components.

As we discuss in the previously mentioned issues, it's hard to tell which is right or "more right"; each package has its own assumptions and its own techniques making it up for the final user to decide what is appropriate or not. We recommend comparing results from both packages in your use cases to have a more general idea whether there's convergence in conclusions or not.

As a final note, when using this Python package, **we highly recommend setting the prior as None** like so:

    ci = CausalImpact(data, pre_period, post_period, prior_level_sd=None)

This will let statsmodel itself do the optimization for the prior on the local level component. If you are confident that your local level prior should be a given specific value (say `0.01`), then it's probably ok to use it there, otherwise you run the risk of obtaining sub-optimal solutions as a result.

## Contributing, Bugs, Questions
Contributions are more than welcome! If you want to propose new changes, fix bugs or improve something feel free to fork the repository and send us a Pull Request. You can also open new [`Issues`](https://github.com/dafiti/causalimpact/issues) for reporting bugs and general problems.


%package -n python3-pycausalimpact
Summary:	Python version of Google's Causal Impact model
Provides:	python-pycausalimpact
BuildRequires:	python3-devel
BuildRequires:	python3-setuptools
BuildRequires:	python3-pip
%description -n python3-pycausalimpact
# Causal Impact [![Build Status](https://travis-ci.com/dafiti/causalimpact.svg?branch=master)](https://travis-ci.com/dafiti/causalimpact) [![Coverage Status](https://coveralls.io/repos/github/dafiti/causalimpact/badge.svg?branch=master)](https://coveralls.io/github/dafiti/causalimpact?branch=master) [![PyPI version](https://badge.fury.io/py/pycausalimpact.svg)](https://badge.fury.io/py/pycausalimpact) [![Pyversions](https://img.shields.io/pypi/pyversions/pycausalimpact.svg)](https://pypi.python.org/pypi/pycausalimpact) [![GitHub license](https://img.shields.io/github/license/dafiti/causalimpact.svg)](https://github.com/dafiti/causalimpact/blob/master/LICENSE)

Python causal impact (or causal inference) implementation of [Google's](https://github.com/google/CausalImpact) model with all functionalities fully ported and tested.

## How it works
The main goal of the algorithm is to infer  the expected effect a given intervention (or any action) had on some response variable by analyzing differences between expected and observed time series data.

Data is divided in two parts: the first one is what is known as the "pre-intervention" period and the concept of [Bayesian Structural Time Series](https://en.wikipedia.org/wiki/Bayesian_structural_time_series)  is used to fit a model that best explains what has been observed. The fitted model is used in the second part of data ("post-intervention" period) to forecast what the response would look like had the intervention not taken place. The inferences are based on the differences between observed response to the predicted one which yields the absolute and relative expected effect the intervention caused on data.

The model makes as assumption (which is recommended to be confirmed in your data) that the response variable can be precisely modeled by a linear regression with what is known as "covariates" (or `X`) that **must not** be affected by the intervention that took place (for instance, if a company wants to infer what impact a given marketing campaign will have on its "revenue", then its daily "visits" cannot be used as a covariate as probably the total visits might be affected by the campaign. 

It is more commonly used to infer the impact that marketing interventions have on businesses such as the expected revenue associated to a given campaign or even to assert more precisely the revenue a given channel brings in by completely turning it off (also known as "hold-out" tests). It's important to note though that the model can be extensively used in different areas and subjects; any intervention on time series data can potentially be modeled and inferences be made upon observed and predicted data.

Please refer to <a href=http://nbviewer.jupyter.org/github/dafiti/causalimpact/blob/master/examples/getting_started.ipynb>getting started</a> in the `examples` folder for more information.

## Installation

    pip install pycausalimpact

## Requirements

 - python{2.7, 3.6, 3.7, 3.8} \*
 - numpy
 - scipy
 - statsmodels
 - matplotlib
 - jinja2

\* **We no longer support Python2.7!** Please refer to the tag `0.0.16` (`pip install pycausalimpact==0.0.16`) for the latest available supported version.

## Getting Started
We recommend this [presentation](https://www.youtube.com/watch?v=GTgZfCltMm8) by Kay Brodersen (one of the creators of the causal impact implementation in R).

We also created this introductory [ipython notebook](http://nbviewer.jupyter.org/github/dafiti/causalimpact/blob/master/examples/getting_started.ipynb) with examples of how to use this package.

### Simple Example
Here's a simple example (which can also be found in the original Google's R implementation) running in Python:

```python
import numpy as np
import pandas as pd
from statsmodels.tsa.arima_process import ArmaProcess
from causalimpact import CausalImpact


np.random.seed(12345)
ar = np.r_[1, 0.9]
ma = np.array([1])
arma_process = ArmaProcess(ar, ma)
X = 100 + arma_process.generate_sample(nsample=100)
y = 1.2 * X + np.random.normal(size=100)
y[70:] += 5

data = pd.DataFrame({'y': y, 'X': X}, columns=['y', 'X'])
pre_period = [0, 69]
post_period = [70, 99]

ci = CausalImpact(data, pre_period, post_period)
print(ci.summary())
print(ci.summary(output='report'))
ci.plot()
```

![alt text](https://raw.githubusercontent.com/dafiti/causalimpact/master/examples/ci_plot.png)

## Differences Between Python and R Packages
One thing you'll notice when using this package is that sometimes results will converge to be similar to the R package output and at times it may yield different conclusions.

This is a quite complex topic and we have discussed it more throroughly on the issues number [#34](https://github.com/dafiti/causalimpact/issues/34), [#37](https://github.com/dafiti/causalimpact/issues/37) and [#40](https://github.com/dafiti/causalimpact/issues/40) which we highly recommend the reading.

In a nutshell, Python implementation relies on [statsmodels](https://github.com/statsmodels/statsmodels) which uses a classical Kalman Filter approach for solving the statespace equations whereas R\`s uses a Bayesian approach (from [bsts](https://github.com/cran/bsts) package) with a stochastic Kalman Filter technique; both algorithms are expected to converge to similar final statespace solution [(ref)](https://stackoverflow.com/questions/57300211/local-level-model-not-fully-optimizing-irregular-state/57316141?noredirect=1#comment101157526_57316141).

Still, despite the similarities, both packages uses different assumptions for prior initalizations as well as for steps involved in the optimization process: while in R we find an approach that relies on user prior knowledge, Python uses classical statistical techniques aiming to maximize the likelihood function expressed in terms of the structural time series components.

As we discuss in the previously mentioned issues, it's hard to tell which is right or "more right"; each package has its own assumptions and its own techniques making it up for the final user to decide what is appropriate or not. We recommend comparing results from both packages in your use cases to have a more general idea whether there's convergence in conclusions or not.

As a final note, when using this Python package, **we highly recommend setting the prior as None** like so:

    ci = CausalImpact(data, pre_period, post_period, prior_level_sd=None)

This will let statsmodel itself do the optimization for the prior on the local level component. If you are confident that your local level prior should be a given specific value (say `0.01`), then it's probably ok to use it there, otherwise you run the risk of obtaining sub-optimal solutions as a result.

## Contributing, Bugs, Questions
Contributions are more than welcome! If you want to propose new changes, fix bugs or improve something feel free to fork the repository and send us a Pull Request. You can also open new [`Issues`](https://github.com/dafiti/causalimpact/issues) for reporting bugs and general problems.


%package help
Summary:	Development documents and examples for pycausalimpact
Provides:	python3-pycausalimpact-doc
%description help
# Causal Impact [![Build Status](https://travis-ci.com/dafiti/causalimpact.svg?branch=master)](https://travis-ci.com/dafiti/causalimpact) [![Coverage Status](https://coveralls.io/repos/github/dafiti/causalimpact/badge.svg?branch=master)](https://coveralls.io/github/dafiti/causalimpact?branch=master) [![PyPI version](https://badge.fury.io/py/pycausalimpact.svg)](https://badge.fury.io/py/pycausalimpact) [![Pyversions](https://img.shields.io/pypi/pyversions/pycausalimpact.svg)](https://pypi.python.org/pypi/pycausalimpact) [![GitHub license](https://img.shields.io/github/license/dafiti/causalimpact.svg)](https://github.com/dafiti/causalimpact/blob/master/LICENSE)

Python causal impact (or causal inference) implementation of [Google's](https://github.com/google/CausalImpact) model with all functionalities fully ported and tested.

## How it works
The main goal of the algorithm is to infer  the expected effect a given intervention (or any action) had on some response variable by analyzing differences between expected and observed time series data.

Data is divided in two parts: the first one is what is known as the "pre-intervention" period and the concept of [Bayesian Structural Time Series](https://en.wikipedia.org/wiki/Bayesian_structural_time_series)  is used to fit a model that best explains what has been observed. The fitted model is used in the second part of data ("post-intervention" period) to forecast what the response would look like had the intervention not taken place. The inferences are based on the differences between observed response to the predicted one which yields the absolute and relative expected effect the intervention caused on data.

The model makes as assumption (which is recommended to be confirmed in your data) that the response variable can be precisely modeled by a linear regression with what is known as "covariates" (or `X`) that **must not** be affected by the intervention that took place (for instance, if a company wants to infer what impact a given marketing campaign will have on its "revenue", then its daily "visits" cannot be used as a covariate as probably the total visits might be affected by the campaign. 

It is more commonly used to infer the impact that marketing interventions have on businesses such as the expected revenue associated to a given campaign or even to assert more precisely the revenue a given channel brings in by completely turning it off (also known as "hold-out" tests). It's important to note though that the model can be extensively used in different areas and subjects; any intervention on time series data can potentially be modeled and inferences be made upon observed and predicted data.

Please refer to <a href=http://nbviewer.jupyter.org/github/dafiti/causalimpact/blob/master/examples/getting_started.ipynb>getting started</a> in the `examples` folder for more information.

## Installation

    pip install pycausalimpact

## Requirements

 - python{2.7, 3.6, 3.7, 3.8} \*
 - numpy
 - scipy
 - statsmodels
 - matplotlib
 - jinja2

\* **We no longer support Python2.7!** Please refer to the tag `0.0.16` (`pip install pycausalimpact==0.0.16`) for the latest available supported version.

## Getting Started
We recommend this [presentation](https://www.youtube.com/watch?v=GTgZfCltMm8) by Kay Brodersen (one of the creators of the causal impact implementation in R).

We also created this introductory [ipython notebook](http://nbviewer.jupyter.org/github/dafiti/causalimpact/blob/master/examples/getting_started.ipynb) with examples of how to use this package.

### Simple Example
Here's a simple example (which can also be found in the original Google's R implementation) running in Python:

```python
import numpy as np
import pandas as pd
from statsmodels.tsa.arima_process import ArmaProcess
from causalimpact import CausalImpact


np.random.seed(12345)
ar = np.r_[1, 0.9]
ma = np.array([1])
arma_process = ArmaProcess(ar, ma)
X = 100 + arma_process.generate_sample(nsample=100)
y = 1.2 * X + np.random.normal(size=100)
y[70:] += 5

data = pd.DataFrame({'y': y, 'X': X}, columns=['y', 'X'])
pre_period = [0, 69]
post_period = [70, 99]

ci = CausalImpact(data, pre_period, post_period)
print(ci.summary())
print(ci.summary(output='report'))
ci.plot()
```

![alt text](https://raw.githubusercontent.com/dafiti/causalimpact/master/examples/ci_plot.png)

## Differences Between Python and R Packages
One thing you'll notice when using this package is that sometimes results will converge to be similar to the R package output and at times it may yield different conclusions.

This is a quite complex topic and we have discussed it more throroughly on the issues number [#34](https://github.com/dafiti/causalimpact/issues/34), [#37](https://github.com/dafiti/causalimpact/issues/37) and [#40](https://github.com/dafiti/causalimpact/issues/40) which we highly recommend the reading.

In a nutshell, Python implementation relies on [statsmodels](https://github.com/statsmodels/statsmodels) which uses a classical Kalman Filter approach for solving the statespace equations whereas R\`s uses a Bayesian approach (from [bsts](https://github.com/cran/bsts) package) with a stochastic Kalman Filter technique; both algorithms are expected to converge to similar final statespace solution [(ref)](https://stackoverflow.com/questions/57300211/local-level-model-not-fully-optimizing-irregular-state/57316141?noredirect=1#comment101157526_57316141).

Still, despite the similarities, both packages uses different assumptions for prior initalizations as well as for steps involved in the optimization process: while in R we find an approach that relies on user prior knowledge, Python uses classical statistical techniques aiming to maximize the likelihood function expressed in terms of the structural time series components.

As we discuss in the previously mentioned issues, it's hard to tell which is right or "more right"; each package has its own assumptions and its own techniques making it up for the final user to decide what is appropriate or not. We recommend comparing results from both packages in your use cases to have a more general idea whether there's convergence in conclusions or not.

As a final note, when using this Python package, **we highly recommend setting the prior as None** like so:

    ci = CausalImpact(data, pre_period, post_period, prior_level_sd=None)

This will let statsmodel itself do the optimization for the prior on the local level component. If you are confident that your local level prior should be a given specific value (say `0.01`), then it's probably ok to use it there, otherwise you run the risk of obtaining sub-optimal solutions as a result.

## Contributing, Bugs, Questions
Contributions are more than welcome! If you want to propose new changes, fix bugs or improve something feel free to fork the repository and send us a Pull Request. You can also open new [`Issues`](https://github.com/dafiti/causalimpact/issues) for reporting bugs and general problems.


%prep
%autosetup -n pycausalimpact-0.1.1

%build
%py3_build

%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
	find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
	find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
	find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
	find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
	find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .

%files -n python3-pycausalimpact -f filelist.lst
%dir %{python3_sitelib}/*

%files help -f doclist.lst
%{_docdir}/*

%changelog
* Sun Apr 23 2023 Python_Bot <Python_Bot@openeuler.org> - 0.1.1-1
- Package Spec generated