%global _empty_manifest_terminate_build 0
Name:		python-correlation
Version:	1.0.0
Release:	1
Summary:	Calculate the confidence intervals of correlation coeficients
License:	BSD 2-Clause License
URL:		https://github.com/XiangwenWang/correlation
Source0:	https://mirrors.aliyun.com/pypi/web/packages/30/b9/f4fcac90062b340c0fa5d015979818f6d9c61d37393163dfbbe369ad4d70/correlation-1.0.0.tar.gz
BuildArch:	noarch

Requires:	python3-numpy
Requires:	python3-scipy

%description
# correlation

Calculate confidence intervals for correlation coefficients, including Pearson's R, Kendall's tau, Spearman's rho, and customized correlation measures.

## Methodology  
Two approaches are offered to calculate the confidence intervals, one parametric approach based on normal approximation, and one non-parametric approach based on bootstrapping.
### Parametric Approach
Say r\_hat is the correlation we obtained, then with a transformation  
```
z = ln((1+r)/(1-r))/2,
```  
z would approximately follow a normal distribution,  
with a mean equals to z(r\_hat),  
and a variance sigma^2 that equals to 1/(n-3), 0.437/(n-4), (1+r_hat^2/2)/(n-3) for the Pearson's r, Kendall's tau, and Spearman's rho, respectively (read Ref. [1, 2] for more details). n is the array length.

The (1-alpha) CI for r would be  
```
(T(z_lower), T(z_upper))
```  
where T is the inverse of the transformation mentioned earlier  
```
T(x) = (exp(2x) - 1) / (exp(2x) + 1),
```   
```
z_lower = z - z_(1-alpha/2) sigma,
```  
```
z_upper = z + z_(1-alpha/2) sigma.
```

This normal approximation works when the absolute values of the Pearson's r, Kendall's tau, and Spearman's rho are less than 1, 0.8, and 0.95, respectively.

### Nonparametric Approach
For the nonparametric approach, we simply adopt a naive bootstrap method.

* We sample a pair (x\_i, y\_i) with replacement from the original (paired) samples until we have a sample size that equals to n, and calculate a correlation coefficient from the new samples.  
* Repeat this process for a large number of times (by default we use 5000),
* then we could obtain the (1-alpha) CI for r by taking the alpha/2 and (1-alpha/2) quantiles of the obtained correlation coefficients.


## References
[1] Bonett, Douglas G., and Thomas A. Wright. "Sample size requirements for estimating Pearson, Kendall and Spearman correlations." Psychometrika 65, no. 1 (2000): 23-28.  
[2] Bishara, Anthony J., and James B. Hittner. "Confidence intervals for correlations when data are not normal." Behavior research methods 49, no. 1 (2017): 294-309.


## Installation:  
```
pip install correlation
```  
or

```
conda install -c wangxiangwen correlation
```

## Example Usage:  
```python
>>> import correlation
>>> a, b = list(range(2000)), list(range(200, 0, -1)) * 10
>>> correlation.corr(a, b, method='spearman_rho')
(-0.0999987624920335,          # correlation coefficient
 -0.14330929583811683,         # lower endpoint of CI
 -0.056305939127336606,        # upper endpoint of CI
 7.446171861744971e-06)        # p-value
```


%package -n python3-correlation
Summary:	Calculate the confidence intervals of correlation coeficients
Provides:	python-correlation
BuildRequires:	python3-devel
BuildRequires:	python3-setuptools
BuildRequires:	python3-pip
%description -n python3-correlation
# correlation

Calculate confidence intervals for correlation coefficients, including Pearson's R, Kendall's tau, Spearman's rho, and customized correlation measures.

## Methodology  
Two approaches are offered to calculate the confidence intervals, one parametric approach based on normal approximation, and one non-parametric approach based on bootstrapping.
### Parametric Approach
Say r\_hat is the correlation we obtained, then with a transformation  
```
z = ln((1+r)/(1-r))/2,
```  
z would approximately follow a normal distribution,  
with a mean equals to z(r\_hat),  
and a variance sigma^2 that equals to 1/(n-3), 0.437/(n-4), (1+r_hat^2/2)/(n-3) for the Pearson's r, Kendall's tau, and Spearman's rho, respectively (read Ref. [1, 2] for more details). n is the array length.

The (1-alpha) CI for r would be  
```
(T(z_lower), T(z_upper))
```  
where T is the inverse of the transformation mentioned earlier  
```
T(x) = (exp(2x) - 1) / (exp(2x) + 1),
```   
```
z_lower = z - z_(1-alpha/2) sigma,
```  
```
z_upper = z + z_(1-alpha/2) sigma.
```

This normal approximation works when the absolute values of the Pearson's r, Kendall's tau, and Spearman's rho are less than 1, 0.8, and 0.95, respectively.

### Nonparametric Approach
For the nonparametric approach, we simply adopt a naive bootstrap method.

* We sample a pair (x\_i, y\_i) with replacement from the original (paired) samples until we have a sample size that equals to n, and calculate a correlation coefficient from the new samples.  
* Repeat this process for a large number of times (by default we use 5000),
* then we could obtain the (1-alpha) CI for r by taking the alpha/2 and (1-alpha/2) quantiles of the obtained correlation coefficients.


## References
[1] Bonett, Douglas G., and Thomas A. Wright. "Sample size requirements for estimating Pearson, Kendall and Spearman correlations." Psychometrika 65, no. 1 (2000): 23-28.  
[2] Bishara, Anthony J., and James B. Hittner. "Confidence intervals for correlations when data are not normal." Behavior research methods 49, no. 1 (2017): 294-309.


## Installation:  
```
pip install correlation
```  
or

```
conda install -c wangxiangwen correlation
```

## Example Usage:  
```python
>>> import correlation
>>> a, b = list(range(2000)), list(range(200, 0, -1)) * 10
>>> correlation.corr(a, b, method='spearman_rho')
(-0.0999987624920335,          # correlation coefficient
 -0.14330929583811683,         # lower endpoint of CI
 -0.056305939127336606,        # upper endpoint of CI
 7.446171861744971e-06)        # p-value
```


%package help
Summary:	Development documents and examples for correlation
Provides:	python3-correlation-doc
%description help
# correlation

Calculate confidence intervals for correlation coefficients, including Pearson's R, Kendall's tau, Spearman's rho, and customized correlation measures.

## Methodology  
Two approaches are offered to calculate the confidence intervals, one parametric approach based on normal approximation, and one non-parametric approach based on bootstrapping.
### Parametric Approach
Say r\_hat is the correlation we obtained, then with a transformation  
```
z = ln((1+r)/(1-r))/2,
```  
z would approximately follow a normal distribution,  
with a mean equals to z(r\_hat),  
and a variance sigma^2 that equals to 1/(n-3), 0.437/(n-4), (1+r_hat^2/2)/(n-3) for the Pearson's r, Kendall's tau, and Spearman's rho, respectively (read Ref. [1, 2] for more details). n is the array length.

The (1-alpha) CI for r would be  
```
(T(z_lower), T(z_upper))
```  
where T is the inverse of the transformation mentioned earlier  
```
T(x) = (exp(2x) - 1) / (exp(2x) + 1),
```   
```
z_lower = z - z_(1-alpha/2) sigma,
```  
```
z_upper = z + z_(1-alpha/2) sigma.
```

This normal approximation works when the absolute values of the Pearson's r, Kendall's tau, and Spearman's rho are less than 1, 0.8, and 0.95, respectively.

### Nonparametric Approach
For the nonparametric approach, we simply adopt a naive bootstrap method.

* We sample a pair (x\_i, y\_i) with replacement from the original (paired) samples until we have a sample size that equals to n, and calculate a correlation coefficient from the new samples.  
* Repeat this process for a large number of times (by default we use 5000),
* then we could obtain the (1-alpha) CI for r by taking the alpha/2 and (1-alpha/2) quantiles of the obtained correlation coefficients.


## References
[1] Bonett, Douglas G., and Thomas A. Wright. "Sample size requirements for estimating Pearson, Kendall and Spearman correlations." Psychometrika 65, no. 1 (2000): 23-28.  
[2] Bishara, Anthony J., and James B. Hittner. "Confidence intervals for correlations when data are not normal." Behavior research methods 49, no. 1 (2017): 294-309.


## Installation:  
```
pip install correlation
```  
or

```
conda install -c wangxiangwen correlation
```

## Example Usage:  
```python
>>> import correlation
>>> a, b = list(range(2000)), list(range(200, 0, -1)) * 10
>>> correlation.corr(a, b, method='spearman_rho')
(-0.0999987624920335,          # correlation coefficient
 -0.14330929583811683,         # lower endpoint of CI
 -0.056305939127336606,        # upper endpoint of CI
 7.446171861744971e-06)        # p-value
```


%prep
%autosetup -n correlation-1.0.0

%build
%py3_build

%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
	find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
	find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
	find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
	find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
	find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .

%files -n python3-correlation -f filelist.lst
%dir %{python3_sitelib}/*

%files help -f doclist.lst
%{_docdir}/*

%changelog
* Tue Jun 20 2023 Python_Bot <Python_Bot@openeuler.org> - 1.0.0-1
- Package Spec generated