1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
|
%global _empty_manifest_terminate_build 0
Name: python-pandas-ply
Version: 0.2.1
Release: 1
Summary: functional data manipulation for pandas
License: Apache License 2.0
URL: https://github.com/coursera/pandas-ply
Source0: https://mirrors.aliyun.com/pypi/web/packages/8d/6b/434ef2f9c96e10ba6f75a1f82a85cf46ac98199f581627c9e732504a62f3/pandas-ply-0.2.1.tar.gz
BuildArch: noarch
%description
**pandas-ply** is a thin layer which makes it easier to manipulate data with `pandas <http://pandas.pydata.org/>`_. In particular, it provides elegant, functional, chainable syntax in cases where **pandas** would require mutation, saved intermediate values, or other awkward constructions. In this way, it aims to move **pandas** closer to the "grammar of data manipulation" provided by the `dplyr <http://cran.r-project.org/web/packages/dplyr/index.html>`_ package for R.
For example, take the **dplyr** code below:
flights %>%
group_by(year, month, day) %>%
summarise(
arr = mean(arr_delay, na.rm = TRUE),
dep = mean(dep_delay, na.rm = TRUE)
) %>%
filter(arr > 30 & dep > 30)
The most common way to express this in **pandas** is probably:
grouped_flights = flights.groupby(['year', 'month', 'day'])
output = pd.DataFrame()
output['arr'] = grouped_flights.arr_delay.mean()
output['dep'] = grouped_flights.dep_delay.mean()
filtered_output = output[(output.arr > 30) & (output.dep > 30)]
**pandas-ply** lets you instead write:
(flights
.groupby(['year', 'month', 'day'])
.ply_select(
arr = X.arr_delay.mean(),
dep = X.dep_delay.mean())
.ply_where(X.arr > 30, X.dep > 30))
In our opinion, this **pandas-ply** code is cleaner, more expressive, more readable, more concise, and less error-prone than the original **pandas** code.
Explanatory notes on the **pandas-ply** code sample above:
* **pandas-ply**'s methods (like ``ply_select`` and ``ply_where`` above) are attached directly to **pandas** objects and can be used immediately, without any wrapping or redirection. They start with a ``ply_`` prefix to distinguish them from built-in **pandas** methods.
* **pandas-ply**'s methods are named for (and modelled after) SQL's operators. (But keep in mind that these operators will not always appear in the same order as they do in a SQL statement: ``SELECT a FROM b WHERE c GROUP BY d`` probably maps to ``b.ply_where(c).groupby(d).ply_select(a)``.)
* **pandas-ply** includes a simple system for building "symbolic expressions" to provide as arguments to its methods. ``X`` above is an instance of ``ply.symbolic.Symbol``. Operations on this symbol produce larger compound symbolic expressions. When ``pandas-ply`` receives a symbolic expression as an argument, it converts it into a function. So, for instance, ``X.arr > 30`` in the above code could have instead been provided as ``lambda x: x.arr > 30``. Use of symbolic expressions allows the ``lambda x:`` to be left off, resulting in less cluttered code.
%package -n python3-pandas-ply
Summary: functional data manipulation for pandas
Provides: python-pandas-ply
BuildRequires: python3-devel
BuildRequires: python3-setuptools
BuildRequires: python3-pip
%description -n python3-pandas-ply
**pandas-ply** is a thin layer which makes it easier to manipulate data with `pandas <http://pandas.pydata.org/>`_. In particular, it provides elegant, functional, chainable syntax in cases where **pandas** would require mutation, saved intermediate values, or other awkward constructions. In this way, it aims to move **pandas** closer to the "grammar of data manipulation" provided by the `dplyr <http://cran.r-project.org/web/packages/dplyr/index.html>`_ package for R.
For example, take the **dplyr** code below:
flights %>%
group_by(year, month, day) %>%
summarise(
arr = mean(arr_delay, na.rm = TRUE),
dep = mean(dep_delay, na.rm = TRUE)
) %>%
filter(arr > 30 & dep > 30)
The most common way to express this in **pandas** is probably:
grouped_flights = flights.groupby(['year', 'month', 'day'])
output = pd.DataFrame()
output['arr'] = grouped_flights.arr_delay.mean()
output['dep'] = grouped_flights.dep_delay.mean()
filtered_output = output[(output.arr > 30) & (output.dep > 30)]
**pandas-ply** lets you instead write:
(flights
.groupby(['year', 'month', 'day'])
.ply_select(
arr = X.arr_delay.mean(),
dep = X.dep_delay.mean())
.ply_where(X.arr > 30, X.dep > 30))
In our opinion, this **pandas-ply** code is cleaner, more expressive, more readable, more concise, and less error-prone than the original **pandas** code.
Explanatory notes on the **pandas-ply** code sample above:
* **pandas-ply**'s methods (like ``ply_select`` and ``ply_where`` above) are attached directly to **pandas** objects and can be used immediately, without any wrapping or redirection. They start with a ``ply_`` prefix to distinguish them from built-in **pandas** methods.
* **pandas-ply**'s methods are named for (and modelled after) SQL's operators. (But keep in mind that these operators will not always appear in the same order as they do in a SQL statement: ``SELECT a FROM b WHERE c GROUP BY d`` probably maps to ``b.ply_where(c).groupby(d).ply_select(a)``.)
* **pandas-ply** includes a simple system for building "symbolic expressions" to provide as arguments to its methods. ``X`` above is an instance of ``ply.symbolic.Symbol``. Operations on this symbol produce larger compound symbolic expressions. When ``pandas-ply`` receives a symbolic expression as an argument, it converts it into a function. So, for instance, ``X.arr > 30`` in the above code could have instead been provided as ``lambda x: x.arr > 30``. Use of symbolic expressions allows the ``lambda x:`` to be left off, resulting in less cluttered code.
%package help
Summary: Development documents and examples for pandas-ply
Provides: python3-pandas-ply-doc
%description help
**pandas-ply** is a thin layer which makes it easier to manipulate data with `pandas <http://pandas.pydata.org/>`_. In particular, it provides elegant, functional, chainable syntax in cases where **pandas** would require mutation, saved intermediate values, or other awkward constructions. In this way, it aims to move **pandas** closer to the "grammar of data manipulation" provided by the `dplyr <http://cran.r-project.org/web/packages/dplyr/index.html>`_ package for R.
For example, take the **dplyr** code below:
flights %>%
group_by(year, month, day) %>%
summarise(
arr = mean(arr_delay, na.rm = TRUE),
dep = mean(dep_delay, na.rm = TRUE)
) %>%
filter(arr > 30 & dep > 30)
The most common way to express this in **pandas** is probably:
grouped_flights = flights.groupby(['year', 'month', 'day'])
output = pd.DataFrame()
output['arr'] = grouped_flights.arr_delay.mean()
output['dep'] = grouped_flights.dep_delay.mean()
filtered_output = output[(output.arr > 30) & (output.dep > 30)]
**pandas-ply** lets you instead write:
(flights
.groupby(['year', 'month', 'day'])
.ply_select(
arr = X.arr_delay.mean(),
dep = X.dep_delay.mean())
.ply_where(X.arr > 30, X.dep > 30))
In our opinion, this **pandas-ply** code is cleaner, more expressive, more readable, more concise, and less error-prone than the original **pandas** code.
Explanatory notes on the **pandas-ply** code sample above:
* **pandas-ply**'s methods (like ``ply_select`` and ``ply_where`` above) are attached directly to **pandas** objects and can be used immediately, without any wrapping or redirection. They start with a ``ply_`` prefix to distinguish them from built-in **pandas** methods.
* **pandas-ply**'s methods are named for (and modelled after) SQL's operators. (But keep in mind that these operators will not always appear in the same order as they do in a SQL statement: ``SELECT a FROM b WHERE c GROUP BY d`` probably maps to ``b.ply_where(c).groupby(d).ply_select(a)``.)
* **pandas-ply** includes a simple system for building "symbolic expressions" to provide as arguments to its methods. ``X`` above is an instance of ``ply.symbolic.Symbol``. Operations on this symbol produce larger compound symbolic expressions. When ``pandas-ply`` receives a symbolic expression as an argument, it converts it into a function. So, for instance, ``X.arr > 30`` in the above code could have instead been provided as ``lambda x: x.arr > 30``. Use of symbolic expressions allows the ``lambda x:`` to be left off, resulting in less cluttered code.
%prep
%autosetup -n pandas-ply-0.2.1
%build
%py3_build
%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .
%files -n python3-pandas-ply -f filelist.lst
%dir %{python3_sitelib}/*
%files help -f doclist.lst
%{_docdir}/*
%changelog
* Tue Jun 20 2023 Python_Bot <Python_Bot@openeuler.org> - 0.2.1-1
- Package Spec generated
|