1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
|
%global _empty_manifest_terminate_build 0
Name: python-datasketch
Version: 1.5.9
Release: 1
Summary: Probabilistic data structures for processing and searching very large datasets
License: MIT
URL: https://ekzhu.github.io/datasketch
Source0: https://mirrors.nju.edu.cn/pypi/web/packages/34/42/22ca877495066c15f05ed0fef1769545ff81efc97de0bfca49e703e06a49/datasketch-1.5.9.tar.gz
BuildArch: noarch
Requires: python3-numpy
Requires: python3-scipy
Requires: python3-pyhash
Requires: python3-matplotlib
Requires: python3-scikit-learn
Requires: python3-scipy
Requires: python3-pandas
Requires: python3-SetSimilaritySearch
Requires: python3-pyfarmhash
Requires: python3-nltk
Requires: python3-cassandra-driver
Requires: python3-aiounittest
Requires: python3-motor
Requires: python3-redis
Requires: python3-cassandra-driver
Requires: python3-redis
Requires: python3-mock
Requires: python3-mockredispy
Requires: python3-coverage
Requires: python3-pymongo
Requires: python3-nose
Requires: python3-nose-exclude
Requires: python3-pytest
%description
datasketch gives you probabilistic data structures that can process and
search very large amount of data super fast, with little loss of
accuracy.
This package contains the following data sketches:
+-------------------------+-----------------------------------------------+
| Data Sketch | Usage |
+=========================+===============================================+
| `MinHash`_ | estimate Jaccard similarity and cardinality |
+-------------------------+-----------------------------------------------+
| `Weighted MinHash`_ | estimate weighted Jaccard similarity |
+-------------------------+-----------------------------------------------+
| `HyperLogLog`_ | estimate cardinality |
+-------------------------+-----------------------------------------------+
| `HyperLogLog++`_ | estimate cardinality |
+-------------------------+-----------------------------------------------+
The following indexes for data sketches are provided to support
sub-linear query time:
+---------------------------+-----------------------------+------------------------+
| Index | For Data Sketch | Supported Query Type |
+===========================+=============================+========================+
| `MinHash LSH`_ | MinHash, Weighted MinHash | Jaccard Threshold |
+---------------------------+-----------------------------+------------------------+
| `MinHash LSH Forest`_ | MinHash, Weighted MinHash | Jaccard Top-K |
+---------------------------+-----------------------------+------------------------+
| `MinHash LSH Ensemble`_ | MinHash | Containment Threshold |
+---------------------------+-----------------------------+------------------------+
datasketch must be used with Python 2.7 or above, NumPy 1.11 or above, and Scipy.
Note that `MinHash LSH`_ and `MinHash LSH Ensemble`_ also support Redis and Cassandra
storage layer (see `MinHash LSH at Scale`_).
%package -n python3-datasketch
Summary: Probabilistic data structures for processing and searching very large datasets
Provides: python-datasketch
BuildRequires: python3-devel
BuildRequires: python3-setuptools
BuildRequires: python3-pip
%description -n python3-datasketch
datasketch gives you probabilistic data structures that can process and
search very large amount of data super fast, with little loss of
accuracy.
This package contains the following data sketches:
+-------------------------+-----------------------------------------------+
| Data Sketch | Usage |
+=========================+===============================================+
| `MinHash`_ | estimate Jaccard similarity and cardinality |
+-------------------------+-----------------------------------------------+
| `Weighted MinHash`_ | estimate weighted Jaccard similarity |
+-------------------------+-----------------------------------------------+
| `HyperLogLog`_ | estimate cardinality |
+-------------------------+-----------------------------------------------+
| `HyperLogLog++`_ | estimate cardinality |
+-------------------------+-----------------------------------------------+
The following indexes for data sketches are provided to support
sub-linear query time:
+---------------------------+-----------------------------+------------------------+
| Index | For Data Sketch | Supported Query Type |
+===========================+=============================+========================+
| `MinHash LSH`_ | MinHash, Weighted MinHash | Jaccard Threshold |
+---------------------------+-----------------------------+------------------------+
| `MinHash LSH Forest`_ | MinHash, Weighted MinHash | Jaccard Top-K |
+---------------------------+-----------------------------+------------------------+
| `MinHash LSH Ensemble`_ | MinHash | Containment Threshold |
+---------------------------+-----------------------------+------------------------+
datasketch must be used with Python 2.7 or above, NumPy 1.11 or above, and Scipy.
Note that `MinHash LSH`_ and `MinHash LSH Ensemble`_ also support Redis and Cassandra
storage layer (see `MinHash LSH at Scale`_).
%package help
Summary: Development documents and examples for datasketch
Provides: python3-datasketch-doc
%description help
datasketch gives you probabilistic data structures that can process and
search very large amount of data super fast, with little loss of
accuracy.
This package contains the following data sketches:
+-------------------------+-----------------------------------------------+
| Data Sketch | Usage |
+=========================+===============================================+
| `MinHash`_ | estimate Jaccard similarity and cardinality |
+-------------------------+-----------------------------------------------+
| `Weighted MinHash`_ | estimate weighted Jaccard similarity |
+-------------------------+-----------------------------------------------+
| `HyperLogLog`_ | estimate cardinality |
+-------------------------+-----------------------------------------------+
| `HyperLogLog++`_ | estimate cardinality |
+-------------------------+-----------------------------------------------+
The following indexes for data sketches are provided to support
sub-linear query time:
+---------------------------+-----------------------------+------------------------+
| Index | For Data Sketch | Supported Query Type |
+===========================+=============================+========================+
| `MinHash LSH`_ | MinHash, Weighted MinHash | Jaccard Threshold |
+---------------------------+-----------------------------+------------------------+
| `MinHash LSH Forest`_ | MinHash, Weighted MinHash | Jaccard Top-K |
+---------------------------+-----------------------------+------------------------+
| `MinHash LSH Ensemble`_ | MinHash | Containment Threshold |
+---------------------------+-----------------------------+------------------------+
datasketch must be used with Python 2.7 or above, NumPy 1.11 or above, and Scipy.
Note that `MinHash LSH`_ and `MinHash LSH Ensemble`_ also support Redis and Cassandra
storage layer (see `MinHash LSH at Scale`_).
%prep
%autosetup -n datasketch-1.5.9
%build
%py3_build
%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .
%files -n python3-datasketch -f filelist.lst
%dir %{python3_sitelib}/*
%files help -f doclist.lst
%{_docdir}/*
%changelog
* Mon Apr 10 2023 Python_Bot <Python_Bot@openeuler.org> - 1.5.9-1
- Package Spec generated
|