python-freeproxyscraper.spec


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246

%global _empty_manifest_terminate_build 0
Name:		python-FreeProxyScraper
Version:	0.1.17
Release:	1
Summary:	please add a summary manually as the author left a blank one
License:	"MIT"
URL:		https://github.com/Themis3000/FreeProxyScraper
Source0:	https://mirrors.aliyun.com/pypi/web/packages/d0/58/cd036bc135080c8da969bbf352369f69f7285c0ecd1fac14477ddc1ba619/FreeProxyScraper-0.1.17.tar.gz
BuildArch:	noarch

Requires:	python3-requests
Requires:	python3-beautifulsoup4
Requires:	python3-fake-useragent

%description
# FreeProxyScraper
This is a plugin driven web scraper meant to retrieve and test free proxies for use. Note that this package may be unstable and should not be used in a production environment.

## Installation
Run the following to install:

```bash
pip install FreeProxyScraper
```

## Usage

```python
import FreeProxyScraper

pq = FreeProxyScraper.ProxyQuery()

# Returns any proxy's found
for proxy in pq.find_proxies(limit=20):
    print(proxy)

# Returns only proxies that are anonymous or "elite"
for proxy in pq.find_filter(limit=20, min_anon_level=1):
    print(proxy)
```

There are 3 anonymity levels, indicated as integers between 0-2.

- Level 0: Transparent. The end server can see your real ip even though it's being routed through a proxy
- Level 1: Anonymous. The end server knows you are using a proxy, but does not know your real ip
- Level 2: High Anonymity, also sometimes called "elite". The end server does not know you are using a proxy or know your real ip. The end server may have a database known proxies, so they still may know that you are using a proxy by matching your ip against such a database.

## List of sites implemented for scraping:
- https://www.sslproxies.org/
- http://free-proxy.cz/en/
- https://spys.one/en/
- https://hidemy.name/en/proxy-list/
- https://geonode.com/free-proxy-list

## FAQ
- Why implement so many websites for scraping?

Websites are always changing, or going down, or banning ip's very quickly. In order to make sure this package stays reliable it is essential that it implements many websites

- I want to make sure that I am truly not using transparent proxies, how do I know the websites being scraped from aren't lying abut the anonymity of the proxies?

By default, all proxies will be checked if they are transparent or not before ever giving them to you if you specified a higher anon_level then 0. There's no need to worry, your ip should be safe.

## Development
to install FreeProxyScraper, along with the tools you need to develop, run the following in the directory containing this repo:

```bash
pip install -e .[dev]
```

If you'd like to contribute to development, right now the most needed thing is writing more plugins. In order to help, you need basic knowledge of BeautifulSoup4 and a little of patience with websites purposely making it hard for you to scrape information. Check out `src/plugins/examplePlugin.py` to see an example layout of a plugin file.


%package -n python3-FreeProxyScraper
Summary:	please add a summary manually as the author left a blank one
Provides:	python-FreeProxyScraper
BuildRequires:	python3-devel
BuildRequires:	python3-setuptools
BuildRequires:	python3-pip
%description -n python3-FreeProxyScraper
# FreeProxyScraper
This is a plugin driven web scraper meant to retrieve and test free proxies for use. Note that this package may be unstable and should not be used in a production environment.

## Installation
Run the following to install:

```bash
pip install FreeProxyScraper
```

## Usage

```python
import FreeProxyScraper

pq = FreeProxyScraper.ProxyQuery()

# Returns any proxy's found
for proxy in pq.find_proxies(limit=20):
    print(proxy)

# Returns only proxies that are anonymous or "elite"
for proxy in pq.find_filter(limit=20, min_anon_level=1):
    print(proxy)
```

There are 3 anonymity levels, indicated as integers between 0-2.

- Level 0: Transparent. The end server can see your real ip even though it's being routed through a proxy
- Level 1: Anonymous. The end server knows you are using a proxy, but does not know your real ip
- Level 2: High Anonymity, also sometimes called "elite". The end server does not know you are using a proxy or know your real ip. The end server may have a database known proxies, so they still may know that you are using a proxy by matching your ip against such a database.

## List of sites implemented for scraping:
- https://www.sslproxies.org/
- http://free-proxy.cz/en/
- https://spys.one/en/
- https://hidemy.name/en/proxy-list/
- https://geonode.com/free-proxy-list

## FAQ
- Why implement so many websites for scraping?

Websites are always changing, or going down, or banning ip's very quickly. In order to make sure this package stays reliable it is essential that it implements many websites

- I want to make sure that I am truly not using transparent proxies, how do I know the websites being scraped from aren't lying abut the anonymity of the proxies?

By default, all proxies will be checked if they are transparent or not before ever giving them to you if you specified a higher anon_level then 0. There's no need to worry, your ip should be safe.

## Development
to install FreeProxyScraper, along with the tools you need to develop, run the following in the directory containing this repo:

```bash
pip install -e .[dev]
```

If you'd like to contribute to development, right now the most needed thing is writing more plugins. In order to help, you need basic knowledge of BeautifulSoup4 and a little of patience with websites purposely making it hard for you to scrape information. Check out `src/plugins/examplePlugin.py` to see an example layout of a plugin file.


%package help
Summary:	Development documents and examples for FreeProxyScraper
Provides:	python3-FreeProxyScraper-doc
%description help
# FreeProxyScraper
This is a plugin driven web scraper meant to retrieve and test free proxies for use. Note that this package may be unstable and should not be used in a production environment.

## Installation
Run the following to install:

```bash
pip install FreeProxyScraper
```

## Usage

```python
import FreeProxyScraper

pq = FreeProxyScraper.ProxyQuery()

# Returns any proxy's found
for proxy in pq.find_proxies(limit=20):
    print(proxy)

# Returns only proxies that are anonymous or "elite"
for proxy in pq.find_filter(limit=20, min_anon_level=1):
    print(proxy)
```

There are 3 anonymity levels, indicated as integers between 0-2.

- Level 0: Transparent. The end server can see your real ip even though it's being routed through a proxy
- Level 1: Anonymous. The end server knows you are using a proxy, but does not know your real ip
- Level 2: High Anonymity, also sometimes called "elite". The end server does not know you are using a proxy or know your real ip. The end server may have a database known proxies, so they still may know that you are using a proxy by matching your ip against such a database.

## List of sites implemented for scraping:
- https://www.sslproxies.org/
- http://free-proxy.cz/en/
- https://spys.one/en/
- https://hidemy.name/en/proxy-list/
- https://geonode.com/free-proxy-list

## FAQ
- Why implement so many websites for scraping?

Websites are always changing, or going down, or banning ip's very quickly. In order to make sure this package stays reliable it is essential that it implements many websites

- I want to make sure that I am truly not using transparent proxies, how do I know the websites being scraped from aren't lying abut the anonymity of the proxies?

By default, all proxies will be checked if they are transparent or not before ever giving them to you if you specified a higher anon_level then 0. There's no need to worry, your ip should be safe.

## Development
to install FreeProxyScraper, along with the tools you need to develop, run the following in the directory containing this repo:

```bash
pip install -e .[dev]
```

If you'd like to contribute to development, right now the most needed thing is writing more plugins. In order to help, you need basic knowledge of BeautifulSoup4 and a little of patience with websites purposely making it hard for you to scrape information. Check out `src/plugins/examplePlugin.py` to see an example layout of a plugin file.


%prep
%autosetup -n FreeProxyScraper-0.1.17

%build
%py3_build

%install
%py3_install
install -d -m755 %{buildroot}/%{_pkgdocdir}
if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi
if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi
if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi
if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi
pushd %{buildroot}
if [ -d usr/lib ]; then
	find usr/lib -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/lib64 ]; then
	find usr/lib64 -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/bin ]; then
	find usr/bin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
if [ -d usr/sbin ]; then
	find usr/sbin -type f -printf "\"/%h/%f\"\n" >> filelist.lst
fi
touch doclist.lst
if [ -d usr/share/man ]; then
	find usr/share/man -type f -printf "\"/%h/%f.gz\"\n" >> doclist.lst
fi
popd
mv %{buildroot}/filelist.lst .
mv %{buildroot}/doclist.lst .

%files -n python3-FreeProxyScraper -f filelist.lst
%dir %{python3_sitelib}/*

%files help -f doclist.lst
%{_docdir}/*

%changelog
* Thu Jun 08 2023 Python_Bot <Python_Bot@openeuler.org> - 0.1.17-1
- Package Spec generated