diff options
| -rw-r--r-- | .gitignore | 1 | ||||
| -rw-r--r-- | python-freeproxyscraper.spec | 246 | ||||
| -rw-r--r-- | sources | 1 |
3 files changed, 248 insertions, 0 deletions
@@ -0,0 +1 @@ +/FreeProxyScraper-0.1.17.tar.gz diff --git a/python-freeproxyscraper.spec b/python-freeproxyscraper.spec new file mode 100644 index 0000000..212560c --- /dev/null +++ b/python-freeproxyscraper.spec @@ -0,0 +1,246 @@ +%global _empty_manifest_terminate_build 0 +Name: python-FreeProxyScraper +Version: 0.1.17 +Release: 1 +Summary: please add a summary manually as the author left a blank one +License: "MIT" +URL: https://github.com/Themis3000/FreeProxyScraper +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/d0/58/cd036bc135080c8da969bbf352369f69f7285c0ecd1fac14477ddc1ba619/FreeProxyScraper-0.1.17.tar.gz +BuildArch: noarch + +Requires: python3-requests +Requires: python3-beautifulsoup4 +Requires: python3-fake-useragent + +%description +# FreeProxyScraper +This is a plugin driven web scraper meant to retrieve and test free proxies for use. Note that this package may be unstable and should not be used in a production environment. + +## Installation +Run the following to install: + +```bash +pip install FreeProxyScraper +``` + +## Usage + +```python +import FreeProxyScraper + +pq = FreeProxyScraper.ProxyQuery() + +# Returns any proxy's found +for proxy in pq.find_proxies(limit=20): + print(proxy) + +# Returns only proxies that are anonymous or "elite" +for proxy in pq.find_filter(limit=20, min_anon_level=1): + print(proxy) +``` + +There are 3 anonymity levels, indicated as integers between 0-2. + +- Level 0: Transparent. The end server can see your real ip even though it's being routed through a proxy +- Level 1: Anonymous. The end server knows you are using a proxy, but does not know your real ip +- Level 2: High Anonymity, also sometimes called "elite". The end server does not know you are using a proxy or know your real ip. The end server may have a database known proxies, so they still may know that you are using a proxy by matching your ip against such a database. + +## List of sites implemented for scraping: +- https://www.sslproxies.org/ +- http://free-proxy.cz/en/ +- https://spys.one/en/ +- https://hidemy.name/en/proxy-list/ +- https://geonode.com/free-proxy-list + +## FAQ +- Why implement so many websites for scraping? + +Websites are always changing, or going down, or banning ip's very quickly. In order to make sure this package stays reliable it is essential that it implements many websites + +- I want to make sure that I am truly not using transparent proxies, how do I know the websites being scraped from aren't lying abut the anonymity of the proxies? + +By default, all proxies will be checked if they are transparent or not before ever giving them to you if you specified a higher anon_level then 0. There's no need to worry, your ip should be safe. + +## Development +to install FreeProxyScraper, along with the tools you need to develop, run the following in the directory containing this repo: + +```bash +pip install -e .[dev] +``` + +If you'd like to contribute to development, right now the most needed thing is writing more plugins. In order to help, you need basic knowledge of BeautifulSoup4 and a little of patience with websites purposely making it hard for you to scrape information. Check out `src/plugins/examplePlugin.py` to see an example layout of a plugin file. + + + +%package -n python3-FreeProxyScraper +Summary: please add a summary manually as the author left a blank one +Provides: python-FreeProxyScraper +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-FreeProxyScraper +# FreeProxyScraper +This is a plugin driven web scraper meant to retrieve and test free proxies for use. Note that this package may be unstable and should not be used in a production environment. + +## Installation +Run the following to install: + +```bash +pip install FreeProxyScraper +``` + +## Usage + +```python +import FreeProxyScraper + +pq = FreeProxyScraper.ProxyQuery() + +# Returns any proxy's found +for proxy in pq.find_proxies(limit=20): + print(proxy) + +# Returns only proxies that are anonymous or "elite" +for proxy in pq.find_filter(limit=20, min_anon_level=1): + print(proxy) +``` + +There are 3 anonymity levels, indicated as integers between 0-2. + +- Level 0: Transparent. The end server can see your real ip even though it's being routed through a proxy +- Level 1: Anonymous. The end server knows you are using a proxy, but does not know your real ip +- Level 2: High Anonymity, also sometimes called "elite". The end server does not know you are using a proxy or know your real ip. The end server may have a database known proxies, so they still may know that you are using a proxy by matching your ip against such a database. + +## List of sites implemented for scraping: +- https://www.sslproxies.org/ +- http://free-proxy.cz/en/ +- https://spys.one/en/ +- https://hidemy.name/en/proxy-list/ +- https://geonode.com/free-proxy-list + +## FAQ +- Why implement so many websites for scraping? + +Websites are always changing, or going down, or banning ip's very quickly. In order to make sure this package stays reliable it is essential that it implements many websites + +- I want to make sure that I am truly not using transparent proxies, how do I know the websites being scraped from aren't lying abut the anonymity of the proxies? + +By default, all proxies will be checked if they are transparent or not before ever giving them to you if you specified a higher anon_level then 0. There's no need to worry, your ip should be safe. + +## Development +to install FreeProxyScraper, along with the tools you need to develop, run the following in the directory containing this repo: + +```bash +pip install -e .[dev] +``` + +If you'd like to contribute to development, right now the most needed thing is writing more plugins. In order to help, you need basic knowledge of BeautifulSoup4 and a little of patience with websites purposely making it hard for you to scrape information. Check out `src/plugins/examplePlugin.py` to see an example layout of a plugin file. + + + +%package help +Summary: Development documents and examples for FreeProxyScraper +Provides: python3-FreeProxyScraper-doc +%description help +# FreeProxyScraper +This is a plugin driven web scraper meant to retrieve and test free proxies for use. Note that this package may be unstable and should not be used in a production environment. + +## Installation +Run the following to install: + +```bash +pip install FreeProxyScraper +``` + +## Usage + +```python +import FreeProxyScraper + +pq = FreeProxyScraper.ProxyQuery() + +# Returns any proxy's found +for proxy in pq.find_proxies(limit=20): + print(proxy) + +# Returns only proxies that are anonymous or "elite" +for proxy in pq.find_filter(limit=20, min_anon_level=1): + print(proxy) +``` + +There are 3 anonymity levels, indicated as integers between 0-2. + +- Level 0: Transparent. The end server can see your real ip even though it's being routed through a proxy +- Level 1: Anonymous. The end server knows you are using a proxy, but does not know your real ip +- Level 2: High Anonymity, also sometimes called "elite". The end server does not know you are using a proxy or know your real ip. The end server may have a database known proxies, so they still may know that you are using a proxy by matching your ip against such a database. + +## List of sites implemented for scraping: +- https://www.sslproxies.org/ +- http://free-proxy.cz/en/ +- https://spys.one/en/ +- https://hidemy.name/en/proxy-list/ +- https://geonode.com/free-proxy-list + +## FAQ +- Why implement so many websites for scraping? + +Websites are always changing, or going down, or banning ip's very quickly. In order to make sure this package stays reliable it is essential that it implements many websites + +- I want to make sure that I am truly not using transparent proxies, how do I know the websites being scraped from aren't lying abut the anonymity of the proxies? + +By default, all proxies will be checked if they are transparent or not before ever giving them to you if you specified a higher anon_level then 0. There's no need to worry, your ip should be safe. + +## Development +to install FreeProxyScraper, along with the tools you need to develop, run the following in the directory containing this repo: + +```bash +pip install -e .[dev] +``` + +If you'd like to contribute to development, right now the most needed thing is writing more plugins. In order to help, you need basic knowledge of BeautifulSoup4 and a little of patience with websites purposely making it hard for you to scrape information. Check out `src/plugins/examplePlugin.py` to see an example layout of a plugin file. + + + +%prep +%autosetup -n FreeProxyScraper-0.1.17 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-FreeProxyScraper -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Mon May 15 2023 Python_Bot <Python_Bot@openeuler.org> - 0.1.17-1 +- Package Spec generated @@ -0,0 +1 @@ +5c828e8b938d7cf18e2def9433d943dd FreeProxyScraper-0.1.17.tar.gz |
