%global _empty_manifest_terminate_build 0 Name: python-geograpy3 Version: 0.2.6 Release: 1 Summary: Extract countries, regions and cities from a URL or text License: Apache URL: https://github.com/somnathrakshit/geograpy3 Source0: https://mirrors.nju.edu.cn/pypi/web/packages/e3/91/79ee302e5ab4a47d3f844539b0a247d65755f2fb163471f418a02c790747/geograpy3-0.2.6.tar.gz BuildArch: noarch Requires: python3-newspaper3k Requires: python3-nltk Requires: python3-jellyfish Requires: python3-numpy Requires: python3-pylodstorage Requires: python3-sphinx-rtd-theme Requires: python3-scikit-learn Requires: python3-pandas %description geograpy extracts place names from a URL or text, and adds context to those names -- for example distinguishing between a country, region or city. The extraction is a two step process. The first process is a Natural Language Processing task which analyzes a text for potential mentions of geographic locations. In the next step the words which represent such locations are looked up using the Locator. If you already know that your content has geographic information you might want to use the Locator interface directly. ## Examples/Tutorial * [see Examples/Tutorial Wiki](http://wiki.bitplan.com/index.php/Geograpy#Examples) ## Install & Setup Grab the package using `pip` (this will take a few minutes) ```bash pip install geograpy3 ``` geograpy3 uses [NLTK](http://www.nltk.org/) for entity recognition, so you'll also need to download the models we're using. Fortunately there's a command that'll take care of this for you. ```bash geograpy-nltk ``` ## Getting the source code ```bash git clone https://github.com/somnathrakshit/geograpy3 cd geograpy3 scripts/install ``` ## Basic Usage Import the module, give some text or a URL, and presto. ```python import geograpy url = 'https://en.wikipedia.org/wiki/2012_Summer_Olympics_torch_relay' places = geograpy.get_geoPlace_context(url=url) ``` Now you have access to information about all the places mentioned in the linked article. * `places.countries` _contains a list of country names_ * `places.regions` _contains a list of region names_ * `places.cities` _contains a list of city names_ * `places.other` _lists everything that wasn't clearly a country, region or city_ Note that the `other` list might be useful for shorter texts, to pull out information like street names, points of interest, etc, but at the moment is a bit messy when scanning longer texts that contain possessive forms of proper nouns (like "Russian" instead of "Russia"). ## But Wait, There's More In addition to listing the names of discovered places, you'll also get some information about the relationships between places. * `places.country_regions` _regions broken down by country_ * `places.country_cities` _cities broken down by country_ * `places.address_strings` _city, region, country strings useful for geocoding_ ## Last But Not Least While a text might mention many places, it's probably focused on one or two, so geograpy3 also breaks down countries, regions and cities by number of mentions. * `places.country_mentions` * `places.region_mentions` * `places.city_mentions` Each of these returns a list of tuples. The first item in the tuple is the place name and the second item is the number of mentions. For example: [('Russian Federation', 14), (u'Ukraine', 11), (u'Lithuania', 1)] ## If You're Really Serious You can of course use each of Geograpy's modules on their own. For example: ```python from geograpy import extraction e = extraction.Extractor(url='https://en.wikipedia.org/wiki/2012_Summer_Olympics_torch_relay') e.find_geoEntities() # You can now access all of the places found by the Extractor print(e.places) ``` Place context is handled in the `places` module. For example: ```python from geograpy import places pc = places.PlaceContext(['Cleveland', 'Ohio', 'United States']) pc.set_countries() print pc.countries #['United States'] pc.set_regions() print(pc.regions #['Ohio']) pc.set_cities() print(pc.cities #['Cleveland']) print(pc.address_strings #['Cleveland, Ohio, United States']) ``` And of course all of the other information shown above (`country_regions` etc) is available after the corresponding `set_` method is called. ## Stackoverflow * [Questions tagged with 'geograpy'](https://stackoverflow.com/questions/tagged/geograpy) ## Credits geograpy3 uses the following excellent libraries: * [NLTK](http://www.nltk.org/) for entity recognition * [newspaper](https://github.com/codelucas/newspaper) for text extraction from HTML * [jellyfish](https://github.com/sunlightlabs/jellyfish) for fuzzy text match * [pylodstorage](https://pypi.org/project/pylodstorage/) for storage and retrieval of tabular data from SQL and SPARQL sources geograpy3 uses the following data sources: * [ISO3166ErrorDictionary](https://github.com/bodacea/countryname/blob/master/countryname/databases/ISO3166ErrorDictionary.csv) for common country mispellings _via [Sara-Jayne Terp](https://github.com/bodacea)_ * [Wikidata](https://www.wikidata.org) for country/region/city information with disambiguation via population Hat tip to [Chris Albon](https://github.com/chrisalbon) for the name. %package -n python3-geograpy3 Summary: Extract countries, regions and cities from a URL or text Provides: python-geograpy3 BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-geograpy3 geograpy extracts place names from a URL or text, and adds context to those names -- for example distinguishing between a country, region or city. The extraction is a two step process. The first process is a Natural Language Processing task which analyzes a text for potential mentions of geographic locations. In the next step the words which represent such locations are looked up using the Locator. If you already know that your content has geographic information you might want to use the Locator interface directly. ## Examples/Tutorial * [see Examples/Tutorial Wiki](http://wiki.bitplan.com/index.php/Geograpy#Examples) ## Install & Setup Grab the package using `pip` (this will take a few minutes) ```bash pip install geograpy3 ``` geograpy3 uses [NLTK](http://www.nltk.org/) for entity recognition, so you'll also need to download the models we're using. Fortunately there's a command that'll take care of this for you. ```bash geograpy-nltk ``` ## Getting the source code ```bash git clone https://github.com/somnathrakshit/geograpy3 cd geograpy3 scripts/install ``` ## Basic Usage Import the module, give some text or a URL, and presto. ```python import geograpy url = 'https://en.wikipedia.org/wiki/2012_Summer_Olympics_torch_relay' places = geograpy.get_geoPlace_context(url=url) ``` Now you have access to information about all the places mentioned in the linked article. * `places.countries` _contains a list of country names_ * `places.regions` _contains a list of region names_ * `places.cities` _contains a list of city names_ * `places.other` _lists everything that wasn't clearly a country, region or city_ Note that the `other` list might be useful for shorter texts, to pull out information like street names, points of interest, etc, but at the moment is a bit messy when scanning longer texts that contain possessive forms of proper nouns (like "Russian" instead of "Russia"). ## But Wait, There's More In addition to listing the names of discovered places, you'll also get some information about the relationships between places. * `places.country_regions` _regions broken down by country_ * `places.country_cities` _cities broken down by country_ * `places.address_strings` _city, region, country strings useful for geocoding_ ## Last But Not Least While a text might mention many places, it's probably focused on one or two, so geograpy3 also breaks down countries, regions and cities by number of mentions. * `places.country_mentions` * `places.region_mentions` * `places.city_mentions` Each of these returns a list of tuples. The first item in the tuple is the place name and the second item is the number of mentions. For example: [('Russian Federation', 14), (u'Ukraine', 11), (u'Lithuania', 1)] ## If You're Really Serious You can of course use each of Geograpy's modules on their own. For example: ```python from geograpy import extraction e = extraction.Extractor(url='https://en.wikipedia.org/wiki/2012_Summer_Olympics_torch_relay') e.find_geoEntities() # You can now access all of the places found by the Extractor print(e.places) ``` Place context is handled in the `places` module. For example: ```python from geograpy import places pc = places.PlaceContext(['Cleveland', 'Ohio', 'United States']) pc.set_countries() print pc.countries #['United States'] pc.set_regions() print(pc.regions #['Ohio']) pc.set_cities() print(pc.cities #['Cleveland']) print(pc.address_strings #['Cleveland, Ohio, United States']) ``` And of course all of the other information shown above (`country_regions` etc) is available after the corresponding `set_` method is called. ## Stackoverflow * [Questions tagged with 'geograpy'](https://stackoverflow.com/questions/tagged/geograpy) ## Credits geograpy3 uses the following excellent libraries: * [NLTK](http://www.nltk.org/) for entity recognition * [newspaper](https://github.com/codelucas/newspaper) for text extraction from HTML * [jellyfish](https://github.com/sunlightlabs/jellyfish) for fuzzy text match * [pylodstorage](https://pypi.org/project/pylodstorage/) for storage and retrieval of tabular data from SQL and SPARQL sources geograpy3 uses the following data sources: * [ISO3166ErrorDictionary](https://github.com/bodacea/countryname/blob/master/countryname/databases/ISO3166ErrorDictionary.csv) for common country mispellings _via [Sara-Jayne Terp](https://github.com/bodacea)_ * [Wikidata](https://www.wikidata.org) for country/region/city information with disambiguation via population Hat tip to [Chris Albon](https://github.com/chrisalbon) for the name. %package help Summary: Development documents and examples for geograpy3 Provides: python3-geograpy3-doc %description help geograpy extracts place names from a URL or text, and adds context to those names -- for example distinguishing between a country, region or city. The extraction is a two step process. The first process is a Natural Language Processing task which analyzes a text for potential mentions of geographic locations. In the next step the words which represent such locations are looked up using the Locator. If you already know that your content has geographic information you might want to use the Locator interface directly. ## Examples/Tutorial * [see Examples/Tutorial Wiki](http://wiki.bitplan.com/index.php/Geograpy#Examples) ## Install & Setup Grab the package using `pip` (this will take a few minutes) ```bash pip install geograpy3 ``` geograpy3 uses [NLTK](http://www.nltk.org/) for entity recognition, so you'll also need to download the models we're using. Fortunately there's a command that'll take care of this for you. ```bash geograpy-nltk ``` ## Getting the source code ```bash git clone https://github.com/somnathrakshit/geograpy3 cd geograpy3 scripts/install ``` ## Basic Usage Import the module, give some text or a URL, and presto. ```python import geograpy url = 'https://en.wikipedia.org/wiki/2012_Summer_Olympics_torch_relay' places = geograpy.get_geoPlace_context(url=url) ``` Now you have access to information about all the places mentioned in the linked article. * `places.countries` _contains a list of country names_ * `places.regions` _contains a list of region names_ * `places.cities` _contains a list of city names_ * `places.other` _lists everything that wasn't clearly a country, region or city_ Note that the `other` list might be useful for shorter texts, to pull out information like street names, points of interest, etc, but at the moment is a bit messy when scanning longer texts that contain possessive forms of proper nouns (like "Russian" instead of "Russia"). ## But Wait, There's More In addition to listing the names of discovered places, you'll also get some information about the relationships between places. * `places.country_regions` _regions broken down by country_ * `places.country_cities` _cities broken down by country_ * `places.address_strings` _city, region, country strings useful for geocoding_ ## Last But Not Least While a text might mention many places, it's probably focused on one or two, so geograpy3 also breaks down countries, regions and cities by number of mentions. * `places.country_mentions` * `places.region_mentions` * `places.city_mentions` Each of these returns a list of tuples. The first item in the tuple is the place name and the second item is the number of mentions. For example: [('Russian Federation', 14), (u'Ukraine', 11), (u'Lithuania', 1)] ## If You're Really Serious You can of course use each of Geograpy's modules on their own. For example: ```python from geograpy import extraction e = extraction.Extractor(url='https://en.wikipedia.org/wiki/2012_Summer_Olympics_torch_relay') e.find_geoEntities() # You can now access all of the places found by the Extractor print(e.places) ``` Place context is handled in the `places` module. For example: ```python from geograpy import places pc = places.PlaceContext(['Cleveland', 'Ohio', 'United States']) pc.set_countries() print pc.countries #['United States'] pc.set_regions() print(pc.regions #['Ohio']) pc.set_cities() print(pc.cities #['Cleveland']) print(pc.address_strings #['Cleveland, Ohio, United States']) ``` And of course all of the other information shown above (`country_regions` etc) is available after the corresponding `set_` method is called. ## Stackoverflow * [Questions tagged with 'geograpy'](https://stackoverflow.com/questions/tagged/geograpy) ## Credits geograpy3 uses the following excellent libraries: * [NLTK](http://www.nltk.org/) for entity recognition * [newspaper](https://github.com/codelucas/newspaper) for text extraction from HTML * [jellyfish](https://github.com/sunlightlabs/jellyfish) for fuzzy text match * [pylodstorage](https://pypi.org/project/pylodstorage/) for storage and retrieval of tabular data from SQL and SPARQL sources geograpy3 uses the following data sources: * [ISO3166ErrorDictionary](https://github.com/bodacea/countryname/blob/master/countryname/databases/ISO3166ErrorDictionary.csv) for common country mispellings _via [Sara-Jayne Terp](https://github.com/bodacea)_ * [Wikidata](https://www.wikidata.org) for country/region/city information with disambiguation via population Hat tip to [Chris Albon](https://github.com/chrisalbon) for the name. %prep %autosetup -n geograpy3-0.2.6 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-geograpy3 -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Wed Apr 12 2023 Python_Bot - 0.2.6-1 - Package Spec generated