%global _empty_manifest_terminate_build 0 Name: python-lbsntransform Version: 0.25.1 Release: 1 Summary: Location based social network (LBSN) data structure format & transfer tool License: GNU GPLv3 or any higher URL: https://gitlab.vgiscience.de/lbsn/lbsntransform Source0: https://mirrors.nju.edu.cn/pypi/web/packages/18/61/fe7960bc1a74d44fd818f69d6373c755ec926cbde4bcb4549b7fd845d477/lbsntransform-0.25.1.tar.gz BuildArch: noarch Requires: python3-lbsnstructure Requires: python3-protobuf Requires: python3-psycopg2-binary Requires: python3-ppygis3 Requires: python3-shapely Requires: python3-emoji Requires: python3-requests Requires: python3-geos Requires: python3-numpy Requires: python3-regex Requires: python3-nltk %description [![PyPI version](https://lbsn.vgiscience.org/lbsntransform/pypi.svg?&kill_cache=2)](https://pypi.org/project/lbsntransform/) [![pylint](https://lbsn.vgiscience.org/lbsntransform/pylint.svg)](https://gitlab.vgiscience.de/lbsn/lbsntransform) [![pipeline status](https://gitlab.vgiscience.de/lbsn/lbsntransform/badges/master/pipeline.svg)](https://gitlab.vgiscience.de/lbsn/lbsntransform/-/commits/master) [![Documentation](https://lbsn.vgiscience.org/lbsntransform/documentation.svg)](https://lbsn.vgiscience.org/lbsntransform/docs/) # LBSNTransform A python package that uses the [common location based social network (LBSN) data structure][lbsnstructure] (ProtoBuf) to import, transform and export Social Media data such as Twitter and Flickr. ![Illustration of functions](https://lbsn.vgiscience.org/lbsntransform/docs/inputoutput.svg) ## Motivation The goal is to provide a common interface to handle Social Media Data, without the need to individually adapt to the myriad API endpoints available. As an example, consider the ProtoBuf spec [lbsn.Post][lbsnpost], which can be a Tweet on Twitter, a Photo shared on Flickr, or a post on Reddit. However, all of these objects share a common set of attributes, which is reflected in the lbsnstructure. The tool is based on a 4-Facet conceptual framework for LBSN, introduced in a paper by [Dunkel et al. (2018)](https://www.tandfonline.com/doi/full/10.1080/13658816.2018.1546390). The GDPR directly requests Social Media Network operators to allow users to transfer accounts and data in-between services. While there are attempts by Google, Facebook etc. (e.g. see the [data-transfer-project][data-transfer-project]), this is not currently possible. With the lbsnstructure, a primary motivation is to systematically characterize LBSN data aspects in a common, cross-network data scheme that enables privacy-by-design for connected software, data handling and database design. ## Description This tool enables data import from a Postgres database, JSON, or CSV and export to CSV, [LBSN ProtoBuf][lbsnstructure] or the [hll][hlldb] and [raw][rawdb] versions of the LBSN prepared Postgres Databases. The tool will map Social Media endpoints (e.g. Twitter tweets) to a common [LBSN Interchange Structure][lbsnstructure] format in ProtoBuf. LBSNTransform can be used using the command line (CLI) or imported to other Python projects with `import lbsntransform`, for on-the-fly conversion. ## Quick Start The recommended way to install lbsntransform, for both Linux and Windows, is through the conda package manager. 1. Create a conda env using `environment.yml` First, create an environment with the dependencies for lbsntransform using the [environment.yml][environment.yml] that is provided in the root of the repository. ```bash git clone https://github.com/Sieboldianus/lbsntransform.git cd lbsntransform # not necessary, but recommended: conda config --env --set channel_priority strict conda env create -f environment.yml ``` 2. Install lbsntransform without dependencies Afterwards, install lbsntransform using pip, without dependencies. ```bash conda activate lbsntransform pip install lbsntransform --no-deps --upgrade # or locally, from the latest commits on master # pip install . --no-deps --upgrade ``` 3. Import data using a mapping For each data source, a mapping must be provided that defines how data is mapped to the [lbsnstructure][lbsnstructure]. The default mapping is [lbsnraw][lbsnraw]. Additional mappings can be dynamically loaded from a folder. We have provided two [example mappings][mappings] for the [Flickr YFCC100M dataset][yfcc100m] (CSV) and Twitter (json). For example, to import the first 1000 records from json data from Twitter to the [lbsn raw database][rawdb], clone [field_mapping_twitter.py][field_mapping_twitter] to a local folder `./resources/mappings/`, startup the Docker [rawdb][rawdb] container, and use: ```shell lbsntransform --origin 3 \ --mappings_path ./resources/mappings/ \ --file_input \ --file_type "json" \ --mappings_path ./resources/mappings/ \ --dbpassword_output "sample-key" \ --dbuser_output "postgres" \ --dbserveraddress_output "127.0.0.1:5432" \ --dbname_output "rawdb" \ --dbformat_output "lbsn" \ --transferlimit 1000 ``` .. with the above input args, the the tool will: - read local json from `./01_Input/` - and store lbsn records to the [lbsn rawdb][rawdb]. Vice versa, to import data directly to the privacy-aware version of lbsnstructure, called [hlldb][hlldb], startup the Docker container, and use: ```shell lbsntransform --origin 3 \ --mappings_path ./resources/mappings/ \ --file_input \ --file_type "json" \ --mappings_path ./resources/mappings/ \ --dbpassword_output "sample-key" \ --dbuser_output "postgres" \ --dbserveraddress_output "127.0.0.1:25432" \ --dbname_output "hlldb" \ --dbformat_output "hll" \ --dbpassword_hllworker "sample-key" \ --dbuser_hllworker "postgres" \ --dbserveraddress_hllworker "127.0.0.1:25432" \ --dbname_hllworker "hlldb" \ --include_lbsn_objects "origin,post" \ --include_lbsn_bases hashtag,place,date,community \ --transferlimit 1000 ``` .. with the above input args, the the tool will: - read local json from `./01_Input/` - and store lbsn records to the privacy-aware [lbsn hlldb][hlldb] - by converting only lbsn objects of type [origin][lbsnorigin] and [post][lbsnpost] - and updating the HyperLogLog (HLL) target tables `hashtag`, `place`, `date` and `community` A full list of possible input and output args is available in the [documentation](https://lbsn.vgiscience.org/lbsntransform/docs/). ## Built With * [lbsnstructure](https://pypi.org/project/lbsnstructure/) - A common language independend and cross-network social-media datascheme * [protobuf](https://github.com/google/protobuf) - Google's data interchange format * [psycopg2](https://github.com/psycopg/psycopg2) - Python-PostgreSQL Database Adapter * [ppygis3](https://github.com/AlexImmer/ppygis3) - A PPyGIS port for Python * [shapely](https://github.com/Toblerity/Shapely) - Geometric objects processing in Python * [emoji](https://github.com/carpedm20/emoji/) - Emoji handling in Python ## Authors * **Alexander Dunkel** - Initial work See also the list of [contributors](/../graphs/master). ## License This project is licensed under the GNU GPLv3 or any higher - see the [LICENSE.md](LICENSE.md) file for details. [lbsnstructure]: https://lbsn.vgiscience.org/structure/ [lbsnpost]: https://lbsn.vgiscience.org/structure/#post [lbsnorigin]: https://lbsn.vgiscience.org/structure/#origin [data-transfer-project]: https://datatransferproject.dev/ [rawdb]: https://gitlab.vgiscience.de/lbsn/databases/rawdb [hlldb]: https://gitlab.vgiscience.de/lbsn/databases/hlldb [lbsnraw]: lbsntransform/input/mappings/field_mapping_lbsn.py [mappings]: resources/mappings [field_mapping_twitter]: resources/mappings/field_mapping_twitter.py [yfcc100m]: http://projects.dfki.uni-kl.de/yfcc100m/ %package -n python3-lbsntransform Summary: Location based social network (LBSN) data structure format & transfer tool Provides: python-lbsntransform BuildRequires: python3-devel BuildRequires: python3-setuptools BuildRequires: python3-pip %description -n python3-lbsntransform [![PyPI version](https://lbsn.vgiscience.org/lbsntransform/pypi.svg?&kill_cache=2)](https://pypi.org/project/lbsntransform/) [![pylint](https://lbsn.vgiscience.org/lbsntransform/pylint.svg)](https://gitlab.vgiscience.de/lbsn/lbsntransform) [![pipeline status](https://gitlab.vgiscience.de/lbsn/lbsntransform/badges/master/pipeline.svg)](https://gitlab.vgiscience.de/lbsn/lbsntransform/-/commits/master) [![Documentation](https://lbsn.vgiscience.org/lbsntransform/documentation.svg)](https://lbsn.vgiscience.org/lbsntransform/docs/) # LBSNTransform A python package that uses the [common location based social network (LBSN) data structure][lbsnstructure] (ProtoBuf) to import, transform and export Social Media data such as Twitter and Flickr. ![Illustration of functions](https://lbsn.vgiscience.org/lbsntransform/docs/inputoutput.svg) ## Motivation The goal is to provide a common interface to handle Social Media Data, without the need to individually adapt to the myriad API endpoints available. As an example, consider the ProtoBuf spec [lbsn.Post][lbsnpost], which can be a Tweet on Twitter, a Photo shared on Flickr, or a post on Reddit. However, all of these objects share a common set of attributes, which is reflected in the lbsnstructure. The tool is based on a 4-Facet conceptual framework for LBSN, introduced in a paper by [Dunkel et al. (2018)](https://www.tandfonline.com/doi/full/10.1080/13658816.2018.1546390). The GDPR directly requests Social Media Network operators to allow users to transfer accounts and data in-between services. While there are attempts by Google, Facebook etc. (e.g. see the [data-transfer-project][data-transfer-project]), this is not currently possible. With the lbsnstructure, a primary motivation is to systematically characterize LBSN data aspects in a common, cross-network data scheme that enables privacy-by-design for connected software, data handling and database design. ## Description This tool enables data import from a Postgres database, JSON, or CSV and export to CSV, [LBSN ProtoBuf][lbsnstructure] or the [hll][hlldb] and [raw][rawdb] versions of the LBSN prepared Postgres Databases. The tool will map Social Media endpoints (e.g. Twitter tweets) to a common [LBSN Interchange Structure][lbsnstructure] format in ProtoBuf. LBSNTransform can be used using the command line (CLI) or imported to other Python projects with `import lbsntransform`, for on-the-fly conversion. ## Quick Start The recommended way to install lbsntransform, for both Linux and Windows, is through the conda package manager. 1. Create a conda env using `environment.yml` First, create an environment with the dependencies for lbsntransform using the [environment.yml][environment.yml] that is provided in the root of the repository. ```bash git clone https://github.com/Sieboldianus/lbsntransform.git cd lbsntransform # not necessary, but recommended: conda config --env --set channel_priority strict conda env create -f environment.yml ``` 2. Install lbsntransform without dependencies Afterwards, install lbsntransform using pip, without dependencies. ```bash conda activate lbsntransform pip install lbsntransform --no-deps --upgrade # or locally, from the latest commits on master # pip install . --no-deps --upgrade ``` 3. Import data using a mapping For each data source, a mapping must be provided that defines how data is mapped to the [lbsnstructure][lbsnstructure]. The default mapping is [lbsnraw][lbsnraw]. Additional mappings can be dynamically loaded from a folder. We have provided two [example mappings][mappings] for the [Flickr YFCC100M dataset][yfcc100m] (CSV) and Twitter (json). For example, to import the first 1000 records from json data from Twitter to the [lbsn raw database][rawdb], clone [field_mapping_twitter.py][field_mapping_twitter] to a local folder `./resources/mappings/`, startup the Docker [rawdb][rawdb] container, and use: ```shell lbsntransform --origin 3 \ --mappings_path ./resources/mappings/ \ --file_input \ --file_type "json" \ --mappings_path ./resources/mappings/ \ --dbpassword_output "sample-key" \ --dbuser_output "postgres" \ --dbserveraddress_output "127.0.0.1:5432" \ --dbname_output "rawdb" \ --dbformat_output "lbsn" \ --transferlimit 1000 ``` .. with the above input args, the the tool will: - read local json from `./01_Input/` - and store lbsn records to the [lbsn rawdb][rawdb]. Vice versa, to import data directly to the privacy-aware version of lbsnstructure, called [hlldb][hlldb], startup the Docker container, and use: ```shell lbsntransform --origin 3 \ --mappings_path ./resources/mappings/ \ --file_input \ --file_type "json" \ --mappings_path ./resources/mappings/ \ --dbpassword_output "sample-key" \ --dbuser_output "postgres" \ --dbserveraddress_output "127.0.0.1:25432" \ --dbname_output "hlldb" \ --dbformat_output "hll" \ --dbpassword_hllworker "sample-key" \ --dbuser_hllworker "postgres" \ --dbserveraddress_hllworker "127.0.0.1:25432" \ --dbname_hllworker "hlldb" \ --include_lbsn_objects "origin,post" \ --include_lbsn_bases hashtag,place,date,community \ --transferlimit 1000 ``` .. with the above input args, the the tool will: - read local json from `./01_Input/` - and store lbsn records to the privacy-aware [lbsn hlldb][hlldb] - by converting only lbsn objects of type [origin][lbsnorigin] and [post][lbsnpost] - and updating the HyperLogLog (HLL) target tables `hashtag`, `place`, `date` and `community` A full list of possible input and output args is available in the [documentation](https://lbsn.vgiscience.org/lbsntransform/docs/). ## Built With * [lbsnstructure](https://pypi.org/project/lbsnstructure/) - A common language independend and cross-network social-media datascheme * [protobuf](https://github.com/google/protobuf) - Google's data interchange format * [psycopg2](https://github.com/psycopg/psycopg2) - Python-PostgreSQL Database Adapter * [ppygis3](https://github.com/AlexImmer/ppygis3) - A PPyGIS port for Python * [shapely](https://github.com/Toblerity/Shapely) - Geometric objects processing in Python * [emoji](https://github.com/carpedm20/emoji/) - Emoji handling in Python ## Authors * **Alexander Dunkel** - Initial work See also the list of [contributors](/../graphs/master). ## License This project is licensed under the GNU GPLv3 or any higher - see the [LICENSE.md](LICENSE.md) file for details. [lbsnstructure]: https://lbsn.vgiscience.org/structure/ [lbsnpost]: https://lbsn.vgiscience.org/structure/#post [lbsnorigin]: https://lbsn.vgiscience.org/structure/#origin [data-transfer-project]: https://datatransferproject.dev/ [rawdb]: https://gitlab.vgiscience.de/lbsn/databases/rawdb [hlldb]: https://gitlab.vgiscience.de/lbsn/databases/hlldb [lbsnraw]: lbsntransform/input/mappings/field_mapping_lbsn.py [mappings]: resources/mappings [field_mapping_twitter]: resources/mappings/field_mapping_twitter.py [yfcc100m]: http://projects.dfki.uni-kl.de/yfcc100m/ %package help Summary: Development documents and examples for lbsntransform Provides: python3-lbsntransform-doc %description help [![PyPI version](https://lbsn.vgiscience.org/lbsntransform/pypi.svg?&kill_cache=2)](https://pypi.org/project/lbsntransform/) [![pylint](https://lbsn.vgiscience.org/lbsntransform/pylint.svg)](https://gitlab.vgiscience.de/lbsn/lbsntransform) [![pipeline status](https://gitlab.vgiscience.de/lbsn/lbsntransform/badges/master/pipeline.svg)](https://gitlab.vgiscience.de/lbsn/lbsntransform/-/commits/master) [![Documentation](https://lbsn.vgiscience.org/lbsntransform/documentation.svg)](https://lbsn.vgiscience.org/lbsntransform/docs/) # LBSNTransform A python package that uses the [common location based social network (LBSN) data structure][lbsnstructure] (ProtoBuf) to import, transform and export Social Media data such as Twitter and Flickr. ![Illustration of functions](https://lbsn.vgiscience.org/lbsntransform/docs/inputoutput.svg) ## Motivation The goal is to provide a common interface to handle Social Media Data, without the need to individually adapt to the myriad API endpoints available. As an example, consider the ProtoBuf spec [lbsn.Post][lbsnpost], which can be a Tweet on Twitter, a Photo shared on Flickr, or a post on Reddit. However, all of these objects share a common set of attributes, which is reflected in the lbsnstructure. The tool is based on a 4-Facet conceptual framework for LBSN, introduced in a paper by [Dunkel et al. (2018)](https://www.tandfonline.com/doi/full/10.1080/13658816.2018.1546390). The GDPR directly requests Social Media Network operators to allow users to transfer accounts and data in-between services. While there are attempts by Google, Facebook etc. (e.g. see the [data-transfer-project][data-transfer-project]), this is not currently possible. With the lbsnstructure, a primary motivation is to systematically characterize LBSN data aspects in a common, cross-network data scheme that enables privacy-by-design for connected software, data handling and database design. ## Description This tool enables data import from a Postgres database, JSON, or CSV and export to CSV, [LBSN ProtoBuf][lbsnstructure] or the [hll][hlldb] and [raw][rawdb] versions of the LBSN prepared Postgres Databases. The tool will map Social Media endpoints (e.g. Twitter tweets) to a common [LBSN Interchange Structure][lbsnstructure] format in ProtoBuf. LBSNTransform can be used using the command line (CLI) or imported to other Python projects with `import lbsntransform`, for on-the-fly conversion. ## Quick Start The recommended way to install lbsntransform, for both Linux and Windows, is through the conda package manager. 1. Create a conda env using `environment.yml` First, create an environment with the dependencies for lbsntransform using the [environment.yml][environment.yml] that is provided in the root of the repository. ```bash git clone https://github.com/Sieboldianus/lbsntransform.git cd lbsntransform # not necessary, but recommended: conda config --env --set channel_priority strict conda env create -f environment.yml ``` 2. Install lbsntransform without dependencies Afterwards, install lbsntransform using pip, without dependencies. ```bash conda activate lbsntransform pip install lbsntransform --no-deps --upgrade # or locally, from the latest commits on master # pip install . --no-deps --upgrade ``` 3. Import data using a mapping For each data source, a mapping must be provided that defines how data is mapped to the [lbsnstructure][lbsnstructure]. The default mapping is [lbsnraw][lbsnraw]. Additional mappings can be dynamically loaded from a folder. We have provided two [example mappings][mappings] for the [Flickr YFCC100M dataset][yfcc100m] (CSV) and Twitter (json). For example, to import the first 1000 records from json data from Twitter to the [lbsn raw database][rawdb], clone [field_mapping_twitter.py][field_mapping_twitter] to a local folder `./resources/mappings/`, startup the Docker [rawdb][rawdb] container, and use: ```shell lbsntransform --origin 3 \ --mappings_path ./resources/mappings/ \ --file_input \ --file_type "json" \ --mappings_path ./resources/mappings/ \ --dbpassword_output "sample-key" \ --dbuser_output "postgres" \ --dbserveraddress_output "127.0.0.1:5432" \ --dbname_output "rawdb" \ --dbformat_output "lbsn" \ --transferlimit 1000 ``` .. with the above input args, the the tool will: - read local json from `./01_Input/` - and store lbsn records to the [lbsn rawdb][rawdb]. Vice versa, to import data directly to the privacy-aware version of lbsnstructure, called [hlldb][hlldb], startup the Docker container, and use: ```shell lbsntransform --origin 3 \ --mappings_path ./resources/mappings/ \ --file_input \ --file_type "json" \ --mappings_path ./resources/mappings/ \ --dbpassword_output "sample-key" \ --dbuser_output "postgres" \ --dbserveraddress_output "127.0.0.1:25432" \ --dbname_output "hlldb" \ --dbformat_output "hll" \ --dbpassword_hllworker "sample-key" \ --dbuser_hllworker "postgres" \ --dbserveraddress_hllworker "127.0.0.1:25432" \ --dbname_hllworker "hlldb" \ --include_lbsn_objects "origin,post" \ --include_lbsn_bases hashtag,place,date,community \ --transferlimit 1000 ``` .. with the above input args, the the tool will: - read local json from `./01_Input/` - and store lbsn records to the privacy-aware [lbsn hlldb][hlldb] - by converting only lbsn objects of type [origin][lbsnorigin] and [post][lbsnpost] - and updating the HyperLogLog (HLL) target tables `hashtag`, `place`, `date` and `community` A full list of possible input and output args is available in the [documentation](https://lbsn.vgiscience.org/lbsntransform/docs/). ## Built With * [lbsnstructure](https://pypi.org/project/lbsnstructure/) - A common language independend and cross-network social-media datascheme * [protobuf](https://github.com/google/protobuf) - Google's data interchange format * [psycopg2](https://github.com/psycopg/psycopg2) - Python-PostgreSQL Database Adapter * [ppygis3](https://github.com/AlexImmer/ppygis3) - A PPyGIS port for Python * [shapely](https://github.com/Toblerity/Shapely) - Geometric objects processing in Python * [emoji](https://github.com/carpedm20/emoji/) - Emoji handling in Python ## Authors * **Alexander Dunkel** - Initial work See also the list of [contributors](/../graphs/master). ## License This project is licensed under the GNU GPLv3 or any higher - see the [LICENSE.md](LICENSE.md) file for details. [lbsnstructure]: https://lbsn.vgiscience.org/structure/ [lbsnpost]: https://lbsn.vgiscience.org/structure/#post [lbsnorigin]: https://lbsn.vgiscience.org/structure/#origin [data-transfer-project]: https://datatransferproject.dev/ [rawdb]: https://gitlab.vgiscience.de/lbsn/databases/rawdb [hlldb]: https://gitlab.vgiscience.de/lbsn/databases/hlldb [lbsnraw]: lbsntransform/input/mappings/field_mapping_lbsn.py [mappings]: resources/mappings [field_mapping_twitter]: resources/mappings/field_mapping_twitter.py [yfcc100m]: http://projects.dfki.uni-kl.de/yfcc100m/ %prep %autosetup -n lbsntransform-0.25.1 %build %py3_build %install %py3_install install -d -m755 %{buildroot}/%{_pkgdocdir} if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi pushd %{buildroot} if [ -d usr/lib ]; then find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/lib64 ]; then find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/bin ]; then find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst fi if [ -d usr/sbin ]; then find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst fi touch doclist.lst if [ -d usr/share/man ]; then find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst fi popd mv %{buildroot}/filelist.lst . mv %{buildroot}/doclist.lst . %files -n python3-lbsntransform -f filelist.lst %dir %{python3_sitelib}/* %files help -f doclist.lst %{_docdir}/* %changelog * Mon May 29 2023 Python_Bot - 0.25.1-1 - Package Spec generated