From 5b6e6a8a42561cb237439d80f4def395dbccdd4e Mon Sep 17 00:00:00 2001 From: CoprDistGit Date: Wed, 10 May 2023 04:14:47 +0000 Subject: automatic import of python-transbigdata --- .gitignore | 1 + python-transbigdata.spec | 1083 ++++++++++++++++++++++++++++++++++++++++++++++ sources | 1 + 3 files changed, 1085 insertions(+) create mode 100644 python-transbigdata.spec create mode 100644 sources diff --git a/.gitignore b/.gitignore index e69de29..cd7f6f6 100644 --- a/.gitignore +++ b/.gitignore @@ -0,0 +1 @@ +/transbigdata-0.4.17.tar.gz diff --git a/python-transbigdata.spec b/python-transbigdata.spec new file mode 100644 index 0000000..d00ce11 --- /dev/null +++ b/python-transbigdata.spec @@ -0,0 +1,1083 @@ +%global _empty_manifest_terminate_build 0 +Name: python-transbigdata +Version: 0.4.17 +Release: 1 +Summary: A Python package developed for transportation spatio-temporal big data processing and analysis. +License: BSD +URL: https://github.com/ni1o1/transbigdata +Source0: https://mirrors.nju.edu.cn/pypi/web/packages/18/94/fd5784ab7c74eba7c5fbac9167d552ff92236b049693b382f63a2b3790a1/transbigdata-0.4.17.tar.gz +BuildArch: noarch + +Requires: python3-numpy +Requires: python3-pandas +Requires: python3-shapely +Requires: python3-geopandas +Requires: python3-scipy +Requires: python3-matplotlib + +%description +English [中文版](README-zh_CN.md) + +# TransBigData + + + +[![Documentation Status](https://readthedocs.org/projects/transbigdata/badge/?version=latest)](https://transbigdata.readthedocs.io/en/latest/?badge=latest) [![Downloads](https://pepy.tech/badge/transbigdata)](https://pepy.tech/project/transbigdata) [![Downloads](https://pepy.tech/badge/transbigdata/week)](https://pepy.tech/project/transbigdata) [![Tests](https://github.com/ni1o1/transbigdata/actions/workflows/tests.yml/badge.svg)](https://github.com/ni1o1/transbigdata/actions/workflows/tests.yml) [![codecov](https://codecov.io/gh/ni1o1/transbigdata/branch/main/graph/badge.svg?token=GLAVYYCD9L)](https://codecov.io/gh/ni1o1/transbigdata) + +## Introduction + +`TransBigData` is a Python package developed for transportation spatio-temporal big data processing, analysis and visualization. `TransBigData` provides fast and concise methods for processing common transportation spatio-temporal big data such as Taxi GPS data, bicycle sharing data and bus GPS data. `TransBigData` provides a variety of processing methods for each stage of transportation spatio-temporal big data analysis. The code with `TransBigData` is clean, efficient, flexible, and easy to use, allowing complex data tasks to be achieved with concise code. + +For some specific types of data, `TransBigData` also provides targeted tools for specific needs, such as extraction of Origin and Destination(OD) of taxi trips from taxi GPS data and identification of arrival and departure information from bus GPS data. The latest stable release of the software can be installed via pip and full documentation +can be found at https://transbigdata.readthedocs.io/en/latest/. Introduction PPT can be found [here](https://github.com/ni1o1/transbigdata/blob/main/introduction/IntroductionofTransBigData.pdf) and [here(in Chinese)](https://github.com/ni1o1/transbigdata/blob/main/introduction/gridbasedframework.pdf) + +### Target Audience + +The target audience of `TransBigData` includes: + +- Data science researchers and data engineers in the field of transportation big data, smart transportation systems, and urban computing, particularly those who want to integrate innovative algorithms into intelligent trasnportation systems +- Government, enterprises, or other entities who expect efficient and reliable management decision support through transportation spatio-temporal data analysis. + +### Technical Features + +* Provide a variety of processing methods for each stage of transportation spatio-temporal big data analysis. +* The code with `TransBigData` is clean, efficient, flexible, and easy to use, allowing complex data tasks to be achieved with concise code. + +### Main Functions + +Currently, `TransBigData` mainly provides the following methods: + +* **Data Quality**: Provides methods to quickly obtain the general information of the dataset, including the data amount the time period and the sampling interval. +* **Data Preprocess**: Provides methods to clean multiple types of data error. +* **Data Gridding**: Provides methods to generate multiple types of geographic grids (Rectangular grids, Hexagonal grids) in the research area. Provides fast algorithms to map GPS data to the generated grids. +* **Data Aggregating**: Provides methods to aggregate GPS data and OD data into geographic polygon. +* **Data Visualization**: Built-in visualization capabilities leverage the visualization package keplergl to interactively visualize data on Jupyter notebook with simple code. +* **Trajectory Processing**: Provides methods to process trajectory data, including generating trajectory linestring from GPS points, and trajectory densification, etc. +* **Basemap Loading**: Provides methods to display Mapbox basemap on matplotlib figures + +## Installation + +`TransBigData` support Python >= 3.6 + +### Using pypi [![PyPI version](https://badge.fury.io/py/transbigdata.svg)](https://badge.fury.io/py/transbigdata) + +`TransBigData` can be installed by using `pip install`. Before installing `TransBigData`, make sure that you have installed the available [geopandas package](https://geopandas.org/en/stable/getting_started/install.html). If you already have geopandas installed, run the following code directly from the command prompt to install `TransBigData`: + + pip install transbigdata + +### Using conda-forge [![Conda Version](https://img.shields.io/conda/vn/conda-forge/transbigdata.svg)](https://anaconda.org/conda-forge/transbigdata) [![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/transbigdata.svg)](https://anaconda.org/conda-forge/transbigdata) + +You can also install `TransBigData` by `conda-forge`, this will automaticaly solve the dependency, it can be installed with: + + conda install -c conda-forge transbigdata + +## Contributing to TransBigData [![GitHub contributors](https://img.shields.io/github/contributors/ni1o1/transbigdata.svg)](https://github.com/ni1o1/transbigdata/graphs/contributors) [![Join the chat at https://gitter.im/transbigdata/community](https://badges.gitter.im/transbigdata/community.svg)](https://gitter.im/transbigdata/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) ![GitHub commit activity](https://img.shields.io/github/commit-activity/m/ni1o1/transbigdata) + +All contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome. A detailed overview on how to contribute can be found in the [contributing guide](https://github.com/ni1o1/transbigdata/blob/master/CONTRIBUTING.md) on GitHub. + +## Examples + +### Example of data visualization + +#### Visualize trajectories (with keplergl) + +![gif](https://github.com/ni1o1/transbigdata/raw/main/image/README/tbdexample1.gif) + +#### Visualize data distribution (with keplergl) + +![gif](https://github.com/ni1o1/transbigdata/raw/main/image/README/tbdexample2.gif) + +#### Visualize OD (with keplergl) + +![gif](https://github.com/ni1o1/transbigdata/raw/main/image/README/tbdexample3.gif) + +### Example of taxi GPS data processing + +The following example shows how to use the `TransBigData` to perform data gridding, data aggregating and data visualization for taxi GPS data. + +#### Read the data + +```python +import transbigdata as tbd +import pandas as pd +#Read taxi gps data +data = pd.read_csv('TaxiData-Sample.csv',header = None) +data.columns = ['VehicleNum','time','lon','lat','OpenStatus','Speed'] +data +``` + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
VehicleNumtimelonlatOpenStatusSpeed
03474520:27:43113.80684722.623249127
13474520:24:07113.80989822.62739900
23474520:24:27113.80989822.62739900
33474520:22:07113.81134822.62806700
43474520:10:06113.81988522.647800054
.....................
5449942826521:35:13114.32150322.709499018
5449952826509:08:02114.32270122.68170000
5449962826509:14:31114.33670022.69010000
5449972826521:19:12114.35260022.72839900
5449982826519:08:06114.13770322.62170000
+

544999 rows × 6 columns

+
+ +#### Data pre-processing + +Define the study area and use the `tbd.clean_outofbounds` method to delete the data out of the study area + +```python +#Define the study area +bounds = [113.75, 22.4, 114.62, 22.86] +#Delete the data out of the study area +data = tbd.clean_outofbounds(data,bounds = bounds,col = ['lon','lat']) +``` + +#### Data gridding + +The most basic way to express the data distribution is in the form of geograpic grids. `TransBigData` provides methods to generate multiple types of geographic grids (Rectangular grids, Hexagonal grids) in the research area. For rectangular gridding, you need to determine the gridding parameters at first (which can be interpreted as defining a grid coordinate system): + +```python +#Obtain the gridding parameters +params = tbd.area_to_params(bounds,accuracy = 1000) +params +``` + +> {'slon': 113.75, +> 'slat': 22.4, +> 'deltalon': 0.00974336289289822, +> 'deltalat': 0.008993210412845813, +> 'theta': 0, +> 'method': 'rect', +> 'gridsize': 1000} + +The gridding parameters store the information of the initial position, the size and the angle of the gridding system. + +The next step is to map the GPS data to their corresponding grids. Using the `tbd.GPS_to_grid`, it will generate the `LONCOL` column and the `LATCOL` column (Rectangular grids). The two columns together can specify a grid: + +```python +#Map the GPS data to grids +data['LONCOL'],data['LATCOL'] = tbd.GPS_to_grid(data['lon'],data['lat'],params) +``` + +Count the amount of data in each grids, generate the geometry of the grids and transform it into a GeoDataFrame: + +```python +#Aggregate data into grids +grid_agg = data.groupby(['LONCOL','LATCOL'])['VehicleNum'].count().reset_index() +#Generate grid geometry +grid_agg['geometry'] = tbd.grid_to_polygon([grid_agg['LONCOL'],grid_agg['LATCOL']],params) +#Change the type into GeoDataFrame +import geopandas as gpd +grid_agg = gpd.GeoDataFrame(grid_agg) +#Plot the grids +grid_agg.plot(column = 'VehicleNum',cmap = 'autumn_r') +``` + +![png](https://github.com/ni1o1/transbigdata/raw/main/image/README/output_5_1.png) + +#### Triangle and Hexagon grids & rotation angle + +`TransBigData` also support the triangle and hexagon grids. It also supports given rotation angle for the grids. We can alter the gridding parameter: + +```python +#set to the hexagon grids +params['method'] = 'hexa' +#or set as triangle grids: params['method'] = 'tri' +#set a rotation angle (degree) +params['theta'] = 5 +``` + +Then we can do the GPS data matching again: + +```python +#Triangle and Hexagon grids requires three columns to store ID +data['loncol_1'],data['loncol_2'],data['loncol_3'] = tbd.GPS_to_grid(data['lon'],data['lat'],params) +#Aggregate data into grids +grid_agg = data.groupby(['loncol_1','loncol_2','loncol_3'])['VehicleNum'].count().reset_index() +#Generate grid geometry +grid_agg['geometry'] = tbd.grid_to_polygon([grid_agg['loncol_1'],grid_agg['loncol_2'],grid_agg['loncol_3']],params) +#Change the type into GeoDataFrame +import geopandas as gpd +grid_agg = gpd.GeoDataFrame(grid_agg) +#Plot the grids +grid_agg.plot(column = 'VehicleNum',cmap = 'autumn_r') +``` + +![1648714436503.png](https://github.com/ni1o1/transbigdata/raw/main/image/README/1648714436503.png) + +#### Data Visualization(with basemap) + +For a geographical data visualization figure, we still have to add the basemap, the colorbar, the compass and the scale. Use `tbd.plot_map` to load the basemap and `tbd.plotscale` to add compass and scale in matplotlib figure: + +```python +import matplotlib.pyplot as plt +fig =plt.figure(1,(8,8),dpi=300) +ax =plt.subplot(111) +plt.sca(ax) +#Load basemap +tbd.plot_map(plt,bounds,zoom = 11,style = 4) +#Define colorbar +cax = plt.axes([0.05, 0.33, 0.02, 0.3]) +plt.title('Data count') +plt.sca(ax) +#Plot the data +grid_agg.plot(column = 'VehicleNum',cmap = 'autumn_r',ax = ax,cax = cax,legend = True) +#Add scale +tbd.plotscale(ax,bounds = bounds,textsize = 10,compasssize = 1,accuracy = 2000,rect = [0.06,0.03],zorder = 10) +plt.axis('off') +plt.xlim(bounds[0],bounds[2]) +plt.ylim(bounds[1],bounds[3]) +plt.show() +``` + +![1648714582961.png](https://github.com/ni1o1/transbigdata/raw/main/image/README/1648714582961.png) + +#### Griding framework offered by TransBigData + +Here is an overview of the gridding framework offered by `TransBigData`. + +![1648715064154.png](https://github.com/ni1o1/transbigdata/raw/main/image/README/1648715064154.png) + +See [This Example](https://github.com/ni1o1/transbigdata/blob/main/docs/source/gallery/Example%202-Grid-base%20processing%20framework%20of%20TransBigData.ipynb) for further details. + +## Citation information [![DOI](https://zenodo.org/badge/419559811.svg)](https://zenodo.org/badge/latestdoi/419559811) [![status](https://joss.theoj.org/papers/d1055fe3105dfa2dcff4cb6c7688a79b/status.svg)](https://joss.theoj.org/papers/d1055fe3105dfa2dcff4cb6c7688a79b) + +Please cite [this](https://doi.org/10.21105/joss.04021) when using `TransBigData` in your research. Citation information can be found at [CITATION.cff](https://github.com/ni1o1/transbigdata/blob/main/CITATION.cff). + +## Introducing Video (In Chinese) [![bilibili](https://img.shields.io/badge/bilibili-%E5%90%8C%E6%B5%8E%E5%B0%8F%E6%97%AD%E5%AD%A6%E9%95%BF-green.svg)](https://space.bilibili.com/3051484) + +* [Bilibili](https://www.bilibili.com/video/BV1na411i7sd/) +* [Youtube](https://www.youtube.com/watch?v=ynqJ01WmPiQ) + + +%package -n python3-transbigdata +Summary: A Python package developed for transportation spatio-temporal big data processing and analysis. +Provides: python-transbigdata +BuildRequires: python3-devel +BuildRequires: python3-setuptools +BuildRequires: python3-pip +%description -n python3-transbigdata +English [中文版](README-zh_CN.md) + +# TransBigData + + + +[![Documentation Status](https://readthedocs.org/projects/transbigdata/badge/?version=latest)](https://transbigdata.readthedocs.io/en/latest/?badge=latest) [![Downloads](https://pepy.tech/badge/transbigdata)](https://pepy.tech/project/transbigdata) [![Downloads](https://pepy.tech/badge/transbigdata/week)](https://pepy.tech/project/transbigdata) [![Tests](https://github.com/ni1o1/transbigdata/actions/workflows/tests.yml/badge.svg)](https://github.com/ni1o1/transbigdata/actions/workflows/tests.yml) [![codecov](https://codecov.io/gh/ni1o1/transbigdata/branch/main/graph/badge.svg?token=GLAVYYCD9L)](https://codecov.io/gh/ni1o1/transbigdata) + +## Introduction + +`TransBigData` is a Python package developed for transportation spatio-temporal big data processing, analysis and visualization. `TransBigData` provides fast and concise methods for processing common transportation spatio-temporal big data such as Taxi GPS data, bicycle sharing data and bus GPS data. `TransBigData` provides a variety of processing methods for each stage of transportation spatio-temporal big data analysis. The code with `TransBigData` is clean, efficient, flexible, and easy to use, allowing complex data tasks to be achieved with concise code. + +For some specific types of data, `TransBigData` also provides targeted tools for specific needs, such as extraction of Origin and Destination(OD) of taxi trips from taxi GPS data and identification of arrival and departure information from bus GPS data. The latest stable release of the software can be installed via pip and full documentation +can be found at https://transbigdata.readthedocs.io/en/latest/. Introduction PPT can be found [here](https://github.com/ni1o1/transbigdata/blob/main/introduction/IntroductionofTransBigData.pdf) and [here(in Chinese)](https://github.com/ni1o1/transbigdata/blob/main/introduction/gridbasedframework.pdf) + +### Target Audience + +The target audience of `TransBigData` includes: + +- Data science researchers and data engineers in the field of transportation big data, smart transportation systems, and urban computing, particularly those who want to integrate innovative algorithms into intelligent trasnportation systems +- Government, enterprises, or other entities who expect efficient and reliable management decision support through transportation spatio-temporal data analysis. + +### Technical Features + +* Provide a variety of processing methods for each stage of transportation spatio-temporal big data analysis. +* The code with `TransBigData` is clean, efficient, flexible, and easy to use, allowing complex data tasks to be achieved with concise code. + +### Main Functions + +Currently, `TransBigData` mainly provides the following methods: + +* **Data Quality**: Provides methods to quickly obtain the general information of the dataset, including the data amount the time period and the sampling interval. +* **Data Preprocess**: Provides methods to clean multiple types of data error. +* **Data Gridding**: Provides methods to generate multiple types of geographic grids (Rectangular grids, Hexagonal grids) in the research area. Provides fast algorithms to map GPS data to the generated grids. +* **Data Aggregating**: Provides methods to aggregate GPS data and OD data into geographic polygon. +* **Data Visualization**: Built-in visualization capabilities leverage the visualization package keplergl to interactively visualize data on Jupyter notebook with simple code. +* **Trajectory Processing**: Provides methods to process trajectory data, including generating trajectory linestring from GPS points, and trajectory densification, etc. +* **Basemap Loading**: Provides methods to display Mapbox basemap on matplotlib figures + +## Installation + +`TransBigData` support Python >= 3.6 + +### Using pypi [![PyPI version](https://badge.fury.io/py/transbigdata.svg)](https://badge.fury.io/py/transbigdata) + +`TransBigData` can be installed by using `pip install`. Before installing `TransBigData`, make sure that you have installed the available [geopandas package](https://geopandas.org/en/stable/getting_started/install.html). If you already have geopandas installed, run the following code directly from the command prompt to install `TransBigData`: + + pip install transbigdata + +### Using conda-forge [![Conda Version](https://img.shields.io/conda/vn/conda-forge/transbigdata.svg)](https://anaconda.org/conda-forge/transbigdata) [![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/transbigdata.svg)](https://anaconda.org/conda-forge/transbigdata) + +You can also install `TransBigData` by `conda-forge`, this will automaticaly solve the dependency, it can be installed with: + + conda install -c conda-forge transbigdata + +## Contributing to TransBigData [![GitHub contributors](https://img.shields.io/github/contributors/ni1o1/transbigdata.svg)](https://github.com/ni1o1/transbigdata/graphs/contributors) [![Join the chat at https://gitter.im/transbigdata/community](https://badges.gitter.im/transbigdata/community.svg)](https://gitter.im/transbigdata/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) ![GitHub commit activity](https://img.shields.io/github/commit-activity/m/ni1o1/transbigdata) + +All contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome. A detailed overview on how to contribute can be found in the [contributing guide](https://github.com/ni1o1/transbigdata/blob/master/CONTRIBUTING.md) on GitHub. + +## Examples + +### Example of data visualization + +#### Visualize trajectories (with keplergl) + +![gif](https://github.com/ni1o1/transbigdata/raw/main/image/README/tbdexample1.gif) + +#### Visualize data distribution (with keplergl) + +![gif](https://github.com/ni1o1/transbigdata/raw/main/image/README/tbdexample2.gif) + +#### Visualize OD (with keplergl) + +![gif](https://github.com/ni1o1/transbigdata/raw/main/image/README/tbdexample3.gif) + +### Example of taxi GPS data processing + +The following example shows how to use the `TransBigData` to perform data gridding, data aggregating and data visualization for taxi GPS data. + +#### Read the data + +```python +import transbigdata as tbd +import pandas as pd +#Read taxi gps data +data = pd.read_csv('TaxiData-Sample.csv',header = None) +data.columns = ['VehicleNum','time','lon','lat','OpenStatus','Speed'] +data +``` + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
VehicleNumtimelonlatOpenStatusSpeed
03474520:27:43113.80684722.623249127
13474520:24:07113.80989822.62739900
23474520:24:27113.80989822.62739900
33474520:22:07113.81134822.62806700
43474520:10:06113.81988522.647800054
.....................
5449942826521:35:13114.32150322.709499018
5449952826509:08:02114.32270122.68170000
5449962826509:14:31114.33670022.69010000
5449972826521:19:12114.35260022.72839900
5449982826519:08:06114.13770322.62170000
+

544999 rows × 6 columns

+
+ +#### Data pre-processing + +Define the study area and use the `tbd.clean_outofbounds` method to delete the data out of the study area + +```python +#Define the study area +bounds = [113.75, 22.4, 114.62, 22.86] +#Delete the data out of the study area +data = tbd.clean_outofbounds(data,bounds = bounds,col = ['lon','lat']) +``` + +#### Data gridding + +The most basic way to express the data distribution is in the form of geograpic grids. `TransBigData` provides methods to generate multiple types of geographic grids (Rectangular grids, Hexagonal grids) in the research area. For rectangular gridding, you need to determine the gridding parameters at first (which can be interpreted as defining a grid coordinate system): + +```python +#Obtain the gridding parameters +params = tbd.area_to_params(bounds,accuracy = 1000) +params +``` + +> {'slon': 113.75, +> 'slat': 22.4, +> 'deltalon': 0.00974336289289822, +> 'deltalat': 0.008993210412845813, +> 'theta': 0, +> 'method': 'rect', +> 'gridsize': 1000} + +The gridding parameters store the information of the initial position, the size and the angle of the gridding system. + +The next step is to map the GPS data to their corresponding grids. Using the `tbd.GPS_to_grid`, it will generate the `LONCOL` column and the `LATCOL` column (Rectangular grids). The two columns together can specify a grid: + +```python +#Map the GPS data to grids +data['LONCOL'],data['LATCOL'] = tbd.GPS_to_grid(data['lon'],data['lat'],params) +``` + +Count the amount of data in each grids, generate the geometry of the grids and transform it into a GeoDataFrame: + +```python +#Aggregate data into grids +grid_agg = data.groupby(['LONCOL','LATCOL'])['VehicleNum'].count().reset_index() +#Generate grid geometry +grid_agg['geometry'] = tbd.grid_to_polygon([grid_agg['LONCOL'],grid_agg['LATCOL']],params) +#Change the type into GeoDataFrame +import geopandas as gpd +grid_agg = gpd.GeoDataFrame(grid_agg) +#Plot the grids +grid_agg.plot(column = 'VehicleNum',cmap = 'autumn_r') +``` + +![png](https://github.com/ni1o1/transbigdata/raw/main/image/README/output_5_1.png) + +#### Triangle and Hexagon grids & rotation angle + +`TransBigData` also support the triangle and hexagon grids. It also supports given rotation angle for the grids. We can alter the gridding parameter: + +```python +#set to the hexagon grids +params['method'] = 'hexa' +#or set as triangle grids: params['method'] = 'tri' +#set a rotation angle (degree) +params['theta'] = 5 +``` + +Then we can do the GPS data matching again: + +```python +#Triangle and Hexagon grids requires three columns to store ID +data['loncol_1'],data['loncol_2'],data['loncol_3'] = tbd.GPS_to_grid(data['lon'],data['lat'],params) +#Aggregate data into grids +grid_agg = data.groupby(['loncol_1','loncol_2','loncol_3'])['VehicleNum'].count().reset_index() +#Generate grid geometry +grid_agg['geometry'] = tbd.grid_to_polygon([grid_agg['loncol_1'],grid_agg['loncol_2'],grid_agg['loncol_3']],params) +#Change the type into GeoDataFrame +import geopandas as gpd +grid_agg = gpd.GeoDataFrame(grid_agg) +#Plot the grids +grid_agg.plot(column = 'VehicleNum',cmap = 'autumn_r') +``` + +![1648714436503.png](https://github.com/ni1o1/transbigdata/raw/main/image/README/1648714436503.png) + +#### Data Visualization(with basemap) + +For a geographical data visualization figure, we still have to add the basemap, the colorbar, the compass and the scale. Use `tbd.plot_map` to load the basemap and `tbd.plotscale` to add compass and scale in matplotlib figure: + +```python +import matplotlib.pyplot as plt +fig =plt.figure(1,(8,8),dpi=300) +ax =plt.subplot(111) +plt.sca(ax) +#Load basemap +tbd.plot_map(plt,bounds,zoom = 11,style = 4) +#Define colorbar +cax = plt.axes([0.05, 0.33, 0.02, 0.3]) +plt.title('Data count') +plt.sca(ax) +#Plot the data +grid_agg.plot(column = 'VehicleNum',cmap = 'autumn_r',ax = ax,cax = cax,legend = True) +#Add scale +tbd.plotscale(ax,bounds = bounds,textsize = 10,compasssize = 1,accuracy = 2000,rect = [0.06,0.03],zorder = 10) +plt.axis('off') +plt.xlim(bounds[0],bounds[2]) +plt.ylim(bounds[1],bounds[3]) +plt.show() +``` + +![1648714582961.png](https://github.com/ni1o1/transbigdata/raw/main/image/README/1648714582961.png) + +#### Griding framework offered by TransBigData + +Here is an overview of the gridding framework offered by `TransBigData`. + +![1648715064154.png](https://github.com/ni1o1/transbigdata/raw/main/image/README/1648715064154.png) + +See [This Example](https://github.com/ni1o1/transbigdata/blob/main/docs/source/gallery/Example%202-Grid-base%20processing%20framework%20of%20TransBigData.ipynb) for further details. + +## Citation information [![DOI](https://zenodo.org/badge/419559811.svg)](https://zenodo.org/badge/latestdoi/419559811) [![status](https://joss.theoj.org/papers/d1055fe3105dfa2dcff4cb6c7688a79b/status.svg)](https://joss.theoj.org/papers/d1055fe3105dfa2dcff4cb6c7688a79b) + +Please cite [this](https://doi.org/10.21105/joss.04021) when using `TransBigData` in your research. Citation information can be found at [CITATION.cff](https://github.com/ni1o1/transbigdata/blob/main/CITATION.cff). + +## Introducing Video (In Chinese) [![bilibili](https://img.shields.io/badge/bilibili-%E5%90%8C%E6%B5%8E%E5%B0%8F%E6%97%AD%E5%AD%A6%E9%95%BF-green.svg)](https://space.bilibili.com/3051484) + +* [Bilibili](https://www.bilibili.com/video/BV1na411i7sd/) +* [Youtube](https://www.youtube.com/watch?v=ynqJ01WmPiQ) + + +%package help +Summary: Development documents and examples for transbigdata +Provides: python3-transbigdata-doc +%description help +English [中文版](README-zh_CN.md) + +# TransBigData + + + +[![Documentation Status](https://readthedocs.org/projects/transbigdata/badge/?version=latest)](https://transbigdata.readthedocs.io/en/latest/?badge=latest) [![Downloads](https://pepy.tech/badge/transbigdata)](https://pepy.tech/project/transbigdata) [![Downloads](https://pepy.tech/badge/transbigdata/week)](https://pepy.tech/project/transbigdata) [![Tests](https://github.com/ni1o1/transbigdata/actions/workflows/tests.yml/badge.svg)](https://github.com/ni1o1/transbigdata/actions/workflows/tests.yml) [![codecov](https://codecov.io/gh/ni1o1/transbigdata/branch/main/graph/badge.svg?token=GLAVYYCD9L)](https://codecov.io/gh/ni1o1/transbigdata) + +## Introduction + +`TransBigData` is a Python package developed for transportation spatio-temporal big data processing, analysis and visualization. `TransBigData` provides fast and concise methods for processing common transportation spatio-temporal big data such as Taxi GPS data, bicycle sharing data and bus GPS data. `TransBigData` provides a variety of processing methods for each stage of transportation spatio-temporal big data analysis. The code with `TransBigData` is clean, efficient, flexible, and easy to use, allowing complex data tasks to be achieved with concise code. + +For some specific types of data, `TransBigData` also provides targeted tools for specific needs, such as extraction of Origin and Destination(OD) of taxi trips from taxi GPS data and identification of arrival and departure information from bus GPS data. The latest stable release of the software can be installed via pip and full documentation +can be found at https://transbigdata.readthedocs.io/en/latest/. Introduction PPT can be found [here](https://github.com/ni1o1/transbigdata/blob/main/introduction/IntroductionofTransBigData.pdf) and [here(in Chinese)](https://github.com/ni1o1/transbigdata/blob/main/introduction/gridbasedframework.pdf) + +### Target Audience + +The target audience of `TransBigData` includes: + +- Data science researchers and data engineers in the field of transportation big data, smart transportation systems, and urban computing, particularly those who want to integrate innovative algorithms into intelligent trasnportation systems +- Government, enterprises, or other entities who expect efficient and reliable management decision support through transportation spatio-temporal data analysis. + +### Technical Features + +* Provide a variety of processing methods for each stage of transportation spatio-temporal big data analysis. +* The code with `TransBigData` is clean, efficient, flexible, and easy to use, allowing complex data tasks to be achieved with concise code. + +### Main Functions + +Currently, `TransBigData` mainly provides the following methods: + +* **Data Quality**: Provides methods to quickly obtain the general information of the dataset, including the data amount the time period and the sampling interval. +* **Data Preprocess**: Provides methods to clean multiple types of data error. +* **Data Gridding**: Provides methods to generate multiple types of geographic grids (Rectangular grids, Hexagonal grids) in the research area. Provides fast algorithms to map GPS data to the generated grids. +* **Data Aggregating**: Provides methods to aggregate GPS data and OD data into geographic polygon. +* **Data Visualization**: Built-in visualization capabilities leverage the visualization package keplergl to interactively visualize data on Jupyter notebook with simple code. +* **Trajectory Processing**: Provides methods to process trajectory data, including generating trajectory linestring from GPS points, and trajectory densification, etc. +* **Basemap Loading**: Provides methods to display Mapbox basemap on matplotlib figures + +## Installation + +`TransBigData` support Python >= 3.6 + +### Using pypi [![PyPI version](https://badge.fury.io/py/transbigdata.svg)](https://badge.fury.io/py/transbigdata) + +`TransBigData` can be installed by using `pip install`. Before installing `TransBigData`, make sure that you have installed the available [geopandas package](https://geopandas.org/en/stable/getting_started/install.html). If you already have geopandas installed, run the following code directly from the command prompt to install `TransBigData`: + + pip install transbigdata + +### Using conda-forge [![Conda Version](https://img.shields.io/conda/vn/conda-forge/transbigdata.svg)](https://anaconda.org/conda-forge/transbigdata) [![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/transbigdata.svg)](https://anaconda.org/conda-forge/transbigdata) + +You can also install `TransBigData` by `conda-forge`, this will automaticaly solve the dependency, it can be installed with: + + conda install -c conda-forge transbigdata + +## Contributing to TransBigData [![GitHub contributors](https://img.shields.io/github/contributors/ni1o1/transbigdata.svg)](https://github.com/ni1o1/transbigdata/graphs/contributors) [![Join the chat at https://gitter.im/transbigdata/community](https://badges.gitter.im/transbigdata/community.svg)](https://gitter.im/transbigdata/community?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge) ![GitHub commit activity](https://img.shields.io/github/commit-activity/m/ni1o1/transbigdata) + +All contributions, bug reports, bug fixes, documentation improvements, enhancements and ideas are welcome. A detailed overview on how to contribute can be found in the [contributing guide](https://github.com/ni1o1/transbigdata/blob/master/CONTRIBUTING.md) on GitHub. + +## Examples + +### Example of data visualization + +#### Visualize trajectories (with keplergl) + +![gif](https://github.com/ni1o1/transbigdata/raw/main/image/README/tbdexample1.gif) + +#### Visualize data distribution (with keplergl) + +![gif](https://github.com/ni1o1/transbigdata/raw/main/image/README/tbdexample2.gif) + +#### Visualize OD (with keplergl) + +![gif](https://github.com/ni1o1/transbigdata/raw/main/image/README/tbdexample3.gif) + +### Example of taxi GPS data processing + +The following example shows how to use the `TransBigData` to perform data gridding, data aggregating and data visualization for taxi GPS data. + +#### Read the data + +```python +import transbigdata as tbd +import pandas as pd +#Read taxi gps data +data = pd.read_csv('TaxiData-Sample.csv',header = None) +data.columns = ['VehicleNum','time','lon','lat','OpenStatus','Speed'] +data +``` + +
+ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
VehicleNumtimelonlatOpenStatusSpeed
03474520:27:43113.80684722.623249127
13474520:24:07113.80989822.62739900
23474520:24:27113.80989822.62739900
33474520:22:07113.81134822.62806700
43474520:10:06113.81988522.647800054
.....................
5449942826521:35:13114.32150322.709499018
5449952826509:08:02114.32270122.68170000
5449962826509:14:31114.33670022.69010000
5449972826521:19:12114.35260022.72839900
5449982826519:08:06114.13770322.62170000
+

544999 rows × 6 columns

+
+ +#### Data pre-processing + +Define the study area and use the `tbd.clean_outofbounds` method to delete the data out of the study area + +```python +#Define the study area +bounds = [113.75, 22.4, 114.62, 22.86] +#Delete the data out of the study area +data = tbd.clean_outofbounds(data,bounds = bounds,col = ['lon','lat']) +``` + +#### Data gridding + +The most basic way to express the data distribution is in the form of geograpic grids. `TransBigData` provides methods to generate multiple types of geographic grids (Rectangular grids, Hexagonal grids) in the research area. For rectangular gridding, you need to determine the gridding parameters at first (which can be interpreted as defining a grid coordinate system): + +```python +#Obtain the gridding parameters +params = tbd.area_to_params(bounds,accuracy = 1000) +params +``` + +> {'slon': 113.75, +> 'slat': 22.4, +> 'deltalon': 0.00974336289289822, +> 'deltalat': 0.008993210412845813, +> 'theta': 0, +> 'method': 'rect', +> 'gridsize': 1000} + +The gridding parameters store the information of the initial position, the size and the angle of the gridding system. + +The next step is to map the GPS data to their corresponding grids. Using the `tbd.GPS_to_grid`, it will generate the `LONCOL` column and the `LATCOL` column (Rectangular grids). The two columns together can specify a grid: + +```python +#Map the GPS data to grids +data['LONCOL'],data['LATCOL'] = tbd.GPS_to_grid(data['lon'],data['lat'],params) +``` + +Count the amount of data in each grids, generate the geometry of the grids and transform it into a GeoDataFrame: + +```python +#Aggregate data into grids +grid_agg = data.groupby(['LONCOL','LATCOL'])['VehicleNum'].count().reset_index() +#Generate grid geometry +grid_agg['geometry'] = tbd.grid_to_polygon([grid_agg['LONCOL'],grid_agg['LATCOL']],params) +#Change the type into GeoDataFrame +import geopandas as gpd +grid_agg = gpd.GeoDataFrame(grid_agg) +#Plot the grids +grid_agg.plot(column = 'VehicleNum',cmap = 'autumn_r') +``` + +![png](https://github.com/ni1o1/transbigdata/raw/main/image/README/output_5_1.png) + +#### Triangle and Hexagon grids & rotation angle + +`TransBigData` also support the triangle and hexagon grids. It also supports given rotation angle for the grids. We can alter the gridding parameter: + +```python +#set to the hexagon grids +params['method'] = 'hexa' +#or set as triangle grids: params['method'] = 'tri' +#set a rotation angle (degree) +params['theta'] = 5 +``` + +Then we can do the GPS data matching again: + +```python +#Triangle and Hexagon grids requires three columns to store ID +data['loncol_1'],data['loncol_2'],data['loncol_3'] = tbd.GPS_to_grid(data['lon'],data['lat'],params) +#Aggregate data into grids +grid_agg = data.groupby(['loncol_1','loncol_2','loncol_3'])['VehicleNum'].count().reset_index() +#Generate grid geometry +grid_agg['geometry'] = tbd.grid_to_polygon([grid_agg['loncol_1'],grid_agg['loncol_2'],grid_agg['loncol_3']],params) +#Change the type into GeoDataFrame +import geopandas as gpd +grid_agg = gpd.GeoDataFrame(grid_agg) +#Plot the grids +grid_agg.plot(column = 'VehicleNum',cmap = 'autumn_r') +``` + +![1648714436503.png](https://github.com/ni1o1/transbigdata/raw/main/image/README/1648714436503.png) + +#### Data Visualization(with basemap) + +For a geographical data visualization figure, we still have to add the basemap, the colorbar, the compass and the scale. Use `tbd.plot_map` to load the basemap and `tbd.plotscale` to add compass and scale in matplotlib figure: + +```python +import matplotlib.pyplot as plt +fig =plt.figure(1,(8,8),dpi=300) +ax =plt.subplot(111) +plt.sca(ax) +#Load basemap +tbd.plot_map(plt,bounds,zoom = 11,style = 4) +#Define colorbar +cax = plt.axes([0.05, 0.33, 0.02, 0.3]) +plt.title('Data count') +plt.sca(ax) +#Plot the data +grid_agg.plot(column = 'VehicleNum',cmap = 'autumn_r',ax = ax,cax = cax,legend = True) +#Add scale +tbd.plotscale(ax,bounds = bounds,textsize = 10,compasssize = 1,accuracy = 2000,rect = [0.06,0.03],zorder = 10) +plt.axis('off') +plt.xlim(bounds[0],bounds[2]) +plt.ylim(bounds[1],bounds[3]) +plt.show() +``` + +![1648714582961.png](https://github.com/ni1o1/transbigdata/raw/main/image/README/1648714582961.png) + +#### Griding framework offered by TransBigData + +Here is an overview of the gridding framework offered by `TransBigData`. + +![1648715064154.png](https://github.com/ni1o1/transbigdata/raw/main/image/README/1648715064154.png) + +See [This Example](https://github.com/ni1o1/transbigdata/blob/main/docs/source/gallery/Example%202-Grid-base%20processing%20framework%20of%20TransBigData.ipynb) for further details. + +## Citation information [![DOI](https://zenodo.org/badge/419559811.svg)](https://zenodo.org/badge/latestdoi/419559811) [![status](https://joss.theoj.org/papers/d1055fe3105dfa2dcff4cb6c7688a79b/status.svg)](https://joss.theoj.org/papers/d1055fe3105dfa2dcff4cb6c7688a79b) + +Please cite [this](https://doi.org/10.21105/joss.04021) when using `TransBigData` in your research. Citation information can be found at [CITATION.cff](https://github.com/ni1o1/transbigdata/blob/main/CITATION.cff). + +## Introducing Video (In Chinese) [![bilibili](https://img.shields.io/badge/bilibili-%E5%90%8C%E6%B5%8E%E5%B0%8F%E6%97%AD%E5%AD%A6%E9%95%BF-green.svg)](https://space.bilibili.com/3051484) + +* [Bilibili](https://www.bilibili.com/video/BV1na411i7sd/) +* [Youtube](https://www.youtube.com/watch?v=ynqJ01WmPiQ) + + +%prep +%autosetup -n transbigdata-0.4.17 + +%build +%py3_build + +%install +%py3_install +install -d -m755 %{buildroot}/%{_pkgdocdir} +if [ -d doc ]; then cp -arf doc %{buildroot}/%{_pkgdocdir}; fi +if [ -d docs ]; then cp -arf docs %{buildroot}/%{_pkgdocdir}; fi +if [ -d example ]; then cp -arf example %{buildroot}/%{_pkgdocdir}; fi +if [ -d examples ]; then cp -arf examples %{buildroot}/%{_pkgdocdir}; fi +pushd %{buildroot} +if [ -d usr/lib ]; then + find usr/lib -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/lib64 ]; then + find usr/lib64 -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/bin ]; then + find usr/bin -type f -printf "/%h/%f\n" >> filelist.lst +fi +if [ -d usr/sbin ]; then + find usr/sbin -type f -printf "/%h/%f\n" >> filelist.lst +fi +touch doclist.lst +if [ -d usr/share/man ]; then + find usr/share/man -type f -printf "/%h/%f.gz\n" >> doclist.lst +fi +popd +mv %{buildroot}/filelist.lst . +mv %{buildroot}/doclist.lst . + +%files -n python3-transbigdata -f filelist.lst +%dir %{python3_sitelib}/* + +%files help -f doclist.lst +%{_docdir}/* + +%changelog +* Wed May 10 2023 Python_Bot - 0.4.17-1 +- Package Spec generated diff --git a/sources b/sources new file mode 100644 index 0000000..936623b --- /dev/null +++ b/sources @@ -0,0 +1 @@ +a9a86600b846a7d1e4843df0895f4292 transbigdata-0.4.17.tar.gz -- cgit v1.2.3