
Geospatial partitioning of open transit data
- Author
- Harm Delva (UGent) , Julian Andres Rojas Melendez (UGent) , Pieter-Jan Vandenberghe, Pieter Colpaert (UGent) and Ruben Verborgh (UGent)
- Organization
- Abstract
- Public transit operators often publish their open data as a single data dump, but developers with limited computational resources may not be able to process all this data. Existing work has already focused on fragmenting the data by departure time, so that data consumers can be more selective in the data they process. However, each fragment still contains data from the entire operator's service area. We build upon this idea by fragmenting geospatially as well as by departure time. Our method is robust to changes in the original data, such as the deletion or the addition of stops, which is crucial in scenarios where data publishers do not control the data itself. In this paper we explore popular clustering methods such as k-means and METIS, alongside two simple domain-specific methods of our own. We compare the effectiveness of each for the use case of client-side route planning, focusing on the ease of use of the data and the cacheability of the data fragments. Our results show that simply clustering stops by their proximity to 8 transport hubs yields the most promising results: queries are 2.4 times faster and download 4 times less data. More than anything though, our results show that the difference between clustering methods is small, and that engineers can safely choose practical and simple solutions. We expect that this insight also holds true for publishing other geospatial data such as road networks, sensor data, or points of interest.
- Keywords
- Linked open data, Mobility, Maintainability, Web API engineering
Downloads
-
DS333 i.pdf
- full text (Accepted manuscript)
- |
- open access
- |
- |
- 1.62 MB
-
(...).pdf
- full text (Published version)
- |
- UGent only
- |
- |
- 1.64 MB
Citation
Please use this url to cite or link to this publication: http://hdl.handle.net/1854/LU-8664563
- MLA
- Delva, Harm, et al. “Geospatial Partitioning of Open Transit Data.” WEB ENGINEERING, ICWE 2020, edited by M Bielikova et al., vol. 12128, Springer, 2020, pp. 305–20, doi:10.1007/978-3-030-50578-3_21.
- APA
- Delva, H., Rojas Melendez, J. A., Vandenberghe, P.-J., Colpaert, P., & Verborgh, R. (2020). Geospatial partitioning of open transit data. In M. Bielikova, T. Mikkonen, & C. Pautasso (Eds.), WEB ENGINEERING, ICWE 2020 (Vol. 12128, pp. 305–320). https://doi.org/10.1007/978-3-030-50578-3_21
- Chicago author-date
- Delva, Harm, Julian Andres Rojas Melendez, Pieter-Jan Vandenberghe, Pieter Colpaert, and Ruben Verborgh. 2020. “Geospatial Partitioning of Open Transit Data.” In WEB ENGINEERING, ICWE 2020, edited by M Bielikova, T Mikkonen, and C Pautasso, 12128:305–20. Springer. https://doi.org/10.1007/978-3-030-50578-3_21.
- Chicago author-date (all authors)
- Delva, Harm, Julian Andres Rojas Melendez, Pieter-Jan Vandenberghe, Pieter Colpaert, and Ruben Verborgh. 2020. “Geospatial Partitioning of Open Transit Data.” In WEB ENGINEERING, ICWE 2020, ed by. M Bielikova, T Mikkonen, and C Pautasso, 12128:305–320. Springer. doi:10.1007/978-3-030-50578-3_21.
- Vancouver
- 1.Delva H, Rojas Melendez JA, Vandenberghe P-J, Colpaert P, Verborgh R. Geospatial partitioning of open transit data. In: Bielikova M, Mikkonen T, Pautasso C, editors. WEB ENGINEERING, ICWE 2020. Springer; 2020. p. 305–20.
- IEEE
- [1]H. Delva, J. A. Rojas Melendez, P.-J. Vandenberghe, P. Colpaert, and R. Verborgh, “Geospatial partitioning of open transit data,” in WEB ENGINEERING, ICWE 2020, online, 2020, vol. 12128, pp. 305–320.
@inproceedings{8664563, abstract = {{Public transit operators often publish their open data as a single data dump, but developers with limited computational resources may not be able to process all this data. Existing work has already focused on fragmenting the data by departure time, so that data consumers can be more selective in the data they process. However, each fragment still contains data from the entire operator's service area. We build upon this idea by fragmenting geospatially as well as by departure time. Our method is robust to changes in the original data, such as the deletion or the addition of stops, which is crucial in scenarios where data publishers do not control the data itself. In this paper we explore popular clustering methods such as k-means and METIS, alongside two simple domain-specific methods of our own. We compare the effectiveness of each for the use case of client-side route planning, focusing on the ease of use of the data and the cacheability of the data fragments. Our results show that simply clustering stops by their proximity to 8 transport hubs yields the most promising results: queries are 2.4 times faster and download 4 times less data. More than anything though, our results show that the difference between clustering methods is small, and that engineers can safely choose practical and simple solutions. We expect that this insight also holds true for publishing other geospatial data such as road networks, sensor data, or points of interest.}}, author = {{Delva, Harm and Rojas Melendez, Julian Andres and Vandenberghe, Pieter-Jan and Colpaert, Pieter and Verborgh, Ruben}}, booktitle = {{WEB ENGINEERING, ICWE 2020}}, editor = {{Bielikova, M and Mikkonen, T and Pautasso, C}}, isbn = {{9783030505776}}, issn = {{0302-9743}}, keywords = {{Linked open data,Mobility,Maintainability,Web API engineering}}, language = {{eng}}, location = {{online}}, pages = {{305--320}}, publisher = {{Springer}}, title = {{Geospatial partitioning of open transit data}}, url = {{http://dx.doi.org/10.1007/978-3-030-50578-3_21}}, volume = {{12128}}, year = {{2020}}, }
- Altmetric
- View in Altmetric
- Web of Science
- Times cited: