Advanced search
2 files | 6.33 MB Add to list

Web-scale provenance reconstruction of implicit information diffusion on social media

Author
Organization
Abstract
Fast, massive, and viral data diffused on social media affects a large share of the online population, and thus, the (prospective) information diffusion mechanisms behind it are of great interest to researchers. The (retrospective) provenance of such data is equally important because it contributes to the understanding of the relevance and trustworthiness of the information. Furthermore, computing provenance in a timely way is crucial for particular use cases and practitioners, such as online journalists that promptly need to assess specific pieces of information. Social media currently provide insufficient mechanisms for provenance tracking, publication and generation, while state-of-the-art on social media research focuses mainly on explicit diffusion mechanisms (like retweets in Twitter or reshares in Facebook).The implicit diffusion mechanisms remain understudied due to the difficulties of being captured and properly understood. From a technical side, the state of the art for provenance reconstruction evaluates small datasets after the fact, sidestepping requirements for scale and speed of current social media data. In this paper, we investigate the mechanisms of implicit information diffusion by computing its fine-grained provenance. We prove that explicit mechanisms are insufficient to capture influence and our analysis unravels a significant part of implicit interactions and influence in social media. Our approach works incrementally and can be scaled up to cover a truly Web-scale scenario like major events. We can process datasets consisting of up to several millions of messages on a single machine at rates that cover bursty behaviour, without compromising result quality. By doing that, we provide to online journalists and social media users in general, fine grained provenance reconstruction which sheds lights on implicit interactions not captured by social media providers. These results are provided in an online fashion which also allows for fast relevance and trustworthiness assessment.
Keywords
Provenance, Information diffusion, Incremental clustering, Social media, Influence

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 3.16 MB
  • DS74 i.pdf
    • full text
    • |
    • open access
    • |
    • PDF
    • |
    • 3.17 MB

Citation

Please use this url to cite or link to this publication:

MLA
Taxidou, Io, et al. “Web-Scale Provenance Reconstruction of Implicit Information Diffusion on Social Media.” DISTRIBUTED AND PARALLEL DATABASES, vol. 36, no. 1, Springer, 2018, pp. 47–79, doi:10.1007/s10619-017-7211-3.
APA
Taxidou, I., Lieber, S., Fischer, P. M., De Nies, T., & Verborgh, R. (2018). Web-scale provenance reconstruction of implicit information diffusion on social media. DISTRIBUTED AND PARALLEL DATABASES, 36(1), 47–79. https://doi.org/10.1007/s10619-017-7211-3
Chicago author-date
Taxidou, Io, Sven Lieber, Peter M. Fischer, Tom De Nies, and Ruben Verborgh. 2018. “Web-Scale Provenance Reconstruction of Implicit Information Diffusion on Social Media.” DISTRIBUTED AND PARALLEL DATABASES 36 (1): 47–79. https://doi.org/10.1007/s10619-017-7211-3.
Chicago author-date (all authors)
Taxidou, Io, Sven Lieber, Peter M. Fischer, Tom De Nies, and Ruben Verborgh. 2018. “Web-Scale Provenance Reconstruction of Implicit Information Diffusion on Social Media.” DISTRIBUTED AND PARALLEL DATABASES 36 (1): 47–79. doi:10.1007/s10619-017-7211-3.
Vancouver
1.
Taxidou I, Lieber S, Fischer PM, De Nies T, Verborgh R. Web-scale provenance reconstruction of implicit information diffusion on social media. DISTRIBUTED AND PARALLEL DATABASES. 2018;36(1):47–79.
IEEE
[1]
I. Taxidou, S. Lieber, P. M. Fischer, T. De Nies, and R. Verborgh, “Web-scale provenance reconstruction of implicit information diffusion on social media,” DISTRIBUTED AND PARALLEL DATABASES, vol. 36, no. 1, pp. 47–79, 2018.
@article{8557772,
  abstract     = {{Fast, massive, and viral data diffused on social media affects a large share of the online population, and thus, the (prospective) information diffusion mechanisms behind it are of great interest to researchers. The (retrospective) provenance of such data is equally important because it contributes to the understanding of the relevance and trustworthiness of the information. Furthermore, computing provenance in a timely way is crucial for particular use cases and practitioners, such as online journalists that promptly need to assess specific pieces of information. Social media currently provide insufficient mechanisms for provenance tracking, publication and generation, while state-of-the-art on social media research focuses mainly on explicit diffusion mechanisms (like retweets in Twitter or reshares in Facebook).The implicit diffusion mechanisms remain understudied due to the difficulties of being captured and properly understood. From a technical side, the state of the art for provenance reconstruction evaluates small datasets after the fact, sidestepping requirements for scale and speed of current social media data. In this paper, we investigate the mechanisms of implicit information diffusion by computing its fine-grained provenance. We prove that explicit mechanisms are insufficient to capture influence and our analysis unravels a significant part of implicit interactions and influence in social media. Our approach works incrementally and can be scaled up to cover a truly Web-scale scenario like major events. We can process datasets consisting of up to several millions of messages on a single machine at rates that cover bursty behaviour, without compromising result quality. By doing that, we provide to online journalists and social media users in general, fine grained provenance reconstruction which sheds lights on implicit interactions not captured by social media providers. These results are provided in an online fashion which also allows for fast relevance and trustworthiness assessment.}},
  author       = {{Taxidou, Io and Lieber, Sven and Fischer, Peter M. and De Nies, Tom and Verborgh, Ruben}},
  issn         = {{0926-8782}},
  journal      = {{DISTRIBUTED AND PARALLEL DATABASES}},
  keywords     = {{Provenance,Information diffusion,Incremental clustering,Social media,Influence}},
  language     = {{eng}},
  number       = {{1}},
  pages        = {{47--79}},
  publisher    = {{Springer}},
  title        = {{Web-scale provenance reconstruction of implicit information diffusion on social media}},
  url          = {{http://doi.org/10.1007/s10619-017-7211-3}},
  volume       = {{36}},
  year         = {{2018}},
}

Altmetric
View in Altmetric
Web of Science
Times cited: