Advanced search
1 file | 4.55 MB

Performance of four subjective video quality assessment protocols and impact of different rating pre-processing and analysis methods

Author
Organization
Abstract
Standardization bodies recommend various protocols to conduct subjective quality assessment (QA) of imaging systems. While many studies have compared these QA protocols, few have assessed the impact of different approaches for preprocessing and analyzing quality ratings. Furthermore, the effect of large versus small quality differences on the discrimination performance of protocols has not been extensively studied in video QA. This study examines these issues for four QA protocols. H.264 compressed medical videos and denoised natural scene videos were evaluated by expert and naive subjects. Scores were collected with four QA protocols – forced choice (FC), two ratio-scaled paired comparison methods: preference (Pref) and dissimilarity (Dissim), and single stimulus (SS) – and analyzed using combinations of different rating pre-processing approaches, generating a total of 14 methods. Performance metrics – probability and effect size – quantified the ability of the methods to discriminate between quality levels. The Pref and Dissim methods analyzed with classical multidimensional scaling and the SS method with Z-score transformation consistently outperformed the other methods. The type of pre-processing introduced large differences in the performance of individual protocols. Grouping stimuli pairs by small and large quality differences introduced significant differences in the performance rankings of the methods, with Pref and Dissim most sensitive to small quality differences. For the future, we suggest further validation of the FC method, due to its simplicity and ease of use, and continued investigation into more robust raw scores transformation and statistical analysis methods for both SS and FC ratings.
Keywords
Video quality assessment, statistical analysis, subjective ratings, mean opinion score (MOS)

Downloads

    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 4.55 MB

Citation

Please use this url to cite or link to this publication:

Chicago
Kumcu, Asli, Klaas Bombeke, Ljiljana Platisa, Ljubomir Jovanov, Jan Van Looy, and Wilfried Philips. 2017. “Performance of Four Subjective Video Quality Assessment Protocols and Impact of Different Rating Pre-processing and Analysis Methods.” Ieee Journal of Selected Topics in Signal Processing  11 (1): 48–63.
APA
Kumcu, A., Bombeke, K., Platisa, L., Jovanov, L., Van Looy, J., & Philips, W. (2017). Performance of four subjective video quality assessment protocols and impact of different rating pre-processing and analysis methods. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING  , 11(1), 48–63.
Vancouver
1.
Kumcu A, Bombeke K, Platisa L, Jovanov L, Van Looy J, Philips W. Performance of four subjective video quality assessment protocols and impact of different rating pre-processing and analysis methods. IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING  . Institute of Electrical and Electronics Engineers (IEEE); 2017;11(1):48–63.
MLA
Kumcu, Asli, Klaas Bombeke, Ljiljana Platisa, et al. “Performance of Four Subjective Video Quality Assessment Protocols and Impact of Different Rating Pre-processing and Analysis Methods.” IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING  11.1 (2017): 48–63. Print.
@article{8504490,
  abstract     = {Standardization bodies recommend various protocols to conduct subjective quality assessment (QA) of imaging systems. While many studies have compared these QA protocols, few have assessed the impact of different approaches for preprocessing and analyzing quality ratings. Furthermore, the effect of large versus small quality differences on the discrimination performance of protocols has not been extensively studied in video QA. This study examines these issues for four QA protocols. H.264 compressed medical videos and denoised natural scene videos were evaluated by expert and naive subjects. Scores were collected with four QA protocols -- forced choice (FC), two ratio-scaled paired comparison methods: preference (Pref) and dissimilarity (Dissim), and single stimulus (SS) -- and analyzed using combinations of different rating pre-processing approaches, generating a total of 14 methods. Performance metrics -- probability and effect size -- quantified the ability of the methods to discriminate between quality levels. The Pref and Dissim methods analyzed with classical multidimensional scaling and the SS method with Z-score transformation consistently outperformed the other methods. The type of pre-processing introduced large differences in the performance of individual protocols. Grouping stimuli pairs by small and large quality differences introduced significant differences in the performance rankings of the methods, with Pref and Dissim most sensitive to small quality differences. For the future, we suggest further validation of the FC method, due to its simplicity and ease of use, and continued investigation into more robust raw scores transformation and statistical analysis methods for both SS and FC ratings.},
  author       = {Kumcu, Asli and Bombeke, Klaas and Platisa, Ljiljana and Jovanov, Ljubomir and Van Looy, Jan and Philips, Wilfried},
  issn         = {1932-4553},
  journal      = {IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING                                         },
  keyword      = {Video quality assessment,statistical analysis,subjective ratings,mean opinion score (MOS)},
  language     = {eng},
  number       = {1},
  pages        = {48--63},
  publisher    = {Institute of Electrical and Electronics Engineers (IEEE)},
  title        = {Performance of four subjective video quality assessment protocols and impact of different rating pre-processing and analysis methods},
  url          = {http://dx.doi.org/10.1109/jstsp.2016.2638681},
  volume       = {11},
  year         = {2017},
}

Altmetric
View in Altmetric
Web of Science
Times cited: