Advanced search
2 files | 1.39 MB Add to list

Clip-level feature aggregation : a key factor for video-based person re-identification

Chengjin Lyu (UGent) , Patrick Heyer Wollenberg (UGent) , Ljiljana Platisa (UGent) , Bart Goossens (UGent) , Peter Veelaert (UGent) and Wilfried Philips (UGent)
Author
Organization
Abstract
In the task of video-based person re-identification, features of persons in the query and gallery sets are compared to search the best match. Generally, most existing methods aggregate the frame-level features together using a temporal method to generate the clip-level fea- tures, instead of the sequence-level representations. In this paper, we propose a new method that aggregates the clip-level features to obtain the sequence-level representations of persons, which consists of two parts, i.e., Average Aggregation Strategy (AAS) and Raw Feature Utilization (RFU). AAS makes use of all frames in a video sequence to generate a better representation of a person, while RFU investigates how batch normalization operation influences feature representations in person re- identification. The experimental results demonstrate that our method can boost the performance of existing models for better accuracy. In particular, we achieve 87.7% rank-1 and 82.3% mAP on MARS dataset without any post-processing procedure, which outperforms the existing state-of-the-art.
Keywords
Person re-identification, Convolutional neural network, Feature aggregation

Downloads

  • (...).pdf
    • full text (Published version)
    • |
    • UGent only
    • |
    • PDF
    • |
    • 845.89 KB
  • Lyu C accepted version.pdf
    • full text (Accepted manuscript)
    • |
    • open access
    • |
    • PDF
    • |
    • 543.30 KB

Citation

Please use this url to cite or link to this publication:

MLA
Lyu, Chengjin, et al. “Clip-Level Feature Aggregation : A Key Factor for Video-Based Person Re-Identification.” Advanced Concepts for Intelligent Vision Systems - ACIVS 2020, edited by Jacques Blanc-Talon et al., vol. 12002, Springer, 2020, pp. 179–91, doi:10.1007/978-3-030-40605-9_16.
APA
Lyu, C., Heyer Wollenberg, P., Platisa, L., Goossens, B., Veelaert, P., & Philips, W. (2020). Clip-level feature aggregation : a key factor for video-based person re-identification. In J. Blanc-Talon, P. Delmas, W. Philips, D. Popescu, & P. Scheunders (Eds.), Advanced concepts for intelligent vision systems - ACIVS 2020 (Vol. 12002, pp. 179–191). https://doi.org/10.1007/978-3-030-40605-9_16
Chicago author-date
Lyu, Chengjin, Patrick Heyer Wollenberg, Ljiljana Platisa, Bart Goossens, Peter Veelaert, and Wilfried Philips. 2020. “Clip-Level Feature Aggregation : A Key Factor for Video-Based Person Re-Identification.” In Advanced Concepts for Intelligent Vision Systems - ACIVS 2020, edited by Jacques Blanc-Talon, Patrice Delmas, Wilfried Philips, Dan Popescu, and Paul Scheunders, 12002:179–91. Springer. https://doi.org/10.1007/978-3-030-40605-9_16.
Chicago author-date (all authors)
Lyu, Chengjin, Patrick Heyer Wollenberg, Ljiljana Platisa, Bart Goossens, Peter Veelaert, and Wilfried Philips. 2020. “Clip-Level Feature Aggregation : A Key Factor for Video-Based Person Re-Identification.” In Advanced Concepts for Intelligent Vision Systems - ACIVS 2020, ed by. Jacques Blanc-Talon, Patrice Delmas, Wilfried Philips, Dan Popescu, and Paul Scheunders, 12002:179–191. Springer. doi:10.1007/978-3-030-40605-9_16.
Vancouver
1.
Lyu C, Heyer Wollenberg P, Platisa L, Goossens B, Veelaert P, Philips W. Clip-level feature aggregation : a key factor for video-based person re-identification. In: Blanc-Talon J, Delmas P, Philips W, Popescu D, Scheunders P, editors. Advanced concepts for intelligent vision systems - ACIVS 2020. Springer; 2020. p. 179–91.
IEEE
[1]
C. Lyu, P. Heyer Wollenberg, L. Platisa, B. Goossens, P. Veelaert, and W. Philips, “Clip-level feature aggregation : a key factor for video-based person re-identification,” in Advanced concepts for intelligent vision systems - ACIVS 2020, Auckland, New Zealand, 2020, vol. 12002, pp. 179–191.
@inproceedings{8634968,
  abstract     = {{In the task of video-based person re-identification, features
of persons in the query and gallery sets are compared to search the
best match. Generally, most existing methods aggregate the frame-level
features together using a temporal method to generate the clip-level fea-
tures, instead of the sequence-level representations. In this paper, we
propose a new method that aggregates the clip-level features to obtain
the sequence-level representations of persons, which consists of two parts,
i.e., Average Aggregation Strategy (AAS) and Raw Feature Utilization
(RFU). AAS makes use of all frames in a video sequence to generate
a better representation of a person, while RFU investigates how batch
normalization operation influences feature representations in person re-
identification. The experimental results demonstrate that our method
can boost the performance of existing models for better accuracy. In
particular, we achieve 87.7% rank-1 and 82.3% mAP on MARS dataset
without any post-processing procedure, which outperforms the existing
state-of-the-art.}},
  author       = {{Lyu, Chengjin and Heyer Wollenberg, Patrick and Platisa, Ljiljana and Goossens, Bart and Veelaert, Peter and Philips, Wilfried}},
  booktitle    = {{Advanced concepts for intelligent vision systems - ACIVS 2020}},
  editor       = {{Blanc-Talon, Jacques and Delmas, Patrice and Philips, Wilfried and Popescu, Dan and Scheunders, Paul}},
  isbn         = {{9783030406042}},
  issn         = {{0302-9743}},
  keywords     = {{Person re-identification,Convolutional neural network,Feature aggregation}},
  language     = {{eng}},
  location     = {{Auckland, New Zealand}},
  pages        = {{179--191}},
  publisher    = {{Springer}},
  title        = {{Clip-level feature aggregation : a key factor for video-based person re-identification}},
  url          = {{http://doi.org/10.1007/978-3-030-40605-9_16}},
  volume       = {{12002}},
  year         = {{2020}},
}

Altmetric
View in Altmetric