Advanced search
1 file | 1.45 MB

On label dependence and loss minimization in multi-label classification

(2012) MACHINE LEARNING. 88(1-2). p.5-45
Author
Organization
Abstract
Most of the multi-label classification (MLC) methods proposed in recent years intended to exploit, in one way or the other, dependencies between the class labels. Comparing to simple binary relevance learning as a baseline, any gain in performance is normally explained by the fact that this method is ignoring such dependencies. Without questioning the correctness of such studies, one has to admit that a blanket explanation of that kind is hiding many subtle details, and indeed, the underlying mechanisms and true reasons for the improvements reported in experimental studies are rarely laid bare. Rather than proposing yet another MLC algorithm, the aim of this paper is to elaborate more closely on the idea of exploiting label dependence, thereby contributing to a better understanding of MLC. Adopting a statistical perspective, we claim that two types of label dependence should be distinguished, namely conditional and marginal dependence. Subsequently, we present three scenarios in which the exploitation of one of these types of dependence may boost the predictive performance of a classifier. In this regard, a close connection with loss minimization is established, showing that the benefit of exploiting label dependence does also depend on the type of loss to be minimized. Concrete theoretical results are presented for two representative loss functions, namely the Hamming loss and the subset 0/1 loss. In addition, we give an overview of state-of-the-art decomposition algorithms for MLC and we try to reveal the reasons for their effectiveness. Our conclusions are supported by carefully designed experiments on synthetic and benchmark data.
Keywords
Label dependence, Multi-label classification, Loss functions

Downloads

  • (...).pdf
    • full text
    • |
    • UGent only
    • |
    • PDF
    • |
    • 1.45 MB

Citation

Please use this url to cite or link to this publication:

Chicago
Dembczyński, Krzystzof, Willem Waegeman, Weiwei Cheng, and Eyke Hüllermeier. 2012. “On Label Dependence and Loss Minimization in Multi-label Classification.” Machine Learning 88 (1-2): 5–45.
APA
Dembczyński, Krzystzof, Waegeman, W., Cheng, W., & Hüllermeier, E. (2012). On label dependence and loss minimization in multi-label classification. MACHINE LEARNING, 88(1-2), 5–45.
Vancouver
1.
Dembczyński K, Waegeman W, Cheng W, Hüllermeier E. On label dependence and loss minimization in multi-label classification. MACHINE LEARNING. 2012;88(1-2):5–45.
MLA
Dembczyński, Krzystzof, Willem Waegeman, Weiwei Cheng, et al. “On Label Dependence and Loss Minimization in Multi-label Classification.” MACHINE LEARNING 88.1-2 (2012): 5–45. Print.
@article{2919153,
  abstract     = {Most of the multi-label classification (MLC) methods proposed in recent years intended to exploit, in one way or the other, dependencies between the class labels. Comparing to simple binary relevance learning as a baseline, any gain in performance is normally explained by the fact that this method is ignoring such dependencies. Without questioning the correctness of such studies, one has to admit that a blanket explanation of that kind is hiding many subtle details, and indeed, the underlying mechanisms and true reasons for the improvements reported in experimental studies are rarely laid bare. Rather than proposing yet another MLC algorithm, the aim of this paper is to elaborate more closely on the idea of exploiting label dependence, thereby contributing to a better understanding of MLC. Adopting a statistical perspective, we claim that two types of label dependence should be distinguished, namely conditional and marginal dependence. Subsequently, we present three scenarios in which the exploitation of one of these types of dependence may boost the predictive performance of a classifier. In this regard, a close connection with loss minimization is established, showing that the benefit of exploiting label dependence does also depend on the type of loss to be minimized. Concrete theoretical results are presented for two representative loss functions, namely the Hamming loss and the subset 0/1 loss. In addition, we give an overview of state-of-the-art decomposition algorithms for MLC and we try to reveal the reasons for their effectiveness. Our conclusions are supported by carefully designed experiments on synthetic and benchmark data.},
  author       = {Dembczy\'{n}ski, Krzystzof and Waegeman, Willem and Cheng, Weiwei and H{\"u}llermeier, Eyke},
  issn         = {0885-6125},
  journal      = {MACHINE LEARNING},
  keyword      = {Label dependence,Multi-label classification,Loss functions},
  language     = {eng},
  number       = {1-2},
  pages        = {5--45},
  title        = {On label dependence and loss minimization in multi-label classification},
  url          = {http://dx.doi.org/10.1007/s10994-012-5285-8},
  volume       = {88},
  year         = {2012},
}

Altmetric
View in Altmetric
Web of Science
Times cited: