Conference Publication Details
Mandatory Fields
Guo J.;Foley C.;Gurrin C.;Lao S.
Proceedings - IEEE International Conference on Multimedia and Expo
Semantic concept detection in imbalanced datasets based on different under-sampling strategies
2011
November
Published
1
()
Optional Fields
Classification Imbalanced Dataset SVM TRECVid Under-sampling
Semantic concept detection is a very useful technique for developing powerful retrieval or filtering systems for multimedia data. To date, the methods for concept detection have been converging on generic classification schemes. However, there is often imbalanced dataset or rare class problems in classification algorithms, which deteriorate the performance of many classifiers. In this paper, we adopt three under-sampling strategies to handle this imbalanced dataset issue in a SVM classification framework and evaluate their performances on the TRECVid 2007 dataset and additional positive samples from TRECVid 2010 development set. Experimental results show that our well-designed under-sampling methods (method SAK) increase the performance of concept detection about 9:6% overall. In cases of extreme imbalance in the collection the proposed methods worsen the performance than a baseline sampling method (method SI), however in the majority of cases, our proposed methods increase the performance of concept detection substantially. We also conclude that method SAK is a promising solution to address the SVM classification with not extremely imbalanced datasets. © 2011 IEEE.
10.1109/ICME.2011.6011923
Grant Details