Learning to Aggregate Ordinal Labels by Maximizing Separating Width

International Conference on Machine Learning 2017

Guangyong Chen1          Shengyu Zhang1         Di Lin2          Hui Huang2         Pheng-Ann Heng1

Figure 1. Graphical model of our generative model.


While crowdsourcing has been a cost and time efficient method to label massive samples, one critical issue is quality control, for which the key challenge is to infer the ground truth from noisy or even adversarial data by various users. A large class of crowdsourcing problems, such as those involving age, grade, level, or stage, have an ordinal structure in their labels. Based on a technique of sampling estimated label from the posterior distribution, we define a novel separating width among the labeled observations to characterize the quality of sampled labels, and develop an efficient algorithm to optimize it through solving multiple linear decision boundaries and adjusting prior distributions. Our algorithm is empirically evaluated on several real world datasets, and demonstrates its supremacy over state-ofthe-art methods.

[To reference our ALGORITHM, API, CODE or DATA in any publication, please include the bibtex below and a link to this webpage.]


We would like to thank anonymous reviewers for their valuable comments to improve the presentation of this paper. This work is supported by the China 973 Program (Project No. 2015CB351706) and a grant from the National Natural Science Foundation of China (Project No. 61233012). Shengyu Zhang was supported by Research Grants Council of the Hong Kong S.A.R. (Project no. CUHK14239416).

Downloads(faster for people in China)

Downloads(faster for people in other places)