摘要:传统基于pairwise的ranking算法,学习后得到的模型在用NDCG这样的ranking标准评价时效果并不好,对此提出了一种新型ranking算法。该算法也是使用样本对作为训练数据,但定义了一个面向NDCG评估标准的目标函数。针对此目标函数非平滑、难以直接优化的特点,提出使用割平面算法进行学习,不仅解决了上述问题,而且使算法迭代的次数不再依赖于训练样本对数。最后基于基准数据集的实验证明了算法的有效性。
关键词:ranking算法;pairwise方法;支持向量机;NDCG;割平面算法
中***分类号:TP181 文献标志码:A
Improved ranking algorithm based on pairwise method
CHENG Fan1,2, ZHONG Hong1
1.College of Computer Science and Technology, Anhui University, Hefei Anhui 230039, China;
2.School of Computer Science and Technology, University of Science and Technology of China, Hefei Anhui230027, China
Abstract:The model learned by ranking algorithm based on traditional pairwise method does not work well by ranking measure, such as Normalized Discounted Cumulative Gain (NDCG). To solve this problem, a new ranking algorithm was proposed. The algorithm used the same train data as the traditional way, and the difference is defining a new object function directed to NDCG. For the problem that the function is non-smooth, difficult to directly optimize, it was proposed to use the cutting plane algorithm, which not only solved the problem above but also made the number of iteration not depending on the training size. The experimental results on the benchmark datasets prove the effectiveness of the proposed algorithm.Key words:ranking algorithm; pairwise method; Support Vector Machine (SVM); Normalized Discounted Cumulative Gain (NDCG); cutting plane algorithm
0 引言
Ranking学习作为机器学习研究的一个新方向,由于其在信息检索、协同滤波、专家发现等领域的广泛应用,近些年来受到越来越多的关注。所谓的ranking学习是指通过使用机器学习技术和有标签的数据来产生一个ranking模型。它被看成是一种新的学习,一种介于分类和回归之间的学习,故称为ranking学习,其本质是类别标号的表示。
已有的ranking学习根据训练数据的不同,可以分为基于单个样本的pointwise法[1-2]、基于样本对的pairwise法[3-8]和基于多个样本(又称为样本队列)的listwise法[9-11]。上述几种方法中,pairwise方法是目前ranking学习研究的主体,也受到最大的关注,其原因主要在于以下几个方面:
1)该方法把ranking学习看成一个二分类问题,训练的数据为具有偏序关系的样本对,其学习的目标为错误的偏序对越少越好。由于主要用到的方法是分类,很多已有的成熟分类算法可以直接应用于该方法上。比如pairwise法中最为经典的几篇文献[3-6]都是使用已有的传统分类技术,并取得了不错的效果。
2)其训练数据――具有偏序关系的样本对可以通过分析用户的点击流直接得到[6],不需要专家的人工标注。不仅节省了人力,更重要的是在实际应用中,可以针对每个用户提供更为准确的个性化模型。
但是,近些年的研究表明,通过传统pairwise方法学习得到的模型在用NDCG(Normalized Discounted Cumulative Gain)[12]这样的ranking评估标准评价时往往效果并不好。究其原因,主要还是传统方法只是简单地把ranking学习变成某种形式的分类学习,并没有特别地注意到二者之间的区别。
对此,本文提出了一种新型ranking算法。该算法对原有的pairwise方法进行了改进,设计了一个针对NDCG评估标准的新目标函数。考虑到该优化目标的非平滑性,已有的ranking学习中用到的优化技术很难直接应用其上,文中提出使用基于次梯度的割平面算法进行近似计算,不仅解决了上述问题,而且算法的迭代次数不依赖于训练样本对的个数,也更适合将来可能的大规模训练数据的应用场合。
参考文献:
[1]CHU W, KEERTHI S S. New approaches to support vector ordinal regression[C]// Proceedings of the International Conference on Machine Learning. New York: ACM, 2005:145-152.
[2]SHASHUA A, LEVIN A. Ranking with large margin principle: Two approaches[C]//Advances in Neural Information Processing Systems. New York: ACM, 2003: 937-944.
[3]BURGES C, SHAKED T, RENSHAW E. Learning to rank using gradient descent[C]// Proceedings of the International Conference on Machine Learning. New York: ACM, 2005: 89-96.
[4]FREUND Y, IYER R, SCHAPIRE R E. An efficient boosting algorithm for combining preferences[J]. Journal of Machine Learning Research, 2003,4(12):933-969.
[5]HERBRICH R, GRAPEL T, OBERMAYER K. Large margin rank boundaries for ordinal regression[M]. Cambridge: MIT Press, 2000.
[6]JOACHIMS T. Optimizing search engines using clickthrough data[C]// Proceedings of the ACM Conference on Knowledge Discovery and Data Mining. New York: ACM,2002:134-142.
[7]ZHENG ZHAOHUI, ZHA HONGYUAN, ZHANG TONG,et al. A general boosting method and its application to learning ranking functions for Web search[C]// Advances in Neural Information Processing Systems. New York: ACM, 2008: 1697-1704.
[8]USUNIER N, BUFFONI D, GALLINARI P. Ranking with ordered weighted pairwise classification[C]// Proceedings of the 26th International Conference on Machine Learning. New York: ACM, 2009:188-196.
[9]CAO ZHE, QIN TAO, LIU TIE-YAN, et al. Learning to rank: From pairwise approach to listwise approach[C]// Proceedings of the International Conference on Machine Learning’07. New York: ACM, 2007: 129-136.
[10]ARAMPATZIS A, KAMPS J. Where to stop reading a ranked list[C]// Proceedings of the 32th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2009: 524-531.
[11]McFEE B, LANCKRIET G. Metric learning to rank[C]// Proceedings of the 27th International Conference on Machine Learning. New York: ACM, 2010:235-243.
[12]JARVELIN K, KEKALAINEN J. Evaluation methods for retrieving highly relevantdocuments[C]// Proceedings of the 23th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM,2000: 41-48.
[13]HIRIART-URRUTY J B, LEMAR′ECHAL C. Convex analysis and minimization algorithms I and II[M]. Berlin:Springer-Verlag, 1993.
[14]TEO C H, LE Q, ***OLA A, et al. A scalable modular convex solver for regularized risk minimization[C]// Proceedings of International Conference on Knowledge Discovery and Data Mining (KDD). New York: ACM, 2007: 48-57.
[15]〖CM(26*2〗LIU TIE-YAN, QIN TAO, XU JUN, et al. Letor: Benchmark〖CM)〗 dataset for research on learning to rank for information retrieval[C]// Learning to Rank for Information Retrieval. New York: ACM, 2007:137-145.
[16]XU JUN, LI HANG. Adarank: A boosting algorithm for information retrieval[C]// Learning to Rank for Information Retrieval. New York: ACM, 2007: 391-398.
转载请注明出处学文网 » 基于pairwise的改进ranking算法