节点文献
基于弹球支持向量机的性能提升
Performance Improvement Based on Pinball Support Vector Machine
【作者】 伍玉莲;
【导师】 於州;
【作者基本信息】 华东师范大学 , 统计学, 2021, 硕士
【摘要】 对于分类问题,基于数据的分位数距离的弹球支持向量机相较于基于数据的最短距离的经典的hinge损失的支持向量机,具有更好的稳健性。本文基于弹球支持向量机方法,分别结合“分布式”和“复合”的思想,从计算效率和参数估计有效性两方面做方法和算法的性能提升。为了解决现代大规模数据带来的计算机内存限制、计算成本耗费等问题,本文提出并研究了一种适用于大数据环境下二元分类任务的pinball损失线性支持向量机(pin-SVM)的分布式估计方法。在本文中,我们基于pinball损失函数的光滑化近似,对参数进行迭代求解并得到pin-SVM的参数的多轮分布线性型估计。通过数值模拟及其在实际数据集上的应用,可以验证我们的方法不仅对参数估计具备较高的精准性,而且对于噪音具有稳健性。同时分布式计算极大程度得加快了计算速度,减少了计算成本。为了进一步提升pin-SVM的参数估计有效性,我们提出复合弹球支持向量机方法。首先,我们给出复合弹球支持向量机的线性形式,并从理论上证明线性复合弹球支持向量机的渐近收敛性质。其次,我们给出复合弹球支持向量机的非线性核形式,并在此形式上推导出其对偶问题。最后,为了所提出方法的实用性,我们设计解决复合弹球支持向量机优化问题的序列最小优化算法,并通过数值实验说明我们的方法参数估计有效性方面具有优良表现。
【Abstract】 For classification problems,the pinball support vector machine based on the quantile distance of the data has better robustness than the classic hinge loss support vector machine based on the shortest distance of the data.This paper is based on the pinball support vector machine method,combined with the ”distributed” and ”composite” ideas,respectively,to improve the performance of methods and algorithms from both computational efficiency and parameter estimation effectiveness.In order to solve the problems of computer memory limitation and computational cost caused by modern large-scale data,this paper proposes and studies a pinball loss linear support vector machine(pin-SVM)distributed estimation method suitable for binary classification tasks in a big data environment.In this paper,based on the smoothing approximation of the pinball loss function,we iteratively solve the parameters and obtain the multi-round distributed linear estimation of the parameters of the pin-SVM.Through numerical simulation and its application on real data sets,it can be verified that our method not only has high accuracy for parameter estimation,but also has robustness to noise.At the same time,distributed computing greatly accelerates the calculation speed and reduces the calculation cost.In order to further improve the parameter estimation effectiveness of pin-SVM,we propose a composite pinball support vector machine method.Firstly,we give the linear form of the composite pinball support vector machine,and theoretically prove the asymptotic convergence properties of the linear composite pinball support vector machine.Secondly,we give the nonlinear kernel form of the composite pinball support vector machine,and derive its dual problem from this form.Finally,for the practicability of the proposed method,we design a sequential minimal optimization algorithm to solve the optimization problem of composite pinball support vector machine,and show that our method has excellent performance in the effectiveness of parameter estimation through numerical experiments.
【Key words】 Big data; pinball support vector machine; distributed computing; composite quantile; kernel formulation; dual problem; sequential minimal optimization;
- 【网络出版投稿人】 华东师范大学 【网络出版年期】2022年 10期
- 【分类号】TP181
- 【下载频次】11