Jeongyoun Ahn
Department of Statistics and Operations Research
The University of North Carolina at Chapel Hill
High Dimension Low Sample Size Data Analysis
(practice talk for the preliminary dissertation oral exam)
Date: Friday, February 18
Time: Noon-1:00PM
Place: Smith 211
Abstract:
In a binary discrimination problem, a linear classifier finds a linear hyperplane that separates two classes by partitioning the data space. Especially in a High Dimension Low Sample Size (HDLSS) setting, there are linear separating hyperplanes such that the projections of the training data points onto their normal direction vectors are identically zero, or some non-zero constant. Of interest in this talk is a linear separating hyperplane such that the projections of the training data points from each class onto its normal direction vector have two distinct values, one for each class. This direction vector is uniquely defined in the subspace generated by the data. A simple formula is given to find this direction. In non-HDLSS settings, this direction vector is the same as the Fisher Linear Discrimination direction vector.
If time permits, some results from the study of sample covariance matrix regarding HDLSS asymptotics will be presented. Also the parameter selection problem in the kernel-based classification will be introduced.