"Test of significance when the data are curves"
(Under the direction of Jianqing Fan)
In the modern world, massive data are often so easily collected that the data are not just univariate nor multivariate, they are sampled from continuous functions and the observed data are a set of curves. More precisely, we have observations of form { X_{i}(t_{j}) \}: the observation of the i^{th} subject at time t_{j}. Rice and Silverman (1991) gave a gait cycle analysis. Ramsay and Dalzell (1991) analyzed temperature precipitation patterns.
Our goal is to provide a powerful method to test the significance between two or more sets of curves. The Kolmogorov-Smirnov test and the Cramer-Von Mises test are two traditional methods. We will show that these two methods test only a few dimensions and have low power in detecting features such as local bumps and high frequency components.
We propose two orthogonal transforms to compress the information. One is a discrete Fourier transform and the other is a discrete wavelet transform. The discrete Fourier transform is applied when curves are smooth; then we apply adaptive Neyman procedure to choose a statistic which has the maximum power. The discrete wavelet transform is applied when curves have local properties; then we apply a thresholding procedure to select some signals into a statistic. Simulation results and an application to business commercial data will be given in this talk.