|
Organizers |
Data miners should know about multilevel models!
by
John Maindonald
Centre for Bioinformation Science, Australian National University
Many or perhaps most large datasets have a variation structure that is important for inference. Whether or not there is formal use of methodology for the analysis of multilevel models, multilevel models are often the right framework for assessment of predictive accuracy. This is true however the assessment is made - by using results from a formal use of multilevel analysis, by cross-validation or bootstrapping, or by use of the training/test set methodology. The assessment of predictive accuracy will depend on the structure of the population for which predictions are made, and can be huge. Simplistic assessments can be seriously wrong, usually exaggerating the accuracy.
Predictions typically will be for some future time, but from data that have very limited information on the time-varying component, so that formal time series or repeated measures modelling is not usually possible. The time-varying component adds a further, usually unknown, error to model predictions.
Date received: September 24, 2001
Copyright © 2001 by the author(s). The author(s) of this document and the organizers of the conference have granted their consent to include this abstract in Atlas Conferences Inc. Document # caic-15.