Atlas home || Conferences | Abstracts | about Atlas


On high-dimensional data p >> n in mathematical statistics and bio-medical applications

September 9-20, 2002

Leiden, Netherlands

Statistics

Host: Lorentz Center
Homepage: http://www.lc.leidenuniv.nl/lorentz_center/2002/20020909/info.php3?wsid=51

Organizers: D. L. Donoho (Univ. Stanford CA, USA) J. C. van Houwelingen (LUMC Leiden, The Netherlands) S. A. van de Geer (Univ. Leiden, The Netherlands)

Description:
In traditional statistics, the golden rule is that the number of observations n should be larger than the number of variables (or parameters) p. However, nowadays, the data we collect (or the models we make) frequently violate this rule. Indeed the new statistical philosophy almost reversed the old traditional golden rule.

High-dimensional data may come up naturally, for example, when the objects to be analysed are curves, images, micro-arrays, or films. The enlarging memory and computing capacity of computers have made it possible to store large data sets and manipulate them. But also for example a regression of y on a single covariate x can be seen as a high-dimensional problem, as soon as the regression curve is modeled nonparametrically.

There are various approaches for handling high-dimensionality. Empirical Bayes methods put (hierarchical) priors on the many parameters, there are the (related) penalized methods and thresholding methods, and also projection techniques, additive models, multiresolution analysis, and model selection techniques. High-dimensional data sets in Banach spaces give rise to interesting challenges in the mathematical world (think of for instance results concentration of measure in Banach spaces), as well as in areas as data mining, data compression, and more.

In bio-medical applications high-dimensional data are a traditionally found in clinical chemistry, radiology and medical imaging. The explosive growth of genomics has lead to a new stream of high-dimensional data-sets arising either from the genome-wide search for genetic causes of complex diseases or from gene-expression data measured in micro-arrays.

There will be two concentration periods (CP's):

CP1. p >> n in mathematical statistics (September 9-13)

CP2. p >> n in bio-medical applications (September 16-20)

The first period CP1 will primarily focus on mathematical results and the theory justifying the methods. It will be an opportunity to meet experts in the mathematics of the problem.

The second period CP2 will concentrate on bio-medical applications and on the theoretical questions that arise from such applications. The workshop will be of interest to young researchers and will be a learning experience for all.

During a CP there will be two or three main lectures a day, and many occasions for discussions and collaborations.

Participants can choose to attend both CP's or just one of them.

The workshop is jointly organized by the International Lorentz Center, Leiden University, and the Department of Medical Statistics, Leiden University Medical Center.

Date received: November 15, 2001


© 2008 Atlas Conferences Inc.