The problem of prediction is central in many areas of science,
and various general methods of prediction have been developed in
machine learning, statistics, and other areas of applied mathematics.
Our research project is in the machine-learning tradition.
A common feature of the standard methods of prediction is their
reliance on various stochastic assumptions made about the
data-generating mechanism. For example, the currently dominant approach
to prediction in machine learning is statistical learning theory. This
theory gives interesting and useful performance guarantees for various
prediction algorithms but makes a restrictive assumption on the way the
data are generated: they are assumed to consist of independent and
identically distributed labelled instances. Other possible assumptions
are the stationarity or the Gaussian distribution of the data sequence.
Competitive prediction does not make any assumptions on data generation
but still produces strong performance guarantees in the on-line
prediction framework; in many cases, these guarantees are as strong as
those obtained in statistical learning theory. The role of stochastic
assumptions is played by a benchmark class of prediction strategies:
the predictor first decides what class of prediction strategies he or
she wants to compete with, and methods of competitive prediction
produce predictions whose cumulative loss does not exceed the loss of
any prediction strategy in the benchmark class plus a small overhead
term. The task of competitive prediction is to design efficient
algorithms that minimize the overhead term.
An important development in machine learning in recent years has been
the explosion of interest in kernel methods (started by Vapnik's
invention of support vector machines). The so-called "kernel trick"
allows one to apply linear methods to non-linear problems. The kernel
trick has been applied to several algorithms of competitive prediction
to obtain competitive loss bounds for benchmark classes that are
Hilbert spaces of prediction strategies (basically, classes of
prediction strategies that are equipped with the notion of scalar
product). The prediction algorithms themselves work with kernels rather
than Hilbert spaces directly, which makes them computationally
efficient. Since a large number of kernels have been developed for
various applications, such as pattern recognition, web, bioinformatics,
linguistics, etc., this makes the methods of competitive prediction
very widely applicable.
The Hilbert spaces form a relatively small subclass of the Banach
spaces, and the main goal of this project is to develop methods of
competitive prediction for benchmark classes of prediction strategies
that form Banach rather than Hilbert spaces. Why is this important? For
us there are two main reasons.
The overhead term in the competitive loss bounds involves the norm of
the prediction strategy in the benchmark class we are competing with.
Therefore, norm of prediction strategies plays a fundamental role in
competitive prediction, whereas scalar product appears extraneous
(albeit important for some algorithms and methods of their analysis).
From the pragmatic point of view, Banach function spaces provide many
new interesting and practically important examples of benchmark classes
of prediction strategies. For example, some benchmark classes are just
too big to be equipped with scalar product. We believe that
Banach-space methods have potential to extend further the domain of
applicability of competitive prediction.