### Tutorial day

The problem of reliable prediction in data science is to decide whether a prediction assigned to a particular case (instance) is indeed the true one. Solving this problem is important for applications that are characterized by high error costs and risks. Examples involve application in medicine, economics, education etc.

The tutorial day consists of three invited lectures that introduce several solutions to the problem of reliable prediction in the context of the conformity framework. The first lecture considers approaches to probabilistic prediction. These approaches allow estimating a probability that a prediction assigned to a particular case (instance) is correct. The second lecture considers approaches to confidence prediction. These approaches allow estimating a confidence that a prediction assigned to a particular case (instance) is correct (The confidence is defined as the probability of picking a case that is as or more contrary contrary than the case being predicted). The third lecture considers applications of reliable prediction to instance transfer. It discusses several approaches to combine/select instances from different data sets (domains).

The tutorial day is organised in the context of the 7th Symposium on Conformal and Probabilistic Prediction with Applications (COPA 2018) June 11-13, 2018 Maastricht, The Netherlands (see http://clrc.rhul.ac.uk/copa2018). The tutorials are followed by: Kolmogorov's lecture of professor Vladimir Vapnik (the inventor of support vector machines). The title of the lecture is "Rethinking Statistical Learning Theory: Learning Using Statistical Invariants". The lecture will start at 16:00 in the main Aula of Maastricht University.

### Lars Carlsson

Stena Line, Sweden

Royal Holloway, University of London, UK

### Henrik Linusson

University of Borås, Sweden

### Introduction to Conformal Prediction

How good is your prediction? In risk-sensitive applications, it is crucial to be able to assess the quality of a prediction, however, traditional classification and regression models don't provide their users with any information regarding prediction trustworthiness. In contrast, conformal classification and regression models associate each of their multi-valued predictions with a measure of statistically valid confidence, and let their users specify a maximal threshold of the model's error rate - the price to be paid is that predictions made with a higher confidence cover a larger area of the possible output space. This tutorial aims to provide its attendees with the knowledge necessary to implement conformal prediction in their daily data science work, be it research or practice oriented, as well as highlight current research topics on the subject.

Since its development the framework has been combined with many popular techniques, such as Support Vector Machines, k-Nearest Neighbours, Neural Networks, Ridge Regression etc., and has been successfully applied to many challenging real world problems, such as the early detection of ovarian cancer, the classification of leukaemia subtypes, the diagnosis of acute abdominal pain, the assessment of stroke risk, the recognition of hypoxia in electroencephalograms (EEGs), the prediction of plant promoters, the prediction of network traffic demand, the estimation of effort for software projects and the back calculation of non-linear pavement layer moduli. The framework has also been extended to additional problem settings such as semi-supervised learning, anomaly detection, feature selection, outlier detection, change detection in streams and active learning. The aim of this symposium is to serve as a forum for the presentation of new and ongoing work and the exchange of ideas between researchers on any aspect of Conformal Prediction and its applications.

### Probabilistic prediction: Venn-ABERS prediction

The tutorial will introduce Venn Predictors, a generic framework to generate, from an underlying ML classifier, non-parametric calibrated probabilistic predictions. The framework is non-parametric in the sense that it does not rely on any assumptions on the specific form of the probability distribution. The probabilities output by the framework are calibrated in the sense they are guaranteed to reflect long-term relative frequencies (within statistical fluctuation).

In the special case of binary classification (i.e. classifying objects into two classes), a special form of Venn Predictors called Venn-ABERS predictors can be applied on top of any scoring classifier. An example of the application of Venn-ABERS predictors will be presented and the characteristics of the method and its differences with other approaches (Isotonic Regression, Platt’s Scaling) will be discussed.

### Paolo Toccaceli

Royal Holloway, University of London, UK

### Shuang Zhou

Chengdu University of Information Technology, China

### Conformal Instance Transfer

Instance transfer aims at improving predictive models for a target domain by exploiting data from a related source domain. In this tutorial, we will explain how reliable prediction (based on the the conformity framework) can help instance transfer. We will start by presenting a statistical conformal test that can be used to decide whether the source data is relevant to the target data. When the relevance is established by the test, the source data can help improving predictive models for the target domain. However, when this is not the case, a successful instance transfer assumes that a subset of the source data needs to be selected and used. We will discuss several approaches to source-subset selection that differ in their functionality and computational complexity. We will end the tutorial by presenting approaches to instance transfer that in addition to source-subset selection employ feature selection. We will show that these approaches are the only solution to instance transfer when the relevance of the source and target data varies over the features.

## Preliminary Program (Monday, 11th June)

### Tutorial 1

Confidence Prediction: Introduction to Conformal Prediction, by Lars Carlsson, Stena Line, Sweden & Henrik Linusson, University of Borås, Sweden

### Coffee Break

### Tutorial 2

Probabilistic Prediction: Venn-ABERS Prediction, by Paolo Toccaceli, Royal Holloway University of London, UK

### Lunch

### Tutorial 3

Conformal Instance Transfer, by Shuang Zhou, Chengdu University of Information Technology, China

## Student Tutorial Package

##
€**90**/Person

- Access to 3 tutorials on June 11
- Kolmogorov lecture
- Coffee breaks & a lunch

## Academia Tutorial Package

##
€**125**/Person

- Access to 3 tutorials on June 11
- Kolmogorov lecture
- Coffee breaks & a lunch

## Industrial Tutorial Package

##
€**200**/Person

- Access to 3 tutorials on June 11
- Kolmogorov lecture
- Coffee breaks & a lunch

Registration form can be found here.