91勛圖厙

Fifth Van der Meulen Seminar

Fifth Van der Meulen Seminar
Fifth Van der Meulen Seminar: Data science at the intersection of information theory, statistics, and machine learning

You are cordially invited by the 91勛圖厙 Benelux Information Theory Chapter and the Centre for Telematics and Information Technology of the University of Twente to attend the

Fifth Van der Meulen Seminar:
Data science at the intersection of information theory, statistics, and machine learning

Date

Thursday, 27 November, 2014

Time

13:00 17:00

Location

University of Twente, Building: Horst, Room: Horstring N109

Program

12:30 13:00

Welcome

13:00 13:45

Peter Gr羹nwald 泭(CWI Amsterdam & Leiden University)
Learning when all models are wrong: an information-theoretic perspective

13:45 14:15

Thijs Laarhoven 泭(Eindhoven University of Technology)

Collusion-resistant fingerprinting and group testing

14:15 14:30

Break

14:30 15:00

Arnoud den Boer 泭(University of Twente)
Model Selection in Stochastic Optimization Problems with泭Incomplete Information

15:00 15:30

Sicco Verwer 泭(Delft University of Technology)
Learning timed state machines using maximum likelihood tests

15:30 16:00

Break

16:00 16:30

Farzad Farhadzadeh 泭(Eindhoven University of Technology)
Active content fingerprinting and its application in content identification

16:30 17:00

Robin Aly 泭(University of Twente)

Big Data Programming Models

17:00

Reception

Website

Abstracts of talks and biographies of speakers are available at

Registration

Attendance is free of charge, but registration is required. Please indicate your attendance by sending a message to泭 [email protected] . The deadline for registration is 20 November.

Contact information

Dr. ir. Jasper Goseling

Faculty of Electrical Engineering, Mathematics and Computer Science,

University of Twente

[email protected]

+31 53 489 33 69

Abstracts and biographies

Peter Gr羹nwald

CWI Amsterdam & Leiden University

Learning when all Models are Wrong: an information-theoretic perspective

Bayesian and MDL (Minimum Description Length) inference can behave badly if the model under consideration is wrong yet useful: the posterior may fail to concentrate even for large samples, leading to extreme overfitting in practice. We demonstrate this on a simple regression problem. We introduce a test that can tell from the data whether we are heading for such a situation. The test is based on the idea of hypercompression: if we can compress data more than our model predicts, then the model must be wrong and there is danger of overfitting. In this case we can adjust the learning rate (equivalently: make the prior lighter-tailed, or penalize the likelihood more). The resulting "safe" Bayesian/MDL estimator behaves as well as standard Bayes/MDL if the model is correct but continues to achieve good rates with wrong models.泭 In classification problems, it learns faster in easy settings, i.e. when a Tsybakov condition holds, effectively solving the old problem of 'how to learn the learning rate'.

* For an informal introduction to the idea, see Larry Wasserman's泭 .

Biography

Peter Gr羹nwald (1970, VIDI, VICI) heads the information-theoretic learning group at CWI <http://www.cwi.nl>, the Dutch national research institute for mathematics and computer science, located in Amsterdam. He is also professor of statistical learning at Leiden University. His research interests lie where statistics, computer science and information theory meet: theories of learning from data. In 2010 he was co-awarded the Van Dantzig prize, the highest Dutch award in statistics and OR. He is mostly known for his work on MDL (He is author of The Minimum Description Length Principle, MIT Press, 2007, becoming the standard reference in the field) and his active involvement in the ultimately successf