IDA Machine Learning Seminars - Fall 2014
Wednesday, September 17, 3.15 pm, 2014.Sequential Decision Making: Experiment Design, Big Data and Reinforcement Learning
Christos Dimitrakakis, Computer Science and Engineering at Chalmers University of Technology.
Abstract: Gone are the days when statisticians used to work with fixed, laboriously compiled and labelled datasets. Nowadays data collection is frequently active and therefore must be adaptive. This talk will give an overview of the field of sequential decision making and how it relates to experiment design, active learning and the general problem of reinforcement learning. The main technical problems encountered are how to plan and learn efficiently. The first problem requires efficient optimisation algorithms, while the second requires good models that are easy to update online and can deal with large amounts of data.
Organizer: Mattias Villani
Wednesday, October 15, 3.15 pm, 2014.Inducing Semantic Representations from Text with Little or No Supervision
Ivan Titov, Institute for Logic, Language and Computation at University of Amsterdam.
Abstract: Inducing meaning representations from text is one of the key objectives of NLP. Most existing statistical semantic analyzers rely on large human-annotated datasets, which are expensive to create and exist only for a very limited number of languages. Even then, they are not very robust, cover only a small proportion of semantic constructions appearing in the labeled data, and are domain-dependent. We investigate Bayesian models which do not use any labeled data but induce semantic representations from unannotated texts. Unlike semantically-annotated data, unannotated texts are plentiful and available for many languages and many domains which makes our approach particularly promising. We show that these models induce linguistically-plausible semantic representations, significantly outperform current state-of-the-art approaches, and yield competitive results in applications (e.g., question answering in the biomedical domain). We also look into several extensions of the model, and specifically consider multilingual induction of semantics, where we show that multilingual parallel texts (i.e. sentences and their translations) provide an additional valuable source of supervision.
Organizer: Marco Kuhlmann
Thursday, October 16, 10.15 pm, 2014.
Knowledge Discovery and Optimization Heuristics for Massive Networks
Extra seminar organized jointly with Seminars in Optimization.
Panos M. Pardalos, Center for Applied Optimization, Department of Industrial and Systems Engineering, University of Florida.
Abstract: In recent years, data mining and optimization heuristics have been used to analyze many large (and massive) data-sets that can be represented as a network. In these networks, certain attributes are associated with vertices and edges. This analysis often provides useful information about the internal structure of the datasets they represent. We are going to discuss our work on several networks from telecommunications (call graph), financial networks (market graph), social networks, and neuroscience. In addition, we are going to present recent results on critical element selection. In network analysis, the problem of detecting subsets of elements important to the connectivity of a network (i.e., critical elements) has become a fundamental task over the last few years. Identifying the nodes, arcs, paths, clusters, cliques, etc., that are responsible for network cohesion can be crucial for studying many fundamental properties of a network.
Organizer: Oleg Burdakov
Wednesday, November 12, 3.15 pm, 2014.Probing Cortical Representations of Naturalistic Stimuli with Deep Learning
Marcel van Gerven, Donders Institute for Brain, Cognition and Behaviour at Radboud University Nijmegen
Abstract: Recent advances in machine learning have shown that deep learning achieves state-of-the-art performance in visual object recognition. In this talk I outline how we used deep learning to disentangle the functional organisation of the cortical visual stream. Our results show that downstream areas code for features that are also represented in deeper layers of artificial neural networks. Furthermore, the outlined framework can be used as a high-throughput method for analysing how individual stimulus features are represented across the cortical sheet as well as for es voxel-level receptive fields. I argue that the marriage of statistical machine learning with cognitive neuroscience yield new insights into human cognition that cannot be easily achieved via more conventional approaches.
Organizer: Jose M. Peña
Wednesday, December 10, 3.15 pm, 2014.Discovering, Modeling, and Predicting Task-by-Task Behaviour of Search Engine Users
Salvatore Orlando, Dept. of Environmental Sciences, Informatics and Statistics at Università Ca' Foscari Venezia.
Abstract: Users of web search engines are increasingly issuing queries to accomplish their daily tasks (e.g., "finding a recipe", "booking a flight", "read- ing online news", etc.). In this work, we propose a two-step methodology for discovering latent tasks that users try to perform through search engines. Firstly, we identify user tasks from individual user query logs. In our vision, a user task is a set of possibly non-contiguous queries, within or crossing user search sessions, which refer to the same latent need. Secondly, we discover collective tasks by aggregating similar user tasks, possibly performed by distinct users. To discover tasks, we propose to adopt clustering algorithms based on novel query similarity measures, in turn obtained by exploiting specific features, and both unsupervised and supervised learning approaches. In particular, in a recent work we show that query similarity can be effectively learned by Learning to Rank (L2R) techniques. All the proposed solutions were evaluated by exploiting a couple of manually-built ground-truth datasets, derived from a real log of search engine queries.
Furthermore, we introduce the the Task Relation Graph (TGR) as a representation of users' search behaviors on a task-by-task perspective, by exploiting the collective tasks obtained so far. The task-by-task behavior is captured by weighting the edges of TGR with a relatedness score computed between pairs of tasks, as mined from the query log. We validated our approach on a concrete application, namely a task recommender system, which suggests related tasks to users on the basis of the task predictions derived from the TGR. Finally, we showed that the task recommendations generated by our technique are beyond the reach of existing query suggestion schemes, and that our solution is able to recommend tasks that user will likely perform in the near future.
Organizer: Oleg Sysoev
Page responsible: Mattias Villani
Last updated: 2015-01-07