é-EGC : Antoine Cornuéjols (AgroParisTech) - “L’apprentissage automatique : où est la place du raisonnement ?”
L’exposé montrera d’abord quel a été le rôle du raisonnement dans l’apprentissage automatique dans l’histoire de l’Intelligence Artificielle depuis ses débuts. La deuxième partie de l’exposé illustrera comment ce rôle redevient important dans plusieurs applications, dans les développements récents de la discipline, et ce que l’on peut anticiper. Salle 201
é-EGC : Jean-Daniel Zucker - “Machine Learning and interpretability : exemples in precision medicine”
Dans cette intervention nous rappellerons les enjeux liés à l’explicabilité des résultats fournis par des algorithmes d’apprentissage automatique.
Nous discuterons en particulier de leur importance en médecine. Nous ferons un état de l’art rapide des approches permettant de proposer des explications. Nous illustrerons cette notion sur des exemples concrets et présenterons des outils disponibles pour de telles analyse. Salle 201
é-EGC : Jean-Gabriel Ganascia - “Graph and Intertextuality - detection of reuses on big masses of texts”
Nowadays, no one really believes that intellectual works such as literary or philosophical writings can be produced by spontaneous genius or by the visitation of the muses. In the seventies, many literary critics, such as Julia Kristeva, thought that social and cultural environments were essential to literary production. They based their hypothesis on their respective discoveries of segments of recycled expressions — that may be borrowed or quoted texts that betray reuses — which served as the basis for what they called intertextual studies.
With the use of computers and the massive digitization of classics, it is now possible to automate the detection of these markers with suitable text mining techniques. Some software, such as Phoebus, Philoline or TextPair have been developed over the last few years to identify them.
However, when exploring corpora containing huge quantities of books (several hundred of thousand or even millions), the number of detected reuses becomes so high (several millions) that it is impossible to understand their meaning, especially when there are many approximate borrowings of the same fragment. After showing how reuses are detected, we will then see, by representing them on graphs, how it is possible to use many concepts from graph theory such as centrality, connected components, communities, bi-graphs, stream graphs or link streams to be able to characterize the nature of reuses and their temporal evolutions.
We will see that beyond their applications in the digital humanities, these techniques open up many perspectives in many sectors where we work on very large textual corpora.
Salle 201
é-EGC : Vincent Lemaire - “Weakly supervised learning with a focus on Active Learning”
Machine learning from big labeled data is highly successful speech recognition, image understanding, natural language translation, …
However, there are various applications where massive labeled data is not available Medicine, robots, frauds, … In this talk I will discuss about classification from limited information.
After a brief and general introduction to the Weakly supervised learning we will give a view on active learning literature. Salle 201
é-EGC : Yves Kodratoff - “Passer à ou revenir à « Machine Reasoning » ?”
Partie 1 : Quelques ‘souvenirs’ de l’IA des années 70
L’IA est-elle une branche de l’informatique ou bien la science des explications ? « Alice » (1976) de Jean-Louis Laurière (1945-2005).
Salle 201
Partie 2 : L’interaction homme-machine pour implémenter un « Extra-Strong Learning »: prendre en compte les échecs et les succès (innovations) des programmeurs PROLOG. Approche de Stephen Muggleton & all.
Partie 3 : Vers une théorie de la créativité scientifique : systèmes complexes symbiotiques (symbiose ‘orientée vers un but’), « pulsatifs » (savoir prouver des théorèmes ‘existentiels’ de la forme $ System " Problem solves(System,Problem)), travailler avec des systèmes « presque complets » (gérer les spécifications incomplètes).
é-EGC : Michel Verleysen - “Dimensionality Reduction and Manifold Learning for High-Dimensional Data Analysis ”
High-dimensional data are ubiquitous in many branches of science: sociology, psychometrics, medicine, and many others. Modern data science faces huge challenges in extracting useful information from these data. Indeed high-dimensional data have statistical properties that make them ill-adapted to conventional data analysis tools. In addition the choice among the wide range of modern machine learning tools is difficult because it should be guided by the (unknown) structure and properties of the data. Finally, understanding the results of data analysis may be even more important than the performances in many applications because of the need to convince users and experts.
These reasons makes dimensionality reduction, including (but not restricted to) visualization, of high-dimensional data an essential step in the data analysis process. Dimensionality reduction (DR) aims at providing faithful low-dimensional (LD) representations of high-dimensional (HD) data. Feature selection is a branch of DR that selects a low number of features among the original HD ones; keeping the original features helps user interpretability. Other DR methods provides more flexibility by building new features as nonlinear combinations of the original ones, at the cost of a lower interpretability.
This talk will cover advances in machine learning based dimensionality reduction. The curse of dimensionality and its influence on algorithms will be detailed as a motivation for DR methods. Next, the tutorial will cover information-theoretic criteria for feature selection, in particular mutual information used for multivariate selection. Finally the talk will cover advances in nonlinear dimensionality reduction related to manifold learning: after a brief historical perspective, it will present modern DR methods relying on distance, neighborhood or similarity preservation, and using either spectral methods or nonlinear optimization tools. It will cover important issues such as scalability to big data, user interaction for dynamical exploration, reproducibility, stability, and performance evaluation. Salle 201
é-EGC : Arnaud Martin - “Classifier fusion and imperfect data management
Classifier fusion methods are particular cases of information fusion. Voting methods allow the reliability of classifiers to be integrated but are not able to represent the imperfections of the output of classifiers. Probability theory makes it possible to model uncertainty but not the imprecision of classifiers. The theory of belief functions is widely used in information fusion because it models the uncertainty and imprecision of classifiers and their reliability.
When combining imperfect experts' opinions the conflict is unavoidable. In the theory of belief functions one of the major problem is the global conflict repartition enlightened by the famous Zadeh’s example. As a consequence, a plethora of alternative combination rules to Dempster’s one were born, in particular proposing alternative repartitions of conflict .
The global conflict is traditionally defined by the weight assigned to the empty set after a conjunctive rule. However, this quantity fails to adequately represent the disagreement between experts in particular when noticing that the conflict between identical belief functions is not null due to the non-idempotence of the majority of the rules .
This lecture presents information fusion principles with vote methods, probabilistic methods and the theory of belief function. Some definitions of conflict measures and how to manage the conflict in the framework of the theory of belief functions are presented.
Salle 201
La session "minute of madness" est une séance ludique où les auteurs pourront présenter leurs travaux en deux minutes à l’aide d’une unique diapositive. C'est pour les chercheurs une façon ludique de présenter leurs derniers résultats et de "teaser" pour leurs présentations complètes.
Remise des prix EGC.
Des prix scientifiques seront attribués lors de la conférence : un prix pour la catégorie "article académique" (1500 euros), un prix pour la catégorie "article applicatif" (1500 euros), un prix pour le défi (1500 euros + 500 euros si collaboration avec une équipe de SHS), un prix de thèse (500 euros) décerné à un jeune docteur dont la thèse a été soutenue depuis moins de trois ans dans les thématiques liées à l’extraction et la gestion des connaissances.
"A quoi sert le management de la connaissance quand le Knowledge Manager ne connaît pas l'ingénierie de la connaissance?", Alain Berger, Ardans.
- “Estimating NOx air quality with maps and deep learing, a proof-of-concept for the European Environment Agency”, Wouter Labeeuw, Delaware.
Le repas de gala de la conférence a lieu dans le magnifique cadre de l'ancien théâtre du prestigieux hôtel Plaza dans le coeur de Bruxelles.
Le théâtre du Plaza.