e-EGC > e-EGC - Programme

Slides

Vous trouverez ci-aprés les slides de l'école d'hiver qui sont à notre disposition :

Programme

Lundi 27/01/2020

09h00 - 09h30 – Bienvenue / Welcome ! « From machine learning to machine reasoning »

09h30 – 11h00 - Antoine Cornuéjols (AgroParisTech) - “L’apprentissage automatique : où est la place du raisonnement ?”

11h00 - 11h30 – Pause café

11h30 - 13h00 - Jean-Daniel Zucker (IRD) - “Machine Learning and interpretability : exemples in precision medicine”

13h00 - 14h30 - Déjeuner

14h30 - 16h00 - Jean-Gabriel Ganascia (Sorbonne Université) - "Graph and Intertextuality - detection of reuses on big masses of texts"

16h00 - 16h30 - Pause café

16h30 - 18h00 - Vincent Lemaire (Orange) - “Weakly supervised learning with a focus on Active Learning”

20h00 - Dîner

Mardi 28/01/2020

09h00 - 10h30 - Yves Kodratoff - “Passer à ou revenir à « Machine Reasoning » ?”

10h30 - 11h00 - Pause café

11h00 - 12h30 - Michel Verleysen (Ecole Polytechnique de Louvain) - “Dimensionality Reduction and Manifold Learning for High-Dimensional Data Analysis ”

12h30 - 14h00 - Déjeuner

14h00 - 15h30 - Arnaud Martin (IRISA, Université, de Rennes) - “Classifier fusion and imperfect data management”

15h30 - 16h00 - Clôture de l’école

Résumés des exposés

Antoine Cornuéjols (AgroParisTech) - “L’apprentissage automatique : où est la place du raisonnement ?”

Résumé. L’exposé montrera d’abord quel a été le rôle du raisonnement dans l’apprentissage automatique dans l’histoire de l’Intelligence Artificielle depuis ses débuts. La deuxième partie de l’exposé illustrera comment ce rôle redevient important dans plusieurs applications, dans les développements récents de la discipline, et ce que l’on peut anticiper.

Biographie. Antoine Cornuéjols est professeur à AgroParisTech où il est responsable de l’équipe LINK (Learning and INtegration of Knowledge). Il s’intéresse particulièrement à l’apprentissage par transfert et aux méthodes d’apprentissage collaboratives supervisées et non supervisée. Il est membre d’un groupe de travail sur l’apprentissage et le raisonnement dans le GDR-IA piloté par Henri Prade. Il est co-auteur avec Laurent Miclet et Vincent Barra de l’ouvrage « Apprentissage Artificiel. Deep learning, concepts et algorithmes », ainsi que de l’ouvrage « Phase Transitions in Machine Learning » (Cambridge University Press) qui étudie en particulier l’effet de la représentation des connaissances utilisées sur la complexité de l’apprentissage et les résultats qu’il peut produire.

Jean-Daniel Zucker (IRD) - “Machine Learning and interpretability : exemples in precision medicine”

Résumé. Dans cette intervention nous rappellerons les enjeux liés à l’explicabilité des résultats fournis par des algorithmes d’apprentissage automatique.

Nous discuterons en particulier de leur importance en médecine. Nous ferons un état de l’art rapide des approches permettant de proposer des explications. Nous illustrerons cette notion sur des exemples concrets et présenterons des outils disponibles pour de telles analyse.

Biographie. Ingénieur en aéronautique de formation, Jean-Daniel Zucker a suivi un Master en Intelligence artificielle (IA) pour les sciences de la vie en 1986 qui l’a conduit à travailler pendant trois ans aux États-Unis comme ingénieur puis vice-président R&D d’une startup du New England Medical Center (Boston, USA). Après un second Master en Intelligence artificielle à l'université Paris 6 en 1992 il a obtenu son doctorat (Paris, 6) en Apprentissage automatique en 1996 puis est devenu Maitre de Conférences la même année. Nommé Professeur des Universités en Intelligence artificielle à l’université Paris 13 en 2002. Il a cofondé le laboratoire LIM&BIO d'informatique médicale et bio-informatique de l’université Paris 13. En 2007, il est devenu Directeur de Recherche à l’IRD: Institut de Recherche pour le Développement. Il est depuis 5 ans le directeur du laboratoire UMMISCO de recherche en modélisation mathématique et informatique des systèmes complexes dont l’IRD, Sorbonne Université et l’UCA notamment sont co-tutelles. En outre il dirige l’équipe INTEGROMICS de l’Institut hospitalo-universitaire (IHU) sur les maladies cardiométaboliques et la nutrition (ICAN). Ses recherches se situent en IA et Machine Learning - elles concernent la modélisation des systèmes complexes et l'analyse prédictive en haute dimension appliquée à la santé et en particulier aux maladies cardiométaboliques. Il est l’auteur de plus de deux cent cinquante articles de revues ou de conférences. Il a donné des cours d’intelligence artificielle et de Data Mining dans plusieurs universités en France (Sorbonne Université, UP13) et à l’étranger (USTH, Vietnam). Il a dirigé 24 doctorants dont 20 ont déjà soutenu et a été pendant dix ans le directeur du Programme Doctoral International Modélisation des Systèmes Complexes (PDI MSC) de SU et de l’IRD. Il est par ailleurs le co-auteur de l’ouvrage « Abstraction in Artificial Intelligence and Complex Systems » paru en 2013.

Jean-Gabriel Ganascia (Sorbonne Université) - “Graph and Intertextuality - detection of reuses on big masses of texts”

Résumé.

Nowadays, no one really believes that intellectual works such as literary or philosophical writings can be produced by spontaneous genius or by the visitation of the muses. In the seventies, many literary critics, such as Julia Kristeva, thought that social and cultural environments were essential to literary production. They based their hypothesis on their respective discoveries of segments of recycled expressions — that may be borrowed or quoted texts that betray reuses — which served as the basis for what they called intertextual studies.

With the use of computers and the massive digitization of classics, it is now possible to automate the detection of these markers with suitable text mining techniques. Some software, such as Phoebus, Philoline or TextPair have been developed over the last few years to identify them.

However, when exploring corpora containing huge quantities of books (several hundred of thousand or even millions), the number of detected reuses becomes so high (several millions) that it is impossible to understand their meaning, especially when there are many approximate borrowings of the same fragment. After showing how reuses are detected, we will then see, by representing them on graphs, how it is possible to use many concepts from graph theory such as centrality, connected components, communities, bi-graphs, stream graphs or link streams to be able to characterize the nature of reuses and their temporal evolutions.

We will see that beyond their applications in the digital humanities, these techniques open up many perspectives in many sectors where we work on very large textual corpora.

Biographie. Professeur d’informatique à la faculté des sciences de Sorbonne Université (SU), membre senior de l’Institut Universitaire de France, président du COMETS (comité d’éthique du CNRS) et président du comité de pilotage du CHEC (Cycle des Hautes Études de la Culture) Jean-Gabriel Ganascia poursuit ses recherches sur l’intelligence artificielle, l’apprentissage machine, la fouille de données, le versant littéraire des humanités numériques et l’éthique computationnelle au LIP6 (Laboratoire d’Informatique de l’université Paris 6) où il dirige l’équipe ACASA.

PARCOURS
Après une formation initiale d’ingénieur et de philosophe, il s’est orienté vers l’informatique et l’intelligence artificielle. Il est titulaire d’une thèse de doctorat sur les systèmes à base de connaissance obtenue à l’université Paris-Sud en 1983 et d’une thèse d’Etat sur l’apprentissage symbolique soutenue à l’université Paris Sud en 1987. Il a été nommé assistant à l’université d’Orsay (Paris XI) en 1982, puis maître de conférence dans cette même université en 1987 et professeur d'informatique à l’UPMC (ancienne faculté des sciences de Sorbonne Université) en 1988.

Il a dirigé le Diplôme d'Etudes Approfondies IARFA (Intelligence Artificielle, Reconnaissance des Formes et Applications) pendant 12 ans (1992-2004). Il a aussi été chargé de mission à la direction du CNRS (1988-1992) avant de créer et de diriger le Programme de Recherches Coordonnées « Sciences Cognitives » pour le compte du ministère de la recherche (1993) puis le Groupement d’Intérêt Scientifique « Sciences de la cognition » (ministère de la recherche, CNRS, CEA, INRIA, INRETS) (1995-2000). Il a coordonné, pour l'université Pierre et Marie Curie, le master Erasmus Mundus DMKM (Data Mining and Knowledge Management – Fouille de données et gestion de connaissances) entre 2010 et 2016.

PUBLICATIONS
Au cours de sa carrière, il a publié plus de 450 articles dans les actes de conférences scientifiques, dans des livres scientifiques et dans des revues. Il est aussi l’auteur de plusieurs ouvrages destinés au grand public donc voici les derniers :

Le mythe de la Singularité : faut-il craindre l’intelligence artificielle ?, éditions du Seuil, 2017
Intelligence Artificielle : vers une domination programmée ?, Le Cavalier Bleu, Collection Idées reçues, 2017.
Ce matin, maman a été téléchargée, éditions Buchet-Chastel, sous le nom de plume Gabriel Naëj, 2019

Vincent Lemaire (Orange) - “Weakly supervised learning with a focus on Active Learning”

Résumé. Machine learning from big labeled data is highly successful, speech recognition, image understanding, natural language translation, … However, there are various applications where massive labeled data is not available Medicine, robots, frauds, … In this talk I will discuss about classification from limited information. After a brief and general introduction to the Weakly supervised learning we will give a view on active learning literature.

Biographie. Vincent Lemaire is a data scientist and a research project manager in the area "Data Analytics and Knowledge" in the research domain "Data and Knowledge" at Orange Labs, France. He obtained his undergraduate degree from the University of Paris 12 in signal processing and was in the same period an Electronic Teacher. He obtained a PhD in Computer Science from the University of Paris 6 in 1999. He thereafter joined the R&D Division of Orange (France Telecom), where he became a senior expert in data-mining. He obtained his Research Accreditation (HDR) in Computer Science from the University of Paris-Sud 11 (Orsay) in 2008. His research interests are the application of machine learning in various areas for telecommunication companies with an actual main application in data mining for business intelligence, fraud detection and churn prediction. He has organized several machine learning workshops and competitions including the KDD Cup 2009, the AISTATS 2010 challenge on Active Learning and other ECML, ICML, NeurIPS challenges, tutorials or workshops ...

Site web : http://vincentlemaire-labs.fr/

Yves Kodratoff - “Passer à ou revenir à « Machine Reasoning » ?”

Résumé.

Partie 1 : Quelques ‘souvenirs’ de l’IA des années 70
L’IA est-elle une branche de l’informatique ou bien la science des explications ? « Alice » (1976) de Jean-Louis Laurière (1945-2005).

Partie 2 : L’interaction homme-machine pour implémenter un « Extra-Strong Learning »: prendre en compte les échecs et les succès (innovations) des programmeurs PROLOG. Approche de Stephen Muggleton & all.

Partie 3 : Vers une théorie de la créativité scientifique : systèmes complexes symbiotiques (symbiose ‘orientée vers un but’), « pulsatifs » (savoir prouver des théorèmes ‘existentiels’ de la forme $ System " Problem solves(System,Problem)), travailler avec des systèmes « presque complets » (gérer les spécifications incomplètes).

Biographie. à venir

Michel Verleysen (Ecole Polytechnique de Louvain) - “Dimensionality Reduction and Manifold Learning for High-Dimensional Data Analysis ”

Résumé. High-dimensional data are ubiquitous in many branches of science: sociology, psychometrics, medicine, and many others. Modern data science faces huge challenges in extracting useful information from these data. Indeed high-dimensional data have statistical properties that make them ill-adapted to conventional data analysis tools. In addition the choice among the wide range of modern machine learning tools is difficult because it should be guided by the (unknown) structure and properties of the data. Finally, understanding the results of data analysis may be even more important than the performances in many applications because of the need to convince users and experts.

These reasons makes dimensionality reduction, including (but not restricted to) visualization, of high-dimensional data an essential step in the data analysis process. Dimensionality reduction (DR) aims at providing faithful low-dimensional (LD) representations of high-dimensional (HD) data. Feature selection is a branch of DR that selects a low number of features among the original HD ones; keeping the original features helps user interpretability. Other DR methods provides more flexibility by building new features as nonlinear combinations of the original ones, at the cost of a lower interpretability.

This talk will cover advances in machine learning based dimensionality reduction. The curse of dimensionality and its influence on algorithms will be detailed as a motivation for DR methods. Next, the tutorial will cover information-theoretic criteria for feature selection, in particular mutual information used for multivariate selection. Finally the talk will cover advances in nonlinear dimensionality reduction related to manifold learning: after a brief historical perspective, it will present modern DR methods relying on distance, neighborhood or similarity preservation, and using either spectral methods or nonlinear optimization tools. It will cover important issues such as scalability to big data, user interaction for dynamical exploration, reproducibility, stability, and performance evaluation.

Biographie. Michel Verleysen is Professor of Machine Learning at the Université catholique de Louvain, Belgium. He has been an invited professor at the Swiss E.P.F.L. (Ecole Polytechnique Fédérale de Lausanne, Switzerland), at the Université d'Evry Val d'Essonne (France), at the Université ParisI-Panthéon-Sorbonne and at Université Paris Est. He is an Honorary Research Director of the Belgian F.N.R.S. (National Fund for Scientific Research), and the Dean of the Louvain School of Engineering. He is editor-in-chief of the Neural Processing Letters journal (published by Springer), chairman of the annual ESANN conference (European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning), past associate editor of the IEEE Trans. on Neural Networks journal, and member of the editorial board and program committee of several journals and conferences on neural networks and learning. He was the chairman of the IEEE Computational Intelligence Society Benelux chapter (2008-2010), and member of the executive board of the European Neural Networks Society (2005-2010). He is author or co-author of more than 250 scientific papers in international journals and books or communications to conferences with reviewing committee. He is the co-author of the scientific popularization book on artificial neural networks in the series “Que Sais-Je?”, in French, and of the "Nonlinear Dimensionality Reduction" book published by Springer in 2007. His research interests include machine learning, feature selection, nonlinear dimensionality reduction, visualization, high-dimensional data analysis, self-organization, time-series forecasting and biomedical signal processing.

Arnaud Martin (IRISA, Université, de Rennes) - “Classifier fusion and imperfect data management”

Résumé. Classifier fusion methods are particular cases of information fusion. Voting methods allow the reliability of classifiers to be integrated but are not able to represent the imperfections of the output of classifiers. Probability theory makes it possible to model uncertainty but not the imprecision of classifiers. The theory of belief functions is widely used in information fusion because it models the uncertainty and imprecision of classifiers and their reliability.

When combining imperfect experts' opinions the conflict is unavoidable. In the theory of belief functions one of the major problem is the global conflict repartition enlightened by the famous Zadeh’s example. As a consequence, a plethora of alternative combination rules to Dempster’s one were born, in particular proposing alternative repartitions of conflict .

The global conflict is traditionally defined by the weight assigned to the empty set after a conjunctive rule. However, this quantity fails to adequately represent the disagreement between experts in particular when noticing that the conflict between identical belief functions is not null due to the non-idempotence of the majority of the rules .

This lecture presents information fusion principles with vote methods, probabilistic methods and the theory of belief function. Some definitions of conflict measures and how to manage the conflict in the framework of the theory of belief functions are presented.

Biographie. Arnaud Martin is full professor at the laboratory IRISA UMR 6074 and co-head of the team DRUID (https://www-druid.irisa.fr/). Pr. Arnaud Martin teaches data fusion, data mining and computer sciences. His research interests are mainly related to the the theory of belief functions for the classification and include data fusion, data mining. He is author of numerous papers and invited talks.

Web page: http://people.irisa.fr/Arnaud.Martin E-mail: Arnaud.Martin@univ-rennes1.fr

Personnes connectées : 43

Vie privée