Programme:

Jeudi le 07 juin 2018, de 10h30 à 12h00, Salle B011 (LORIA):

10h30-11h15. Nyoman Juniarta: Sequential Pattern Mining using FCA and Pattern Structures for Analyzing Visitor Trajectories in a MuseumRésumé : The presentation is about mining visitor trajectories in Hecht Museum (Haifa, Israel), within the framework of CrossCult European Project about cultural heritage. We present a theoretical and practical research work about the characterization of visitor trajectories and the mining of these trajectories as sequences. The mining process is based on two approaches in the framework of FCA, namely the mining of subsequences without any constraint and the mining of frequent contiguous subsequences. Both approaches are based on pattern structures. In parallel, a similarity measure allows us to build a hierarchical classification which is used for interpretation and characterization of the trajectories w.r.t. four well-known visiting styles.

11h15-12h00. Tatiana Makhalova: A first study on what MDL can do for FCARésumé :Formal Concept Analysis can be considered as a classification engine able to build classes of objects with a description or concepts and to organize these concepts within a concept lattice. The concept lattice can be navigated for selecting significant concepts. Then the problem of finding significant concepts among the potential exponential number of concepts arises. Some measures exist that can be used for focusing on interesting concepts such as support, stability, and other. MDL (minimum description length) is also a good candidate that was rarely used in FCA by now for such objective. In this talk it will be argued that MDL can give FCA practitioners a good measure for selecting significant and representative concepts.

Mardi le 22 mai 2018, de 11h00 à 12h00, Salle B013 (LORIA):

Kevin Dalleau : Unsupervised Extremely Randomized TreesRésumé : Many unsupervised learning algorithms rely on a (dis)similarity measure to evaluate the pairwise distance between samples. Despite the large number of measures already described in the literature, in many applications, the set of available metrics is reduced by intrinsic characteristics of the data and of the chosen algorithm. In this talk, I will present a novel method based on extremely randomized trees. This method, UET (Unsupervised Extremely Randomized Trees), is based on the work of T.Shi and S.Horvath and can be generalized to heterogeneous data, enabling the clustering of datasets with mixed-types variables. The evaluation of the methods shows promising results, both in terms of discrimination of instances and in terms of running time.

Joël Legrand : TreeLSTM and cross-corpus training for extracting pharmacogenomic relationships from textRésumé : A key aspect of machine learning algorithms for relationship extraction in textual data is the availability of sufficiently large training data. Manually annotated corpora are valuable resources for this task, but the time and expertise required for their development explains why only few annotated corpora are freely available. For tasks related to precision medicine, most of them are rather small (i.e., hundreds of sentences) or they only focus on specialized relationships (e.g., drug-drug interactions) that rarely fit what one wants to extract. In this talk, I will present two lines of research that I pursued to overcome the lack of annotated data for the task of pharmacogenomic relation extraction. I will first present a TreeLSTM-based transfer learning method that allows to achieve high performance for the extraction of biomedical relationships from text, for which initial resources are scarce. Then, I will present the PGxCorpus, a manually annotated training corpus designed for the supervised extraction of pharmaco-genomic relationships from text.

Mardi le 13 février 2018, de 11h00 à 12h00, Salle B013 (LORIA):

Michel Habib : Graph classes defined via vertex orderings avoiding patternsRésumé : Consider a graph H given with a fixed ordering on its vertices. Then a graph G will belong to the class if and only if there exists an ordering of its vertices such that no subgraph of G induces a copy of H with the given fixed ordering. Let us give first a very classical example : chordal graphs. These are defined as graphs which do not contain any induced cycle of length at least 4. Is is well known (Dirac 1961) that chordal graphs are exactly graphs which admit a simplicial elimination ordering, that is an ordering on the vertices such that the neighbors of a vertex that are placed before it in the ordering induce a clique. This corresponds to the previous definition when H is the path on 3 vertices where the middle vertex is placed last in the ordering. There exists several results of this type : proper interval graphs, interval graphs, permutation graphs, co-comparability graphs, …, and it usually gives very good recognition algorithms, whereas in general forbidden subgraph characterization of a given class does not provide good algorithms.

Tatiana Makhalova : Formal Concept Analysis: some applications in computational learning theory and machine learningRésumé : Formal Concept Analysis and closed itemsets proved to be of big impor-tance for knowledge discovery, both as a tool for concise representation of asso- ciation rules and a tool for constructing domain taxonomies and ontologies.
In this talk, FCA will be considered in an unconventional way. We will discuss the connection of FCA with Computational Learning Theory. In particular, we will present an approach to obtaining the tight theoretical bounds of probability of overfitting and discuss why the precise bounds are not applicable in practice.The second part will be devoted to a more practical question, that is rule-based classifiers. We will see how random forests relate to frequent closed itemsets. Numerous experiments have proved that the quality of the random forests depends on such parameters as the size of bootstrap samples, the minimal size of leafs or the depth of trees as well as the size of subsets of features used to grow a tree. In the talk we will examine the question of computing a set of rules in a deterministic way in order to build an ensemble of classifiers (i.e. closed itemsets) having accuracy comparable to the random forests.

Mardi le 05 décembre 2017, de 14h00 à 15h00, Salle A006 (LORIA):

Emmanuel Jeandel : Computability in closure spacesRésumé : In many instances in logic or computable algebra, classical theorems show that many problems are undecidable for general structures, but become de-cidable if some rigidity is imposed on the structure. This is the case for example for groups (the rigidity criterion is then simpleness), first order logic (complete-ness), or symbolic dynamics (minimality). We will present a topological frame-work based on closure spaces and Galois connections to show that many of these proofs can be obtained in a similar setting. We will show in particular that these statements can be generalized to cover arbitrary structures, with no finite or recursive presentation/axiomatization.

Mardi le 03 octobre 2017, de 11h00 à 12h00, Salle C103 (LORIA):

Jimmy Devillet : Enumerating quasitrivial semigroupsRésumé : We investigate the class of binary associative and quasitrivial operations on a given finite set. Here quasitriviality (also known as conserva-tiveness) means that the operation always outputs one of its input values. We also examine the special situations where the operations are commutative and nondecreasing. In the latter case, these operations reduce to discrete uninorms, which are discrete fuzzy connectives that play an important role in fuzzy logic. As we will see nondecreasing, associative and quasitrivial operations are chara-cterized in terms of total and weak orderings through the so-called single-peakedness property introduced in social choice theory by Duncan Black. This will enable visual interpretaions of the above mentioned algebraic properties. Motivated by these results, we will also address a number of counting issues: we enumerate all binary associative and quasitrivial operations on a given finite set as well as of those operations that are commutative, are nondecreasing, have neutral and/or annihilator elements. As we will see, these considerations lead to several, previously unknown, integer sequences.

Vendredi le 23 juin 2017, de 11h30 à 12h30, Salle C103 (LORIA):

Victor Codocedo : On Locality Sensitive Hashing for Sampling Extent Generators Résumé : In this talk we introduce a method for sampling formal concepts using locality sensitive hashing (LSH). LSH is a technique used for finding approximate nearest neighbours given a set of hashing functions. Through our approach, we are able to predict the probability of an extent in the concept lattice given a set of objects and their similarity index, a generalization of the Jaccard similarity betwe- en sets. Our approach allows defining a lattice-based amplification construction to design arbitrarily discriminative sampling settings.

10h45 – 11h15 Justine Reynaud. A proposal for classifying the content of the web of data based on FCA and pattern structures

11h30 – 12h00 Pierre Monnin. Using FCA for checking the structure of an ontology in LOD: the example of DBpedia

Lundi le 12 juin 2017, de 16h00 à 17h00, Salle B013 (LORIA)

Matěj Stehlík: Topological bounds on the chromatic numberRésumé : The chromatic number is arguably the most studied invariant in graph theory. We are going to present some general upper and lower bounds on the chromatic number, before introducing a lower bound based on ideas from algebraic topology. This bound originates from Lovasz’s celebrated proof of Kneser’s conjecture in 1977. We will explore a new approach to these bounds using what we call higher-dimensional projective quadrangulations. Joint work with Tomas Kaiser.

Jeudi le 08 juin 2017, de 14h00 à 15h00, Salle C103 (LORIA)

Pauli Miettinen: Hyperbolic Communities: Modelling Communities Beyond CliquesRésumé : Finding and studying communities in (social) networks is one of the key problems in (social) network analysis. Much of this research models the commu-nities as (quasi-)cliques. In recent years, there has been an increasing interest to go « beyond cliques », and consider a more varied collection of community patterns. In this talk, I will present our recent hyperbolic community model that, on the one hand, generalises the classical core-periphery community model from sociology, and on the other hand, contains graph patterns such as cliques, power-laws, and stars as special cases. The empirical evidence shows that our model fits very well to many real-world graphs, and that it also allows us to understand the shape of the communities in that graph. I will also discuss our ongoing work on finding these communities using novel methods based on tropical matrix factorisations and on our novel characterisation of nested matrices.

Thomas Kipf: Deep Learning on Graphs with Graph Convolutional Networks (P. 2)Résumé : Neural networks on graphs have gained renewed interest in the machine learning community. Recent results have shown that end-to-end trainable neural network models that operate directly on graphs can challenge well-established classical approaches, such as kernel-based methods or methods that rely on graph embeddings (e.g. DeepWalk). In this talk, I will introduce two extensions to the basic framework of defining neural networks on graphs: graph auto-encoders [1] and relational graph convolutional networks (R-GCNs) [2]. While graph auto-encoders provide a new way of approaching problems like link prediction and clustering, R-GCNs allow for efficient modeling of directed, relational graphs, such as knowledge bases (e.g. DBPedia). If time permits, I will talk about some recent work that combines such graph-based approaches with deep learning methods from the field of natural language processing, to achieve new state-of-the-art results on the task of semantic role labeling.
[1] TN Kipf and M Welling, Variational Graph Auto-Encoders, NIPS Bayesian Deep Learning Workshop (2016)
[2] M Schlichtkrull, TN Kipf, P Bloem, R vd Berg, I Titov and M Welling, Modeling Relational Data with Graph Convolutional Networks, arXiv:1703.06103 (2017)

Mercredi, le 15 mars 2017, de 11h à 12h, Salle: B013 (LORIA):

Maria Boritchev: On Politics and ArgumentationRésumé : Argumentation mining is a relatively new topic in the field of corpus-based discourse analysis. This talk presents a work which primarily concerns with the identification of argumentative structures that can be found within dialogical discourse. Starting from data from 2016 U.S. presidential elec- tions debates, the aim is to automatically mine intertextual correspondences between political argumentation and social media reactions. Identifying the most efficient (here, reaction-rising) political discourse techniques can not only give the politicians the possibility to adjust their discourse but also give an attentive audience clues to critical thinking. During this talk I will present the current state of this work, introducing the chosen argumentation mining techniques and showing some intermediate results. Then I will finish with a discussion of the future work on intertextual correspondence mining.

Vendredi, le 10 mars 2017, de 14h00 à 15h00, Salle B013 (LORIA):

Jan Volec: Graph limit approach in extremal combinatoricsRésumé : In the last decade, various theories of combinatorial limits emerged, and attracted a substantial attention. In this talk, I will focus on applications of combinatorial limits and the so-called flag algebra method to problems in extremal combinatorics. I will present the main ideas behind these methods and then use them to answer a problem of Erdos on pseudo-random hypergraphs, solve a conjecture of Erdos and Sos concerned with recursive graph constructions, and to study robust versions of Turan-type problems.

Mercredi, le 8 mars 2017, de 11h00 à 12h00, Salle C003 (LORIA):

Abdallah Saffidine: The (Parameterized) Complexity of (Positional) GamesRésumé : Classical Computational Complexity is a popular tool from Theoretical Computer Science used to characterize the difficulty of solving decision problems. I will present how a couple intuitive criteria allow one to guess the most likely complexity class of games of various sort. We’ll then move to Parameterized Complexity which addresses some of the practical shortcomings of classical com- putational complexity and has seen increased adoption for studying graph-theoretic problems. I will describe ongoing work on building a better under- standing of the Parameterized Complexity of Games. As an application, we’ll show that Short Generalized Hex is W[1]-complete parameterized by the number of moves. This disproves a conjecture from Downey and Fellows’ influential list of open problems from 1999.

Joel Legrand: Word Sequence Modeling using Deep LearningRésumé : For a long time, natural language processing (NLP) has relied on generative models with task specific and manually engineered features. Recently, the rapidly growing interest for deep learning has led to state-of-the-art results in various fields such as computer vision, speech processing and natural language processing. The central idea behind these approaches is to learn features and models simultaneously, in an end-to-end manner, and making as few assumptions as possible. In NLP, word embeddings, mapping words in a dictionary on a continuous low-dimensional vector space, have proven to be very efficient for a large variety of tasks while requiring almost no a-priori linguistic assumptions. In this talk, I will present the results of my research on continuous representations of segments of sentences for the purpose of solving NLP tasks that involve complex sentence-level relationships. I will first introduce the key concepts of deep learning for NLP. I will then focus on two recent empirical studies concerning the tasks of syntactic parsing and bilingual word alignment. For each of them, I will present the main challenges as well as the deep learning-based solutions used to overcome them.

Mercredi, le 5 octobre 2016, de 15h à 16h, Salle: B013 (LORIA):

François Pirot: The bounds for the distance-t chromatic number, and the effect of removing cyclesRésumé : It is well known that you can colour a graph G of maximum degree d greedily with d+1 colors. Moreover, this bound is tight, since it is reached for example by the cliques. Johansson proved with a pseudo-random colouring scheme that you can colour triangle-free graphs of maximum degree d with no more than O(d/log d) colours. This is tight up to a multiplicative constant, since you can pseudo-randomly construct a family of graphs of maximum degree d, arbitrary large girth, and chromatic number bigger than c(d/log d), for some constant c. When you introduce the distance-t chromatic number of a graph G of maximum degree d, which is simply the chromatic number of G^t, you can be interested in finding similar bounds, and proving that they are tight. It is straight-forward that d^t is an upper bound for the distance-t chromatic number, but determining how tight this bound is is far from being an easy problem. You may also wonder what happens to the bound of the distance-t chromatic number when you forbid some fixed lengths of cycles in your graph. The goal of my talk will be to give some answers to these natural questions.

Vendredi 3 juin 2016, de 11h à 12h00, Salle: B013 (LORIA):

Pauli Miettinen: Matrix factorization models for patterns beyond blocksRésumé : Finding patterns, or regularities, from data and using the patterns to summarise the data are in the core of data mining. In many cases, these pattens are « block-like »: when the data can be expressed as a table of values, or as a matrix, the patterns are sub-tables (or sub-matrices) that are somehow different from the other parts of the data. Clustering, community detection, and frequent itemset mining can all be considered to find « block-like » patterns, to name a few examples. Block-like patters also fit well to the language of linear algebra, where we can model the block-like patterns as (approximate) rank-1 matrices, and data summarisations as matrix factorisations. Not all patterns or summaries in data mining are easy to express as matrix factorisations, though, and in recent years, patterns that go « beyond blocks » have become under active investigation. In this talk, I will discuss why being able to express the data mining task in matrix factorisation terms is important and cover my recent and ongoing work on different types of matrix factorisations that can be used to model patterns beyond blocks.

Sergei O. Kuznetsov: Optimizing Therapy in Subgroups of Patients with Pattern StructuresRésumé : The effectiveness of a therapy of serious lethal diseases is measured by survival curves. A therapy is better than another one if its survival curve lies strictly above. For standard randomisation strategies accepted in medical science, two therapies may be of same quality, however if we consider subgroups of patients defined by common physiological features, one treatment can be strictly better than another for certain subgroups. Closed descriptions is a very natural way for concise representation of subgroups, so we describe an efficient approach for finding closed descriptions of groups where some treatments are definitely preferred to others. The approach can be applied in other domains like optimal credit scoring.

Lucien Haddad: Partial clones and the Erdös-Faber-Lovsáz conjectureRésumé : Let A be a finite set. A partial clone on A is a set of partial functions closed under composition and containing all projection functions on A. We survey some results in the theory of partial clones. In particular: (1) we show the link between The Erdös-Faber-Lovsáz conjecture for graphs and combinatorial descriptions of some maximal partial clones, and (2) we give a complete classification of certain intervals of partial clones, that solves an open problem by D. Lau. These results were obtained in collaboration with (1) C. Tardif and (2) M. Couceiro, K. Schölzel and T. Waldhauser.

Lundi, le 7 décembre 2015, de 11h30 à 12h30, Salle: B011 (LORIA):

Fabien Labernia: Conditional preferences network : an overviewRésumé : Let V be a set of binary variables. Elicitating preferences over the set of all possible assignments for V to obtain a partial (or total) order can be a very long and difficult task. Different models have been proposed to 1) help the expert to elicit these preferences (ceteris paribus preferences), 2) represent in a compact way this exponential set (conditional preference rules) and 3) catch in a formal way some preference properties (such as importance of a variable). In this presentation we will survey different models like CP-nets, TCP-nets, CP-theories and CI-nets (the different properties they catch, active learning of these models), and discuss some current research directions on these combinatorial structures.

Mardi, le 1 décembre 2015, de 11h à 12h00, Salle: B013 (LORIA):

Bruno Teheux: A modal logic for games with liesRésumé : In the Rényi-Ulam game, a liar picks out a number in a given search space. A second player has to guess this number by asking Yes/No questions to the liar who is allowed to lie a maximal given number of times. It is known that states of the game can be encoded in the algebraic language using MV-algebras (which are to Łukasiewicz logic what Boolean algebras are to classical propositional logic). The aim of the talk is to present an (n+1)-valued version of Propositional Dynamic Logic (PDL) that is suitable to model the interactions between players in the Rényi-Ulam game. The ingredients of this model are modal expansions of Łukasiewicz finitely-valued logics, algebras of regular programs and many-valued Kripke models. Our main result will be a completeness result. We will also discuss the gain of expressive power allowed by the new many-valued dynamic language.

Mardi, le 24 novembre 2015, de 13h à 14h00, Salle: B013 (LORIA):

Kevin Dalleau: Suggesting valid pharmacogenes by mining linked dataRésumé : A standard task in pharmacogenomics research is identifying genes that may be involved in drug response variability, i.e., pharmacogenes. Because genomic experiments tended to generate many false positives, computational approaches based on the use of background knowledge have been proposed. Until now, those have used only molecular networks or the biomedical literature. Here we propose a novel method that consumes an eclectic set of linked data sources to help validating uncertain drug–gene relationships. One of the advantages relies on that linked data are implemented in a standard framework that facilitates the joint use of various sources, making easy the consideration of features of various origins. Consequently, we propose an initial selection of linked data sources relevant to pharmacogenomics. We formatted these data to train a random forest algorithm, producing a model that enables classifying drug–gene pairs as related or not, thus confirming the validity of candidate pharmacogenes. Our model achieves the performance of F-measure=0.92, on a 100 folds cross-validation. A list of top candidates is provided and their obtention is discussed.

Renato Vimieiro: Mining disjunctive emerging patterns in high-dimensional data using hypergraphsRésumé : Pattern mining is one of the most known sub-fields of data mining. Its main goal, as originally proposed by Agrawal et al., is to identify sets of attributes that are frequent amongst objects in a dataset. Most of the applications of (frequent) pattern mining involve the analysis of market basket data. In this type of data the attributes are binary and the data table is a matrix where objects are rows, attributes columns and cells indicate whether an object has an attribute. Since the beginning, the area has been extensively investigated and a wealth of efficient algorithms proposed. Although the original proposition was very successful in many domains, it was not suitable for the analysis of labelled data, i.e., data where objects had class labels. Thus one extension to frequent pattern mining, to cope with this type of data, was proposed in 1999 by Dong and Li. Their idea, named emerging patterns, was to extract patterns that were frequent amongst objects with one class label and infrequent amongst the others. Emerging patterns showed to be effective in many applications involving labelled data, particularly those in bioinformatics. In this presentation a new extension to emerging patterns will be briefly discussed. An overview of disjunctive emerging patterns will be given and an efficient algorithm based on hypergraphs and their minimal transversals for mining these patterns will be introduced.

Maurice Pouzet:Boolean dimension of a graph and inversion index of a tournamentRésumé : The Boolean dimension of a graph G (undirected with no loops) is the least integer m for which there exist m subsets X1, … , Xm of the vertex set V(G) of G such that the edges of G are exactly the 2-element subsets e of V(G) which are included into an odd number of the Xi‘s. If G has n vertices, its Boolean dimension is at most n-1; it is n-1 if and only if G is a path. The inversion of a subset X of the vertex set V(T) of a tournament T consists to invert all arcs of T between two vertices of X. The inversion index of T is the least number, denoted by i(T), of subsets of V(T) to invert in order to transform T into an acyclic tournament. I will present the main results and two tantalizing problems: the value of the inversion index of paths of strong connectivity, a majoration of the size of obstructions of the class I(n) of finite tournaments whose inversion index is a most n (the fact that there is a bound follows from techniques of well quasi ordering). Results presented here originate in the thesis of H. Belkhechine [1]. They have been obtained in collaboration with H. Belkhechine, M. Bouaziz and I.Boudabbous and announced in [2].
[1] H. Belkhechine, Indécomposabilité des graphes et des tournois, thèse de doctorat, 15 juillet 2009, Université Claude-Bernard et Université de Sfax.
[2] H. Belkhechine, M. Bouaziz, I. Boudabbous, M. Pouzet, Inversion dans les tournois, C. R. Acad. Sci. Paris, Ser. I 348 (2010) 703-707.

Vendredi, le 5 juin 2015, de 11h00 à 12h00, Salle B013 (LORIA):

Abdallah Saffidine: Solving Games and all of thatRésumé : Efficient best-first search algorithms have been developed for deterministic two-player games with two outcomes. We present a formal framework to represent such best-first search algorithms. The framework is general enough to express popular algorithms such as Proof Number Search, Monte Carlo Tree Search, and the Product Propagation algorithm. We then show how a similar framework can be devised for two more general settings: two-player games with multiple outcomes, and the model checking problem in modal logic K. This gives rise to new Proof Number and Monte Carlo inspired search algorithms for these settings.Similarly, the alpha-beta pruning technique is known to be very important in games with sequential actions. We propose an extension of this technique for stacked-matrix games, a generalization of zero-sum perfect information two-player games that allows simultaneous moves.

Erkko Lehtonen: Homomorphisms of labeled posetsRésumé : A partially ordered set labeled with k labels (a k-poset) is a triple (P, ≤, c), where (P, ≤) is a partially ordered set and c : P → {0, …, k − 1} is a labeling function. A homomorphism of a k-poset (P, ≤, c) to another k-poset (P′, ≤′, c′) is a map h : P → P′ that preserves both the order and the labels, i.e., h(x) ≤′ h(y) whenever x ≤ y and c = c′ ⚬ h. The set of all k-posets is partially ordered by the existence of homomorphism. We present some results concerning the structure of this homomorphism order. In particular, for k ≥ 2, the homomorphism order of finite k-posets is a distributive lattice that is universal for countable posets. We also discuss the complexity of certain decision problems involving labeled posets. This talk is partly based on joint work with Léonard Kwuida.

Mardi, le 28 avril 2015, de 12h00 à 13h00, Salle B013 (LORIA):

Michel Grabisch: Games on concept latticesRésumé : We introduce cooperative TU-games on concept lattices, where a concept is a pair (S,S’) with S being a subset of players or objects, and S’ a subset of attributes. Any such game induces a game on the set of players/objects, which appears to be a TU-game whose collection of feasible coalitions is a lattice closed under intersection, and a game on the set of attributes. We investigate the geometrical properties of the core (non-emptiness, boundedness, pointedness, extremal rays). In particular, we derive the equivalence of the intent and extent core for the class of distributive concepts.