*Nyoman Juniarta, Yannick Toussaint*

#### Introduction

In this work, we applied FCA to Démonette with two objectives. The first is to systematically represent the relation among derivational families. This representation should allow us to

observe which families share the same set of derivations, and to see which family’s derivation set is more complex than another family’s. The second objective is to detect families having

anomalies, i.e. families having either missing or incorrect derivations.

#### Results

Each family is represented as a derivational graph. The attributes of a node is a lexeme and its part of speech. An edge thus corresponds to a derivation between two lexemes, and is described by the orientation (direct, indirect, or undecidable) and the morphological pattern (e.g. X-Xeur). Currently, D´emonext contains 25,444 families. For each family, we also define “fingerprint”, which is a graph having identical structure as the family’s derivational graph,

but without the lexemes. Consequently, several families can share the same fingerprint. Among 25,444 families, there are 6,657 unique fingerprints. We applied FCA, more specifically AOC-poset (partially ordered set of attribute-object-concepts), to obtain the poset of families and fingerprints.

An example of AOC-poset from five families is shown in Figure 1. We see that f1 “grows” by adding an indirect X-X derivation to become f2 or by adding two derivations to become f3.

From the poset in Figure 1, we can observe the relation among families. The family *cramer* (with only one derivation *cramerV* → *cramageN*) and the family *roder* (with only one

derivation *roderV* → *rodageN*) share the same set of derivations. These two families are less complex than *haubaner* and *jaunir*. Finally, the derivations of ajouter is a combination of

the derivations of haubaner and jaunir.

By exploring the poset and the number of families corresponding to each fingerprint, we can detect anomalies (missing or false derivations). An example of a missing derivation that

we found is that of the family *orpailleur*, which has only two lexemes *orpaillageN* and *orpailleurN*, and an indirect derivation between them. This is considered an anomaly since many families having that derivation also contain a verb. We then propose the addition of the lexeme *orpaillerV* and the corresponding derivations.

Furthermore, we observe a case of possibly false derivations in the family détracter. This family has two direct X-Xion derivations : *détracterV → détractionN* and *détracterV → détractationN*, while several other families contain only one X-Xion from a verb. However, we found that certain families have extra valid derivations for different spelling, e.g. * essuyement-essuiement*, *débuscage-débusquage*, etc., which should not be regarded as incorrect derivations.

These findings were presented to linguists to be validated. We built a web application which can be accessed via https: //github.com/nyomanjuniarta/Demonext-web. We visualize

the family having anomalies and a “normal” family side-by-side, to facilitate the linguists in deciding whether there are actually missing or incorrect derivations.