23 Juin – Thesis defense - Kim Savaroche

16 h Ecole Nationale Supérieure de Cognitique (Talence)

Following shape and visual field analysis in a software environment - a minimalist and externalist approach.

Gesture recognition and hand tracking draw attention from both academia and industry. The diversity of the segments of one crosses the problematics of the other, and computer vision or machine learning now includes use cases that constantly challenge the existing - it is this set of constraints that we address here. The "constraint" is this typology of elements which prevent an entity, here software, from progressing towards the objective which is predefined for it. It takes many forms and is not exclusively related to the industrial context or computer hardware. Beyond proposing a concrete software brick, this doctoral work addresses the nature of the constraints encountered during the constitution of this necessarily plural software as well as the conditions of its evolution. From a 2D camera, Clay makes it possible to locate and follow the hands of a user in real time, including in depth (Z), in a small video capture area. Once the movement has been identified, it is possible to link it to an action / instruction: increase or decrease the sound of a sound file, grab a 3D object, activate on-board controls in a passenger compartment, etc. This principle of identification (then of shape monitoring) of the hand requires a fresh examination of the notions of automated segmentation, of classification of the hand (and its sub-parts) from physiological criteria, and up to to the very process of annotating joints. We provide a process for evaluating the identification of the hand in a flow of images guided by the automation of qualitative tests. Moreover, on the theoretical and epistemological levels, the analogical segmentation of the identified hand is based on the seminal concepts of the theories of Form; said form is understood only with regard to its grasping, constituting what we describe as a "tooled perception" following the TAC thesis (Technique as Anthropologically Constitutive). Finally, in terms of model engineering, we offer an annotation tool combining the advantages of 3D modeling with the calibration of several 2D cameras in order to solve the problems inevitably linked to 2D annotation - in particular, the coherence of the proportions and the blind spot of the parts of the hand are resolved. These key elements are each time accompanied by concrete results highlighted by the nature of the constraints that have influenced the developer to choose one solution over another.

Event localization