08 Décembre – Thesis defense - Sara Akodad
10 h Amphi Jean-Paul Dom - laboratory IMS (University of Bordeaux)
Ensemble learning methods on the space of covariance matrices: application to remote sensing scene and multivariate time series classification.
In view of the growing success of second-order statistics in classification problems, the work of this thesis has been oriented towards the development of learning methods in manifolds. Indeed, covariance matrices are symmetric positive definite matrices that live in a non-Euclidean space. It is therefore necessary to adapt the classical tools of Euclidean geometry to handle this type of data. To do that, we have proposed to exploit the log-Euclidean metric. This latter allows to project the set of covariance matrices on a tangent plane to the manifold defined at a reference point, classically chosen equal to the identity matrix, followed by a vectorization step to obtain the log-Euclidean representation. On this tangent plane, it is possible to define parametric Gaussian models as well as Gaussian mixture models. Nevertheless, this projection on a single tangent plane can induce distortions. In order to overcome this limitation, we have proposed a GMM model composed of several tangent planes, where the reference points are defined by the centers of each cluster.
In view of the success of neural networks, in particular convolutional neural networks (CNNs), we have proposed two hybrid transfer learning approaches based on the covariance matrix computed locally and globally on the CNN convolutional layers’ outputs. The local approach relies on the covariance matrices extracted locally on the first layers of a CNN, which are then encoded by the Fisher vectors computed on their log-Euclidean representation, while for the global approach, a single covariance matrix is computed on the feature maps of the CNN deep layers. Moreover, in order to give more importance to the objects of interest present in the images, we proposed to use a covariance matrix weighted by the saliency information. Furthermore, in order to take advantage of both local and global aspects, these two approaches are subsequently combined in an ensemble strategy.
On the other hand, the availability of multivariate time series has aroused the interest of the remote sensing community and more generally of machine learning researchers for the development of new learning strategies dedicated to supervised classification. In particular, methods based on the calculation of point-to-point distance between series. Moreover, two series belonging to the same class can evolve in different ways, which can induce temporal distortions (translation, compression, dilation, etc.). To avoid this, warping methods allow to align the time series. In order to extend this approach to time series of covariance matrices, while ensuring invariance to the re-parametrization of the series, we were interested in the TSRVF representation. In the same context, several ensemble methods have been proposed in the literature, including TCK, which relies on similarity computation to classify time series. We have proposed to extend this strategy to covariance matrices by introducing the SO-TCK approach which relies on the log-Euclidean representation of such matrices.
Finally, the last axis of this thesis concerns the modeling of temporal trajectories of signals measured by the radar (Sentinel 1) and optical (Sentinel 2) sensors. In particular, we are interested in the forestry problem of the chestnut ink disease in the Montmorency forest. For this purpose, we developed classification and regression models to predict a health status score from the covariance matrix computed on multi-temporal radiometric attributes.