15 février 2018
10:45 - 11:45
Oratrice ou orateur
Julien Chiquet
Catégorie d'évènement Séminaire Probabilités et Statistique
Many application domains such as ecology or genomics have to deal with multivariate count data. A typical example is the joint observation of the respective abundances of a set of species in a series of sites, aiming to understand the co-variations between these species. The Gaussian setting provides a canonical way to model such dependencies, but does not apply in general. We adopt here the Poisson lognormal (PLN) model, which is attractive since it allows one to describe multivariate count data with a Poisson distribution as the emission law, while all the dependencies is kept in an hidden friendly multivariate Gaussian layer. While usual maximum likelihood based inference raises some issues in PLN, we show how to circumvent this issue by means of a variational algorithm for which gradient descent easily applies. We then derive several variants of our algorithm to apply PLN to PCA, LDA and sparse covariance inference on multivariate count data. We illustrate our method on microbial ecology datasets, and show the importance of accounting for covariate effects to better understand interactions between species.