Statistical Machine Learning and Modelling of Biological Systems

Team Publications

Year of publication 2020

Sebastien Gauthier, Iwona Pranke, Vincent Jung, Loredana Martignetti, Véronique Stoven, Thao Nguyen-Khoa, Michaela Semeraro, Alexandre Hinzpeter, Aleksander Edelman, Ida Chiara Guerrera, Isabelle Sermet-Gaudelus (2020 Sep 15)

Urinary Exosomes of Patients with Cystic Fibrosis Unravel CFTR-Related Renal Disease.

International journal of molecular sciences : DOI : E6625 Learn more
Summary

The prevalence of chronic kidney disease is increased in patients with cystic fibrosis (CF). The study of urinary exosomal proteins might provide insight into the pathophysiology of CF kidney disease. Urine samples were collected from 19 CF patients (among those 7 were treated by cystic fibrosis transmembrane conductance regulator (CFTR) modulators), and 8 healthy subjects. Urine exosomal protein content was determined by high resolution mass spectrometry. A heatmap of the differentially expressed proteins in urinary exosomes showed a clear separation between control and CF patients. Seventeen proteins were upregulated in CF patients (including epidermal growth factor receptor (EGFR); proteasome subunit beta type-6, transglutaminases, caspase 14) and 118 were downregulated (including glutathione S-transferases, superoxide dismutase, klotho, endosomal sorting complex required for transport, and matrisome proteins). Gene set enrichment analysis revealed 20 gene sets upregulated and 74 downregulated. Treatment with CFTR modulators yielded no significant modification of the proteomic content. These results highlight that CF kidney cells adapt to the CFTR defect by upregulating proteasome activity and that autophagy and endosomal targeting are impaired. Increased expression of EGFR and decreased expression of klotho and matrisome might play a central role in this CF kidney signature by inducing oxidation, inflammation, accelerated senescence, and abnormal tissue repair. Our study unravels novel insights into consequences of CFTR dysfunction in the urinary tract, some of which may have clinical and therapeutic implications.

Fold up
Racha Chouaib, Adham Safieddine, Xavier Pichon, Arthur Imbert, Oh Sung Kwon, Aubin Samacoits, Abdel-Meneem Traboulsi, Marie-Cécile Robert, Nikolay Tsanov, Emeline Coleno, Ina Poser, Christophe Zimmer, Anthony Hyman, Hervé Le Hir, Kazem Zibara, Marion Peter, Florian Mueller, Thomas Walter, Edouard Bertrand (2020 Aug 14)

A Dual Protein-mRNA Localization Screen Reveals Compartmentalized Translation and Widespread Co-translational RNA Targeting.

Developmental cell : 773-791.e5 : DOI : S1534-5807(20)30584-0 Learn more
Summary

Local translation allows spatial control of gene expression. Here, we performed a dual protein-mRNA localization screen, using smFISH on 523 human cell lines expressing GFP-tagged genes. 32 mRNAs displayed specific cytoplasmic localizations with local translation at unexpected locations, including cytoplasmic protrusions, cell edges, endosomes, Golgi, the nuclear envelope, and centrosomes, the latter being cell-cycle-dependent. Automated classification of mRNA localization patterns revealed a high degree of intercellular heterogeneity. Surprisingly, mRNA localization frequently required ongoing translation, indicating widespread co-translational RNA targeting. Interestingly, while P-body accumulation was frequent (15 mRNAs), four mRNAs accumulated in foci that were distinct structures. These foci lacked the mature protein, but nascent polypeptide imaging showed that they were specialized translation factories. For β-catenin, foci formation was regulated by Wnt, relied on APC-dependent polysome aggregation, and led to nascent protein degradation. Thus, translation factories uniquely regulate nascent protein metabolism and create a fine granular compartmentalization of translation.

Fold up
Playe B., Stoven V. (2020 Jan 1)

Evaluation of deep and shallow learning methods in chemogenomics for the prediction of drugs specificity

Journal of Cheminformatics : 12 : 11 Learn more
Summary

Fold up
Boyd J., Gouveia Z., Perez F., Walter T. (2020 Jan 1)

Experimentally-Generated Ground Truth for Detecting Cell Types in an Image-Based Immunotherapy Screen

2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI) : 886-890 Learn more
Summary

Fold up

Year of publication 2019

Judith Abécassis, Anne-Sophie Hamy, Cécile Laurent, Benjamin Sadacca, Hélène Bonsang-Kitzis, Fabien Reyal, Jean-Philippe Vert (2019 Nov 8)

Assessing reliability of intra-tumor heterogeneity estimates from single sample whole exome sequencing data.

PloS one : e0224143 : DOI : 10.1371/journal.pone.0224143 Learn more
Summary

Tumors are made of evolving and heterogeneous populations of cells which arise from successive appearance and expansion of subclonal populations, following acquisition of mutations conferring them a selective advantage. Those subclonal populations can be sensitive or resistant to different treatments, and provide information about tumor aetiology and future evolution. Hence, it is important to be able to assess the level of heterogeneity of tumors with high reliability for clinical applications. In the past few years, a large number of methods have been proposed to estimate intra-tumor heterogeneity from whole exome sequencing (WES) data, but the accuracy and robustness of these methods on real data remains elusive. Here we systematically apply and compare 6 computational methods to estimate tumor heterogeneity on 1,697 WES samples from the cancer genome atlas (TCGA) covering 3 cancer types (breast invasive carcinoma, bladder urothelial carcinoma, and head and neck squamous cell carcinoma), and two distinct input mutation sets. We observe significant differences between the estimates produced by different methods, and identify several likely confounding factors in heterogeneity assessment for the different methods. We further show that the prognostic value of tumor heterogeneity for survival prediction is limited in those datasets, and find no evidence that it improves over prognosis based on other clinical variables. In conclusion, heterogeneity inference from WES data on a single sample, and its use in cancer prognosis, should be considered with caution. Other approaches to assess intra-tumoral heterogeneity such as those based on multiple samples may be preferable for clinical applications.

Fold up
Joseph C Boyd, Alice Pinheiro, Elaine Del Nery, Fabien Reyal, Thomas Walter (2019 Oct 15)

Domain-invariant features for mechanism of action prediction in a multi-cell-line drug screen.

Bioinformatics (Oxford, England) : 1607-1613 : DOI : 10.1093/bioinformatics/btz774 Learn more
Summary

High-content screening is an important tool in drug discovery and characterization. Often, high-content drug screens are performed on one single-cell line. Yet, a single-cell line cannot be thought of as a perfect disease model. Many diseases feature an important molecular heterogeneity. Consequently, a drug may be effective against one molecular subtype of a disease, but less so against another. To characterize drugs with respect to their effect not only on one cell line but on a panel of cell lines is therefore a promising strategy to streamline the drug discovery process.

Fold up
Collier Olivier, Stoven Véronique, Vert Jean-Philippe (2019 Sep 25)

A Single- and Multitask Machine Learning Algorithm for the Prediction of Cancer Driver Genes

Plos Computational Biology Learn more
Summary

Fold up
Dubois R., Imbert A., Samacoïts A., Peter M., Bertrand E., Müller F., Walter T. (2019 Sep 24)

A Deep Learning Approach To Identify MRNA Localization Patterns

IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019)IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) Learn more
Summary

Fold up
Héctor Climente-González, Chloé-Agathe Azencott, Samuel Kaski, Makoto Yamada (2019 Sep 13)

Block HSIC Lasso: model-free biomarker detection for ultra-high dimensional data.

Bioinformatics (Oxford, England) : i427-i435 : DOI : 10.1093/bioinformatics/btz333 Learn more
Summary

Finding non-linear relationships between biomolecules and a biological outcome is computationally expensive and statistically challenging. Existing methods have important drawbacks, including among others lack of parsimony, non-convexity and computational overhead. Here we propose block HSIC Lasso, a non-linear feature selector that does not present the previous drawbacks.

Fold up
Slim L., Chatelain C., Azencott C.A., Vert J.P. (2019 Jun 1)

kernelPSI: a Post-Selection Inference Framework for Nonlinear Variable Selection

International Conference on Machine LearningInternational Conference on Machine Learning : 5857-5865 Learn more
Summary

Model selection is an essential task for many applications in scientific discovery. The most common approaches rely on univariate linear measures of association between each feature and the outcome. Such classical selection procedures fail to take into account nonlinear effects and interactions between features. Kernel-based selection procedures have been proposed as a solution. However, current strategies for kernel selection fail to measure the significance of a joint model constructed through the combination of the basis kernels. In the present work, we exploit recent advances in post-selection inference to propose a valid statistical test for the association of a joint model of the selected kernels with the outcome. The kernels are selected via a step-wise procedure which we model as a succession of quadratic constraints in the outcome variable.

Fold up
Mélanie Durand, Thomas Walter, Tiphène Pirnay, Thomas Naessens, Paul Gueguen, Christel Goudot, Sonia Lameiras, Qing Chang, Nafiseh Talaei, Olga Ornatsky, Tatiana Vassilevskaia, Sylvain Baulande, Sebastian Amigorena, Elodie Segura (2019 May 11)

Human lymphoid organ cDC2 and macrophages play complementary roles in T follicular helper responses.

The Journal of experimental medicine : DOI : jem.20181994 Learn more
Summary

CD4 T follicular helper (Tfh) cells are essential for inducing efficient humoral responses. T helper polarization is classically orientated by dendritic cells (DCs), which are composed of several subpopulations with distinct functions. Whether human DC subsets display functional specialization for Tfh polarization remains unclear. Here we find that tonsil cDC2 and CD14 macrophages are the best inducers of Tfh polarization. This ability is intrinsic to the cDC2 lineage but tissue dependent for macrophages. We further show that human Tfh cells comprise two effector states producing either IL-21 or CXCL13. Distinct mechanisms drive the production of Tfh effector molecules, involving IL-12p70 for IL-21 and activin A and TGFβ for CXCL13. Finally, using imaging mass cytometry, we find that tonsil CD14 macrophages localize in situ in the B cell follicles, where they can interact with Tfh cells. Our results indicate that human lymphoid organ cDC2 and macrophages play complementary roles in the induction of Tfh responses.

Fold up
Romain Menegaux, Jean-Philippe Vert (2019 Feb 21)

Continuous Embeddings of DNA Sequencing Reads and Application to Metagenomics.

Journal of computational biology : a journal of computational molecular cell biology : 509-518 : DOI : 10.1089/cmb.2018.0174 Learn more
Summary

Fold up
Peter Naylor, Marick Lae, Fabien Reyal, Thomas Walter (2019 Feb 5)

Segmentation of Nuclei in Histopathology Images by Deep Regression of the Distance Map.

IEEE Transactions on Medical Imaging : 448-459 : DOI : 10.1109/TMI.2018.2865709 Learn more
Summary

The advent of digital pathology provides us with the challenging opportunity to automatically analyze whole slides of diseased tissue in order to derive quantitative profiles that can be used for diagnosis and prognosis tasks. In particular, for the development of interpretable models, the detection and segmentation of cell nuclei is of the utmost importance. In this paper, we describe a new method to automatically segment nuclei from Haematoxylin and Eosin (H&E) stained histopathology data with fully convolutional networks. In particular, we address the problem of segmenting touching nuclei by formulating the segmentation problem as a regression task of the distance map. We demonstrate superior performance of this approach as compared to other approaches using Convolutional Neural Networks.

Fold up
Naylor P., Boyd J., Laé M., Reyal F., Walter T. (2019 Jan 1)

Predicting Residual Cancer Burden In A Triple Negative Breast Cancer Cohort

IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019)IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019) : 933-937 Learn more
Summary

Fold up

Year of publication 2018

Aubin Samacoits, Racha Chouaib, Adham Safieddine, Abdel-Meneem Traboulsi, Wei Ouyang, Christophe Zimmer, Marion Peter, Edouard Bertrand, Thomas Walter, Florian Mueller (2018 Nov 4)

A computational framework to study sub-cellular RNA localization.

Nature Communications : 4584 : DOI : 10.1038/s41467-018-06868-w Learn more
Summary

RNA localization is a crucial process for cellular function and can be quantitatively studied by single molecule FISH (smFISH). Here, we present an integrated analysis framework to analyze sub-cellular RNA localization. Using simulated images, we design and validate a set of features describing different RNA localization patterns including polarized distribution, accumulation in cell extensions or foci, at the cell membrane or nuclear envelope. These features are largely invariant to RNA levels, work in multiple cell lines, and can measure localization strength in perturbation experiments. Most importantly, they allow classification by supervised and unsupervised learning at unprecedented accuracy. We successfully validate our approach on representative experimental data. This analysis reveals a surprisingly high degree of localization heterogeneity at the single cell level, indicating a dynamic and plastic nature of RNA localization.

Fold up