UMR3244 – Dynamics of Genetic Information

Team Publications

Year of publication 2016

Nataliya Petryk, Malik Kahli, Yves d'Aubenton-Carafa, Yan Jaszczyszyn, Yimin Shen, Maud Silvain, Claude Thermes, Chun-Long Chen, Olivier Hyrien (2016 Jan 12)

Replication landscape of the human genome.

Nature communications : 10208 : DOI : 10.1038/ncomms10208 Learn more

Despite intense investigation, human replication origins and termini remain elusive. Existing data have shown strong discrepancies. Here we sequenced highly purified Okazaki fragments from two cell types and, for the first time, quantitated replication fork directionality and delineated initiation and termination zones genome-wide. Replication initiates stochastically, primarily within non-transcribed, broad (up to 150 kb) zones that often abut transcribed genes, and terminates dispersively between them. Replication fork progression is significantly co-oriented with the transcription. Initiation and termination zones are frequently contiguous, sometimes separated by regions of unidirectional replication. Initiation zones are enriched in open chromatin and enhancer marks, even when not flanked by genes, and often border ‘topologically associating domains’ (TADs). Initiation zones are enriched in origin recognition complex (ORC)-binding sites and better align to origins previously mapped using bubble-trap than λ-exonuclease. This novel panorama of replication reveals how chromatin and transcription modulate the initiation process to create cell-type-specific replication programs.

Fold up

Year of publication 2013

Olivier Hyrien, Aurélien Rappailles, Guillaume Guilbaud, Antoine Baker, Chun-Long Chen, Arach Goldar, Nataliya Petryk, Malik Kahli, Emilie Ma, Yves d'Aubenton-Carafa, Benjamin Audit, Claude Thermes, Alain Arneodo (2013 Oct 8)

From simple bacterial and archaeal replicons to replication N/U-domains.

Journal of molecular biology : 4673-89 : DOI : 10.1016/j.jmb.2013.09.021 Learn more

The Replicon Theory proposed 50 years ago has proven to apply for replicons of the three domains of life. Here, we review our knowledge of genome organization into single and multiple replicons in bacteria, archaea and eukarya. Bacterial and archaeal replicator/initiator systems are quite specific and efficient, whereas eukaryotic replicons show degenerate specificity and efficiency, allowing for complex regulation of origin firing time. We expand on recent evidence that ~50% of the human genome is organized as ~1,500 megabase-sized replication domains with a characteristic parabolic (U-shaped) replication timing profile and linear (N-shaped) gradient of replication fork polarity. These N/U-domains correspond to self-interacting segments of the chromatin fiber bordered by open chromatin zones and replicate by cascades of origin firing initiating at their borders and propagating to their center, possibly by fork-stimulated initiation. The conserved occurrence of this replication pattern in the germline of mammals has resulted over evolutionary times in the formation of megabase-sized domains with an N-shaped nucleotide compositional skew profile due to replication-associated mutational asymmetries. Overall, these results reveal an evolutionarily conserved but developmentally plastic organization of replication that is driving mammalian genome evolution.

Fold up

Year of publication 2012

Benjamin Audit, Antoine Baker, Chun-Long Chen, Aurélien Rappailles, Guillaume Guilbaud, Hanna Julienne, Arach Goldar, Yves d'Aubenton-Carafa, Olivier Hyrien, Claude Thermes, Alain Arneodo (2012 Dec 15)

Multiscale analysis of genome-wide replication timing profiles using a wavelet-based signal-processing algorithm.

Nature protocols : 98-110 : DOI : 10.1038/nprot.2012.145 Learn more

In this protocol, we describe the use of the LastWave open-source signal-processing command language ( for analyzing cellular DNA replication timing profiles. LastWave makes use of a multiscale, wavelet-based signal-processing algorithm that is based on a rigorous theoretical analysis linking timing profiles to fundamental features of the cell’s DNA replication program, such as the average replication fork polarity and the difference between replication origin density and termination site density. We describe the flow of signal-processing operations to obtain interactive visual analyses of DNA replication timing profiles. We focus on procedures for exploring the space-scale map of apparent replication speeds to detect peaks in the replication timing profiles that represent preferential replication initiation zones, and for delimiting U-shaped domains in the replication timing profile. In comparison with the generally adopted approach that involves genome segmentation into regions of constant timing separated by timing transition regions, the present protocol enables the recognition of more complex patterns of the spatio-temporal replication program and has a broader range of applications. Completing the full procedure should not take more than 1 h, although learning the basics of the program can take a few hours and achieving full proficiency in the use of the software may take days.

Fold up
A Baker, C L Chen, H Julienne, B Audit, Y d'Aubenton-Carafa, C Thermes, A Arneodo (2012 Nov 27)

Linking the DNA strand asymmetry to the spatio-temporal replication program: II. Accounting for neighbor-dependent substitution rates.

The European physical journal. E, Soft matter : 123 : DOI : 10.1140/epje/i2012-12123-9 Learn more

In paper I, we addressed the impact of the spatio-temporal program on the DNA composition evolution in the case of time homogeneous and neighbor-independent substitution rates. But substitution rates do depend on the flanking nucleotides as exemplified in vertebrates where CpG sites are hypermutable so that the substitution rate C –> T depends dramatically (ten fold) on whether the cytosine belongs to a CG dinucleotide or not. With the specific goal to account for neighbor-dependence, we revisit our minimal modeling of neutral substitution rates in the human genome. When assuming that r = CpG –> TpG and its reverse complement r(c) = CpG –> CpA are (by far) the main neighbor-dependent substitution rates, we demonstrate, using perturbative analysis, that neighbor-dependence does not affect the decomposition of the compositional asymmetry into a transcription- and a replication-associated components, the former increases in magnitude with transcription rate and changes sign with gene orientation, whereas the latter is proportional to the replication fork polarity. Indeed the neighbor dependence case differs from the neighbor-independent model by an additional source term related to the CG dinucleotide content in both the transcription and replication-associated components. We finally discuss the case of time-dependent substitution rates confirming as a very general result the fact that the skew can still be decomposed into a transcription- and a replication-associated components.

Fold up
Benjamin Audit, Lamia Zaghloul, Antoine Baker, Alain Arneodo, Chun-Long Chen, Yves d'Aubenton-Carafa, Claude Thermes (2012 Nov 15)

Megabase replication domains along the human genome: relation to chromatin structure and genome organisation.

Sub-cellular biochemistry : 57-80 : DOI : 10.1007/978-94-007-4525-4_3 Learn more

In higher eukaryotes, the absence of specific sequence motifs, marking the origins of replication has been a serious hindrance to the understanding of (i) the mechanisms that regulate the spatio-temporal replication program, and (ii) the links between origins activation, chromatin structure and transcription. In this chapter, we review the partitioning of the human genome into megabased-size replication domains delineated as N-shaped motifs in the strand compositional asymmetry profiles. They collectively span 28.3% of the genome and are bordered by more than 1,000 putative replication origins. We recapitulate the comparison of this partition of the human genome with high-resolution experimental data that confirms that replication domain borders are likely to be preferential replication initiation zones in the germline. In addition, we highlight the specific distribution of experimental and numerical chromatin marks along replication domains. Domain borders correspond to particular open chromatin regions, possibly encoded in the DNA sequence, and around which replication and transcription are highly coordinated. These regions also present a high evolutionary breakpoint density, suggesting that susceptibility to breakage might be linked to local open chromatin fiber state. Altogether, this chapter presents a compartmentalization of the human genome into replication domains that are landmarks of the human genome organization and are likely to play a key role in genome dynamics during evolution and in pathological situations.

Fold up
A Baker, H Julienne, C L Chen, B Audit, Y d'Aubenton-Carafa, C Thermes, A Arneodo (2012 Sep 25)

Linking the DNA strand asymmetry to the spatio-temporal replication program. I. About the role of the replication fork polarity in genome evolution.

The European physical journal. E, Soft matter : 92 Learn more

Two key cellular processes, namely transcription and replication, require the opening of the DNA double helix and act differently on the two DNA strands, generating different mutational patterns (mutational asymmetry) that may result, after long evolutionary time, in different nucleotide compositions on the two DNA strands (compositional asymmetry). We elaborate on the simplest model of neutral substitution rates that takes into account the strand asymmetries generated by the transcription and replication processes. Using perturbation theory, we then solve the time evolution of the DNA composition under strand-asymmetric substitution rates. In our minimal model, the compositional and substitutional asymmetries are predicted to decompose into a transcription- and a replication-associated components. The transcription-associated asymmetry increases in magnitude with transcription rate and changes sign with gene orientation while the replication-associated asymmetry is proportional to the replication fork polarity. These results are confirmed experimentally in the human genome, using substitution rates obtained by aligning the human and chimpanzee genomes using macaca and orangutan as outgroups, and replication fork polarity determined in the HeLa cell line as estimated from the derivative of the mean replication timing. When further investigating the dynamics of compositional skew evolution, we show that it is not at equilibrium yet and that its evolution is an extremely slow process with characteristic time scales of several hundred Myrs.

Fold up
Antoine Baker, Benjamin Audit, Chun-Long Chen, Benoit Moindrot, Antoine Leleu, Guillaume Guilbaud, Aurélien Rappailles, Cédric Vaillant, Arach Goldar, Fabien Mongelard, Yves d'Aubenton-Carafa, Olivier Hyrien, Claude Thermes, Alain Arneodo (2012 Apr 13)

Replication fork polarity gradients revealed by megabase-sized U-shaped replication timing domains in human cell lines.

PLoS computational biology : e1002443 : DOI : 10.1371/journal.pcbi.1002443 Learn more

In higher eukaryotes, replication program specification in different cell types remains to be fully understood. We show for seven human cell lines that about half of the genome is divided in domains that display a characteristic U-shaped replication timing profile with early initiation zones at borders and late replication at centers. Significant overlap is observed between U-domains of different cell lines and also with germline replication domains exhibiting a N-shaped nucleotide compositional skew. From the demonstration that the average fork polarity is directly reflected by both the compositional skew and the derivative of the replication timing profile, we argue that the fact that this derivative displays a N-shape in U-domains sustains the existence of large-scale gradients of replication fork polarity in somatic and germline cells. Analysis of chromatin interaction (Hi-C) and chromatin marker data reveals that U-domains correspond to high-order chromatin structural units. We discuss possible models for replication origin activation within U/N-domains. The compartmentalization of the genome into replication U/N-domains provides new insights on the organization of the replication program in the human genome.

Fold up
Guillaume Guilbaud, Aurélien Rappailles, Antoine Baker, Chun-Long Chen, Alain Arneodo, Arach Goldar, Yves d'Aubenton-Carafa, Claude Thermes, Benjamin Audit, Olivier Hyrien (2012 Jan 6)

Evidence for sequential and increasing activation of replication origins along replication timing gradients in the human genome.

PLoS computational biology : e1002322 : DOI : 10.1371/journal.pcbi.1002322 Learn more

Genome-wide replication timing studies have suggested that mammalian chromosomes consist of megabase-scale domains of coordinated origin firing separated by large originless transition regions. Here, we report a quantitative genome-wide analysis of DNA replication kinetics in several human cell types that contradicts this view. DNA combing in HeLa cells sorted into four temporal compartments of S phase shows that replication origins are spaced at 40 kb intervals and fire as small clusters whose synchrony increases during S phase and that replication fork velocity (mean 0.7 kb/min, maximum 2.0 kb/min) remains constant and narrowly distributed through S phase. However, multi-scale analysis of a genome-wide replication timing profile shows a broad distribution of replication timing gradients with practically no regions larger than 100 kb replicating at less than 2 kb/min. Therefore, HeLa cells lack large regions of unidirectional fork progression. Temporal transition regions are replicated by sequential activation of origins at a rate that increases during S phase and replication timing gradients are set by the delay and the spacing between successive origin firings rather than by the velocity of single forks. Activation of internal origins in a specific temporal transition region is directly demonstrated by DNA combing of the IGH locus in HeLa cells. Analysis of published origin maps in HeLa cells and published replication timing and DNA combing data in several other cell types corroborate these findings, with the interesting exception of embryonic stem cells where regions of unidirectional fork progression seem more abundant. These results can be explained if origins fire independently of each other but under the control of long-range chromatin structure, or if replication forks progressing from early origins stimulate initiation in nearby unreplicated DNA. These findings shed a new light on the replication timing program of mammalian genomes and provide a general model for their replication kinetics.

Fold up

Year of publication 2011

Chun-Long Chen, Lauranne Duquenne, Benjamin Audit, Guillaume Guilbaud, Aurélien Rappailles, Antoine Baker, Maxime Huvet, Yves d'Aubenton-Carafa, Olivier Hyrien, Alain Arneodo, Claude Thermes (2011 Mar 4)

Replication-associated mutational asymmetry in the human genome.

Molecular biology and evolution : 2327-37 : DOI : 10.1093/molbev/msr056 Learn more

During evolution, mutations occur at rates that can differ between the two DNA strands. In the human genome, nucleotide substitutions occur at different rates on the transcribed and non-transcribed strands that may result from transcription-coupled repair. These mutational asymmetries generate transcription-associated compositional skews. To date, the existence of such asymmetries associated with replication has not yet been established. Here, we compute the nucleotide substitution matrices around replication initiation zones identified as sharp peaks in replication timing profiles and associated with abrupt jumps in the compositional skew profile. We show that the substitution matrices computed in these regions fully explain the jumps in the compositional skew profile when crossing initiation zones. In intergenic regions, we observe mutational asymmetries measured as differences between complementary substitution rates; their sign changes when crossing initiation zones. These mutational asymmetries are unlikely to result from cryptic transcription but can be explained by a model based on replication errors and strand-biased repair. In transcribed regions, mutational asymmetries associated with replication superimpose on the previously described mutational asymmetries associated with transcription. We separate the substitution asymmetries associated with both mechanisms, which allows us to determine for the first time in eukaryotes, the mutational asymmetries associated with replication and to reevaluate those associated with transcription. Replication-associated mutational asymmetry may result from unequal rates of complementary base misincorporation by the DNA polymerases coupled with DNA mismatch repair (MMR) acting with different efficiencies on the leading and lagging strands. Replication, acting in germ line cells during long evolutionary times, contributed equally with transcription to produce the present abrupt jumps in the compositional skew. These results demonstrate that DNA replication is one of the major processes that shape human genome composition.

Fold up

Year of publication 2010

Chun-Long Chen, Aurélien Rappailles, Lauranne Duquenne, Maxime Huvet, Guillaume Guilbaud, Laurent Farinelli, Benjamin Audit, Yves d'Aubenton-Carafa, Alain Arneodo, Olivier Hyrien, Claude Thermes (2010 Jan 28)

Impact of replication timing on non-CpG and CpG substitution rates in mammalian genomes.

Genome research : 447-57 : DOI : 10.1101/gr.098947.109 Learn more

Neutral nucleotide substitutions occur at varying rates along genomes, and it remains a major issue to unravel the mechanisms that cause these variations and to analyze their evolutionary consequences. Here, we study the role of replication in the neutral substitution pattern. We obtained a high-resolution replication timing profile of the whole human genome by massively parallel sequencing of nascent BrdU-labeled replicating DNA. These data were compared to the neutral substitution rates along the human genome, obtained by aligning human and chimpanzee genomes using macaque and orangutan as outgroups. All substitution rates increase monotonously with replication timing even after controlling for local or regional nucleotide composition, crossover rate, distance to telomeres, and chromatin compaction. The increase in non-CpG substitution rates might result from several mechanisms including the increase in mutation-prone activities or the decrease in efficiency of DNA repair during the S phase. In contrast, the rate of C –> T transitions in CpG dinucleotides increases in later-replicating regions due to increasing DNA methylation level that reflects a negative correlation between timing and gene expression. Similar results are observed in the mouse, which indicates that replication timing is a main factor affecting nucleotide substitution dynamics at non-CpG sites and constitutes a major neutral process driving mammalian genome evolution.

Fold up