Nucleosomes comprise duplex deoxyribonucleic acid (DNA) wrapped around histone octamers and are the basic unit of chromatin in eukaryotes. While it is now known that the position of nucleosomes is not random and that nucleosome architecture in chromatin has defined roles, the factors that determine the position of nucleosomes remain poorly understood.
In a recent study published in Nucleic Acids Research, Spanish researchers combined signal transmission theory and machine learning tools to develop a method for accurately predicting the location of nucleosomes in genes.
Study: An integrated machine-learning model to predict nucleosome architecture. Image Credit: romakhan3595/Shutterstock.com
Importance of Nucleosome Positioning
The positioning of nucleosomes along the genome is not random, and regions that do not contain nucleosomes, called nucleosome-free regions or NFRs, are often found at the promotor sites, replication origins, and transcription termination sites of genes. Moreover, NFRs are also important as effector proteins target these regions to regulate gene activity.
Numerous theories have been proposed to explain the position of nucleosomes in the genome. Some suggest that the physical properties of the DNA influence nucleosome position, while others argue that cellular machinery, such as transcription factors and chromatin remodelers, dictates the placement of the nucleosome. However, thus far, the factors that determine the placement of the nucleosome in the genome remain unclear.
About the Study
The present study developed and tested a machine learning-based method that used signal transmission to predict nucleosome placement. The researchers used a series of experimental and computational techniques to explore the nucleosome architecture of the yeast Saccharomyces cerevisiae.
Four mutant strains of S. cerevisiae were created for the study by inserting 81 nucleotide-long DNA sequences into two genes per strain. The sequences were designed so that their insertion would not affect the nucleosome formation or the reading frame of the gene.
The yeast cells were then exponentially grown, arrested, and processed for ribonucleic acid (RNA) extraction. The extracted RNA was used to synthesize complementary DNA (cDNA), which was then used to quantify gene expression through quantitative reverse transcription polymerase chain reaction (qRT-PCR).
The researchers used 10-phenanthroline to inhibit transcription and then measured the decay of messenger RNA (mRNA) for the genes with different stabilities. Micrococcal nuclease digestion with deep sequencing (MNase-seq), a method used to investigate genome-wide nucleosome distribution, was performed 30 minutes after the inhibition of transcription.
The researchers subjected semi-intact yeast cells to micrococcal nuclease digestion to obtain the mononucleosomes. The DNA fragments obtained after the digestion were then sequenced using a HiSeq2000 platform. The MNase-seq data was mapped onto the yeast genome, and then the nucleosomes were identified based on the height and width scores of the nucleosome peaks.
The spacing between the nucleosomes was used to classify genes as phased, unphased, or anti-phased. Furthermore, the researchers developed a signal decay model to predict nucleosome position based on the signals emitted by the first and last nucleosome at the gene boundaries.
Molecular dynamics simulations were used to derive stiffness matrices, which were then employed to calculate the elastic energy required to deform the DNA around the nucleosomes. A trained neural network model then used the DNA sequence features to predict the NFRs.
Major Findings
The study found that nucleosome positioning in yeast could be predicted with high accuracy using signal transmission theory.
This theory assumes that two nucleosomes positioned at the first and last positions in a gene will emit periodic signals that decay with distance. Therefore, genes that are well-phased will have clear periodic signals, while unphased genes will show less defined patterns of nucleosomes.
The machine learning model predicted changes in the periodicity of nucleosomes based on the distance between the first and last nucleosomes.
Furthermore, the study showed that the placement of the nucleosomes was dependent on the NFRs, which could be accurately predicted by the neural network using the energy from DNA deformation and the transcription factor binding site density.
The study showed that the nucleosome positions throughout the genes could be accurately predicted by combining the predictions of the NFRs with the periodic signals.
The results also suggested that the basal nucleosome configuration could be modified by cellular processes such as transcription, where inhibition of transcription resulted in a loss of nucleosome order. These findings suggested a causal relationship where the expression of genes influenced the nucleosome architecture.
Conclusions
To summarize, the study showed that periodic signals from well-defined nucleosomes, combined with predictions of NFRs, could accurately predict nucleosome positions within the genome.
Furthermore, the findings suggested that gene expression could influence nucleosome architecture, with RNA polymerase playing a major role in refining the patterns of nucleosomes during transcription.
Journal reference:
-
Sala, A., Labrador, M., Buitrago, D., De Jorge, P., Battistini, F., Heath, I., & Orozco, M. (2024). An integrated machine-learning model to predict nucleosome architecture. Nucleic Acids Research, 52(17), 10132–10143. doi:10.1093/nar/gkae689. https://academic.oup.com/nar/article/52/17/10132/7736801