Research

Research Area

A cell's gene expression profile (what RNAs and proteins are present in a cell) defines a cell's identity. It ensures that cells that all contain the same DNA sequence, can do very different things: muscle cells can contract, an immune cell can (assist in) clearing pathogens and a neuron can transmit signals. A key step in gene expression is transcription: the process of transcribing parts of the DNA into RNAs. RNA Polymerase II (RNAPII) is the machine that transcribes all protein-coding genes into messenger RNAs. Transcription is a multi-step process with tight regulation, because an unbalance in any of the steps will cause gene expression changes which can result in disease.

On the left shows, this image shows the steps leading to the transcription of a full-length mRNA at protein-coding genes. On the right side, you see that at many non-coding loci, the transcription cycle is short-circuited by early termination.

One key "decision point" happens during early elongation: will RNAPII go into productive elongation, or will it terminate early? In recent years, we have learnt that early termination is not uncommon during transcription at protein-coding genes, yet clearly enough RNAPII goes into productive elongation to generate the needed full-length mRNAs. How the correct balance between termination and elongation is achieved is currently unclear.

Over the last 10-15 years, it's become apparent that RNAPII also transcribes many regions of the genome into non-coding RNAs. Some of these have well established functions, such as small nuclear RNAs and microRNAs. The largest number though are generate from places in the genome called 'enhancers'. There, RNAPII produces enhancer RNAs, or eRNAs. Moreover, RNAPII transcribes a non-coding upstream antisense RNA (uaRNA, also called PROMPT) next to most protein-coding genes. The function of transcribing these eRNAs and uaRNAs remains an active topic of investigation and debate. What we do know is that most of these non-coding RNAs are much shorter than a typical mRNAs, meaning that the termination/elongation balance is shifted much more towards early termination. This may be important for genome stability, as RNAPII transcription running rampant throughout the genome would cause collisions and subsequent DNA damage.

With our research, we aim to understand how the balance between early termination and productive elongation is achieved at both protein-coding genes and non-coding loci. By understanding this key step in gene expression, we will open new avenues to study how this process gets misregulated in disease and exploit this for therapeutic gain. 

Research Projects

In the Vlaming group, we study the regulation of this elongation/termination balance both from the DNA perspective, and from the perspective of protein regulators. 

To study what elements in the transcribed sequence (encoded in the DNA or RNA) control the fate of RNAPII, we developed the INSERT-seq approach. 

This image contains a schematic representation of the INSERT-seq methodology. A library of variable sequences is inserted in a genomic locus, and may lead to differences in the produced transcripts. These effects can be measured out at the level of (nascent) RNA and protein fluorescence of a reporter.
Schematic representation of the INSERT-seq approach. Effects of thousands of DNA inserts on transcription/expression is measured in high-throughput.

Using this approach, we found that the composition of the transcribed sequence is a critical determinant of RNAPII elongation potential. This identified that the high GC content of the early transcribed regions of protein-coding genes favors transcription elongation. The GC content of most uaRNAs and eRNAs is much lower, and this contributes to their early termination. Furthermore, the presence of splice elements is important: not only does the process of splicing stimulate transcription, the 5'SS can autonomously promote transcription as well.

This work forms a foundation for future work in the lab. We have evidence that additional sequence elements have a role in dictating the elongation/termination balance, and through a combination of additional INSERT-seq screens and advanced data analysis, we will uncover these elements. For these novel sequences, we will decipher in what contexts they act and how their signals are conveyed to RNAPII.

In parallel, we will use CRISPR screening to identify new protein regulators that differentially control coding and non-coding transcription. For these proteins, we will study their genome-wide transcriptional effects through cutting-edge nascent RNA sequencing approaches, and will uncover what underlies their target specificity.

This image shows schematics of how PRO-seq and TT-seq work.
Approaches used in the lab to measure transcription. PRO-seq maps the exact locations of transcriptionally engaged RNAPII, TT-seq measures recently produced RNA.

Seminar

Here is a seminar about our previous work, presented in the Fragile Nucleosome series: