Regulatory Genomics

The identity of a cell is fundamentally governed by the set of genes expressed from its genome. Thus gene expression regulation is a key process for brain development and cell type specification, but also for cognitive functions such as learning and memory. We are working on understanding how the human genome achieves the extraordinary regulatory complexity required for fine-tuning gene expression in the human brain. We focus on two key areas: the role of non-coding regulatory elements and non-coding RNAs such as circular RNAs.

The role of enhancer elements in the human brain

The vast area of the human genome that does not encode proteins is a major player in orchestrating gene expression. Among non-coding DNA, enhancers are distal regulatory regions that play key roles in controlling tissue-specific and developmentally-regulated gene expression. Thousands of candidate enhancers have been characterized biochemically in the human brain (eg. PsychENCODE). The next challenge lies in determining which candidate enhancers are active in a given brain cell type and cell state, and what genes they regulate.

We have previously shown that enhancer RNAs mark brain-specific enhancers that are enriched for genetic variants associated with autism spectrum disorders (Yao et al. 2015). Genetic variants associated with schizophrenia, major depressive disorder, Alzheimer’s disease and brain cancers are also enriched in enhancers.

We are currently working on identifying the genes regulated by enhancers in brain cells through high-throughput functional testing of thousands of enhancers using CRISPRi screens.

The biogenesis and function of circular RNAs in the human brain

Circular RNAs (circRNAs) are RNA molecules formed by back-splicing of exon-exon junctions. A distinguishing property of circRNAs, observed across drosophila, mouse and humans, is their enrichment in the nervous system relative to other tissues. We became interested in circRNAs due to their enrichment and dynamic regulation in brain cells, and their increased stability which may confer an advantage for regulatory roles in the cell.

We are exploring the biogenesis and functions of circular RNAs in the human brain. We have carried out a large scale characterisation of circRNA expression in the human brain (Gokool et al. 2020) and together with the Weatheritt lab have explored the evolutionary aspects of circRNA biogenesis (Santos-Rodriguez et al. 2021).

We are currently working on: (i) identifying circRNAs with regulatory roles in brain cells, and (ii) better understanding the biogenesis of circRNA in the brain, in particular the factors that lead to the high circRNA abundance in brain cells relative to other tissues.

Methodological projects

Benchmarking of gene expression deconvolution

Gene expression measurements, similarly to DNA methylation and proteomic measurements, are strongly influenced by cellular composition. We are interested in controlling the effect of cellular composition on our analyses of bulk gene expression, and have carried out an extensive benchmarking of deconvolution methods on human brain data (Sutton et al, 2021).

Topological data analysis

We are interested in methods that can capture non-linear relationships in gene expression data. Topological data analysis uses concepts from topology to extract information from multi-dimensional data. We have developed TDAview (Walsh et al. 2020), a user-friendly implementation and visualization of the Mapper algorithm. We are also curious about the information that persistent homology can extract from gene expression data (Shnier at el. 2019).