MODELING THE SPATIO -TEMPORAL EFFECT OF 2,3,7,8 TETRACHLORODIBENZO-P-
DIOXIN ON HEPATIC GENE EXPRESSION 

By 

Daniel Kwabena Marri 

A DISSERTATION  

Submitted to  
Michigan State University  
in partial fulfillment of the requirements 
for the degree of 

Biomedical Engineering – Doctor of Philosophy  

 2024 

i 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ABSTRACT 

Circadian  clocks  are  intrinsic  molecular  oscillators  present  in  cells  across  prokaryotes  and 

eukaryotes  that  synchronize  physiological  processes  with  external  cues,  enabling  organismal 

adaptation and survival. These clocks regulate crucial biological functions, including sleep-wake 

cycles,  thermoregulation,  hepatic  metabolism,  and  hormonal  secretion,  through  the  rhythmic 

expression  of  clock-controlled  genes.  The  mammalian  liver  comprises  structural  units  called 

lobules, with  hepatocytes arranged in  a hexagonal  pattern  along  a pericentral-to-periportal  axis 

extending from the central vein to the portal triad. Perturbations in the circadian clock network can 

contribute  to  the  pathogenesis  of  various  disorders,  such  as  obesity,  diabetes,  inflammatory 

conditions, and certain cancers. 

2,3,7,8-Tetrachlorodibenzo-p-dioxin  (TCDD)  is  an  exogenous  ligand  that  binds  to  the  aryl 

hydrocarbon  receptor  (AHR),  eliciting  diverse  toxic  effects  by  disrupting  the  circadian  clock 

mechanism. To investigate whether TCDD-activated AHR disrupts the intrahepatic circadian clock 

by interfering with genome-wide CLOCK:BMAL1 binding, potentially leading to a reduction or 

loss  of  rhythmicity  in  clock-controlled  genes,  interpretable  machine  learning  models  were 

developed to predict BMAL1 binding to DNA in liver, kidney, and heart tissues using genetic and 

epigenetic  features  (binding  sequence,  DNA  shape,  and  histone  modifications).  Thus,  TCDD 

activated AHR has been proposed to bind to the E-box binding motifs to disrupt the regulation of 

circadian clock genes. The findings demonstrated that BMAL1 binding to DNA is tissue-specific, 

and  the  combination  of  sequence,  DNA  shape,  and  histone  modification  features  yielded  the 

highest  binding  prediction  accuracy.  Additionally,  the  flanking  sequences  upstream  and 

downstream of the binding motifs played a crucial role in BMAL1 binding to DNA. 

Furthermore,  a  spatiotemporal  multicellular  mathematical  model  of  the  mammalian  circadian 

clock in the liver lobule was developed to investigate intercellular coupling for the synchronization  

ii 

 
 
 
 
 
 
of circadian clock expression across the portal-to-central axis. The analysis revealed that, similar 

to the coupling of autonomous circadian oscillators in the suprachiasmatic nucleus (SCN), hepatic 

clock  rhythms  are  likely  synchronized  by  an  unknown  coupling  factor.  Sensitivity  analysis, 

bifurcation  analysis,  and  parameter  estimation  from  the  model  provided  insights  into  the 

physiology of the hepatic clock and potential mechanisms of alteration. 

Lastly, to understand the interplay between the spatial and temporal axes of gene expression in the 

liver,  particularly  in  drug  metabolism  pathways,  the  existence  of  a  third  axis,  chemical 

perturbation, and its implications for hepatic function were uncovered. We developed a non-linear 

mixed  effect  models  to  investigate  the  effect  of  acute  TCDD  perturbation  on  the  spatial  and 

temporal  axes  of  gene  expression  in  the  liver  lobule. The  analysis  revealed  a  distortion  of  the 

spatial axis of gene expression but a low significant effect on the temporal axis. These findings 

provide a comprehensive examination of circadian rhythms and their disruption by TCDD in the 

liver, encompassing molecular mechanisms, predictive modeling, and spatiotemporal dynamics. 

The study offers valuable insights into the intricate regulatory mechanisms governing circadian 

rhythms, the significance of zonation in hepatic functions, and the interplay between spatial and 

temporal  gene  expression.  The  findings  have  the  potential  to  contribute  significantly  to  our 

understanding of circadian resilience and the mitigation of pathological conditions, particularly in 

the context of drug metabolism pathways and hepatic function. 

iii 

 
 
 
 
 
 
 
 
 
 
 
Copyright by 
DANIEL KWABENA MARRI 
2024 

iv 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
I dedicate this thesis with profound gratitude and love to my family, all those who love me 
and whom I love dearly. 

v 

 
 
                                                        
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ACKNOWLEDGEMENTS 

I  would  like  to  extend  my  sincere  appreciation  to  my  thesis  supervisor  and  mentor,  Dr.  Sudin 

Bhattacharya, whose unwavering support, encouragement, and belief in me have been instrumental 

in shaping this endeavor. Your wealth of knowledge, patience, understanding, and provision of this 

wonderful opportunity have been instrumental in guiding me through this academic path. I also 

extend  my  heartfelt  thanks  to  the  esteemed  members  of  my  dissertation  committee:  Dr.  Ripla 

Arora, for your guidance and the invaluable experience of introducing me to wet laboratory work; 

Dr.  Dana  Spence,  for  your  steadfast  support,  teachings,  and  insightful  advice;  Dr.  Hanne 

Hoffmann, for your constant availability, particularly in the early stages of my thesis, to discuss 

the biological relevance of my work – your research papers and conference recommendations were 

instrumental in propelling me this far; and Dr. Eva Farre Prokosch, for the honor of having you on 

my committee and for your insightful feedback that enriched the quality of my work. Additionally, 

I would like to extend my gratitude to Dr. Timothy Zacharewski, Dr. Rance Nault, and Dr. Cholico 

Giovanni  for  generously  opening  their  doors  and  sharing  their  datasets  with  me;  Dr.  Pawel 

Danielewicz,  for  your  support  and  introduction  to  Michigan  State  University;  and  Dr.  David 

Arnosti, for being an exceptional mentor and teaching me the fundamentals of molecular biology, 

as well as for all the remarkable work you do for AIMS. Next, I would like to express my heartfelt 

appreciation to my family and friends who have been by my side every step of the way. Special 

thanks to Chido Matsika for your unwavering presence and support during the most challenging 

times, making them more manageable. I am grateful to Dr. David Filipovic, Dr. Omar Kana, and 

all my lab members for being a part of this significant journey and contributing to the completion 

of this thesis. 

vi 

 
 
 
 TABLE OF CONTENTS 

CHAPTER 1: INTRODUCTION ................................................................................................ 1 

CHAPTER 2: PREDICTING MAMMALIAN TISSUE SPECIFIC DNA-BINDING BY 
CLOCK-BMAL1........................................................................................................................... 8 
INTRODUCTION ..................................................................................................................... 8 
METHODS .............................................................................................................................. 13 
RESULTS ................................................................................................................................. 17 

CHAPTER 3: MATHEMATICAL MODELING OF THE SPATIAL AND TEMPORAL 
DYNAMICS OF CIRCADIAN CLOCK IN THE LIVER ...................................................... 37 
    INTRODUCTION................................................................................................................... 37 
METHODS .............................................................................................................................. 41 
RESULTS ................................................................................................................................. 51 

CHAPTER 4: SPATIAL-TEMPORAL PERTURBATION OF THE LIVER LOBULE BY 
2,3,7,8-TETRACHLORODIBENZO-P-DIOXIN .................................................................... 73 
INTRODUCTION ................................................................................................................... 73 
METHODS .............................................................................................................................. 77 
RESULTS ................................................................................................................................. 89 

DISCUSSION AND CONCLUSION ...................................................................................... 127 

BIBLOGRAPHY....................................................................................................................... 137 

vii 

 
 
 
 
 
 
 
CHAPTER 1: INTRODUCTION 

Circadian  clocks  are  endogenous  molecular  networks  present  in  the  cells  of  wide  spectrum  of 

organisms from prokaryotes to eukaryotes that regulate 24 -h physiological and behavior rhythms 

1–3.  The  circadian  clock  mechanism  synchronizes  physiological  activities  with  the  external  cue 

cycles for organismal survival. In 1729, Jean-Jacques d’Ortous de Mairan discovered that Mimosa 

pudica leaves exhibit daily folding and unfolding cycles in constant darkness4. Animal circadian 

rhythms  were  later  affirmed  by  experiments  in  1972,  which  demonstrated  persistent  rhythmic 

activity in rats under constant darkness and temperature5 and in Drosophila melanogaster and other 

insects6. The circadian clock rhythms are regulated by a molecular clock mechanism which exists 

in all cells in the mammalian body.  The hypothalamic suprachiasmatic nucleus (SCN) contains 

the master pacemaker clock that synchronizes subsidiary oscillators in peripheral tissues like the 

liver, heart, and kidney. The SCN receives input from external cues and, in turn, relays temporal 

signals  to  synchronize  peripheral  clocks2,7.    The  molecular  basis  for  the  mammalian  circadian 

clock mechanism involves a coordinated transcriptional-translational feedback loop that operates 

on  an  approximately  24  hr  period.  The  feedback  loop  is  driven  by  both  positive  and  negative 

regulatory interactions that allow for sustained oscillatory activity8–10. The positive arm of the loop 

centers on the activation of E-box mediated transcription by the circadian locomotor output cycles 

kaput (CLOCK) or Neuronal PAS domain protein 2 (NPAS2) with brain and muscle ARNT-like 

1  (BMAL1)  forming  the  heterodimeric  complex  CLOCK-BMAL1  or  NPAS2-BMAL1.  This 

heterodimer protein drives the expression of various clock-controlled genes (CCG’s) including the 

Period (Per 1, Per 2 and Per 3) and the Cryptochrome (Cry 1 and Cry 2) which comprise the 

negative  arm  by  repressing  their  own  transcription  by  interacting  with  CLOCK-BMAL1  or 

NPAS2-BMAL1 complex, thus constituting an auto-regulatory feedback process. Additionally, a 

1 

 
secondary  stabilizing  loop  involves  the  activation  of  Bmal1  by  retinoic  acid  receptor-related 

orphan nuclear receptors (RORα, RORβ, and RORγ) and its repression by REV-ERBα and REV-

ERBβ. The precise balance and delays involved in the kinetics of these multiple interconnected 

feedback processes allows for sustained 24-hour periodicity. 

Figure  1:  The  key  molecular  component  of  the  mammalian  clock  mechanism  with  the 
transcriptional – translational feedback loops driving by the positive and negative loop. Adapted 
from Lee, Y11. 

Several  physiological  and  biological  functions  and  processes  are  regulated  by  circadian  clock-

controlled genes12,13, display rhythmicity over time. These include core body temperature, which 

fluctuates on a circadian cycle; sleep-wake cycles, which follow a circadian rhythm dictated by 

the suprachiasmatic nucleus;  cardiovascular variables like heart rate and  blood  pressure, which 

demonstrate  both  circadian  and  ultradian  rhythms;  feeding  behavior  and  digestive  processes, 

which are influenced by circadian, hunger-satiety, and digestive cycles; secretion of hormones like 

2 

 
 
 
 
 
  
 
 
 
 
 
melatonin,  cortisol,  growth  hormone,  and  prolactin,  which  often  follow  circadian  or  ultradian 

patterns  determined  by  the  hypothalamic-pituitary  axis  and  other  oscillators;  liver  metabolic 

activity  and  gene  expression,  which  varies  cyclically  over  24-hour  periods  and  in  response  to 

feeding times and fasting; renal blood flow, glomerular filtration rate, and other measures of kidney 

function, which display circadian variations; as well as several other measurable bodily processes 

14–18.  

Figure 2: Physiological and biological functions regulated by circadian along with the implicated 
genes and proteins in parentheses. Green arrows represent induction and red arrows repression by 
circadian clock gene or protein. Adapted from Jacob et al19. 

3 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Studies  from  animal  models  and  human  subjects  have  shown 

that  perturbation  or 

desynchronization of circadian rhythms can contribute to, the progression of various pathologies20–

22. These include metabolic diseases like obesity and diabetes, inflammatory disorders, and certain 

cancers23–25. Additionally, times of day can affect the severity of symptoms or acute exacerbations 

in  some  conditions  with  a  circadian  component.  For  example,  asthma  attacks  and  rheumatoid 

arthritis flares display circadian patterns, as does the incidence of adverse cardiovascular events 

like myocardial infarction and stroke18,24. The cyclic nature of these conditions points to circadian 

misalignment and loss of coordination between external timing cues and internal oscillators as an 

integral  mechanism  in  disease  pathology.  Strengthening  circadian  resilience  through  timed 

behavioral,  or  pharmacologic,  or  genetic  interventions  represents  a  therapeutic  opportunity  to 

mitigate disease severity in select disorders.  

The  chemical  compound,  2,3,7,8-Tetrachlorodibenzo-p-dioxin  (TCDD)  is  a  persistent  organic 

pollutant that elicits diverse toxic effects in mammals. TCDD exposure has been associated with 

a myriad of adverse health outcomes, including developmental abnormalities, carcinogenesis, and 

hepatotoxicity.  It  exerts  these  effects  primarily  through  activation  of  the  cytosolic  protein  aryl 

hydrocarbon  receptor  (AHR),  acting  as  a  transcription  factor26,27.  TCDD  exposure  perturbs 

circadian  rhythms  in  peripheral  tissues  like  the  liver  by  disrupting  molecular  circadian  clock 

oscillations.  Specifically,  TCDD  activated  AHR  affects  circadian  rhythm  by  reducing  the 

amplitude,  shifting  the  phase,  and  lengthening  the  period  of  circadian  oscillations  in  clock-

controlled genes expression28.  

The liver exhibits strong rhythmicity, regulating various metabolic functions like, detoxification, 

glycogen  storage,  lipid  metabolism,  and  bile  production  in  accordance  with  the  body's  internal 

clock29. The mammalian liver is made up of structural units called lobules. The lobule is composed 

4 

 
of hepatocytes, the liver's parenchymal cells, arranged in a radial pattern around a central vein. 

The hepatocytes are arranged in a roughly hexagonal architecture with a pericentral to periportal 

axis  extending  from  the  central  to  the  portal  veins.  Hepatocytes  are  not  homogeneous  in  their 

metabolic activities; instead, they display distinct biological functions depending on their location 

within the lobule30–32.  Thus, metabolic functions within the liver lobule exhibit a distinct spatial 

organization, a phenomenon known as zonation. The periportal hepatocytes, being exposed to a 

higher  oxygen  concentration,  are  primarily  involved  in  processes  such  as  gluconeogenesis  (the 

synthesis  of  glucose  from  non-carbohydrate  precursors),  urea  synthesis  (detoxification  of 

ammonia), and bile acid synthesis. Conversely, the pericentral hepatocytes, which receive oxygen-

depleted blood, are more active in glycolysis (the breakdown of glucose for energy production), 

lipogenesis (the synthesis of fatty acids), and xenobiotic metabolism (the detoxification of drugs 

and other foreign compounds) 30,32–36.  

The hepatic expression of AHR is zonated across the lobular axis29,33. However, no zonation is 

observed in the basal expression of circadian clock genes. While a large number of mathematical 

models  of  the  natural  circadian  clock  oscillation  have  been  developed9,37,38,  none  of  these 

specifically address spatial disruption of the circadian clock in peripheral tissues like the liver. I 

aimed to address the question of whether TCDD-activated AHR disrupts the intra-hepatic circadian 

clock  in  a  zonated  manner  by  interfering  with  genome-wide  CLOCK:BMAL1  binding,  which 

could lead to a reduction or total loss of rhythmicity of clock-controlled genes. To do this, I first 

developed an interpretable machine learning model of genome-wide DNA binding by BMAL1 in 

the  liver,  heart  and  kidney  using  published  Chromatin  immunoprecipitation-sequencing  (ChIP-

seq) data.  My results confirmed an overlap between the binding motif of BMAL1 (E-box – 5’-

CANNTG-3’) and AHR (5'-GCGTG-3') in the promoter regions and bodies of core circadian clock 

5 

 
genes. This is consistent with the proposed mechanism of TCDD activated AHR binding to the E-

box motif to disrupt the regulation of circadian clock genes. Also, my model showed that, using 

both genomic and epigenomic features like the core motif plus flanking sequences, the shape of 

the DNA and histone modification, we could achieve significant prediction accuracy of BMAL1 

binding. Interpreting my model, I showed specific features with the highest contribution to tissue-

specific  binding  of  BMAL  to  DNA.  My  cross  tissue  predictive  model  demonstrated  that  while 

BMAL1 exhibits high specificity in binding certain DNA conformations and chromatin contexts, 

these specificities exhibit variation across different tissues. Our discoveries expand upon the notion 

that  DNA  shape  and  chromatin  context  can  modulate  the  binding  specificities  of  transcription 

factors  (TFs).  Specifically,  our  results  demonstrate  that  in  addition  to  conferring  differential 

binding specificities to distinct TFs within the same tissue, DNA shape and chromatin environment 

can also confer distinct binding specificities to a given TF in different tissue contexts.  

Secondly, I developed  spatiotemporal multicellular mathematical models of the mammalian liver 

lobule circadian clock network. I developed two sets of mathematical models to investigate the 

dynamics of circadian clock genes expression in hepatocyte across the portal-to-central axis of the 

liver  lobule:  one  model  with  communication  and  one  without  communication  among  cellular 

oscillators  (hepatocytes).    Simulations  of  the  model  revealed  the  dependencies  of  the  model 

observables (expressed mRNA levels of the various genes) on their corresponding transcription 

and  degradation  rates.  Estimating  the  model  parameters  and  fitting  the  coupled  model  to 

experimental data yielded a high correlation with R2 > 0.9, elucidating the way alterations in model 

parameters  modulate  the  reinforcing  and  attenuating  feedback  loops  that  govern  the  circadian 

rhythm. Collectively, my modeling framework establishes a foundation for probing the regulatory 

6 

 
mechanisms  underlying  circadian  rhythmicity  and  its  associated  aberrations  in  the  context  of 

hepatic physiology and pathological states. 

A recent investigation by Droin et al29 revealed the intricate interplay between spatial and temporal 

axes of gene expresssion, leading to rhythmic patterns of expression within established zonated 

pathways. Notably, these pathways encompass a significant number of gene sets implicated in drug 

metabolism  processes. The  observed  overlap  between  the  temporal  and  spatial  axes  with  drug 

metabolism pathways suggests the existence of a third axis, chemical perturbation, that warrants 

consideration when characterizing hepatic function. This led to the final project in my dissertation 

where  I  developed  a  nonlinear  mixed  effect  model  to  investigate  the  effects  of  acute  TCDD-

activated AHR on the temporal (rhythmicity) and spatial (zonated) expression patterns of genes 

along the portal-central axis of the liver lobule. 

7 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
CHAPTER 2: PREDICTING MAMMALIAN TISSUE SPECIFIC DNA-BINDING BY 
CLOCK-BMAL1 

INTRODUCTION 

All living organisms possess a robust circadian timekeeping mechanism enabling anticipation and 

adaptation to recurring environmental changes. In mammals, this system consists of a hierarchical 

network of oscillators. The central  clock, situated in  the suprachiasmatic nucleus (SCN) of the 

hypothalamus,  synchronizes  peripheral  clocks  across  various  tissues3.  The  intracellular  gene 

regulatory network of the circadian clocks, both central and peripheral, involves a relatively small 

set of key transcription factors (TFs) that are interconnected through multiple negative and positive 

feedback  loops. These  feedback  loops  play  a  crucial  role  in  regulating  the  expression  of  clock 

genes and maintaining the oscillatory patterns of the circadian rhythm. In this regulatory network, 

specific TFs  bind  to  the  promoter  regions  of  target  genes,  either  activating  or  repressing  their 

transcription. The expression of these TFs is, in turn, regulated by the products of the genes they 

control, creating intricate feedback loops39.  The core activators of the circadian network are, the 

Clock Locomotor Output Cycles Kaput (CLOCK) and brain and muscle ARNT Like 1 (BMAL1) 

transcription factors, both of which belong to the basic helix-loop-helix (bHLH) family. These two 

proteins form a heterodimer complex (referred to hereon as CLOCK-BMAL1). In the absence of 

CLOCK, another member of the bHLH-PAS transcription factor family, the Neuronal PAS domain 

protein  2  (NPAS2),  can  compensate  by  forming  a  heterodimer  with  BMAL140.  In  the  classical 

model  of  clock  gene  regulation,  the  CLOCK-BMAL1  or  NPAS2-BMAL1  dimer  acts  as  a 

transcriptional activator, initiating the expression of various clock-controlled genes by binding to 

a specific hexanucleotide sequence known as the E-box motif (canonical sequence 5’-CANNTG-

3’, where N represents any nucleotide) within the promoter or enhancer regions of clock-controlled 

genes. This binding event regulates the transcription of these genes, which are crucial for the proper 

8 

 
 
functioning of the circadian clock machinery40,41. In vivo studies have shown that, BMAL1 binds 

also  to  non-canonical  E-box  sequences  such  as  5’-CACGTT-3’  in  the  promoter  region  of  the 

murine  Per2  gene42.  However,  comprehensive  experimental  evidence  supporting  genome-wide 

binding of BMAL1 to such sequences remains elusive. Consequently, in this study, I have focused 

solely  on  the  classical  E-box  motif  with  the  canonical  sequence  5’-CANNTG-3’  (where  N 

represents any nucleotide). Disruptions in the expression or binding functionality of the core clock 

transcription  factors  (TFs)  disturb  natural  circadian  oscillations  and  can  lead  to  various 

pathologies,  including  insomnia,  cancer,  cardiovascular  disease,  and  metabolic  disorders43,44. 

Here, I aim to enhance our understanding of gene regulation by the CLOCK-BMAL1 or NPAS2-

BMAL1  complex  and  its  perturbations  by  employing  interpretable  predictive  models  of  DNA 

binding  by  the  master  regulatory  factor  BMAL1.  By  developing  these  models,  I  seek  to  gain 

insights into the mechanisms underlying the binding of BMAL1 to target DNA sequences and the 

subsequent regulation of clock-controlled genes.  

The genome-wide identification of transcription factor binding sites (TFBS) poses a significant 

challenge.  Typically,  only  a  small  fraction  of  the  classically  defined  sequence  motifs  for  a 

particular transcription factor (TF) are bound45. For example, the canonical E-box binding motif 

occurs more than 7 million times across the mouse genome, but less than 0.7% of these motifs are 

bound by the CLOCK-BMAL1 or NPAS2-BMAL1 complexes in mouse peripheral tissues46. The 

binding  of  a  particular  TF  to  its  cognate  DNA  motif  depends  on  several  molecular  features, 

including  the  DNA  sequence  of  the  core  motif,  sequences  flanking  the  core  motif,  chromatin 

accessibility,  local  shape  of  the  DNA,  presence  of  co-factors,  histone  modifications,  DNA 

methylation, and other biophysical parameters47–50. These features and their relative contributions 

to  binding  can  vary  greatly  across  different  cell  and 

tissue 

types51,52.  Chromatin 

9 

 
immunoprecipitation followed by sequencing (ChIP-seq) is the current gold standard for assaying 

genome-wide TF binding locations53. However, assaying the binding of a given TF under various 

conditions  and  in  different  tissues  is  prohibitively  expensive.  Consequently,  several  predictive 

computational  models  of  genome-wide  TF-DNA  binding  have  been  developed.  From  these 

models, DNA sequence and chromatin accessibility emerge as the most important determinants of 

TF binding. Despite the vast number of potential binding sites in the genome, TFs exhibit highly 

specific binding patterns, with only a small fraction of the canonical motifs being occupied. This 

specificity is attributed to the interplay of various factors, including the DNA sequence context, 

chromatin accessibility, and other biophysical parameters. These features can vary significantly 

across different cell and tissue types, contributing to the complexity of TF binding patterns54–56. 

Chromatin  accessibility  assays,  such  as  deoxyribonuclease  hypersensitive  sites  sequencing 

(DNase-seq) and assay for transposase-accessible chromatin sequencing (ATAC-seq), have been 

employed  to  enhance  the  prediction  of  transcription  factor  binding  sites  (TFBS). These  assays 

provide valuable information about  the regions  of the genome that are accessible to  regulatory 

proteins,  a  crucial  determinant  of  TF  binding57.  Recently,  advancements  in  machine  learning, 

particularly  deep  learning  techniques,  have  led  to  improved  model  predictions  for TF  binding. 

Deep learning models have demonstrated the ability to capture complex patterns and interactions 

within  the  data,  leading  to  more  accurate  predictions  of  TF-DNA  binding  events.  While  deep 

learning models excel in predictive performance, they often tend to be "black boxes," where it is 

challenging  to  understand  the  reasoning  and  mechanism  behind  their  predictions.  This  lack  of 

interpretability  hinders  the  ability  to  elucidate  the  molecular  features  and  interactions  that 

contribute to the tissue-specific binding patterns of transcription factors58–60. 

10 

 
In  this  study,  I  present  interpretable  machine  learning  models  capable  of  predicting  which 

canonical E-box motifs occurring in accessible chromatin regions of the mouse liver, heart, and 

kidney are likely  to be bound by BMAL1. Our predictive models are based on the XGBoost61 

machine learning algorithm, with logistic regression used as a baseline algorithm to evaluate model 

performance. Published data from a BMAL1 ChIP-seq study46 was used to train and evaluate the 

models. When considering which features to include in our predictive models, I noted that DNA 

shape62,63 and histone modifications64 have been shown to be efficient predictors of transcription 

factor (TF) binding in addition to DNA sequence. Specifically, it has been proposed that TFs prefer 

specific 3D DNA conformations and not just specific sequences65. For example, the incorporation 

of DNA shape features led to improved model performance when predicting in vivo binding of 

TFs  from  the  basic  helix–loop–helix  (bHLH)  family. Particularly,  five  distinct  shape  features  - 

Electrostatic Potential (EP), Minor Groove Width (MGW), Propeller Twist (ProT), Roll, and Helix 

Twist (HelT) have been shown to be useful for TF-DNA binding prediction. These shape features 

capture the local three-dimensional structure of the DNA, which can influence the binding affinity 

and specificity of TFs66. In addition to DNA shape features, histone modifications have also been 

implicated in regulating TF-DNA binding. Specific histone modifications can alter the chromatin 

structure and accessibility, thereby affecting the ability of TFs to recognize and bind to their target 

sequences.  By  incorporating  information  about  histone  modifications,  predictive  models  can 

potentially capture additional regulatory mechanisms governing TF-DNA binding67. 

Our study revealed that while most of the flanking DNA sequence features showed low importance 

in  predicting  the binding of BMAL1 to  canonical  E-box motifs. However, the second flanking 

nucleotide upstream of the E-box motif in the liver, specific histone modifications and DNA shape 

features emerged as significant predictors of BMAL1-DNA binding across all tissues examined. 

11 

 
Specifically, the histone modifications H3K27ac, H3K4me1, H3K4me3, and H3K36me3, together 

with the DNA shape features electrostatic potential (EP), roll, and minor groove width (MGW), 

were identified as crucial determinants of BMAL1-DNA binding in the liver, heart, and kidney 

tissues. The  incorporation  of  these  features  into  our  machine  learning  models  resulted  in  high 

predictive  performance,  highlighting  their  importance  in  governing  the  binding  specificity  of 

BMAL1. However, our cross-tissue predictive model revealed that although BMAL1 exhibits a 

preference for specific DNA conformations and chromatin contexts, these specificities vary across 

different  tissues. This  finding  suggests that while certain  genomic and epigenomic features  are 

generally  important  for  BMAL1  binding,  the  relative  contributions  of  these  features  and  the 

specific  combinations  that  facilitate  binding  may  differ  among  tissues.  The  tissue-specific 

variations in BMAL1 binding preferences could be attributed to the unique chromatin landscapes, 

regulatory networks, and cellular environments present in different tissues. These differences may 

influence  the  interplay  between  DNA  sequence,  DNA  shape,  histone  modifications,  and  other 

regulatory factors, resulting in distinct binding patterns of BMAL1 across tissues. 

Our interpretable machine learning models not only achieved high predictive accuracy but also 

provided  insights  into  the  key  genomic  and  epigenomic  features  that  govern  BMAL1-DNA 

binding.  By  identifying  the  most  influential  features  and  their  relative  importance,  our  study 

contributes to a better understanding of the regulatory mechanisms underlying the tissue-specific 

binding patterns of BMAL1, a critical transcription factor in the circadian clock machinery. 

12 

 
 
 
 
 
METHODS 

ChIP-seq dataset preprocessing. 

Uniformly processed BMAL1 ChIP-seq peaks from the C57BL/6J mouse liver, kidney, and heart 

were retrieved from the Gene Expression Omnibus under accession code GSE110604946. BMAL1 

ChIP-seq experiments were conducted at Zeitgeber time 6 (ZT6). Accessible chromatin regions, 

represented by DNase I-hypersensitive (DHS) sites for all three tissues (DNase-seq), were obtained 

from  the  Encyclopedia  of  DNA  Elements,  ENCODE  (Supplementary  Materials).  DNase-seq 

experiments  were  performed  on  unsynchronized  tissues.  The  Genome  Reference  Consortium 

Mouse Build 38 (GRCm 38) served as the reference genome. DHS sequences were processed in 

Python  with  BEDTools68  to  extract  all  E-Box  sequences  (5’-CANNTG-3’)  within  accessible 

chromatin. E-box motifs in accessible chromatin regions not overlapping their respective tissue 

ChIP-seq  bed  files  were  considered  instances  of  unbound  motifs  (the  negative  dataset  for  the 

model). All accessible chromatin singleton E-boxes (instances of only one E-box motif under a 

BMAL1 peak) and E-boxes closest to the summit of the BMAL1 peaks for peaks with multiple E-

boxes  were  labeled  as  bound  (the  positive  dataset).  Other  E-boxes  under  BMAL1  peaks  were 

deemed ambiguous and excluded from further analysis. Specifically, 1175 E-boxes, 1082 E-boxes, 

and 663 E-boxes from the bound Bmal1 liver, kidney, and heart, respectively, were found to be 

ambiguous  due  to  multiple  E-boxes.  Each  E-box  motif  sequence  was  extended  to  include  4-

basepair  (bp)  flanking  sequences  upstream  and  downstream  of  the  E-box. As  the  E-box  motif 

sequence is a palindrome, the reverse complement was disregarded. Each E-box, thus represented 

by a 14-nucleotide sequence (6-bp core plus 4-bp sequence on either end), was one-hot encoded. 

The  binary  (bound  and  unbound)  E-box  data  resulted  in  highly  imbalanced  datasets,  with 

significantly more unbound than bound E-boxes in mouse accessible chromatin. Specifically, the 

13 

 
number  of  bound  E-box  motifs  in  the  liver,  kidney,  and  heart  were  3725,  3237,  and  1313, 

respectively. Conversely, the number of unbound E-box motifs in the liver, kidney, and heart were 

189581,  262053,  and  291840,  respectively.  Notably,  the  negative  samples  outnumbered  the 

positives  by  factors  of  51  in  the  liver,  223  in  the  heart,  and  82  in  the  kidney.  The  reported 

occupancies and the ratio of bound to unbound E-box motifs in the liver, kidney, and heart tissues 

are consistent with previous studies, indicating a low percentage of canonical E-box motifs being 

bound by BMAL1 in accessible chromatin regions. 

DNA shape preprocessing. 

The  DNA  sugar-phosphate  backbone  possesses  degrees  of  freedom,  allowing  neighboring  base 

pairs  and  bases  within  a  pair  to  vary  their  positions  relative  to  each  other  through  rotation  or 

translation, resulting in changes to the overall shape of the DNA molecule. To estimate DNA shape 

features, the R/Bioconductor package DNAshapeR69,70 was employed. The DNAshapeR algorithm 

predicts  DNA  shape  features  based  on  a  given  DNA  sequence  and  encodes  them  into  feature 

vectors. These feature vectors for each shape category were normalized to values between 0 and 1 

using Min-Max normalization and grouped into sets of 10 values for Minor Groove Width (MGW), 

Propeller Twist (ProT), and Electrophoretic Mobility (EP), and sets of 11 values for Helix Twist 

(HelT) and Roll, to serve as inputs for predictive models. The number of bins for each shape feature 

corresponds to the length of the sliding window used to generate the features – 5 base pairs (bp) 

for MGW, ProT, EP, and 6 bp for HelT and Roll. 

Histone modification preprocessing. 

I obtained ChIP-seq data for five histone modifications, namely H3K27ac, H3K4me1, H3K4me3, 

H3K27me3, and H3K36me3, in mouse liver, kidney, and heart tissues from the ENCODE database 

(Supplementary Materials). The ChIP-seq experiments for histone modifications were performed 

14 

 
on unsynchronized tissue samples. These specific histone modifications were chosen based on data 

availability  across  all  tissues  and  their  established  roles  in  transcription  factor  binding.  The 

corresponding bed files were utilized to generate signal profiles and heatmaps using the deepTools 

software71. From the generated profiles and heatmaps, we observed that the histone modification 

ChIP-seq signals extended meaningfully up to a 1.5-kb region (+/- 750 bp) centered on the E-box 

core  motif.  Focusing  on  this  1.5-kb  region  centered  at  the  E-box  core  motif,  we  extracted  the 

histone modification features for the binary dataset for each tissue using the bwtool software72. 

Subsequently, these features were divided into 10 bins, with an equal number of nucleotides in 

each bin. 

Machine learning models. 

XGBoost 

Extreme Gradient Boosting (XGBoost) is an ensemble learning method based on boosting decision 

trees for both classification and regression tasks61. I used up to 20 features as inputs for each E-

box motif - 10 sequence features (one for each nucleotide), 5 DNA-Shape features and 5 histone 

modification features. The first two and last two nucleotide of the E-box motifs were set because 

they  were  the  same  in  all  motifs.  Using  the  Scikit-learn  library  in  Python,  I  performed 

hyperparameter tuning of the following parameters to reduce the degree of overfitting - the number 

of iterations in training (n_estimators), the sum of sample  weight of the smallest leaf nodes to 

prevent overfitting (min_child_weight), the maximum depth of the tree in building a model while 

training  (max_depth),  the  sampling  rate  of  the  training  set  in  each  iteration  (subsample),  the 

learning  rate  (learning_rate),  and  the  feature  sampling  rate  when  constructing  each  tree 

(colsample_bytree).  Hyperparameter  tuning  of  the  XGBoost  model  involved  exploring  a  grid 

search of the hyperparameter space with specific values: n_estimators = {30, 40, 50, 60, 70, 80, 

15 

 
90, 100}, min_child_weight = {1, 2, 3, 4, 5, 6}, subsample={0.5, 0.6, 0.7, 0.8, 0.9, 1 }, max_depth 

= {1, 2, 3, 4, 5}, learning_rate = {0.1, 0.2, 0.3, 0.4, 0.5}, and colsample_bytree = {0.6, 0.7, 0.8, 

0.9, 1 }, resulting in a potential combination of 36,000 hyperparameters. Additionally, I evaluated 

the model's performance using 5-fold cross-validation on predicting the binding status of E-box 

motifs in accessible chromatin regions. 

Logistic Regression  

Logistic regression is a parametric classification model that estimates the probability of the output 

variable belonging to a particular class73. It serves as a baseline for most machine learning-based 

classification models. In this study, I tuned the following hyperparameters of the logistic regression 

model to reduce overfitting on our testing dataset: the regularization solver for the training dataset 

(solver)  and  the  maximum  number  of  iterations  the  solver  algorithm  is  allowed  to  run  before 

convergence (max_iter). The regularization solver determines the algorithm used for optimization 

and regularization. Common solvers include 'lbfgs' (limited-memory Broyden-Fletcher-Goldfarb-

Shanno), 

'newton-cg'  (Newton-Conjugate  Gradient),  and 

'liblinear'  (coordinate  descent 

algorithms). The  max_iter  parameter  specifies  the  maximum  number  of  iterations  the  solver  is 

permitted to run before terminating the optimization process. By tuning these hyperparameters, I 

aimed  to  find  the  optimal  combination  of  solver  and  maximum iterations  that  would  minimize 

overfitting on the testing dataset, thereby enhancing the generalization performance of the logistic 

regression model. 

16 

 
 
 
 
RESULTS 

BMAL1 binds most frequently to the CACGTG E-box motif in all tissues. 

To understand the molecular factors governing the binding of BMAL1 to DNA, I first delved into 

the role of the canonical 5’-CANNTG-3’ E-box motif in the binding of BMAL1 to DNA in the 

various  tissues.  I  conducted  a  comprehensive  scan  of  the  mouse  mm10  reference  genome  to 

identify occurrences of the E-box motif; 5’- CANNTG -3’. I considered all possible nucleotide 

permutations  of  the  central  two  nucleotides  (NN)  and  included  the  reverse  complement  of  the 

canonical E-box sequence. However, a particular E-box and its reverse complement were treated 

separately. Utilizing DNase-seq datasets  retrieved from  the ENCODE database74 for C57BL/6J 

mouse  tissues  (liver,  heart,  and  kidney),  I  identified  subsets  of  E-boxes  that  intersected  or 

overlapped with DNase-seq hypersensitive sites (DHS), indicative of accessible chromatin. The 

tissue-specific list of E-boxes from accessible chromatin were then compared with their respective 

tissue matched BMAL1 ChIP-seq46 peaks to extract all BMAL1 bound and unbound E-Boxes in 

accessible  chromatin.  Furthermore,  I  identified  instances  where  BMAL1-bound  E-boxes  were 

situated outside of accessible chromatin, accounting for 0.8% of all peaks. To prevent potential 

confounding effects between the two classes of bound E-boxes, I excluded these instances from 

our model training and evaluation processes. Comparison of BMAL1-bound E-box occurrences 

within  accessible  chromatin  across  liver,  heart,  and  kidney  tissues  revealed  marked  tissue 

specificity. Specifically, only  398 E-boxes were found to  be bound in  common across all three 

tissues  (Fig  3A-C).  E-boxes  bound  across  all  three  tissues  were  frequently  located  within  the 

promoters of core circadian clock genes, as indicated by our findings (data not shown). 

17 

 
 
 
 
 
B 

A 

C 

Figure 3: (A) BMAL1 ChIP-seq peaks in the liver (red), kidney (blue), and heart (green), and E-
box  binding  motifs  (black  vertical  bars)  under  the  peaks  at  the  Per1  locus.  (B)  Venn  diagram 
representing the overlap of bound E-boxes motifs in open chromatin across liver, kidney, and heart. 
(C) Venn diagram representing the overlap of unbound E-boxes motifs in open chromatin across 
liver, kidney, and heart. Adapted from Marri et al75. 

Subsequently,  I  quantified  all  occurrences  of  the  canonical  E-box  motif  (5’-CANNTG-3’) 

throughout  the  mouse  genome,  wherein  N  denotes  any  nucleotide  type.  The  canonical  E-box 

encompasses 16 unique E-box types, corresponding to each permutation of the NN dinucleotide 

situated at the center of the motif. I computed the fraction of each individual E-box type compared 

to  the  total  number  of  E-boxes  (Fig  4A).  Notably,  the  E-box  types  5’-CACATG-3’  and  5’-

18 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
CATGTG-3’ collectively constituted the highest fraction of E-boxes within the mouse genome, 

accounting for 17.3% of all instances. These two motifs, being reverse complements of each other, 

demonstrated approximately equal frequencies, like all other non-palindromic E-boxes that exhibit 

reverse  complementarity.  Interestingly,  the  palindromic  BMAL1-preferred  E-Box  motif,  5’-

CACGTG-3’, occurs the fewest number of times, representing only 1.83% of all instances within 

the mouse genome. Subsequently, I employed the same methodology to analyze E-boxes within 

accessible chromatin regions of mouse liver, kidney, and heart. The palindromic motif CAGCTG 

was the most common E-box type across accessible chromatin regions in all three tissues, while 

the BMAL1-preferred E-Box CACGTG was among the three least common motifs, which were 

all palindromes (Fig 4B). Using the overlap between the BMAL1 ChIP-seq and DNase-seq peaks, 

I computed the percentage of BMAL1 bound E-boxes in the mouse liver, kidney and heart relative 

to the total number of E-boxes of the same type within accessible chromatin of their respective 

tissues. Interestingly, the BMAL1-preferred E-box 5’-CACGTG-3’ emerged as the most frequently 

bound E-box type across all three tissues. In addition, about 18% of CACGTG E-boxes accessible 

in  the  liver  were  bound,  and  for  the  kidney  and  heart  these  fractions  were  15%,  and  4%, 

respectively.  Furthermore,  less  than  20%  of  all  individual  E-boxes  identified  within  accessible 

chromatin in any given tissue were also bound in that same tissue (Fig 4C). The kidney and heart 

had a higher overall number of E-boxes within these open chromatin regions compared to the liver. 

However, despite having a lower total count of E-boxes in accessible chromatin, the liver exhibited 

a higher proportion of BMAL1-bound E-boxes. This observation suggests a differential regulatory 

landscape for BMAL1 binding among these tissues, potentially reflecting distinct functional roles 

and circadian regulatory mechanisms.  

19 

 
I observed instances where there were none (zero), exactly one (singleton), or two or more (multi) 

E-box motifs under a single BMAL1 ChIP-seq peak across all tissues examined (Fig 4D). I then 

extracted all singleton E-boxes and the E-boxes closest to the summit of the BMAL1 peak within 

multi-E-box peaks, categorizing them as bound peaks (positive dataset). The E-Boxes present in 

accessible  chromatin  regions  that  were  not  bound  by  BMAL1  were  labeled  as  unbound  peaks 

(negative dataset). All other E-boxes were excluded from further analysis. The ratios of the positive 

(bound) to negative (unbound) peaks were 1:51, 1:82, and 1:223 in the liver, kidney, and heart, 

respectively. These  ratios  indicate  a  significant  imbalance  between  the  bound  and  unbound  E-

boxes, with a substantially higher number of unbound E-boxes in all three tissues.  

Together, these results suggest that BMAL1 likely interacts with multiple different E-box types 

across the liver, kidney, and heart in a tissue-specific manner. Notably, the E-box motif CACGTG 

was found to be the most highly associated with BMAL1 binding among the different E-box types 

analyzed, despite its lower occurrence in the genome overall. The observation of varying ratios of 

bound to unbound E-boxes across tissues highlights the tissue-specific nature of BMAL1 binding 

and  suggests  that  additional  factors,  such  as  chromatin  accessibility,  co-factor  availability,  and 

regulatory mechanisms, may contribute to the differential binding patterns observed. Furthermore, 

the  identification  of  instances  where  BMAL1  ChIP-seq  peaks  do  not  overlap  with  any  E-box 

motifs or contain multiple E-boxes under a single peak underscores the complexity of BMAL1 

binding and the potential involvement of alternative binding mechanisms or indirect interactions 

with chromatin. 

20 

 
 
 
A 

e
m
o
n
e
g
e
h
t
n

i

s
f
ti
o
m
x
o
b
-
E
f
o
r
e
b
m
u
N

C 

E-box motifs 

l

e
b
i
s
s
e
c
c
a
n

i

n
ti
a
m
o
r
h
c

s
f
ti
o
m
d
n
u
o
b
f
o
e
g
a
t
n
e
c
r
e
P

B 

l

e
b
i
s
s
e
c
c
a
n

i

E-box motifs 

n
ti
a
m
o
r
h
c

s
f
ti
o
m
x
o
b
-
E
f
o
r
e
b
m
u
N

D 

s
k
a
e
p
1
L
A
M
B
f
o
r
e
b
m
u
N

E-box motifs 

Number of E-box motifs under BMAL1 peak 

Figure 4: (A) E-box binding motif distribution across the entire mouse genome. The canonical E-
Box motif CACGTG (marked with an arrow) is the least represented motif in the mouse genome. 
(B) Distribution of E-box binding motifs in open chromatin across the liver (blue), kidney (orange) 
and heart (green).  (C) Percentage of BMAL1 bound E-box motifs in open chromatin across the 
liver (blue), kidney (orange) and heart (green).  (D) Distribution of BMAL1 peaks with zero (0-E-
Box), exactly one (singleton E-box) and multi (two or more E-box) E-box motifs in the liver (blue), 
kidney (orange) and heart (green). In (A), (B) and (C) the sequences along the x-axis are ordered 
by  their  frequency  in  the  mouse  genome  shown  in  (A),  which  also  happens  to  group 
complementary sequences adjacent to each other. The four palindromic sequences, which are their 
own complements, are marked with asterisks. Adapted from Marri et al75. 

21 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Predicting genome wide BMAL1 binding within tissues. 

Given  that  nucleotides  flanking  the  E-box  have  been  demonstrated  to  influence  the  binding 

specificity  of  E-box  binding  transcription  factors  (TFs)55.  Therefore,  I  extended  and  one-hot 

encoded the genomic sequence for all BMAL1-bound (positive) and unbound (negative) E-boxes 

by  4  base  pairs  (bps)  upstream  and  downstream  of  the  E-box.  Additionally,  I  computed  the 

following DNA shape features for the extended 14 bp sequence: Electrostatic Potential (EP), Minor 

Groove Width (MGW), Propeller Twist (ProT), Roll, and Helix Twist (HelT), utilizing the k-mer 

+ k-shape (k=1) sequence feature model50. Although the shape features are derived from the DNA 

sequence,  they  can  potentially  capture  higher-order  interdependencies  between  neighboring 

nucleotides, thereby providing additional information to the model input. DNA shape features can 

also elucidate the importance of flanking sequences in TF-DNA binding specificity55. Visualization 

of the DNA shape features EP, ProT, and Roll revealed disparities in DNA shape between bound 

and  unbound  motifs  across  the  liver,  kidney,  and  heart,  whereas  the  MGW  feature  exhibited 

differences between bound and unbound motifs solely in the kidney. The shape feature vector for 

each category was then normalized to values ranging from 0 to 1 using Min-Max normalization, 

grouped into intervals of 10 values for the DNA shape features EP, MGW, and ProT, and intervals 

of 11 values for HelT and Roll. These normalized DNA shape feature vectors were utilized as input 

features for the predictive models, as depicted in Fig 5. 

Epigenetic modifications are known to influence transcription factor binding. Specifically, histone 

modifications intricately regulate transcription factor occupancy and subsequently modulate gene 

expression  patterns62,76.  Histone  modification  ChIP-seq    signal  values,  encompassing  genomic 

regions spanning ± 750 base pairs (bps) around the E-box, were utilized to compute feature vectors 

for five histone modifications: H3K27ac, H3K4me1, H3K4me3, H3K27me3, and H3K36me3. The 

22 

 
selection  of  the  ±  750-bp  region  was  guided  by  the  need  to  consider  local  profiles  of  histone 

modifications, approximating the size of a typical promoter or enhancer. The histone feature vector 

was partitioned into 10 bins, with the signal strength averaged across 150 bps within each bin. This 

approach  allowed  for  the  comprehensive  characterization  of  histone  modification  patterns 

surrounding the E-box, providing insights into the regulatory landscape governing transcription 

factor binding dynamics and gene expression regulation. 

Figure 5: Design of the machine learning algorithm input features. The local chromatin features 
(E-box DNA sequence features) and flanking sequences were one-hot encoded. The DNA shape 
genomic feature matrix from the k-mer + k-shape (k=1) sequence feature model and epigenomic 
(histone modification) features averaged and binned were used as the final feature matrix for the 
model. Adapted from Marri et al75. 

I  implemented  three  distinct  models  to  investigate  the  predictive  power  of  various  feature 

combinations for determining the binding status of E-boxes in accessible chromatin regions. The 

models were constructed using subsets of the final encoded feature set, encompassing (i) DNA 

sequence information  alone (DNA sequence-only  model);  (ii) a combination of DNA sequence 

and DNA shape features (sequence + shape model); and (iii) an integrated model incorporating 

23 

 
 
 
 
 
 
 
 
 
DNA sequence, DNA shape, and histone modification data (sequence + shape + HM model). To 

predict  the  binding  status  of  E-boxes,  I  employed  two  machine  learning  algorithms:  XGBoost 

(eXtreme Gradient Boosting) and logistic regression. XGBoost, a powerful tree-based ensemble 

learning algorithm, served as our principal predictive model, while logistic regression was utilized 

as a baseline for performance comparison. Aiming to optimize model performance, I employed a 

grid  search  strategy  in  conjunction  with  stratified  5-fold  cross-validation  to  tune  the 

hyperparameters  of  each  model. This  process  involved  partitioning  the  data  into  five  stratified 

folds, where each fold was used as a validation set once, while the remaining four folds were used 

for training. Hyperparameter tuning was performed independently for each model based on the 

liver, heart, and kidney datasets, allowing us to derive the optimal hyperparameter configurations 

tailored  to  each  tissue-specific  dataset.  Subsequently,  the  models  with  the  optimized 

hyperparameters  were  trained  through  five-fold  stratified  cross-validation,  and  their  predictive 

performance was evaluated for  the liver, heart, and kidney datasets (Figure 6 and table 1). The 

average performance across the five folds was reported for each model and tissue type. 

Figure  6:  Schematic  of  machine  learning-  predictive  model  training.  Based  on  5-fold  cross-
validation, the XGBoost classifier predicted the binding status of E-box motifs in open chromatin, 
training on all accessible bound E-boxes and unbound E-boxes. Adapted from Marri et al75. 

24 

 
 
 
 
 
 
 
 
 
 
DNA shape and histone modification features improve within-tissue model performance. 

Model performance was assessed using two widely adopted metrics: the area under the receiver 

operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC). 

These metrics provide a comprehensive evaluation of the models' ability to distinguish between 

positive (bound) and negative (unbound) instances, accounting for both sensitivity and 

specificity. 

Performance of sequence-only models: We conducted training and validation of our XGBoost 

classifier  across  liver,  heart,  and  kidney  datasets,  utilizing  10  sequence  features:  the  2  central 

nucleotides  of  the  E-Box,  along  with  an  additional  4  flanking  nucleotides  upstream  and 

downstream  of  the  E-box  (NNNNCANNTGNNNN,  excluding  the  conserved  CA  and  TG 

subsequences)  due  to  the  repetition  of  CA  and  TG  nucleotide  in  all  the  sequence  features.  To 

evaluate classifier performance, we computed the average AUROC and AUPRC scores for each 

tissue  via  stratified  5-fold  cross-validation.  Given  the  unbalanced  distribution  between  the  two 

classes - bound vs unbound E-boxes, AUPRC was deemed a more appropriate metric in our case. 

The mean AUROC scores across liver, kidney, and heart were 0.71, 0.78, and 0.80, respectively. 

Correspondingly,  the  mean AUPRC  scores  were  0.09,  0.10,  and  0.06  for  the  liver,  kidney,  and 

heart, respectively (Fig 7A&B). The relatively high AUROC and AUPRC scores across all tissues 

suggest differences in the two central nucleotides and flanking sequence between BMAL1 bound 

and unbound E-boxes. However, the predictive power solely derived from DNA sequence seems 

insufficient for robust prediction, indicating the need for additional features or more sophisticated 

modeling approaches to enhance prediction accuracy. 

Performance  of  sequence  +  shape  models:  The  specific  local  configurations  of  DNA  are 

determined  by  its  three-dimensional  structure,  which  in  turn  influences  various  biological 

25 

 
processes.  Computational  methods,  including  Monte  Carlo  simulations,  have  been  utilized  to 

derive  features  that  quantify  DNA  shape  from  its  local  sequence66,69.  Five  distinct  DNA  shape 

features – Electrostatic Potential (EP), Minor Groove Width (MGW), Propeller Twist (ProT), Roll, 

and  Helix Twist  (HelT)  –  have  been  identified  as  significant  contributors  to  the  DNA  binding 

affinity of transcription factors belonging to the basic helix-loop-helix (bHLH) family50. The 14 

bp  sequences  (NNNNCANNTGNNNN)  were  used  to  derive  five  distinct  DNA  shape  features. 

Integration of these DNA shape feature matrices  with  sequence  features  as  input for predictive 

models, help to evaluate the contribution of DNA shape to BMAL1 binding. The mean AUROC 

scores for the liver, kidney, and heart were 0.97, 0.98, and 0.98, respectively, which are higher than 

the sequence-only model. Additionally, the mean AUPRC metric increased significantly compared 

to the sequence-only model, rising from 0.09 to 0.79 for the liver, 0.10 to 0.51 for the kidney, and 

0.06  to  0.71  for  the  heart.  This  suggests  a  significant  difference  in  local  DNA  shape  features 

between the bound and unbound E-boxes. Further analysis revealed that the EP, Roll, and ProT 

DNA shape features contributed 33% to the prediction of BMAL1 binding to the E-boxes in the 

liver. In the kidney, the EP, ProT, and MGW DNA shape features contributed 68%, while in the 

heart, EP, Roll, and MGW contributed 70% to the prediction. Overall, the EP, Roll, MGW, and 

ProT DNA shape features had the most significant influence on the prediction of bound E-boxes 

across all three tissues. However, training and evaluating DNA shape-only models yielded lower 

performance  compared  to  DNA  sequence-only  models,  suggesting  that  the  local  shape  or 

configuration  of  DNA  near  the  E-box  alone  is  insufficient  to  predict  BMAL1  binding.  This 

demonstrates the importance of considering both DNA sequence and shape features in predicting 

transcription factor binding, with specific DNA shape features playing a crucial role in determining 

BMAL1 binding affinity to E-boxes across different tissues. 

26 

 
Performance of sequence + shape + histone modification (HM) models: Histone modifications 

(HMs)  within  gene  promoter  and  enhancer  regions  are  widely  recognized  to  correlate  with  the 

binding  of  transcription  factors  (TFs)77.  Despite  this  correlation,  the  intricate  mechanisms 

governing the interaction between TF binding and HMs remain incompletely understood. Recent 

investigations have shed light on the role of HMs in enhancing the predictive accuracy of models 

aimed  at  forecasting  TF  binding.  Notably,  the  impact  of  HMs  on  model  performance  varies 

depending  on  the  specific  TF  under  consideration.  For  instance,  models  predicting  binding  of 

bHLH transcription factors have exhibited significantly enhanced accuracy upon the inclusion of 

HMs78,79. Based on these findings, several models have been developed to improve TF binding 

prediction  using  results  from  epigenetic  assays80,81.  We  examined  the  importance  of  HMs  in 

prediction of BMAL1 binding by augmenting the sequence and DNA shape feature matrix with 

five  histone  features:  H3K27ac,  H3K4me1,  H3K4me3,  H3K27me3,  and  H3K36me3.  The 

selection of these HM features was based on both their availability in existing datasets and their 

established roles in facilitating transcription factor binding79. By incorporating these HM features 

into the models, I obtained mean AUROC scores of 0.99, 0.988, and 0.99 for the liver, kidney, and 

heart, respectively. Additionally, the mean AUPRC performance increased significantly to 0.95, 

0.65, and 0.79 for the liver, kidney, and heart, respectively (Figure 7). These results highlight the 

importance  of  considering  not  only  DNA  sequence  and  shape  features  but  also  epigenetic 

information,  such  as  histone  modifications,  in  accurately  modeling  transcription  factor  binding 

events across different tissue types. 

27 

 
 
 
 
A 

)
C
O
R
U
A
(

c
i
r
t
e
m
e
c
n
a
m
r
o
f
r
e
P

B 

)
C
R
P
U
A
(

c
i
r
t
e
m
e
c
n
a
m
r
o
f
r
e
P

Figure  7:  Adding  DNA  Shape  and  Histone  modification  (HM)  features  to  DNA  Sequence 
significantly improves prediction of BMAL1 binding across all tissues.  (A) The area under the 
receiver  operating  characteristics  (AUROC)  for  liver,  kidney,  and  heart  for  the  sequence-only 
model (blue), sequence plus DNA shape model (brown) and sequence plus DNA shape plus HM 
model (green).  The mean AUROC increases sharply with the addition of DNA shape features to 
the model, with a much smaller increase associated with the addition of HMs. (B) The area under 
the precision recall curve (AUPRC) in liver, kidney and heart for the sequence-only model (blue), 
sequence plus DNA shape model (brown) and sequence plus DNA shape plus HM model (green). 
As with AUROC, the mean AUPRC increased by a large margin with the addition of DNA shape 
features to the model, with a smaller increase associated with the addition of HMs. Adapted from 
Marri et al75. 

28 

 
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
XGBoost  

Logistic regression  

AUROC 

AUPRC 

AUROC 

AUPRC 

Liver 

DNA Sequence Only model  

0.71 ± 0.00 

0.09 ± 0.01 

0.71 ± 0.00 

0.08 ± 0.01 

DNA sequence plus DNA shape 
model 
DNA sequence plus histone 
modification model  
DNA shape plus histone 
modification model 
DNA sequence and shape plus 
histone modifications model  

Kidney 

0.97 ± 0.00 

0.79 ± 0.01 

0.97 ± 0.00 

0.79 ± 0.01 

0.85 ± 0.02 

0.13 ± 0.01 

0.80 ± 0.00 

0.12 ± 0.03 

0.90 ± 0.01 

0.22 ± 0.01 

0.81 ± 0.01 

0.16 ± 0.01 

0.99 ± 0.00 

0.95 ± 0.00 

0.97 ± 0.00 

0.91 ± 0.00 

DNA Sequence Only model 

0.78 ± 0.01 

0.10 ± 0.01 

0.77 ± 0.00 

0.10 ± 0.01 

DNA sequence plus DNA shape 
model  
DNA sequence plus histone 
modification model  
DNA shape plus histone 
modification model 
DNA sequence and shape plus 
histone modifications model 

Heart 

0.94 ± 0.00 

0.50 ± 0.01 

0.79 ± 0.01 

0.10 ± 0.01 

0.89 ± 0.00 

0.19 ± 0.01 

0.87 ± 0.00 

0.13 ± 0.01 

0.95 ± 0.01 

0.31 ± 0.01 

0.88 ± 0.01 

0.15 ± 0.01 

0.96 ± 0.00 

0.65 ± 0.01 

0.88 ± 0.01 

0.15 ± 0.01 

DNA Sequence Only model 

0.80 ± 0.01 

0.06 ± 0.01 

0.78 ± 0.01 

0.05 ± 0.01 

DNA sequence plus DNA shape 
model  
DNA sequence plus histone 
modification model  
DNA shape plus histone 
modification model 
DNA sequence and shape plus 
histone modifications (Heart) 

0.99 ± 0.00 

0.71 ± 0.03 

0.97 ± 0.01 

0.49 ± 0.02 

0.96 ± 0.00 

0.26 ± 0.01 

0.92 ± 0.01 

0.8 ± 0.01 

0.97 ± 0.00 

0.35 ± 0.00 

0.95 ± 0.00 

0.22 ± 0.00 

0.99 ± 0.00 

0.80 ± 0.04 

0.98 ± 0.01 

0.47 ± 0.02 

Table 1: Model Performance scores: The performance of models predicting BMAL1-DNA binding 
status in open chromatin of the liver, kidney, and heart using XGBoost and logistic regression.  

29 

 
 
 
 
 
 
 
Table 1 (cont’d) 
Performance of each model is represented as a mean value with a 95% confidence interval around 
the results from 5-fold cross validation. The highest model performance for each tissue is bolded. 
Adapted from Marri et al75. 

Feature importance reveals tissue-specific BMAL1 binding grammar. 

Given  the  improved  performance  of  the  models  incorporating  sequence,  shape,  and  histone 

modification (HM) features, I applied the ELI5 permutation importance method to discern the most 

predictive features governing BMAL1-DNA binding82. The importance of each DNA shape and 

histone modification feature was computed by aggregating the importance scores across all bins 

associated with that specific feature. Furthermore, to facilitate comparative analysis, the feature 

importance of each nucleotide type at a given position relative to the E-Box motif was normalized 

against the cumulative feature importance at that nucleotide position. The analysis revealed that 

the  immediate  flanking  sequences  upstream  and  downstream  of  the  core  E-box  binding  motif 

played pivotal roles in predicting BMAL1 binding across the liver, heart, and kidney tissues  as 

compared to distal flanking sequences (Fig 8). Previous studies examining the binding specificities 

of bHLH transcription factors CBf1 and Tye7 in yeast have shown that 2-bp flanking sequences 

contribute to binding of these transcription factors to the E-box55. In our quantitative analysis of 

the E-box sequence, we did not find the two central base pairs of the CANNTG E-box motif to 

directly contribute to the model performance, even though BMAL1 has a strong preference for the 

CG central dinucleotide across all three mouse tissues. The nucleotide G at the second proximal 

upstream flanking sequence (Seq-2) was a robust predictor of BMAL1-DNA binding in the liver 

(Fig 8A). Remarkably, this nucleotide accounted for over 50% of the feature weights utilized in 

predicting BMAL1-DNA binding in the liver. Additionally, other influential features included EP 

(10%) and H3K27ac (6%). The comprehensive assessment of feature importance highlighted the 

30 

 
 
significance of various DNA shape and histone modification features in predicting BMAL1 DNA 

binding in the liver, with the majority exhibiting individual weights exceeding 5%, underscoring 

their  critical  roles  in  this  process.  Conversely,  most  DNA  sequence  features,  barring  Seq-2, 

exhibited feature weights below the 5% threshold. In the kidney, H3K27ac emerged as the most 

influential feature, contributing 21% to the overall feature importance (Fig 8B), followed closely 

by  EP,  with  a  feature  importance  of  19%.  Notably,  three  histone  modifications  (H3K27ac, 

H3K4me3,  and  H3K4me1)  and  four  DNA  shape  features  (EP,  ProT,  MGW,  and  Roll)  all 

demonstrated  feature  weights  surpassing  5%.  Similarly,  in  the  heart,  H3K27ac  and  H3K4me3 

assumed  the  highest  feature  importance  (both  exceeding  20%),  followed  by  EP  (8%). 

Predominantly,  DNA  sequence  features  exhibited  weights  below  5%  in  both  heart  and  kidney 

tissues. Across all three tissues, histone modifications (H3K27ac, H3K4me1, H3K4me3) and DNA 

shape  features  (EP,  Roll)  consistently  exhibited  high  importance  scores  (Fig  8A-C).  Notably, 

H3K27ac and H3K4me1 emerged as the most influential histone modifications across all tissues, 

with H3K4me3 and H3K36me3 also making substantial contributions, particularly in the kidney 

and  heart.  These  results  show  that  the  combination  of  the  TF  binding  motif  and  its  flanking 

sequence, local shape of DNA, and histone modifications is sufficient to produce predictive models 

of BMAL1 binding to E-box motifs, especially in the mouse liver. The second upstream flanking 

nucleotide  (Seq-2)  stood  out  as  the most  crucial feature  in  the  liver,  with  nucleotides  G  and  C 

overrepresented in this region. This observation was corroborated by the sequence logo of bound 

E-box sequences, encompassing 4 base pairs upstream and downstream of the core motif (Fig 8D). 

Further  analysis  of  bound  E-box  motifs,  along  with  their  upstream  and  downstream  flanking 

sequences, unveiled a notable enrichment of the nucleotide G at the third position of the 5' flanking 

region, particularly prevalent among liver-bound E-boxes (1228 out of 3374 bound E-boxes) (Fig 

31 

 
9A-B), a trend not mirrored in bound E-box motifs kidney and heart. The histone modifications 

H3K27ac  (associated  with  active  enhancers  and  promoters),  H3K4me1  (enriched  at  enhancer 

regions),  and  H3K4me3  (found  at  active  promoters)  played  significant  roles  in  determining 

BMAL1 binding affinity across different tissues. 

B 

s
e
r
u
t
a
e
F

D 

Kidney 

Weight 

A 

s
e
r
u
t
a
e
F

C 

s
e
r
u
t
a
e
F

Liver 

Weight 

Heart 

Weight 

Figure 8: Feature importance of all genomic features (sequence and DNA shape) and epigenomic 
(histone  modification)  features  from  the  XGBoost  classifier  model  across  all  tissues.  Feature 
importance in the XGBoost classifier model in (A) liver, (B) kidney, and (C) heart. The feature 
importance for each DNA shape and histone modification feature is calculated as the sum of all 
the  feature  importance  of  all  bins  for  that  particular  histone  modification  feature.  The  feature 
importance  of  each  nucleotide  type  at  a  particular  position  relative  to  the  E-Box  motif  is 
normalized to the nucleotide type and the sum of all feature importance at that nucleotide position. 
(D)  Standard  plot  sequence  logo  for  BMAL1  bound  E-box  motifs  in  the  liver83. Adapted  from 
Marri et al75. 

32 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
A 

B 

Figure 9: Analysis of Liver bound E-box motifs to investigate the importance of the nucleotide G 
in the third position of the 5’ flanking sequence. (A) Analysis of the bound E-box motif with their 
upstream and downstream flanking sequence revealed that the nucleotide G at the position third of 
the 5’ flanking sequence is enriched in bound E-box motifs in the liver. 1228 out of 3374 of the 
motifs have nucleotide G at the third position of the 5’ flanking sequence. 48 out of the 1228 are 
palindromes and 16 out of the 48 are the sequence GTCACGTGAC. (B) Percentage of enriched 
flanking sequence nucleotide in the liver E-box motifs. (Orange bar corresponds to unbound E-
box motifs and blue bar corresponds to bound E-box motifs). Adapted from Marri et al75. 

33 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Cross-tissue models highlight differences in BMAL1-DNA binding among tissues. 

To test the hypothesis that the DNA binding of BMAL1 is determined by similar factors across the 

three tissues, we formulated cross-tissue models aimed at predicting binding with features based 

on  -  (a)  sequence  only;  (b)  sequence  plus  DNA  shape;  and  (c)  sequence  plus  DNA  shape  plus 

histone modifications. We trained these models on all data available in the respective tissue, using 

the optimal hyper-parameters previously derived for the respective within-tissue model. Models 

trained on one tissue were used to predict BMAL1 binding in a different tissue. The performance 

of  the  sequence-only  models  trained  on  tissue  X  and  predicting  tissue  Y  (X_Y  model)  was 

comparable to the performance of the within-tissue sequence-only model in tissue X, for all tissues 

(Fig 10A). However, the incorporation of DNA shape and histone modification (HM) features lead 

to a reduction in performance scores across some cross-tissue models in contrast to the sequence-

only  models (Fig 10A-C). Notably, the sequence plus  shape model, trained on liver data, aptly 

classified 22% of the E-boxes bound in both kidney (liver_kidney) and heart (liver_heart) (Fig 

10C).  This  model  predicted  most  of  the  bound  E-boxes  in  the  kidney  and  heart  as  unbound, 

yielding  a  high  false  negative  rate.  The  inclusion  of  histone  modification  features  improved 

AUROC and AUPRC metrics for most cross-tissue models (Fig 10 A-B) . However, the cross-

tissue sequence plus DNA shape plus HM model, trained on liver data, correctly classified only 

18% of E-boxes bound in the kidney and 19% in the heart, manifesting a higher false negative rate 

than the sequence plus shape model. The cross-tissue analysis revealed an intriguing pattern when 

models  trained  on  kidney  and  heart  data  were  evaluated  on  liver  data  (kidney_liver  and 

heart_liver). In these cases, the inclusion of DNA shape and histone modification (HM) features 

led to a significant improvement in performance compared to other model types. Specifically, the 

AUROC score of the kidney_liver model increased from 0.68 for the sequence plus DNA shape 

34 

 
model to 0.83 for the sequence plus DNA shape plus HM model. Furthermore, the AUPRC  score 

exhibited a sharp increase, from 0.055 to 0.38, when HM features were added. In contrast to the 

previous  findings,  where  the  addition  of  DNA  shape  and  HM  features  generally  decreased  the 

performance  of  cross-tissue  models,  the  kidney_liver  and  heart_liver  models  demonstrated  a 

significant  improvement in  predictive power  when these genomic  and epigenetic features were 

incorporated. These results suggest that while sequence-only models performed relatively poorly 

in  cross-tissue binding prediction,  the inclusion  of additional features, such as DNA shape and 

histone modifications, can enhance the predictive capability of cross-tissue models in certain cases. 

However,  the  effectiveness  of  these  features  appears  to  be  tissue-specific,  as  evidenced  by  the 

contrasting results obtained when predicting binding in the liver versus predicting binding in other 

tissues. Overall, the observed tissue specificity of BMAL1 DNA binding highlights the complexity 

of the regulatory mechanisms governing this transcription factor's binding patterns. The interplay 

between  sequence  features,  DNA  shape,  and  epigenetic  modifications,  such  as  histone 

modifications, appears to be context-dependent, with varying degrees of influence across different 

tissue types. These findings underscore the importance of considering tissue-specific factors in the 

development of accurate binding prediction models for transcription factors like BMAL1. 

35 

 
 
 
 
 
 
 
B 

)
C
R
P
U
A
(

c
i
r
t
e
m
e
c
n
a
m
r
o
f
r
e
P

A 

)
C
O
R
U
A
(

c
i
r
t
e
m
e
c
n
a
m
r
o
f
r
e
P

C  

e
v
ti
i
s
o
P
e
u
r
T

Figure 10: Performance metrics for cross-tissue prediction models. Scores for the liver, kidney and 
heart  sequence-only  model  (blue  bars),  sequence  plus  DNA  shape  model  (brown  bars)  and 
sequence plus  DNA shape plus  HM model (green bars):  (A) Area under  the receiver operating 
characteristics (AUROC); (B) Area under the precision recall curve (AUPRC); (C) True positive 
rates. (Notation explanation: liver_kidney refers to the model trained on the liver dataset and used 
to predict binding on the kidney dataset). Adapted from Marri et al75. 

36 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
CHAPTER 3: MATHEMATICAL MODELING OF THE SPATIAL AND TEMPORAL 
DYNAMICS OF CIRCADIAN CLOCK IN THE LIVER 

INTRODUCTION 

In the previous chapter, the genetic and epigenetic determinants of BMAL1-DNA binding within 

peripheral  tissues  were  elucidated.  In  this  chapter,  the  focus  will  be  on  investigating  the 

spatiotemporal  expression  patterns  of  circadian  clock  genes  within  the  liver  lobule,  as  well  as 

exploring the factors that contribute to the synchronization of their expression. Within the intricate 

landscape  of  the  brain,  the  suprachiasmatic  nucleus  (SCN)  emerges  as  the  central  orchestrator, 

functioning  as  the  master  pacemaker  of  the  circadian  system.  It  coordinates  and  synchronizes 

cellular, tissue-specific, and systemic rhythms, thereby regulating vital biological processes84,85, 

including the regulation of body temperature, glucose metabolism, sleep-wake patterns, hormone 

secretion, and bone formation86–88. Disruption of the circadian cycle by environmental stimuli has 

been linked to several pathological conditions, including cardiovascular disease, diabetes, bipolar 

disorder, obesity, and cancer39,89–91.  At the molecular level, a complex network of transcriptional, 

translational, and post-translational feedback loops intricately governs the generation of circadian 

rhythms, not only within the SCN but also in peripheral tissues such as the liver10,92. At the core 

of this regulatory network lies a group of transcriptional activators including circadian locomotor 

output cycles kaput (Clock), Neuronal PAS domain protein 2 (Npas2), brain and muscle ARNT 

Like 1 (Bmal1), and the retinoic acid-related orphan receptors (Rora, Rorb, Rorc). These activators 

act  in  concert  with  a  cohort  of  repressors,  notably  the  period  genes  (Per1,  Per2,  Per3),  the 

cryptochrome genes (Cry1, Cry2), and the reverb-clear orphan receptors (Reverbα, Reverbβ), to 

establish the rhythmicity of gene expression75,84. 

As  discussed  in  the  previous  chapter,  the  master  regulator  heterodimer  composed  of  CLOCK-

BMAL1  (or  NPAS2-BMAL1),  binds  to  specific  DNA  motifs  known  as  E-boxes  within  the 

37 

 
 
regulatory  regions  of  target  genes  such  as  Per,  Cry,  Ror,  and  Reverb,  thereby  promoting  their 

transcriptional activation. Subsequently, the PER and CRY proteins form a cytoplasmic complex, 

which  translocates  into  the  nucleus  to  create  a  negative  feedback  loop  by  inhibiting  the 

transcription of their own genes. This intricate interplay also extends to the suppression of Ror and 

Reverb gene transcription. Further complexity arises from the competitive binding of ROR and 

REV-ERB proteins to the ROR regulatory element (RRE) within the promoter region of the Bmal1 

gene. Here, ROR acts as an activator, while REV-ERB serves as a repressor, thus exerting tight 

control  over  the  transcriptional  regulation  of  Bmal1  and  consequently  influencing  the  overall 

circadian rhythm92,93.  

The liver stands as a pivotal peripheral oscillator within the intricate machinery of mammalian 

physiology, serving as a crucial hub for the interplay of anabolic and catabolic processes involving 

lipids  and  amino  acids94.  Structurally,  the  mammalian  liver  is  organized  into  functional  units 

known as "lobules," each comprising predominantly hepatocytes, the primary cellular constituents 

of liver parenchyma. Hepatocytes in the lobule are aligned along a distinct axis extending from the 

portal vein to the central vein, thereby facilitating their categorization based on their proximity to 

either  endpoint  (Fig  11).  Gene  expression  and  resulting  metabolic  functions  exhibit  a  spatial 

gradation  along  the  portal-to-central  axis  of  the  lobule.  Specifically,  processes  such  as 

gluconeogenesis and β-oxidation predominate at the portal end, whereas glycolysis and lipogenesis 

are enriched towards the central region. This spatial organization underscores the liver's intricate 

orchestration of metabolic pathways to ensure optimal nutrient utilization and energy balance95. 

Recent insights from single-cell gene expression studies have unveiled an intriguing aspect of liver 

physiology: the expression of core circadian clock genes manifests in a non-zonated manner across 

hepatocytes along the portal-to-central axis. Unlike metabolic functions, where spatial gradients 

38 

 
are evident, circadian gene expression appears to be uniformly distributed across the lobular axis. 

This  observation  hints  at  a  sophisticated  mechanism  whereby  hepatocytes  synchronize  their 

circadian  rhythms  across  the  entire  liver  tissue,  despite  their  spatial  heterogeneity29.  This 

phenomenon  of  synchronized  circadian  oscillations  across  hepatocytes  has  been  attributed  to  a 

coupling mechanism that harmonizes gene expression among cells along the portal-to-central axis, 

thereby  generating  coordinated  and  synchronized  temporal  rhythms96–98.  While  the  precise 

molecular  machinery  orchestrating  this  synchronization  remains  elusive,  emerging  evidence 

suggests a potential role for transforming growth factor–beta (TGF-β) as a putative coupling factor. 

However,  the  intricacies  of  how  TGF-β  or  other  potential  mediators  precisely  regulate  the 

synchronized oscillations in liver gene expression warrant further investigation98–100. Unlike the 

established neurotransmitter-mediated coupling mechanism in the suprachiasmatic nucleus (SCN) 

regulating central nervous system circadian rhythms101, the mechanisms behind synchronized liver 

gene expression oscillations are fully not understood. Understanding this synchronization would 

offer insights into tissue-level circadian regulation and novel therapeutic strategies for metabolic 

disorders. 

We  developed  a  spatiotemporal  multicellular  mathematical  model  of  the  mammalian  liver 

circadian  clock  regulatory  network9,102–104.  Two  sets  of  models  were  developed  for  the  mouse 

hepatic  clock:  1)  Model  1,  which  assumed  no  communication  among  the  cellular  oscillators, 

leading to non-synchronized gene expression in hepatocytes across the central to portal axis of the 

lobule; and 2) Model 2, which incorporated communication among cells, resulting in synchronized 

gene expression across the lobular axis. The analysis revealed a positive correlation between the 

amplitudes  of  the  observables  (gene  expression  levels  or  protein  levels)  and  their  respective 

transcription rate parameters, while a negative correlation was observed between the amplitudes 

39 

 
and their respective degradation rates. This finding highlights the crucial role of transcription and 

degradation rates in determining the oscillatory amplitudes of the circadian clock components. To 

validate the models, the transcription and degradation rate parameters were estimated from single-

cell RNA-Seq data102. The simulated results exhibited a high  correlation  with  the experimental 

data,  with  an  R2  >  0.9,  indicating  a  strong  agreement  between  the  model  predictions  and  the 

observed gene expression patterns. In summary, this study computationally analyzed asynchronous 

and synchronous spatial and temporal circadian oscillations in the mammalian liver. The models 

revealed  the  dependence  of  the  oscillatory  amplitudes  of  circadian  clock  components  on  their 

respective  transcription  and  degradation  rates.  The  high  correlation  between  simulated  and 

experimental  data  demonstrated  the  robustness  of  the  mathematical  models  in  capturing  the 

intricate  dynamics  of  the  liver  circadian  clock  regulatory  network,  and  the  complex  interplay 

between spatial organization, cellular communication, and circadian rhythms. 

Figure 11: Schematic of a cross-section through the lobule, the fundamental structural unit of the 
mammalian liver, predominantly made up of hepatocytes extending in layers from the portal 
triad to the central vein (generated using https://biorender.com/). 

40 

 
 
 
 
 
 
 
 
 
 
METHODS 

Model design. 

This model was developed to study the communication among single cellular circadian oscillators 

(i.e. hepatocyres) in the liver lobule, or intercellular “coupling”. This coupling ensures a coherent 

output  in  the  suprachiasmatic  nuclei  (SCN)  and  other  peripheral  tissues.  In  our  model,  we 

assembled the key interactions within the clock gene network from the literature. Five known clock 

genes  were  used  in  developing  this  model.  For  simplicity,  we  do  not  distinguish  between  the 

multiple isoforms in these genes. For example, Per1, Per2 and Per3 are represented by a single Per 

gene  as  a  model  variable.  The  same  reasoning  applies  to  the  proteins  and  respective  protein 

complexes. The gene network used in this model consist of the interactions among (Per, Cry, Ror, 

Rev-erb and Bmal1). Per, Cry, and Rev-erb act as transcriptional repressors and Ror, and Bmal1 

as activators. The interactions among these genes were then transformed to a wiring diagram that 

consist of two main dependent feedback loops: the Per-Cry negative feedback loop and the Ror, 

Rev-erb, Bmal1positive feedback loop. 

The central component of the model, the CLOCK_BMAL transcription factor complex binds to 

the  promoter  regions  of  the  clock  genes  (Per,  Cry,  Ror,  Rev-erb)  activating  their  transcription. 

These various mRNAs are translated into the respective proteins in the cytosol. The RORc and 

REV-ERBc protein in the cytosol are phosphorylated reversibly and unphosphorylated RORc and 

REV-ERBc are transported to the nucleus. In the nucleus, RORn binds to the promoter region of 

the  clock  gene  Bmal1  activating  its  transcription. The  Bmal1  mRNA  is  translated  to  BMAL1c 

protein  in  the  cytosol  which  is  phosphorylated  reversibly.  Unphosphorylated  BMAL1c  is 

transported  in  the  nucleus  where  it  forms  a  reversible  heterodimer  with  the  CLOCK  protein 

denoted as CLOCK_BMAL. The nuclear REV-ERBn protein binds to the promoter region of the 

41 

 
Bmal1 gene to repress the activity of the nuclear RORn protein.The PER and CRY proteins in the 

cytosol  are  phosphorylated  reversibly.  Unphosphorylated  PER  and  CRY  protein  in  the  cytosol 

form the reversible PER-CRY heterodimer which is then transported into the nucleus. PER-CRY 

represses the transcription of their own genes (Per and Cry) by binding with the CLOCK_BMAL 

protein. This association represses the transcriptional activities of all the genes that are activated 

by  CLOCK_BMAL. The  dissociation  of  CLOCK_BMAL  from  the  PER-CRY  complex  allows 

CLOCK_BMAL to bind to the promoter regions of clock genes (Per, Cry, Ror, Rev-erb) starting 

the process all over again. 

Two sets of models were developed in this study. (1) Model without communication between the 

cellular oscillators leading to non-synchronized gene expression across the central to portal axis 

of the liver lobule and (2) model with communication between cells leading to synchronized gene 

expression across the lobule. 

Model 1: Model without communication between the cellular oscillators. 

•  The first five ordinary differential equations (ODEs) represent the transcription of Per, 

Cry, Bmal1, Ror & Rev-erb genes into their respective mRNA’s denoted by 

𝑀𝑃, 𝑀𝐶, 𝑀𝐵, 𝑀𝑅𝑜 & 𝑀𝑅𝑒 respectively by the nuclear CLOCK_BMAL protein (𝐶𝐵𝑛), 

PER_CRY protein (𝑃𝐶𝑛), ROR protein (𝑅𝑜𝑛) and REV_ERB protein (𝑅𝑒𝑛). 

𝑑𝑀𝑃
𝑑𝑡

=

𝐾𝑖𝑝𝑛 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1 ∗ 𝑃𝐸𝑅_𝐶𝑅𝑌)𝑚 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑛 − 𝑑𝑝 ∗ 𝑀𝑃 

𝑉𝑠𝑝 ∗ (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑛

𝑑𝑀𝐶
𝑑𝑡

=

𝑙𝑖𝑔ℎ𝑡 ∗ 𝑉𝑠𝑐 ∗ (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑜

𝐾𝑖𝑐𝑜 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1 ∗ 𝑃𝐸𝑅_𝐶𝑅𝑌)𝑝 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑜 − 𝑑𝑐 ∗ 𝑀𝐶 

𝑑𝑀𝐵
𝑑𝑡

=

𝑉𝑠𝑏 ∗ (𝑅𝑂𝑅)𝑝

𝐾𝑖𝑏𝑝 + (𝑅𝑂𝑅 ∗ 𝑅𝐸𝑉_𝐸𝑅𝐵)𝑞 + (𝑅𝑂𝑅)𝑝 − 𝑑𝑏 ∗ 𝑀𝐵 

42 

 
𝑑𝑀𝑅
𝑑𝑡

=

𝐾𝑖𝑟𝑟 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1 ∗ 𝑃𝐸𝑅_𝐶𝑅𝑌)𝑠 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑟 − 𝑑𝑟 ∗ 𝑀𝑅 

𝑉𝑠𝑟 ∗ (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑟

𝑑𝑀𝑅𝑒
𝑑𝑡

=

𝐾𝑖𝑟𝑒𝑡 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1 ∗ 𝑃𝐸𝑅_𝐶𝑅𝑌)𝑢 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑡 − 𝑑𝑟𝑒 ∗ 𝑀𝑅𝑒 

𝑉𝑠𝑟𝑒 ∗ (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑡

•  The next six ODEs represent the translation and reversible activities of the 

unphophorylated PER, CRY, BMAL1, REV-ERB, ROR protein and PER-CRY 

protein complex in the cytoplasm denoted by 𝑃𝑐, 𝐶𝑐, 𝐵𝑐,  𝑅𝑒𝑐 , 𝑅𝑜𝑐 & 𝑃𝐶𝑐 

respectively. 

𝑑𝑃𝑐
𝑑𝑡

𝑑𝐶𝑐
𝑑𝑡

𝑑𝐵𝑐
𝑑𝑡

= 𝑘  ∗  𝑃𝑒𝑟_𝑚𝑅𝑁𝐴  +  𝐾𝑝𝑐1 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑝𝑐𝑜 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) − 𝐾𝑝𝑐 ∗ ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁))   +
𝐾𝑝𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) − 𝑑𝑝𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

= 𝑘1  ∗  𝐶𝑟𝑦_𝑚𝑅𝑁𝐴  +  𝐾𝑝𝑐1 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −   𝐾𝑝𝑐𝑜 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑐𝑐 ∗ ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)) 
+𝐾𝑐𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) − 𝑑𝑐𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

= 𝑘2  ∗  𝐵𝑚𝑎𝑙1_𝑚𝑅𝑁𝐴  −  𝐾𝑏𝑐𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑏𝑐 ∗ ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁))

+𝐾𝑏𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝑑𝑏𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑅𝑜𝑐
𝑑𝑡

= 𝑘3  ∗  𝑅𝑜𝑟_𝑚𝑅𝑁𝐴  −  𝐾𝑟𝑐𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑟𝑐 ∗ ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁))

+𝐾𝑟𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝑑𝑟𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑅𝑒𝑐
𝑑𝑡

= 𝑘4  ∗  𝑅𝑒𝑣 𝑒𝑟𝑏 𝑚𝑅𝑁𝐴  −  𝐾𝑟𝑒𝑐𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   − 

𝐾𝑟𝑒𝑐 ∗ ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)) + 𝐾𝑟𝑒𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
−𝑑𝑟𝑒𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑃𝐶𝑐
𝑑𝑡

= 𝐾𝑝𝑐𝑜  ∗   ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁))

−𝐾𝑝𝑐𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑝𝑐1 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
−𝐾𝑝𝑐𝑝 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   +  𝐾𝑝𝑐𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
−𝑑𝑝𝑐𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

43 

 
 
 
 
 
 
 
•  The next six ODEs  represent the reversible activities of the phophorylated PER, 

CRY, BMAL1, REV-ERB, ROR protein and PER-CRY protein complex in the 

cytoplasm denoted by 𝑃𝑝𝑐, 𝐶𝑝𝑐, 𝐵𝑝𝑐,  𝑅𝑒𝑝𝑐 , 𝑅𝑜𝑝𝑐 & 𝑃𝐶𝑝𝑐 respectively. 

𝑑𝑃𝑝𝑐
𝑑𝑡

𝑑𝐶𝑝𝑐
𝑑𝑡

= 𝐾𝑝𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑝𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑝𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

= 𝐾𝑐𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑐𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑐𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝐵𝑝𝑐
𝑑𝑡

= 𝐾𝑏𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑏𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑏𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑅𝑜𝑝𝑐
𝑑𝑡

= 𝐾𝑟𝑐  ∗   ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)) − 𝐾𝑟𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑟𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑅𝑒𝑝𝑐
𝑑𝑡

𝑑𝑃𝐶𝑝𝑐
𝑑𝑡

= 𝐾𝑟𝑒𝑐  ∗   ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁))   −  𝐾𝑟𝑒𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑟𝑒𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

= 𝐾𝑝𝑐𝑝  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑝𝑐𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑝𝑐𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

•  The last six ODEs represent the activites of BMAL1, ROR, REV-ERB, CLOCK-BMAL, 

PER-CRY, and PER-CRY/CLOCK-BMAL proteins in the nucleus denoted by 

𝐵𝑛 , 𝑅𝑜𝑛,  𝑅𝑒𝑛,  𝐶𝐵𝑛,  𝑃𝐶𝑛,  𝑃𝐶/𝐶𝐵𝑛,  respectively. 

𝑑𝐵𝑛
𝑑𝑡

= 𝐾𝑏𝑐𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑐𝑙𝑏𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑏𝑛.∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑅𝑜𝑛
𝑑𝑡

= 𝐾𝑟𝑐𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑟𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑟𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

44 

 
 
 
 
 
 
 
 
 
𝑑𝑅𝑒𝑛
𝑑𝑡

= 𝐾𝑟𝑒𝑐𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑟𝑒𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑟𝑒𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁);

𝑑𝐶𝐵𝑛
𝑑𝑡

= 𝐾𝑐𝑙𝑏𝑛  ∗   (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝐾𝑐𝑏𝑝𝑐 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
+ 𝐾𝑑𝑐𝑏𝑝𝑐 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
− 𝑑𝑐𝑙𝑏𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   +  𝑑𝑝𝑐𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑃𝐶𝑛
𝑑𝑡

= 𝐾𝑝𝑐𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝐾𝑐𝑏𝑝𝑐 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
+ 𝐾𝑑𝑐𝑏𝑝𝑐 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
− 𝑑𝑝𝑐𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   +  𝑑𝑐𝑙𝑏𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑃𝐶/𝐶𝐵𝑛
𝑑𝑡

= 𝐾𝑐𝑏𝑝𝑐  ∗   (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

− 𝐾𝑑𝑐𝑏𝑝𝑐 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
− 𝑑𝑐𝑙𝑏𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
−𝑑𝑝𝑐𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

Model 2: Model with communication between the cellular oscillators. 

To incorporate the synchronicity of cells in the second model, we assume a global coupling among 

N cells through cell-to-cell communication. Using a parameter K_f (where K_f is the sensitivity 

of  individual  cell  oscillator  to  another  cell  oscillator,  i.e.  coupling  strength)  to  describe  the 

communication strength, we calculated a second parameter F, which is the mean communication 

signal among N cells in the system. This approach was initially used in a three-variable model by 

Gonze et al105  We assume the cell-to-cell communication leading to synchronization is mediated 

by a coupling protein. Previous studies have assumed Transforming growth factor beta (TGF-β) to 

be  the  coupling  factor  responsible  for  synchronization  of  the  cells  to  produce  a  sustained 

oscillation98 

45 

 
 
 
 
 
•  The time-evolution for a single synchronization factor (M, which will later be used to 

calculate the mean field F) in the cellular medium mediated by a coupling protein is 

given as: 

𝑑𝑀

𝑑𝑡

= 𝑝𝑟𝑡 ∗ 𝑋𝑖 −  

𝑣𝑐∗𝑀

𝑑𝑐+𝑀 

Where i = 1...5 represents each of the observable genes used in the model. 𝑋1 for Per, 𝑋2 for Cry, 

𝑋3 forBmal1, 𝑋4 for Ror, and 𝑋5 for Rev-erb. There are delays with respect to the clock genes Xi 

due  to  transcription,  translation,  phosphorylation,  diffusion,  etc.  Assuming  a  linear  production 

equation  of  the  synchronization  factor  by  the  clock  genes  and  a  nonlinear  diffusion  rate,  prt 

represents synchronization protein concentration, vc represents Maximum rate of synchronization 

factor synthesis and dc represents Activation constant for enhancement of synchronization factor 

synthesis. 

The mean field F was then calculated as: 

𝐹 =

1
𝑁

𝑁
∗ ∑ 𝑀𝑖
𝑖=1

•  The first five ODEs of this model represents the transcription of Per, Cry, Bmal1, Ror & 

Rev-erb genes into their respective mRNA’s denoted by 𝑀𝑃, 𝑀𝐶, 𝑀𝐵, 𝑀𝑅𝑜 & 𝑀𝑅𝑒 

induced by the nuclear proteins CLOCK_BMAL (𝐶𝐵𝑛), PER_CRY protein (𝑃𝐶𝑛), ROR 

protein (𝑅𝑜𝑛) and REV_ERB (𝑅𝑒𝑛). 

𝑑𝑀𝑃
𝑑𝑡

=

𝑉𝑠𝑝 ∗ (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑛
𝐾𝑖𝑝𝑛 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1 ∗ 𝑃𝐸𝑅_𝐶𝑅𝑌)𝑚 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑛 +

𝑉𝑑𝑝 ∗ 𝐾𝑓 ∗ 𝐹
𝐾𝑐 + 𝐾𝑓 ∗ 𝐹

− 𝑑𝑝 ∗ 𝑀𝑃 

𝑑𝑀𝐶
𝑑𝑡

=

𝑙𝑖𝑔ℎ𝑡 ∗ 𝑉𝑠𝑐 ∗ (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑜
𝐾𝑖𝑐𝑜 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1 ∗ 𝑃𝐸𝑅_𝐶𝑅𝑌)𝑝 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑜 +

𝑉𝑑𝑝 ∗ 𝐾𝑓 ∗ 𝐹
𝐾𝑐 + 𝐾𝑓 ∗ 𝐹

− 𝑑𝑐 ∗ 𝑀𝐶 

46 

 
 
 
 
𝑑𝑀𝐵
𝑑𝑡

=

𝑉𝑠𝑏 ∗ (𝑅𝑂𝑅)𝑝
𝐾𝑖𝑏𝑝 + (𝑅𝑂𝑅 ∗ 𝑅𝐸𝑉_𝐸𝑅𝐵)𝑞 + (𝑅𝑂𝑅)𝑝 +

𝑉𝑑𝑝 ∗ 𝐾𝑓 ∗ 𝐹
𝐾𝑐 + 𝐾𝑓 ∗ 𝐹

− 𝑑𝑏 ∗ 𝑀𝐵 

𝑑𝑀𝑅
𝑑𝑡

=

𝑉𝑠𝑟 ∗ (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑟
𝐾𝑖𝑟𝑟 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1 ∗ 𝑃𝐸𝑅_𝐶𝑅𝑌)𝑠 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑟 +

𝑉𝑑𝑝 ∗ 𝐾𝑓 ∗ 𝐹
𝐾𝑐 + 𝐾𝑓 ∗ 𝐹

− 𝑑𝑟 ∗ 𝑀𝑅 

𝑑𝑀𝑅𝑒
𝑑𝑡

=

𝑉𝑠𝑟𝑒 ∗ (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑡
𝐾𝑖𝑟𝑒𝑡 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1 ∗ 𝑃𝐸𝑅_𝐶𝑅𝑌)𝑢 + (𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1)𝑡 +

𝑉𝑑𝑝 ∗ 𝐾𝑓 ∗ 𝐹
𝐾𝑐 + 𝐾𝑓 ∗ 𝐹

− 𝑑𝑟𝑒 ∗ 𝑀𝑅𝑒 

•  The next six ODEs represent the translation and reversible activities of the 

unphosphorylated PER, CRY, BMAL1, REV-ERB, ROR protein and PER-CRY 

protein complex in the cytoplasm denoted by 𝑃𝑐, 𝐶𝑐, 𝐵𝑐,  𝑅𝑒𝑐 , 𝑅𝑜𝑐 & 𝑃𝐶𝑐 

respectively 

𝑑𝑃𝑐
𝑑𝑡

𝑑𝐶𝑐
𝑑𝑡

𝑑𝐵𝑐
𝑑𝑡

= 𝑘  ∗  𝑃𝑒𝑟_𝑚𝑅𝑁𝐴  +  𝐾𝑝𝑐1 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑝𝑐𝑜 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) − 𝐾𝑝𝑐 ∗ ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁))   +
𝐾𝑝𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) − 𝑑𝑝𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

= 𝑘1  ∗  𝐶𝑟𝑦_𝑚𝑅𝑁𝐴  +  𝐾𝑝𝑐1 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑝𝑐𝑜 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑐𝑐 ∗ ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)) 
+𝐾𝑐𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) − 𝑑𝑐𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

= 𝑘2  ∗  𝐵𝑚𝑎𝑙1_𝑚𝑅𝑁𝐴  −  𝐾𝑏𝑐𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑏𝑐 ∗ ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁))

+𝐾𝑏𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝑑𝑏𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑅𝑜𝑐
𝑑𝑡

= 𝑘3  ∗  𝑅𝑜𝑟_𝑚𝑅𝑁𝐴  −  𝐾𝑟𝑐𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑟𝑐 ∗ ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁))

+𝐾𝑟𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝑑𝑟𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑅𝑒𝑐
𝑑𝑡

= 𝑘4  ∗  𝑅𝑒𝑣 𝑒𝑟𝑏 𝑚𝑅𝑁𝐴  −  𝐾𝑟𝑒𝑐𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   − 

𝐾𝑟𝑒𝑐 ∗ ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)) + 𝐾𝑟𝑒𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
−𝑑𝑟𝑒𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

47 

 
 
 
 
 
 
𝑑𝑃𝐶𝑐
𝑑𝑡

= 𝐾𝑝𝑐𝑜  ∗   ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁))

−𝐾𝑝𝑐𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑝𝑐1 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
−𝐾𝑝𝑐𝑝 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   +  𝐾𝑝𝑐𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
−𝑑𝑝𝑐𝑐 ∗ (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

•  The next six ODEs represent the reversible activities of the phosphorylated PER, 

CRY, BMAL1, REV-ERB, ROR protein and PER-CRY protein complex in the 

cytoplasm denoted by 𝑃𝑝𝑐, 𝐶𝑝𝑐, 𝐵𝑝𝑐,  𝑅𝑒𝑝𝑐 , 𝑅𝑜𝑝𝑐 & 𝑃𝐶𝑝𝑐 respectively. 

𝑑𝑃𝑝𝑐
𝑑𝑡

𝑑𝐶𝑝𝑐
𝑑𝑡

= 𝐾𝑝𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑝𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑝𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

= 𝐾𝑐𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑐𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑐𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝐵𝑝𝑐
𝑑𝑡

= 𝐾𝑏𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑏𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑏𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑅𝑜𝑝𝑐
𝑑𝑡

= 𝐾𝑟𝑐  ∗   ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)) − 𝐾𝑟𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑟𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑅𝑒𝑝𝑐
𝑑𝑡

𝑑𝑃𝐶𝑝𝑐
𝑑𝑡

= 𝐾𝑟𝑒𝑐  ∗   ((𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁))   −  𝐾𝑟𝑒𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑟𝑒𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

= 𝐾𝑝𝑐𝑝  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑝𝑐𝑝𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑝𝑐𝑐 ∗ (𝑃𝐻𝑂𝑆_𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

•  The last six ODEs represent the activites of BMAL1, ROR, REV-ERB, CLOCK-BMAL, 

PER-CRY, and PER-CRY/CLOCK-BMAL protein in the nucleus denoted by 

𝐵𝑛 , 𝑅𝑜𝑛,  𝑅𝑒𝑛,  𝐶𝐵𝑛,  𝑃𝐶𝑛,  𝑃𝐶/𝐶𝐵𝑛,  respectively. 

48 

 
 
 
 
 
 
 
 
 
𝑑𝐵𝑛
𝑑𝑡

= 𝐾𝑏𝑐𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑐𝑙𝑏𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑏𝑛.∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑅𝑜𝑛
𝑑𝑡

= 𝐾𝑟𝑐𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑟𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑟𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑅𝑂𝑅_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑅𝑒𝑛
𝑑𝑡

= 𝐾𝑟𝑒𝑐𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   −  𝐾𝑟𝑒𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝑑𝑟𝑒𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑅𝐸𝑉_𝐸𝑅𝐵_𝑃𝑅𝑂𝑇𝐸𝐼𝑁);

𝑑𝐶𝐵𝑛
𝑑𝑡

= 𝐾𝑐𝑙𝑏𝑛  ∗   (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝐾𝑐𝑏𝑝𝑐 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
+ 𝐾𝑑𝑐𝑏𝑝𝑐 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
− 𝑑𝑐𝑙𝑏𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   +  𝑑𝑝𝑐𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑃𝐶𝑛
𝑑𝑡

= 𝐾𝑝𝑐𝑐  ∗   (𝐶𝑌𝑇𝑂𝑆𝑂𝐿𝐼𝐶_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

−𝐾𝑐𝑏𝑝𝑐 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
+ 𝐾𝑑𝑐𝑏𝑝𝑐 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
− 𝑑𝑝𝑐𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)   +  𝑑𝑐𝑙𝑏𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

𝑑𝑃𝐶/𝐶𝐵𝑛
𝑑𝑡

= 𝐾𝑐𝑏𝑝𝑐  ∗   (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝑅𝑂𝑇𝐸𝐼𝑁) ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

− 𝐾𝑑𝑐𝑏𝑝𝑐 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
− 𝑑𝑐𝑙𝑏𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)
−𝑑𝑝𝑐𝑛 ∗ (𝑁𝑈𝐶𝐿𝐸𝐴𝑅_𝐶𝐿𝑂𝐶𝐾_𝐵𝑀𝐴𝐿1_𝑃𝐸𝑅_𝐶𝑅𝑌_𝑃𝑅𝑂𝑇𝐸𝐼𝑁)

In  the  proposed  model,  various  parameters  were  incorporated  to  capture  the  dynamics  of  the 

system  under  investigation.  These  parameters,  along  with  their  definitions  and  values,  are 

presented in the table below. To account for potential variations and uncertainties, three distinct 

sets of parameter values were defined, with parameter set 3 serving as the nominal set (wild type 

parameter; WT) employed in the model simulations.The simulation from parameter set 3 revealed 

the  experimental  peak  expression  for  the  circadian  clock  genes  in  the  model.  Also,  parameter 

estimation analysis from figure 8 was used to confirm the accurate reflection of parameter set 3 

(WT)  to  true  biology.  The  table  lists  the  parameters,  their  descriptions,  and  the  corresponding 

values for each of the three parameter sets. 

49 

 
 
 
 
 
 
 
Table 2: Model parameters, definition and values. 

50 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
RESULTS 

Circadian clock mechanism – model design. 

The development of the model involved identifying and compiling the key regulatory interactions 

in the mammalian circadian clock gene network from scientific literature9,103. The model postulates 

a reversible phosphorylation mechanism for all cytosolic circadian clock proteins. This inclusion 

is supported by the pivotal role phosphorylation plays in orchestrating circadian regulation. The 

reversible  nature  of  phosphorylation  allows  for  dynamic  fluctuations  in  protein  activity,  thus 

contributing to the intricate regulation of circadian processes. This hypothesis is based on evidence 

that phosphorylation of key clock proteins (e.g., PER, CRY, BMAL1) plays a regulatory role in 

the  circadian  system  by  altering  their  stability,  localization,  protein-protein  interactions,  and 

transcriptional activity over the circadian cycle. The reversibility of phosphorylation allows for 

dynamic changes in protein activity and regulation of circadian processes104,106.  The RORc and 

REV-ERBc  proteins  (where  the  subscript 

'c'  stands  for 

'cytosolic')  undergo  reversible 

phosphorylation, after which they are transported to the nucleus. In the nucleus, RORn and REV-

ERBn  (the  subscript  'n'  stands  for  'nuclear')  bind  to  the  promoter  region  of  the  Bmal1  gene  to 

regulate  its  transcription.  Specifically,  RORn  acts  as  an  activator,  while  REV-ERBn  serves  as  a 

repressor in this context. Similarly, the Bmal1 gene is translated into the BMAL1c protein in the 

cytosol, which subsequently undergoes reversible phosphorylation. BMAL1c is then transported 

to  the  nucleus,  where  it  forms  a  reversible  heterodimer  with  the  CLOCK  protein,  generating  a 

protein  complex  termed  "CLOCK/BMAL1".  In  parallel,  the  PER  and  CRY  proteins  undergo 

reversible  phosphorylation  within  the  cytosol.  Upon  dephosphorylation,  the  unphosphorylated 

PER and CRY proteins assemble into the reversible PER-CRYc heterodimer, which then undergoes 

nuclear translocation. Within  the nucleus, the PER-CRYn complex exerts  its regulatory role by 

51 

 
repressing  the  transcription  of  key  clock  genes,  including  Per,  Cry,  Rev-erb,  and  Ror.  This 

repression  is  achieved  by  binding  to  the  CLOCK_BMAL1  complex,  thereby  dampening  its 

activity. The dissociation of the transcription factor PER-CRY from the promoters of these genes 

permits CLOCK_BMAL1 to activate the transcription of Per, Cry, Ror, and Rev-erb, thus initiating 

a  new  cycle.  This  cyclical  regulatory  process  forms  the  basis  of  the  circadian  rhythm.  These 

interactions are summarized in a wiring diagram consisting of two main coupled feedback loops: 

a  positive  loop  involving  the  activation  of  Bmal1  transcription  by  RORn,  and  a  negative  loop 

involving the repression of Per, Cry, Ror, and Rev-erb transcription by PER-CRYn (Fig 12). 

A 

B 

Figure 12: Modeling framework and schematic network diagram. (A) Approach to construction of 
a deterministic model of the mammalian circadian clock, and data-driven sensitivity analysis and 
parameter  estimation.  (B)  Schematic  diagram  of  the  principal  positive  and  negative  regulatory 
feedback loops in the mammalian clock network. The core negative feedback loop is formed by 
PER:CRY  heterodimers  that  repress  their  own  transcription. Another  regulatory  (positive  and 
negative feedback) loop is formed by Ror and Rev-erb competing for binding to ROR/REV-ERB-
response element (RORE) to regulate Bmal1. Green lines indicate activation by the transcription 
factors CLOCK-BMAL1 and RORn, red lines indicate repression of genes by transcription factors,  
and  black  lines  indicate  the  translation  and  translocation  of  mRNAs  and  proteins. White  ovals 
represent genes, gray ovals proteins in the nucleus and orange ovals proteins in the cytosol. 

52 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
Multicellular spatiotemporal model of the core clock genes describes coupling among cells. 

Gene  regulatory  networks  can  be  mathematically  described  by  a  set  of  coupled  ordinary 

differential equations (ODEs), where molecular interactions, such as transcription factor binding 

and gene regulation, are represented by nonlinear functions like Michaelis-Menten or higher-order 

Hill  functions.  These  nonlinear  functions  capture  the  cooperative  or  saturable  nature  of  these 

interactions, providing a more accurate representation of the underlying biological processes107. In 

this study, the gene regulatory network depicted in Fig 12 B was translated into two deterministic 

spatiotemporal models, as detailed in the Method section. One model operates in isolation, devoid 

of  intercellular  communication  (coupling),  while  the  other  incorporates  intercellular  coupling 

mechanisms. In both models, I employed Michaelis-Menten and Hill function approximations to 

describe the interactions between transcription factors and genes. Additionally,  I assumed mass 

action  and  linear  kinetics  to  describe  processes  such  as  degradation,  translation,  and  complex 

formation.  These  mathematical  representations  provide  a  robust  framework  for  simulating  the 

dynamic behavior of the circadian clock gene network. In the absence of synchronization signals 

that enable cell-cell coupling, autonomous cells in the suprachiasmatic nucleus or other peripheral 

tissues  like  the  liver  would  oscillate  with  different  periods,  amplitudes,  and  phases.  This 

desynchronization  is  due  to  intrinsic  noise  and  variability  in  the  gene  regulatory  networks  of 

individual cells. However, intercellular communication induced by extrinsic or intrinsic signals 

ensures synchrony in the oscillations and resulting biological functions across the cell population. 

This  coupling  mechanism,  mediated  by  signaling  molecules  or  direct  cell-cell  interactions, 

promotes coordination and coherence in the circadian rhythms and associated processes98,103,105. 

The  two  models  developed  in  this  study  were  specifically  designed  to  investigate  the 

spatiotemporal  expression  patterns  of  circadian  clock  genes  in  the  liver.  By  incorporating 

53 

 
intercellular  communication  in  one  of  the  models,  I  aimed  to  explore  the  role  of  coupling 

mechanisms  in  synchronizing  the  circadian  rhythms  across  hepatocytes  (liver  cells).  I  used  a 

comprehensive  dataset  comprised  of  single-cell  RNA  sequencing  (scRNA-seq)  and  single 

molecule fluorescence in situ hybridization (smFISH) data , which previously demonstrated that 

the expression of core circadian clock genes  in  the liver lobule is non-zonated due to  coupling 

among  autonomous  cell  oscillators29.  To  dissect  the  regulatory  intricacies  underlying  these 

observations, 1 constructed an uncoupled model by translating the wiring diagram depicted in Fig 

12  B  into  a  system  of  23  differential  equations.  These  equations  encapsulated  the  dynamic 

processes of gene transcription and translation, protein complex formation, phosphorylation, and 

inhibition, thus providing a mechanistic framework to explore the underlying regulatory dynamics. 

This model was applied to an assembly of over 435 cells spatially arranged to mimic the geometry 

of  a  liver  lobule  (the  lobule  geometry  was  described  in  a  piff  file  for  use  in  the  Compucell3D 

simulations). To  capture  potential  heterogeneity  across  cells  in  transcription  rates,  which  could 

arise from desynchronization in gene expression, I assumed a Gaussian distribution to randomize 

the transcription rate parameters for every gene in each individual cell. This stochastic modeling 

approach  aimed  to  incorporate  the  inherent  variability  and  noise  present  in  gene  expression 

dynamics across the cellular population within the liver lobule microenvironment. The spatial and 

temporal  expression patterns for each gene were then simulated in  the Compucell3D modeling 

environment, a powerful platform for simulating and visualizing multicellular systems and their 

dynamics108–110.   

The simulations of the uncoupled model revealed that the mean expression of both Per and Bmal1 

genes at timepoints 6, 12, 18, and 24 hours was not zonated across the portal-central axis of the 

liver lobule (Fig 13 A). This finding confirms results from previous studies, which observed a non-

54 

 
zonated expression pattern for core circadian clock genes in the liver lobule111. In the uncoupled 

model, the temporal profiles of Per and Bmal1 gene expression exhibited distinctive autonomous 

oscillations  spanning  the  lobule,  characterized  by  periods  ranging  from  21  to  28  hours  and 

amplitudes varying between 2.8 to 4.7 (arbitrary units, a.u) for Per, and periods ranging from 21 

to 28 hours and amplitudes spanning from 0.8 to 3.2 a.u for Bmal1 (Fig 13 B, E and F). These 

autonomous oscillations reflect the desynchronization and variability among individual cells in the 

absence of coupling mechanisms. 

To explore the impact of intercellular communication on circadian synchronization, in the coupled 

model,  I  assume  that  synchronization  of  autonomous  cells  across  the  liver  lobule  is  achieved 

through  communication  mediated  by  a  putative  coupling  ligand  between  each  cell  and  its 

neighbors. The transmission of this coupling ligand across the liver lobule effectively synchronizes 

the  period  of  the  oscillations.  I  hypothesized  that  global  synchronization  across  the  lobule  is 

achieved  through  the  average  concentration  of  the  coupling  ligand  affecting  clock  genes  via 

receptor molecules,  as previously suggested98,105. Data from Finger et  al98. revealed a potential 

coupling factor: the transforming growth factor-beta (TGF-β), which activates early transcription 

factors  to  control  the  molecular  clock  machinery,  leading  to  a  significant  upregulation  of  Per2 

mRNA levels after 2-4 hours. This finding supports the role of TGF-β as a potential coupling ligand 

in  synchronizing  the  circadian  clock  across  the  liver  lobule103,112.  The  spatial  and  temporal 

expression  for  each  gene  in  the  model  was  then  simulated  in  Compucell3d,  showing  coupling 

across the portal-central axis (Fig 13 C). The temporal profiles of Per and Bmal1 gene expression 

exhibited synchronized oscillations, characterized by a period of 24-25 hours and an amplitude of 

5.2 for Per, and a period of 23-25 hours with an amplitude of 2.2 for Bmal1 (Fig 13 D, E, and F). 

These results demonstrate the impact of intercellular communication and coupling mechanisms on 

55 

 
synchronization of circadian rhythms across the liver lobule. While the uncoupled model showed 

desynchronized  oscillations  with  varying  periods  and  amplitudes,  the  coupled  model  exhibited 

synchronized oscillations with consistent periods and amplitudes, reflecting the role of coupling 

factors in coordinating the circadian clock across the cell population.  

B 

Figure 13: Multicellular spatiotemporal model simulations. Expression across the liver lobule of 
(A) Period gene (Per) and Brain and Muscle ARNT-Like 1 (Bmal1) without coupling at circadian 
times ZT 6, 12, 18, and 24 hours. The central- portal lobule display a non-zonated circadian clock 
genes expression as reported in Droin et al29. (B) Limit cycle oscillations of the Per and Bmal1 
genes for 435 cells without coupling with varying period and amplitude values for Per (period: 21-
28 hours, amplitude: 2.8-4.7) and Bmal1 (period: 21-28 hours, amplitude: 0.8-3.2).  

56 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 13 (cont’d) 
(C) The spatial expression of Per, and Bmal1 genes for 435 cells with coupling at circadian time 
ZT 6 12, 18 and 24 hours across the liver lobule. The expression profile shows a synchrony across 
all time points. (D) Limit cycle oscillations of the Per and Bmal1 genes with coupling producing 
synchronized period and amplitude for Per (period: 24-25 hours, amplitude: 5.0-5.2) and Bmal1 
(period: 23-25 hours, amplitude: 2.0-2.2). The color bar shows the expression values of genes in 
each cell across the central (CV) and portal (PV) in the liver lobule. Each dot represents the time-
dependent  expression  of  a  gene  per  cell.  (E)  The  distribution  of  uncoupled  Per  and  Bmal1 
oscillation periods for over 400 cells in the model lobule. The period ranges from 21 hours to 28 
hours. (F) The distribution of coupled Per and Bmal1 oscillation periods for over 400 cells in the 
model. The period ranges between 24 hours to 25 hours for Per oscillations and 23 hours to 25 
hours for Bmal1 oscillations. 

C 

D 

57 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 13 (cont’d) 

E 

F 

Sensitivity analysis 

To investigate the sensitivity of the model dynamics to variations in key parameters, I performed 

a  parameter  sensitivity  analysis  using  the  AMIGO2  toolbox  in  MATLAB113,114.  The  model 

dynamics analyzed included the period of oscillation, amplitude of oscillation, phase shift, and the 

average amount of mRNA produced. Specifically, the sensitivity analysis focused on the effects of 

transcription and degradation rate parameters on these model dynamics. I used the Latin hypercube 

sampling  (LHS)  method,  which  is  known  for  providing  more  precise  estimates  compared  to 

58 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
random sampling methods115. The LHS technique is used to divide the tested range of each variable 

into  intervals,  from  which  sample  points  are  selected  to  ensure  an  efficient  exploration  of  the 

parameter space. 

The LHS method involves the following steps: 

•  Defining the ranges of the parameters to be tested: For each transcription and degradation 

rate parameter, I specified a biologically relevant range within which the parameter can 

vary. The ranges for each parameter were determined from literature values. 

•  Dividing the parameter ranges into intervals: The LHS method divides the range of each 

parameter into equal intervals, ensuring that the entire range is adequately sampled. 

•  Sampling from the intervals: The LHS algorithm selects sample points from each interval 

in a stratified manner, ensuring that the entire parameter space is covered without clustering 

or gaps. 

•  Running  simulations:  For  each  set  of  sampled  parameter  values,  I  ran  the  model 

simulations and recorded the respective values of the model dynamics (period, amplitude, 

phase, and average mRNA levels). 

•  Sensitivity  analysis:  The  sensitivity  of  each  model  dynamic  to  variations  in  the 

transcription and degradation rate parameters was quantified by analyzing the relationship 

between  the  sampled  parameter  values  and  the  corresponding  model  dynamics. Various 

statistical measures, such as partial rank correlation coefficients (PRCC) or standardized 

regression coefficients, can be used to assess the sensitivity. 

By  employing  the  LHS  method  and  systematically  exploring  the  parameter  space,  I  aimed  to 

identify the most influential parameters affecting the circadian clock dynamics and quantify their 

59 

 
relative impacts. This sensitivity analysis provided valuable insights into the robustness and critical 

control points of the model, as well as potential targets for experimental validation or therapeutic 

interventions. 

The transcription and degradation rate parameters of the Reverb gene respectively show 
negative and positive correlation with the average expression level of Bmal1. 

I investigated the effect of the transcription and degradation rate parameters on average expression 

of  the  observables  (mRNA  levels  of  clock  genes)  for  a  complete  circadian  cycle  (24  hrs)  after 

reaching  a  stable  oscillation. A  10%  increase  in  the  transcription  rate  parameter  of  each  gene 

demonstrated  a  direct  correlation  with  the  corresponding  mRNA  level.  This  phenomenon  was 

evident  across  the  entire  network,  as  elevating  the  transcription  rate  parameter  resulted  in  a 

proportional  rise  in  the  average  expression  levels  of  the  associated  observables  (Fig  14  B). 

Moreover, the intricate interplay of positive and  negative feedback  regulations within the gene 

network  was  reflected  in  the  sensitivity  analysis  of  transcription  and  degradation  rates.  For 

instance, an increase in the transcription rate parameter of the Cry gene (vCs) led to a decrease in 

the  average  expression  of  Per  mRNA  due  to  the  delayed  inhibition  imposed  by  PER-CRYn. 

Similarly, elevating the transcription rate parameter of Bmal1 (vBs) resulted in a decrease in the 

average  expression  of  Rev-erb  and  an  increase  in  the  average  expression  of  Per  and  Cry, 

attributable to the positive feedback loop involving Bmal1, Per, and Cry, alongside the negative 

feedback loop involving Bmal1 and Rev-erb (Fig 14 A-B).   

Conversely, a 10% increase in  the degradation rate parameters  exhibited an inverse correlation 

with their corresponding observables, causing a decline in the average expression levels of their 

associated  genes  (Fig  14  C).  Notably,  an  increase  in  the  degradation  rate  parameter  of  Bmal1 

(d_mB)  led  to  an  increase  in  the  expression  level  of  the  Rev-erb  gene,  owing  to  the  negative 

60 

 
 
feedback  loop  involving  Bmal1,  PER-CRY,  and  Rev-erb  genes.  Similar  relationships  between 

degradation rate parameters and observables were elucidated across the network (Fig 14 C). 

This analysis revealed that the clock gene network demonstrated greater sensitivity to changes in 

transcription  rates  compared  to  degradation  rates.  Notably,  among  all  genes  analyzed,  Bmal1 

exhibited  the  highest  sensitivity  to  degradation  rates  across  the  entire  network,  despite  being 

directly regulated solely by Ror and Rev-erb. In summary, our study sheds light on the intricate 

regulatory dynamics of the circadian clock gene network, highlighting the differential effects of 

transcription and degradation rate parameters on gene expression dynamics. These findings deepen 

our  understanding  of  the  underlying  mechanisms  governing  circadian  rhythm  regulation  and 

underscore the complex interplay of feedback mechanisms within the gene network. 

61 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
A 

B 

C 

Figure  14:  (A)  Schematic  diagram  of  the  mammalian  clock  network  with  its  transcription  and 
degradation parameters. (B) Sensitivity analysis of transcription rate parameters.  The transcription 
rate parameter for the observables (mRNA levels of clock genes) were optimized and the dynamics 
revealed a positive correlation between the transcription rate parameters and the average mRNA 
level of the respective gene. (C) Sensitivity analysis of degradation rate parameters. Optimization 
of the degradation rate parameter for the observables (mRNA levels of model clock genes) revealed 
a  negative  correlation  between  the  degradation  rate  parameters  and  their  respective  average 
mRNA. Also, an increase in the degradation rate parameter of the Bmal1 gene, d_mB, led to an 
increase in the expression level of the Reverb gene due to the negative feedback loop between 
Bmal1, PER-CRY and Reverb gene.  

Increase in the transcription rate parameters lead to a non-monotonic decrease in the 
oscillatory period while increasing the amplitude of the oscillation. 

The  circadian  clock  genes  exhibit  oscillatory  behavior  with  periods  ranging  from  23.5  to  24.5 

hours10. While the oscillatory period is consistent across all genes, the amplitude, defined as the 

absolute difference between the peak and trough levels of the oscillation, varies from gene to gene  

due to the different biochemical reactions involved in the transcription of each gene. A systematic 

62 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
exploration  of  the  transcription  rate  parameters  within  the  range  of  0  to  5  a.u  revealed  a  non-

monotonic change in the oscillatory period. Specifically, the periods of the circadian clock genes 

Per  and  Reverb  initially  increased  linearly  at  very  low  transcription  rates,  but  subsequently 

decreased monotonically with increasing transcription rate parameters for Per (vPs) and Reverb 

(vRes)  (Fig  15 A).  Conversely,  the  oscillation  period  of  the  Ror  gene  exhibited  a  monotonic 

increase with an increase in its corresponding transcription rate parameter, consistent with previous 

studies attributing this behavior to the positive feedback loop mediated by ROR116. To investigate 

the effect of transcription rate parameters on the amplitude of circadian clock oscillations, the same 

range  of  parameter  values  (0-5)  was  explored. The  amplitude  of  the  oscillation  was  monitored 

across all clock genes in the model. The analysis revealed a consistent trend of monotonic increase 

in  the  amplitude  of  oscillation  for  all  clock  genes  as  their  corresponding  transcription  rate 

parameters  were  increased  within  the  specified  range.  This  observation  suggests  a  direct 

relationship  between  the  rate  of  gene  transcription  and  the  magnitude  of  oscillatory  behavior 

exhibited  by  the  circadian  clock  genes.  Specifically,  as  the  transcription  rate  parameters  were 

incrementally increased from 0 to 5, the amplitude of oscillation for each clock gene exhibited a 

continuous and unidirectional increase, without any deviations or fluctuations in the observed trend 

(Fig 15 B). This monotonic increase in amplitude indicates that higher transcription rates lead to 

more pronounced oscillations, with greater deviations from the mean value. It is noteworthy that 

the  rate  of  increase  in  amplitude  may  vary  among  different  clock  genes,  potentially  due  to  the 

specific  regulatory  mechanisms  and  feedback  loops  involved  in  their  respective  transcriptional 

processes.  However,  the  overall  trend  of  a  monotonic  increase  in  amplitude  with  increasing 

transcription rate parameters was consistently observed across all clock genes in the model. These 

findings  highlight  the  crucial  role  of  transcriptional  regulation  in  modulating  the  oscillatory 

63 

 
dynamics  of  the  circadian  clock  network.  By  adjusting  the  transcription  rate  parameters,  the 

amplitude  and  the  period  of  circadian  oscillations  can  be  effectively  modulated,  potentially 

influencing downstream processes and physiological rhythms governed by the circadian clock. 

A 

B 

Figure  15: A  non-monotonic  period  and  amplitude  dynamics  for  observables  Per  mRNA  and 
Reverb mRNA as gradient of the transcription rate parameters applied in the model. (A) Oscillation 
period changes with respect to changes in transcription rate parameter for Per mRNA and Reverb 
mRNA. The oscillatory period exhibits a monotonic decrease with an increase in the transcription 
rate parameter. The red dot corresponds to the wild-type parameter value from the simulated model 
for the transcription rate parameter for Per and Reverb (x=2.9 and x=2.425) respectively, and their 
corresponding oscillatory period (y=23.8 h and y=23.97) (B) Oscillation amplitude changes with 
respect to changes in transcription rate parameter for Per mRNA and Reverb mRNA.  A monotonic 
increase of oscillatory amplitude is observed with an increase in the transcription rate parameter. 
The red dot corresponds to the wild-type parameter value for the transcription rate parameter for 
Per and Reverb (x=2.9 and x=2.425) respectively, and their corresponding oscillatory amplitude 
(y=3.4 and y=2.6). 

The period shows a concave dependency, along with a monotonic decrease in amplitude as the 
degradation rate parameter for Per and Reverb is increased. 

We  conducted  a systematic exploration of the parameter space by varying the degradation  rate 

parameters in the model to investigate their impact on the two key characteristics of the circadian 

64 

 
 
 
 
 
 
 
 
 
 
 
 
 
rhythm: the oscillatory period and the amplitude. These parameters govern the rate at which the 

molecular  components  of  the  circadian  clock,  namely  the  clock  genes  and  their  associated 

regulatory elements, are degraded or broken down within the cellular environment. Our analysis 

revealed  a  non-linear,  concave  relationship  between  the  degradation  rate  parameters  and  the 

oscillatory period of the circadian clock genes. As the degradation rate parameters were increased 

incrementally from 0 to 1.0, the period initially decreased, reaching a minimum value, and then 

increased monotonically with further increments in the degradation rate (Fig. 16 A). This concave 

pattern was observed for the core clock genes, Per and Reverb, when their respective degradation 

rate parameters, d_mP and d_mRe, were varied independently. 

Conversely, our investigation unveiled a monotonic decrease in the amplitude of oscillations across 

all clock genes in the model, as their corresponding transcription rate parameters were increased 

(Fig. 16 B). The amplitude, which represents the magnitude of the oscillatory signal, exhibited an 

inverse relationship with the transcription rate parameters, such that higher transcription rates led 

to  lower  amplitudes  of  oscillation. These  findings  highlight  the  intricate  interplay  between  the 

degradation and transcription rates of the molecular components involved in the circadian clock 

machinery, and their profound influence on the temporal dynamics and robustness of the circadian 

rhythm.  Notably,  the  non-linear  relationship  between  the  degradation  rates  and  the  oscillatory 

period suggests the existence of an optimal range for these parameters, within which the circadian 

clock operates most effectively, ensuring the maintenance of a stable and consistent oscillatory 

pattern. 

65 

 
 
 
 
A 

B 

Figure 16: A concave dynamic period for Per and Reverb as degradation rate parameter increases. 
(A) The oscillation period of both Per and Reverb decreases and then increases giving a concave 
profile with respect to an increasing degradation rate parameter. (B) Oscillation amplitude for Per 
and Reverb showed a monotonic decrease with an increase in the degradation rate parameter.  The 
red dots in (A) and (B) denote the wild-type parameter value from the simulated model degradation 
rate parameter value and the resulting period/amplitude value respectively. 

Bifurcation analysis shows that Cry and Ror exhibit Hopf bifurcations with varying 
transcription rate parameter value. 

I further probed the dynamics of the oscillatory rhythm by conducting a bifurcation analysis, which 

allows for a systematic exploration of how the system's behavior, characterized by properties such 

as the oscillatory period, amplitude, phase, and average mRNA levels, responds to variations in 

key parameter values. This analysis was performed using XPPAUT117, a powerful computational 

66 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
tool  for  studying  dynamical  systems.  The  bifurcation  analysis  was  initially  carried  out  by 

considering  the  transcription  rates  as  the  primary  bifurcation  parameters.  Specifically,  we 

examined the dependence of the oscillatory dynamics on the transcription rates vCs for Cry mRNA 

and vRs for Ror mRNA. The resulting bifurcation diagrams, which depict the system's behavior 

as a function of these parameters, revealed a cyclic oscillatory pattern (Fig. 17 A and C). These 

diagrams illustrate the coexistence of stable and unstable limit cycles, represented by green and 

blue curves, respectively. The points at which these curves intersect correspond to Hopf bifurcation 

(HB) points, indicating a transition between steady-state and oscillatory behaviors. By leveraging 

the  information  contained  within  the  bifurcation  diagrams,  I  derived  the  stable  and  unstable 

oscillatory  periods  for  both  Cry  and  Ror  (Fig.  17  B  and  D).  Notably,  the  stable  and  unstable 

parameter values exhibited a remarkable alignment, both in terms of the oscillatory period and the 

limit cycle oscillation amplitude, for each gene under consideration. Furthermore, we extended the 

bifurcation analysis to incorporate the degradation parameters of the observable molecular species, 

thereby  assessing  the  stability  and  robustness  of the  model's  rhythmic  dynamics  under  varying 

degradation rates. The bifurcation analysis not only elucidated the intricate relationships between 

key  parameters  and  the  emergent  oscillatory  dynamics  but  also  served  as  a  powerful  tool  for 

validating  the  model's  capability  to  capture  and  maintain  robust  circadian  rhythms.  By 

systematically  exploring  the  parameter  space,  we  gained  valuable  insights  into  the  regions  of 

stability  and  instability,  as  well  as  the  transitions  between  these  regimes,  enabling  a  deeper 

understanding of the underlying mechanisms governing the circadian clock machinery. 

67 

 
 
 
 
 
A 

C 

B 

D 

Figure  17:  One  parameter  bifurcation  analysis  of  the  model  rhythmic  dynamics.  (A&C)  The 
bifurcation  diagram  of  the  transcription  rate  parameter  of  Cry  mRNA,  (vCs)  and  Ror  mRNA, 
(vRs). The green and blue lines represent stable unstable limit cycles respectively. HB denotes the 
Hopf  bifurcation  points  in  the  bifurcation  diagram.  (B&D)  Bifurcation  diagram  showing  the 
periodic stability of the transcription rate parameter of Cry mRNA, (vCs) and Ror mRNA, (vRs). 
The green line represents stable period and the blue line represent unstable period of the model.    

68 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Model parameter estimation 

I  used  parameter  estimation  to  calibrate  the  mathematical  model  of  the  circadian  clock  against 

experimental  data  obtained  from  single-nuclei  RNA-sequencing  of  mouse  liver  cells.  The 

experimental data was acquired from male C57BL/6 mice housed under a 12:12 light:dark cycle, 

ensuring synchronization with the environmental circadian rhythm102. Parameter estimation was 

carried out with the AMIGO2113,114 software package, which implements advanced algorithms for 

optimizing model parameters to achieve the best possible agreement between model predictions 

and  experimental  observations.  Specifically,  we  aimed  to  optimize  the  transcription  and 

degradation rate parameters of the model to fit the hepatic single-nuclei RNA-sequencing data. 

The goodness of fit between the model predictions and the experimental data was quantified by a 

cost  function,  which  measures  the  discrepancy  between  the  observed  values  and  the  values 

predicted by the model for a given set of parameters. The parameter estimation algorithm seeks to 

minimize  this  cost  function,  effectively  identifying  the  parameter  values  that  yield  the  closest 

agreement between the model's output and the empirical observations. A smaller overall value of 

the cost function indicates a better match between the model's predictions and the available data. 

I defined the cost function as a maximum (log-) likelihood function, which is a suitable choice 

given the availability and nature of the measured noise in the experimental data. The initial values 

for  the  transcription  and  degradation  rate  parameters  to  be  estimated  were  set  to  the  wild-type 

values used in the original model simulation. Additionally, we imposed upper and lower bounds 

on these parameters based on values reported in the literature on circadian system modeling37,38,118. 

Mathematically, I formulated a non-linear programming problem (NLP) with algebraic constraints, 

aiming to find the optimal transcription and degradation rate parameters that minimize the cost 

function. This formulation allows for the incorporation of additional constraints, such as parameter 

69 

 
bounds and other physiological considerations, ensuring that the estimated parameter values are 

biologically  plausible  and  consistent  with  prior  knowledge.  By  employing  this  parameter 

estimation approach, I was able to refine our mathematical model to better capture the dynamics 

observed  in  the  experimental  data,  ultimately  enhancing  our  understanding  of  the  underlying 

circadian clock machinery and its regulatory mechanisms.  

Estimation of transcription and degradation rate parameters. 

To  assess  the  predictive  accuracy  of  our  mathematical  model,  I  utilized  single-nuclei  gene 

expression data obtained from male C57BL/6 mice at specific time points: 2, 4, 8, 12, 18, and 24 

hours. My focus was on the core circadian genes incorporated in the model: Per, Cry, Bmal1, Ror, 

and  Reverb.  The  normalized  RNA-sequencing  data  replicates  for  these  genes  constituted  the 

experimental dataset against which I optimized the transcription and degradation rate parameters. 

To  represent  the  overall  gene  expression  levels,  I  averaged  the  expression  values  of  individual 

isoforms  for  each  gene.  Given  the  large  scale  and  non-linear  nature  of  our  model,  local 

optimization  techniques  can  often  converge  to  suboptimal  local  minima,  failing  to  identify  the 

globally  optimal  solution. To  overcome  this  challenge  and  efficiently  locate  an  accurate  global 

optimum,  I  employed  the  enhanced  Scatter  Search  (eSS)  hybrid  global-local  optimization 

algorithm available in the AMIGO toolbox119,120. This advanced algorithm combines global and 

local  search strategies, enabling a more  comprehensive exploration of the parameter space and 

increasing the likelihood of finding the globally optimal solution. In the optimization process, I 

targeted five transcription rate parameters and five degradation rate parameters, each associated 

with  one  of  the  observable  variables  in  the  model.  Notably,  the  optimized  parameter  values 

exhibited a positive correlation  with  the nominal wild-type parameter values used in the initial 

model formulation. Consequently, the model predictions generated using the optimized parameter 

70 

 
set demonstrated a positive correlation with the experimental measurements for all five observable 

variables (Fig. 18 A and B). This strong agreement between the model's output and the empirical 

data validates the effectiveness of my parameter estimation approach and highlights the model's 

ability  to  accurately  capture  the  dynamics  of  the  circadian  clock  machinery.  By  leveraging 

advanced optimization techniques and incorporating experimental data, I successfully calibrated 

the model, enhancing its predictive capabilities and enabling a more accurate representation of the 

underlying biological processes governing the circadian rhythm.  

A 

C 

B 

D 

R2 = 0.87 

R2 = 0.91 

Figure 18: The predicted model results and experimental data fit for observables using maximum 
(log-) likelihood function. (A) Correlation between  predicted transcription and degradation and 
wild  type  transcription  and  degradation  rate  parameter  values.  (B)  Correlation  between  model 
predictions and measured data. 

71 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 18 (cont’d) 
(C-F) The fitted curves for the model variables (Per, Cry, Bmal1 and Reverb). The red dotted line 
indicates the experimental data at timepoints 2,4,8,12,18 and 24 h. The solid blue line indicates 
the  predicted  results  from  using  maximum  (log-)  likelihood  function  to  fit  the  model  to  the 
experimental dataset. 

E 

F 

72 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
CHAPTER 4: SPATIAL-TEMPORAL PERTURBATION OF THE LIVER LOBULE BY 
2,3,7,8-TETRACHLORODIBENZO-P-DIOXIN 

INTRODUCTION 

The mammalian liver is a highly organized organ with a unique spatial architecture that facilitates 

its various metabolic functions. The fundamental structural unit of the liver is the hepatic lobule, 

primarily composed of hepatocytes, the parenchymal liver cells. Hepatocytes are arranged along 

the portal-central axis of the lobule, forming a gradient from the portal triad (comprising the portal 

vein, bile duct, and hepatic artery) to the central vein29. This spatial organization of hepatocytes 

within  the  liver  lobule  is  closely  linked  to  the  compartmentalization  of  gene  expression  and 

subsequent  metabolic  activities.  Hepatocytes  situated  closer  to  the  portal  triad  exhibit  higher 

expression  levels  of  genes  involved  in  gluconeogenesis,  oxygen  utilization,  and  β-oxidation 

processes33,121.  In  contrast,  hepatocytes  located  nearer  to  the  central  vein  display  increased 

expression of genes associated with glycolysis, lipogenesis, and xenobiotic metabolism mediated 

by  cytochrome  P450  enzymes122.  This  spatial  organization  of  gene  expression  and  metabolic 

function along the porto-central axis of the hepatic lobule is known as liver zonation. This is a 

continuous pattern of concentric layers of hepatocytes, reflecting the specialization of different 

regions  of  the  liver  lobule  for  specific  metabolic  tasks29,33,95.  Liver  zonation  is  established  and 

maintained  by  a  complex  interplay  of  chemical  cues,  metabolic  gradients,  and  cell-to-cell 

interactions. The establishment of metabolic gradients is facilitated by the direction of blood flow 

from  the  portal  triad  to  the  central  vein,  creating  spatial  gradients  of  nutrients  and  metabolites 

along the porto-central axis. Additionally, circadian rhythms and molecular signaling from core 

clock genes contribute to  the temporal  organization of metabolic processes within hepatocytes. 

The canonical master regulatory pathway governing liver zonation is the Wnt/β-catenin signaling 

pathway. Wnt proteins bind to Frizzled receptors, leading to the phosphorylation of the β-catenin 

73 

 
 
degradation complex by lipoprotein receptor-related proteins. This event causes the dissociation 

of  β-catenin  from  the  degradation  complex,  allowing  its  translocation  to  the  nucleus,  where  it 

activates transcriptional regulators that potentiate the zonation process34,123,124.  

The temporal regulation of liver functions is intricately governed by the biological phenomenon 

known as the circadian rhythm. This temporal compartmentalization ensures that metabolic and 

biological processes within the liver, such as glycolysis and gluconeogenesis, are synchronized 

with  the  feeding  and  fasting  cycles  of  the  organism28,29,121.  The  molecular  machinery  driving 

circadian rhythms is a complex network of circadian genes and proteins interconnected through 

intricate negative and positive feedback loops. The core components of this molecular oscillator 

include  the  CLOCK  and ARNTL  (or  NPAS2)  proteins,  which  form  heterodimeric  complexes. 

These  CLOCK-ARNTL  or  NPAS2-ARNTL  complexes  bind  to  E-box  motifs  in  the  promoter 

regions  of  downstream  circadian  genes,  such  as  Per,  Cry,  Ror,  and  Reverb,  activating  their 

transcription. The translated PER and CRY proteins form the PER-CRY complex in the cytoplasm, 

which subsequently inhibits the binding of the CLOCK-ARNTL or NPAS2-ARNTL complexes, 

thereby creating a negative feedback loop. Additionally, the REV-ERB protein competes with the 

ROR  protein  to  inhibit  the  transcription  of  the  Arntl  gene,  further  modulating  the  circadian 

rhythm3,75.  

A recent  study by Droin et  al29. demonstrated that  the spatial  and temporal  axes of the hepatic 

lobule  interact  with  one  another.  Consequently,  many  established  zonated  pathways  exhibit 

rhythmic  patterns  of  gene  expression. Among  these  pathways  are  gene  sets  involved  in  drug 

metabolism, suggesting a potential interplay between the spatial zonation, temporal rhythmicity, 

and chemical perturbations within the liver. Notably, while rhythmicity impacts the core pathways 

that determine zonation, the converse is not true, as the core circadian clock genes themselves do 

74 

 
not exhibit zonation patterns. This observation highlights the complexity of the interplay between 

the spatial and temporal axes in regulating liver function. The overlap between the temporal and 

spatial axes, particularly in the context of drug metabolism pathways, suggests the existence of a 

third dimension to consider when describing liver function: chemical perturbation. This third axis 

represents the influence of exogenous and endogenous chemical compounds on the intricate spatial 

and temporal dynamics within the liver. 

2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD) is a persistent environmental toxicant known for its 

detrimental effects on various biological functions. Upon entry into the body, TCDD binds to the 

aryl hydrocarbon receptor (AHR), initiating a cascade of events that lead to toxic effects. Upon 

binding to TCDD, the AHR undergoes a conformational change, translocates to the nucleus, and 

binds to specific DNA sequences called dioxin response elements (DREs) located in the proximal 

promoter  regions  of  target  genes,  particularly  the  cytochrome  P450  (CYP)  genes28,51,125.  The 

binding of the TCDD-activated AHR to DREs modulates the transcription of these target genes, 

leading  to  the  subsequent  inhibition  or  activation  of  downstream  genes  involved  in  various 

biological  processes.  Notably, TCDD  exposure  has  been  shown  to  disrupt  both  the  spatial  and 

temporal  components  of hepatic function  within  the liver lobule, ultimately  contributing  to  the 

development  of  several  liver-related  diseases  and  disorders.  Exposure  to  TCDD  has  been 

implicated  in  the  pathogenesis  of  autoimmune  hepatitis,  non-alcoholic  fatty  liver  disease 

(NAFLD),  cardiovascular  disease,  bipolar  disorder,  obesity,  and  cancer.  The  mechanisms 

underlying  these  adverse  effects  involve  the  disruption  of  the  intricate  spatial  organization  and 

temporal  regulation  of  gene  expression  within  the  liver  lobule28,89,124.  Specifically,  sub-chronic 

exposure to TCDD has been demonstrated to  ablate or significantly dampen the oscillations of 

most  core  circadian  clock  genes,  altering  their  phase  and  leading  to  the  disorganization  of  the 

75 

 
rhythmic expression patterns of various zonated gene biomarkers. This disruption of the spatial 

and  temporal  axes  of  hepatic  function  by  TCDD  can  have  far-reaching  consequences  on  liver 

homeostasis and overall organismal health126. 

Despite our current understanding of how TCDD-activated AHR may impact these organizational 

pathways,  the  specific  mechanisms  underlying  the  interaction  between  these  regulatory  axes 

remain  elusive.  Furthermore,  the  acute  effects  of  TCDD-activated  AHR  on  the  temporal 

(rhythmicity)  and  spatial  (zonation)  components  of  the  hepatic  lobule  are  not  well  defined. To 

elucidate  the  transcriptional  impact  of  acute  TCDD  exposure  on  hepatocyte  rhythmicity  and 

zonation,  we  employed  hepatic  single-nuclei  RNA-sequencing  data  from  male  C57BL/6  mice 

administered  a  single  dose  of  30  μg/kg  TCDD  (or  sesame  oil  vehicle)  at  time-point  0.  Liver 

samples were collected and snap-frozen at various time-points (2, 4, 8, 12, 18, and 24 hours) after 

TCDD  administration.  Zonation  within  the  hepatic  lobule  was  inferred  from  pseudo-space  and 

benchmarked against zonal gene biomarkers from previous studies by Halpern et al95. and Droin 

et al29. Genes were then classified based on the effects of TCDD, zonation, temporal regulation, or 

a combination of these variables using a mixed-nonlinear effect model. Our analysis revealed that 

more  genes  were  regulated  by  zonation  independently  than  by  rhythmicity  and  TCDD  alone. 

Notably, a larger number of genes were regulated by a combination of TCDD and zonation than 

by  a  combination  of  TCDD  and  rhythmicity.  Chromatin  immunoprecipitation  followed  by 

sequencing  (ChIP-seq)  analysis  of  dose-affected  categories  revealed  the  interaction  of  dioxin 

response  elements  (DREs),  E-box  motifs,  and AHR  in  most  of  the  genes  affected  by  TCDD-

activated AHR. Overall, our study demonstrated that TCDD exerts a significant effect on hepatic 

zonation and rhythmicity even at acute exposure levels. The interplay between TCDD, zonation, 

and  rhythmicity  in  regulating  gene  expression  highlights  the  complexity  of  the  regulatory 

76 

 
mechanisms governing hepatic function. Importantly, our findings suggest that the disruption of 

zonation by TCDD may be a more prominent event than the disruption of rhythmicity, as evidenced 

by  the  greater  number  of  genes  regulated  by  a  combination  of  TCDD  and  zonation.  This 

observation underscores the potential impact of environmental toxicants on the spatial organization 

of  hepatic  function  and  the  associated  consequences  for  liver  homeostasis.  Furthermore,  the 

interaction between DREs, E-box motifs, and AHR in the genes affected by TCDD-activated AHR 

suggests a potential crosstalk between the AHR signaling pathway and the molecular machinery 

governing  circadian  rhythms. This  crosstalk  may  contribute  to  the  observed  disruption  of  both 

zonation and rhythmicity upon TCDD exposure. 

METHODS 

Single-nuclei RNA-seq expression dataset and preprocessing. 

Cholico et al127. conducted a time-series experiment to investigate the effects of TCDD on hepatic 

gene expression in male C57BL/6 mice. The experimental design involved treating the mice with 

either sesame oil (vehicle control) or a single dose of 30 μg/kg TCDD via gavage. At specific time 

points  (2,  4,  8,  12,  18,  and  24  hours)  post-treatment,  the  animals  were  euthanized  by  CO2 

asphyxiation, and their livers were immediately collected and snap-frozen for subsequent single-

nucleus  RNA-sequencing  (snRNA-seq)  analysis. The  single-nuclei  RNA-seq  dataset  generated 

from  this  experiment  was  processed  and  analyzed  using  methodologies  outlined  in  previous 

studies124,126.  Specifically,  the  clustering  and  cell  type  annotation  of  the  single-nucleus 

transcriptomes were performed to identify and characterize the various hepatic cell populations 

present in the samples.  

77 

 
 
 
Figure  19:  Extraction  and  sequencing  of  single  nuclei  data  from  male  C57BL/6  mice.  Male 
C57BL/6 mice housed in a room with a 12:12 light dark cycle was gavaged with single dose of 
sesame oil (as vehicle) or 30 μg/kg of TCDD (as treated). Their livers were harvested 2, 4, 8, 12, 
18, 24, hours post treatment and snap frozen. The snap frozen livers were sequenced to acquire the 
single nuclei sequencing dataset.  

We  employed  a  comprehensive  preprocessing  pipeline  to  prepare  the  single-nucleus  RNA-

sequencing (snRNA-seq) dataset for downstream analysis.  The preprocessing of the dataset was 

conducted using the scanpy package in Python128, a bioinformatics tool for analyzing single-cell 

RNA  sequencing  (scRNA-seq)  data.  Raw  counts  from  the  dataset  were  first  normalized  to  the 

median total cell count using the normalize_total function, ensuring consistency across samples. 

Subsequently, a log transformation with a pseudocount was applied using the log1p function to 

mitigate  the  effects  of  extreme  values  and  bring  the  data  into  a  more  interpretable  range  for 

subsequent  analyses.  Next,  the  preprocessing  pipeline  focused  on  filtering  cells  and  genes  to 

remove low-quality or uninformative data points. Since the study aimed to investigate the effects 

of TCDD on hepatocytes, the first filtering step involved removing all non-parenchymal cell types, 

retaining only hepatocytes in the dataset. This step ensures that the analysis focuses specifically 

on the cell type of interest. Subsequently, low-quality hepatocytes were filtered out based on two 

criteria:  1)  hepatocytes  with  fewer  than  1,500  counts,  and  2)  hepatocytes  with  fewer  than  200 

78 

 
 
 
 
 
 
 
 
 
 
 
detected genes. These thresholds were chosen to exclude cells with low sequencing coverage or 

poor transcriptome quality, as they may introduce noise and bias into the analysis. In addition to 

cell filtering, gene filtering was also performed. Genes that were expressed in fewer than 200 cells 

were removed from the dataset. This step helps to reduce the computational burden and focus the 

analysis on genes with sufficient expression levels across the cell population. We adopted a non-

standard preprocessing step due to the lower formation of clusters observed during batch correction 

analysis. This step highlights the importance of tailoring the preprocessing pipeline to the specific 

characteristics of the dataset and the research questions being addressed. Finally, the identification 

of highly variable genes (HVGs) was accomplished using the highly_variable_genes function from 

the scanpy package. HVGs are genes that exhibit substantial variability in expression levels across 

cells,  potentially  reflecting  biologically  relevant  differences. These  genes  are  typically  used  as 

input for downstream analyses, such as dimensionality reduction and clustering, as they capture 

the most informative aspects of the transcriptomic landscape. 

Batch correction using scVI. 

To mitigate sample-specific batch effects present in the snRNA-seq data, we employed scVI129 a 

variational autoencoder model designed for single-cell RNA-sequencing data analysis. Variational 

autoencoders are a type of deep learning model that can learn complex, nonlinear representations 

of high-dimensional data, making them well-suited for batch correction and other tasks in single-

cell genomics. To account for batch effects arising from different biological samples, we assigned 

a unique batch label to the cells originating from each individual sample. These sample-specific 

batch labels were then provided as input to the scVI model during training. By incorporating batch 

information during the model's training process, scVI learns to disentangle the biological signal 

from  the  technical  artifacts  introduced  by  batch  effects. The  scVI  model  was  trained  using  the 

79 

 
hyperparameters and architecture configurations specified in Tables 2.1 and 2.2, which detailed 

the model's neural network architecture, regularization techniques, optimizer settings, and other 

training specifications. These hyperparameter settings were deemed suitable for the given snRNA-

seq data and batch correction task. 

Hyperparameter 

Latent dimension 

Number of layers 

Layer width 

Dropout rate 

Value           

30 

1 

128 

0.1 

Kullback-Leibler weight 

5 ∗ 10−5 

Gene expression distribution 

NB 

Latent Distribution 

Normal 

Table 3.1: Hyperparameters for scVI’s variational autoencoder model. 

80 

 
 
 
 
 
 
 
Hyperparameter 

Training epochs 

Learning rate 

Learning rate decay 

Optimizer 

Optimizer epsilon 

Value 

46 

0.001 

10−6 

Adam 

0.01 

Table 3.2: Hyperparamters for scVI’s variatonal autoencoder training. 

Layer calculations. 

We utilized the latent space representation of the normalized cell counts obtained from the single-

cell variational inference (scVI) method as input into the diffusion maps algorithm. Diffusion maps 

were generated using the diffmap function from the scanpy Python package. This technique is a 

non-linear dimensionality reduction method that captures the underlying geometric structure of the 

data  in  a  low-dimensional  representation.  The  second  component  of  the  diffusion  maps 

representation was extracted and min-max scaled to generate a pseudo-space metric. This metric 

was oriented such that genes with  central zone enrichment had the highest values, while genes 

with portal zone enrichment had the lowest values. The rationale behind this approach is to capture 

the transcriptional heterogeneity along the central-portal axis of the liver lobule. The cells were 

then  divided  into  five  equal  bins  based  on  their  pseudo-space  metric  values,  representing 

transcriptional zones along the central-portal axis. Each bin contains cells within one-fifth of the 

full  pseudo-space  range  (i.e.,  cells  in  bin  i  have  pseudo-space  values  between 

𝑖−1

5

  and 

𝑖

). This 
5

binning approach allows for the identification of genes with varying expression patterns across the 

central-portal axis. For each treatment condition, time point, and bin (layer), the raw counts were 

81 

 
 
summed across the constituent cells. This step aggregates the gene expression data across cells 

within each bin, effectively capturing the transcriptional profile of each zone. These count sums 

then underwent normalization using the computeSumFactors function from the scran R package 

to obtain "Normalized Counts." This normalization step is crucial to account for technical biases 

and variations in sequencing depth, ensuring that the gene expression values are comparable across 

different samples and conditions. Genes were then categorized by zone based on the bin (layer) 

showing their maximum Normalized Counts. Genes with peak expression in bins 1 and 2 were 

classified as central zone genes, as they exhibit the highest expression levels in regions closer to 

the  central  vein.  Bin  3  genes  with  mid-range  pseudo-space  values  were  termed  mid-lobular, 

representing  genes  with  intermediate  expression  patterns.  Finally,  portal  zone  genes  displayed 

maximal Normalized Counts in bins 4 and 5, indicating their enrichment in regions closer to the 

portal vein. 

Design of Non-Linear Mixed Effect Model. 

We devised an analytical framework by constructing a a non-linear mixed effect model (NLMEM) 

using the MixedLM class in the statmodels Python package to investigate the effects of various 

factors on gene expression. Mixed effect models are statistical models that incorporate both fixed 

and random effects, allowing for the analysis of data with hierarchical or nested structures. In this 

study, the NLMEM models were meticulously designed to incorporate various factors, enabling 

the examination of the impact of TCDD exposure (D), rhythmicity (R), and zonation (Z) on gene 

expression dynamics. Table 2.3 provides descriptions  of the individual terms  representing each 

factor in the NLMEMs. These terms capture the specific aspects of each factor that influence gene 

expression. For instance, the term representing TCDD exposure may include parameters related to 

the  dose,  duration,  or  time  since  exposure,  while  the  term  for  rhythmicity  may  incorporate 

82 

 
parameters related to the amplitude, period, and phase of the oscillations. The specific equations 

for  the  NLMEM  classes  outlined  in  Table  2.3  are  presented  in  Table  2.4.  These  equations 

mathematically represent the relationships between the factors (D, R, and Z) and gene expression, 

incorporating both fixed and random effects. Fixed effects are parameters that are constant across 

all observations, while random effects account for the variability among different groups or levels 

within the data. The NLMEM approach offers several advantages over traditional linear models. 

First,  it  allows  for  the  incorporation  of  non-linear  relationships  between  the  predictors  and  the 

response variable, which is particularly relevant for biological systems where responses may be 

non-linear  or  exhibit  complex  patterns.  Second,  the  mixed  effect  structure  accounts  for  the 

hierarchical or nested nature of the data, where observations may be correlated within groups or 

clusters. This is crucial when analyzing gene expression data, where measurements from the same 

individual or experimental unit may be correlated. 

Term  Effect  Equation 

𝐷  D 

{

0 𝑖𝑓 𝑆𝑒𝑠𝑎𝑚𝑒 𝑂𝑖𝑙 𝐶𝑜𝑛𝑡𝑟𝑜𝑙
1 𝑖𝑓 𝑇𝐶𝐷𝐷 𝑡𝑟𝑒𝑎𝑡𝑚𝑒𝑛𝑡

𝑅𝑆𝑖𝑛  R 

𝑅𝐶𝑜𝑠  R 

𝑍𝑃1  Z 

𝑍𝑃2  Z 

sin (𝜔𝑡) 

cos (𝜔𝑡) 

𝑙 

3𝑙2
2

Table 3.3: Terms for mixed linear effects models. Each term is denoted by the name and its effect. 
D is TCDD (Dioxin) Influece, R is rhythmicity, and Z is zonation. 𝑡 is the time in hours after  

83 

 
 
 
 
 
Table 3.3 (cont’d) 
treatment. 𝑙 is the layer of the liver lobule. 𝜔 is the conversion factor between 𝑡 and radians which 
is equal to 

2𝜋

. 

24

Class 

Equation for Model 

F 

D 

R 

Z 

Z+R 

RxZ 

DxR 

DxZ 

𝑦 = 𝛽0 

𝑦 = 𝛽0 + 𝛽1𝐷 

𝑦 = 𝛽0 + 𝛽1𝑅𝑆𝑖𝑛 + 𝛽2𝑅𝐶𝑜𝑠 

𝑦 = 𝛽0 + 𝛽1𝑍𝑃1 + 𝛽2𝑍𝑃2  

𝑦 = 𝛽0 + 𝛽1𝑅𝑆𝑖𝑛 + 𝛽2𝑅𝐶𝑜𝑠  + 𝛽3𝑍𝑃1 + 𝛽4𝑍𝑃2 

𝑦 = 𝛽0 + (𝛽1𝑅𝑆𝑖𝑛 + 𝛽2𝑅𝐶𝑜𝑠 ) ∗ (𝛽3𝑍𝑃1 + 𝛽4𝑍𝑃2) 

𝑦 = 𝛽0 + 𝛽1𝐷 ∗ (𝛽2𝑅𝑆𝑖𝑛 + 𝛽3𝑅𝐶𝑜𝑠) 

𝑦 = 𝛽0 + 𝛽1𝐷 ∗ (𝛽2𝑍𝑃1 + 𝛽3𝑍𝑃2) 

Dx(Z+R) 

𝑦 = 𝛽0 + 𝛽1𝐷 ∗ (𝛽2𝑅𝑆𝑖𝑛 + 𝛽3𝑅𝐶𝑜𝑠  + 𝛽4𝑍𝑃1 + 𝛽4𝑍𝑃2) 

DxZxR 

𝑦 = 𝛽0 + 𝛽1𝐷 ∗ (𝛽2𝑅𝑆𝑖𝑛 + 𝛽3𝑅𝐶𝑜𝑠 ) ∗ (𝛽4𝑍𝑃1 + 𝛽5𝑍𝑃2) 

Table 3.4: Equations for each non-linear mixed effects model used for classification.  

Implementation of the NLMEM was almost identical to Droin C et al29. These equations were fit 

to normalized count of each individual gene using the Nelder-Mead optimization algorithm130. A 

noise  offset  (σ0 = 0.15)  was  added  to  the  data  to  avoid  overfitting.  Equations  with  the  smallest 

overall Bayesian information criterion131 (BIC) were classified with their corresponding class. BIC 

acts as a general multi-comparison analogue to the likelihood ratio (𝜒2) test as it penalizes more 

complex models. The exception to classifying with models that have the smallest BIC was when 

84 

 
 
 
models  tied  with  one  another.  Like  for  this  was  in  ties,  which  we  defined  as  having  a  relative 

difference of 1%. In the case of ties, models with fewer parameters were selected.  

Bar graphs of classification were calculated using Shwarz weights. Shwarz weights are calculated 

using  differences  between  the  BIC  values  and  the  minimum  BIC  value  in  across  all  models 

(𝐵𝐼𝐶𝑀𝑖𝑛) using the following equation: 

𝑠ℎ𝑤𝑎𝑟𝑧 𝑤𝑒𝑖𝑔ℎ𝑡 =  

𝐵𝐼𝐶𝑖 − 𝐵𝐼𝐶𝑀𝑖𝑛
exp
2
𝑛 (𝐵𝐼𝐶𝑖 − 𝐵𝐼𝐶𝑀𝑖𝑛)
Σ𝑖=0

Differential Rhythmicity and Zonation. 

To conduct a comprehensive analysis of differential rhythmicity and zonation between treated and 

control conditions, gene expression data were fitted to models similar to the approach described in 

section 2.4. and the Differential RhythmicitY analysis in R (DryR)132 framework model in R. DryR 

(Differential RhythmicitY analysis in R) is a statistical framework based on model selection that 

is designed to detect and estimate changes in rhythmic parameters (amplitude and phase) and mean 

expression  levels  across  multiple  conditions. This  framework  leverages  the  power  of  rhythmic 

regression models and likelihood-based model comparison techniques to rigorously analyze time-

course gene expression data. The core principle of DryR is to fit gene expression data to a set of 

competing  models  that  represent  different  rhythmic  behaviors  and  subsequently  compare  the 

goodness-of-fit of these models using likelihood ratio tests. By fitting these models to the time-

course  data  and  comparing  their  respective  likelihoods,  DryR  can  identify  genes  that  exhibit 

rhythmic expression patterns, as well as those that display significant changes in rhythmicity or 

mean expression levels between conditions. 

With respect to the models similar to the approach described in section 2.4, instead of using the 

Nelder-Mead optimization algorithm, we employed ordinary least squares (OLS) regression for 

85 

 
 
model fitting by leveraging the statsmodels Python library. The OLS regression method is a widely 

used technique for estimating the unknown parameters in a linear regression model. By minimizing 

the sum of squared residuals between the observed data and the predicted values from the model, 

OLS  provides  an  unbiased  and  efficient  estimate  of  the  model  parameters  under  certain 

assumptions. Specifically, the ols function from statsmodels.api was utilized to carry out OLS on 

each  gene's  expression  values,  which  were  separated  into  treated  and  control  groups.  This 

separation  enabled  the  quantification  of  differences  in  rhythmicity  parameters  and  spatial 

patterning between the two conditions across various zone regions, supporting subsequent analysis 

of spatial  and temporal  expression patterns under treatment conditions. These separated treated 

and  control  gene  expression  values  were  then  fitted  to  either  the  R  class  model  (capturing 

rhythmicity) or the F class  model (non-rhythmic  flatline) for  rhythmicity analysis. The R  class 

model incorporates parameters that describe the oscillatory behavior of gene expression, such as 

amplitude,  period,  and  phase,  while  the  F  class  model  represents  a  constant,  non-rhythmic 

expression pattern. Similarly, for spatial zonation analysis, the data were fitted to the Z class model 

(spatial patterning) or the F class model. The Z class model accounts for the spatial organization 

of  gene  expression  along  the  central-portal  axis  of  the  liver  lobule,  capturing  the  varying 

expression levels across different zones. To determine if the R class or Z class models, which have 

more parameters, provided a statistically significantly better fit than the simpler F class model, 

likelihood  ratio  tests  (also  known  as  χ2  tests)  were  employed.  The  likelihood  ratio  test  is  a 

statistical method used to compare the goodness of fit between nested models, where one model 

(the null model, in this case, the F class model) is a special case of the other model (the alternative 

model,  either  the  R  class  or  Z  class  model). The  likelihood  ratio  test  compared  the  maximum 

likelihood estimates  of the more complex model (R or Z class) to the simpler model (F class), 

86 

 
quantifying if the additional parameters justify the increase in model complexity. This enabled a 

rigorous  identification  of  differences  in  rhythmicity  and  spatial  patterning  between  treated  and 

control conditions for each gene tested. Genes with expression data that were significantly better 

explained by the R class model compared to the F class model, based on the likelihood ratio test, 

were  classified  as  exhibiting  differential  rhythmicity.  Similarly,  genes  better  fit  by  the  Z  class 

model  were  designated  as  displaying  differential  zonation.  The  likelihood  ratio  test  was 

implemented using the chi2 function from the scipy.stats Python package for this model selection 

approach. This  function calculates the p-value associated with  the likelihood ratio test  statistic, 

allowing for the assessment of statistical significance and the selection of the more appropriate 

model for each gene. 

We estimated rhythmicity parameters from the fitted parameters in R models. If we let 𝑎 = 𝛽1 and 

𝑏 = β2 then we can calculate the amplitude of the expression oscillation as: 

And we can define the phase as: 

𝐴𝑚𝑝𝑙𝑖𝑡𝑢𝑑𝑒 = 2√𝑎2 + 𝑏2 

𝑝ℎ𝑎𝑠𝑒 = 𝜔 𝑎𝑟𝑐𝑡𝑎𝑛(𝑏, 𝑎) 

Linear zonation slope was calculated for all zonal genes that kept zonation post TCDD treatment 

and were not mid-lobular. To calculate zonation slope we fit a simpler zonation model than the one 

in table 2.4: 

𝑦 = 𝛽0 + 𝛽1𝑙 

We fit this model much in the same way we did above during differential zonation. The zonation 

slope was equal to 𝛽1. The procedure was like how one would find the line of best fit. 

87 

 
 
 
Statistical Tests. 

Differential  expression  analysis  between  treatments  and  comparison  of  parameter  values  were 

performed using the non-parametric Mann-Whitney U test, implemented via the mannwhitneyu 

function in scipy.stats. The Mann-Whitney U test is a non-parametric statistical test that is used to 

compare  the  distributions  of  two  independent  samples.  It  is  particularly  useful  when  the  data 

violates  the  assumptions  of  normality  or  homogeneity  of  variance,  which  are  required  for 

parametric tests such as the t-test. To determine if there were differences between the distribution 

of  gene  zone  assignments  between  gene  lists,  the  Kolmogorov-Smirnov  two-sample  test  was 

utilized through the ks_2samp function. The Kolmogorov-Smirnov test is a non-parametric test 

that quantifies the distance between the empirical distribution functions of two samples. This test 

is useful for comparing the overall shapes and distributions of two datasets, rather than just their 

central tendencies or means. Correlation coefficients and associated p-values were also computed 

using scipy.stats functions. Correlation analysis is a statistical technique that measures the strength 

and direction of the linear relationship between two variables. The correlation coefficients quantify 

the degree of association, while the p-values indicate the statistical significance of the observed 

correlation. All statistical tests leveraged Python-based scipy.stats functions, which are part of the 

SciPy  library.  SciPy  is  a  widely  used  open-source  Python  library  for  scientific  and  technical 

computing, providing a comprehensive collection of mathematical algorithms and functions. For 

gene  set  enrichment  analysis,  the  Enrichr API  within  the  gseapy  package  was  used  to  identify 

enriched  pathways  and  processes.  Gene  set  enrichment  analysis  (GSEA)  is  a  computational 

method that determines whether predefined sets of genes (e.g., pathways or biological processes) 

are  overrepresented  or  underrepresented  in  a  given  gene  list,  compared  to  a  background  set  of 

genes.  The  Enrichr  API  provides  access  to  a  comprehensive  collection  of  gene  set  libraries, 

88 

 
enabling the identification of enriched biological annotations associated with the genes of interest. 

Significant pathways were called at a false discovery rate (FDR) threshold of < 0.2. The FDR is a 

statistical method used to correct for multiple hypothesis testing, which is a common issue when 

performing  numerous  statistical  tests  simultaneously.  By  controlling  the  FDR,  researchers  can 

balance the trade-off between false positive and false negative rates, ensuring that the identified 

enriched  pathways  are  statistically  robust.  This  integrative  analysis  enabled  connecting  the 

observed  differential  expression  and  rhythmicity  changes  to  impacted  biological  functions  and 

processes.  By  combining  the  results  of  differential  expression  analysis,  rhythmicity  parameter 

comparisons,  and  gene  set  enrichment  analysis,  researchers  can  gain  a  comprehensive 

understanding  of  the  transcriptional  responses  to  experimental  treatments  or  conditions.  This 

approach allows for the identification of dysregulated genes, alterations in rhythmic patterns, and 

the associated biological pathways and processes that may be affected by these changes. 

RESULTS 

Treatment with 2,3,7,8 Tetrachlorodibenzo-p-dioxin (TCDD) induces an oscillatory pattern in 
circadian clock genes and elevates the expression of TCDD response genes in acute 
hepatocyte response. 

To  investigate  the  transcriptional  impact  of  acute  2,3,7,8-tetrachlorodibenzo-p-dioxin  (TCDD) 

exposure on the disruption of rhythmicity and zonation in hepatocytes, we utilized single-nucleus 

RNA-sequencing  (snRNA-seq)  data  from  the  livers  of  male  C57BL/6  mice,  as  generated  by 

Cholico et al127. The mice were housed in a controlled environment with a 12-hour light/12-hour 

dark cycle. At the start of the light phase (6:00 AM), the mice received a single oral dose of 30 

μg/kg TCDD or a vehicle control (sesame oil) via gavage. Subsequently, the livers were harvested 

and snap-frozen at specific time points: 2, 4, 8, 12, 18, and 24-hours post-treatment. Following 

sequencing,  the  snRNA-seq  data  was  processed,  and  hepatocytes  were  identified  based  on 

89 

 
 
established hepatic biomarkers from previous studies124,126. After excluding non-hepatocyte cell 

types and filtering out low-quality hepatocytes with low read counts, a total of 129,373 hepatocytes 

remained  in  the  dataset.  The  top  15,000  highly  variable  genes  (HVGs)  were  selected  for 

downstream analysis. Dimensionality reduction techniques (Batch correction) were applied to the 

TCDD response genes for data visualization and analysis, as detailed in the methods section (2.2). 

Uniform  Manifold  Approximation  and  Projection  (UMAP)  analysis  was  conducted  on  the 

hepatocyte  population  to  examine  the  cell  clustering  patterns.  The  UMAP  analysis  revealed  a 

distinct circular pattern in the hepatocyte clusters, which corresponded to different time points and 

treatment  conditions  (Fig.  20  A).  This  circular  pattern  suggests  an  inherent  rhythmicity  in 

hepatocyte gene expression, potentially reflecting the circadian rhythm or other cyclical processes 

within  these  cells.  This  circular  pattern  in  the  UMAP  analysis,  indicative  of  rhythmic  gene 

expression, was not observed in other cell types present in the dataset. Notably, the hepatocyte 

cluster corresponding to the TCDD treatment group exhibited a distinct separation from the control 

group, suggesting a significant alteration in the gene expression profiles due to TCDD exposure 

(Fig.  20  B).  To  examine  the  immediate  impact  of  TCDD  over  time  and  space,  the  expression 

patterns of known circadian clock genes (Per1, Arntl, and Cry) and TCDD response genes (Cyp1a1 

and Ahrr) were analyzed. For the circadian clock gene Per1, the circular pattern (oscillatory pattern 

with  a  peak  at  the  8-hour  time  point)  was  observed  (Fig.  20  C),  indicating  the  preservation  of 

rhythmic  expression  despite  TCDD  treatment.  While  acute  TCDD  treatment  impacted  hepatic 

rhythmicity, it did not completely abolish circadian oscillation, as seen in previous studies with 

sub-chronic TCDD  exposure28  (Fig.  2E). This  finding  suggests  that  the  disruption  of  circadian 

rhythms  by TCDD  may  be  dependent  on  the  duration  and  dosage  of  exposure. As  expected,  a 

significant increase in the expression profiles of TCDD response genes (Cyp1a1 and Ahrr) was 

90 

 
observed after treatment (Fig. 20 D and 20 F). Interestingly, the expression of Ahrr and Cyp1a1 

achieved  saturation  at  12  hours  post-treatment  (Fig.  20  F),  while  Cyp1a2  and  Tiparp  reached 

saturation at or before the 2-hour time point (data not shown). This differential temporal response 

of  TCDD-inducible  genes  suggests  a  complex  regulatory  mechanism  underlying 

the 

transcriptional response to TCDD exposure. The observed circular pattern in the UMAP analysis, 

coupled with the analysis of circadian clock genes and TCDD response genes, provides insights 

into the impact of acute TCDD exposure on hepatocyte rhythmicity and gene expression dynamics. 

While  TCDD  treatment  altered  hepatic  rhythmicity,  it  did  not  completely  abolish  circadian 

oscillations, at least in the acute exposure scenario. Additionally, the varying temporal responses 

of TCDD-inducible genes highlight the intricate transcriptional regulation mechanisms at play in 

the liver's response to dioxin exposure. 

A 

C 

B 

D 

Figure 20: Visualization of acute TCDD in mouse hepatocytes. UMAP of hepatocytes colored by 

91 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 20 (cont’d) 
(A) Time in hours after treatment, (B) Dose in μg/kg of TCDD, (C) Circadian clock gene Per1 
showing peak expression between timepont 8 to  12 hrs and  (D) TCDD  activated gene Cyp1a1 
showing almost zero expression at dose 0 μg/kg (vehicle/sesame gavaged) and high expression at 
dose 30 μg/kg (TCDD gavaged). Each dot represents a cell. Time series expression of Circadian 
and TCDD  response  genes.  Normalized  counts  time  series  expression  of  known  (E)  Circadian 
clock genes Arntl (Bmal1), Cry1 and (F) TCDD activated genes Ahrr and Cyp1a1. Expression of 
genes are plotted as a function of sesame oil vehicle (0 μg/kg) in blue and TCDD (30 μg/kg) in 
orange. 

E 

F 

Zonation of hepatocyte from single nuclei gene expression profile. 

Identifying zonated genes in the liver lobule is a challenging task when using single-nucleus RNA 

sequencing (snRNA-seq) data, as zonation is not directly measured in these experiments. Instead, 

92 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
zonation patterns need to be inferred from the gene expression profiles of individual hepatocytes. 

However, this inference can be confounded by various experimental factors that can distort the 

zonation profiles. For example, the centrally zonated gene Cyp1a2133 is known to be activated by 

TCDD treatment,  while another centrally zonated gene, Slc1a2, exhibits a rhythmic expression 

pattern  over  time.  To  accurately  infer  zonation  from  hepatocyte  gene  expression,  these 

confounding  factors  need  to  be  corrected.  To  address  this  challenge,  we  utilized  the  approach 

developed  by  Nault  et  al124.  The  first  step  involved  batch  correction  of  the  single-cell  data  to 

remove  variance  introduced  by TCDD  exposure  and  temporal  effects. This  step  is  crucial  as  it 

eliminates the influence of these confounding factors on the gene expression profiles, allowing for 

a more accurate inference of zonation. After batch correction, trajectory inference was performed 

to calculate the latent zonation value for each individual hepatocyte. This approach leverages the 

inherent  structure  and  relationships  within  the  gene  expression  data  to  infer  the  underlying 

zonation patterns. By applying batch correction and trajectory inference, we aimed to disentangle 

the effects of TCDD treatment and temporal variations from the gene expression profiles, thereby 

enabling a more accurate inference of hepatocyte zonation. This approach is particularly important 

in the context of studying the impact of TCDD on zonation, as it allows for the identification of 

zonated genes that may be affected by dioxin exposure, while accounting for potential confounding 

factors. To perform batch correction on the single-nucleus RNA sequencing (snRNA-seq) data, we 

utilized  scVI129,  a  variational  autoencoder  (VAE)  method. The  choice  of  a VAE  approach  was 

driven  by  the  large  size  of  the  dataset,  which  comprised  approximately  130,000  cells  after 

preprocessing. Most single-cell integration tools are not designed to efficiently handle datasets of 

such a magnitude. However, scVI129 was specifically developed to integrate large gene expression 

atlases across multiple laboratories and to batch correct technical variations, making it well-suited 

93 

 
for  this  analysis.    The  batch  correction  process  using  scVI  aimed  to  disentangle  the  technical 

variation introduced by TCDD treatment and time of harvest from the biological variation in gene 

expression. By removing these confounding technical factors, the resulting batch-corrected data 

would  provide  a  more  accurate  representation  of  the  underlying  biological  processes,  such  as 

hepatocyte zonation and rhythmic gene expression patterns. Visualizations of the batch-corrected 

data (Fig. 21 A-C) demonstrate the effectiveness of the scVI approach in mitigating the effects of 

TCDD treatment and time of harvest on the gene expression profiles. This batch-corrected data 

served  as  the  foundation  for  subsequent  trajectory  inference  analyses,  enabling  a  more  reliable 

inference of hepatocyte zonation patterns while accounting for the potential confounding factors 

introduced by the experimental design. 

A 

B 

C 

Figure 21: UMAP of batch correction by scVI of single-nuclei RNA sequencing data. 

94 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 21 (cont’d) 
(A) UMAPs of single-nuclei RNA sequencing data and the latent space of scVI (batch correction). 
Cells are colored by the combined label of hours and TCDD dose treatment. (B) Batch corrected 
single-nuclei  RNA  sequencing  colored  by  0  μg/kg  dose  treatment  (blue)  and  30  μg/kg  dose 
treatment (orange). (C) Batch corrected single-nuclei sequencing colored by six (6) time-points in 
hours after treatment. Each dot represents a cell. 

After  batch  correcting  the  single-nucleus  RNA  sequencing  (snRNA-seq)  data  using  scVI,  we 

employed  a  trajectory  inference  algorithm,  diffusion  pseudo-time134,135  on  the  latent  space 

generated by scVI to infer the trajectory of the portal-central axis. This approach aimed to capture 

the zonation patterns within the liver lobule. Utilizing a second component of the diffusion pseudo-

time  plot  (analogous  to  components  of  PCA)  as  our  trajectory,  we  observed  most  zonal  genes 

follow along the component of either high expression in the portal axis with low expression in the 

central axis or high expression in the central axis with low expression in the portal axis. Expression 

values  were  normalized  and  reoriented  from  portal  to  central  expression.  We  referred  to  the 

trajectory inference values as "Pseudo-space," which we defined as an ordering of cells based on 

how  closely  they  approximate  the  expression  patterns  of  centrally  zonated  hepatocytes 

(represented  by  a  value  of  1)  and  portal  zonated  hepatocytes  (represented  by  a  value  of  0). To 

validate whether the inferred pseudo-space accurately captures liver lobule zonation, we examined 

the expression profiles of known zonated genes, Cyp2f2 and Slc1a2, across the pseudo-space axis. 

Significant  correlations  were  observed  between  the  expression  levels  of  these  genes  and  the 

pseudo-space values of the cells. Specifically, we observed high expression of Slc1a2 at the central 

axis (Fig. 22 B) and high expression of Cyp2f2 at the portal axis (Fig. 22 C), consistent with their 

known zonation patterns. As a negative control, we analyzed non-zonated genes, such as Arntl and 

Clock,  which  are  not  expected  to  exhibit  zone-dependent  expression  patterns¹. As  anticipated, 

these genes did not show significant correlations with the pseudo-space values, further validating 

the  accuracy  of  the  inferred  zonation  trajectory.  Our  analysis  confirmed  that  no  significant 

95 

 
 
correlations  exist  between  the  expression  of  the  non-zonated  genes  Arntl  and  Clock  and  the 

computed pseudo-space metric. The expression of these genes was observed to be non-zonated in 

nature (Fig. 22 D). The presence of strong correlations for known zonated genes, such as Cyp2f2 

and  Slc1a2,  and  the  absence  of  such  correlations  for  non-zonated  genes,  confirms  that  the 

computed pseudo-space metric accurately reflects the zonation pattern across the liver lobule. To 

streamline subsequent analyses, we binned the continuous pseudo-space values into distinct layers 

(Fig. 22 A). The number of layers used depends on the resolution of the data. For example, Halpern 

et al95. defined fifteen zones, whereas Droin et al29. used only eight zones. In our study, we elected 

to bin the data into five layers, ensuring at least two thousand cells in each layer. The pseudo-space 

axis was divided into five equal-length bins. As expected, fewer cells were binned into the first 

(central)  layer  compared  to  the  last  (portal)  layer  (Fig.  4A).  This  observation  aligns  with  the 

anatomy of the liver lobule, where there is a single central vein and numerous peripheral portal 

triads, resulting in fewer cells located proximal to the central vein. Using the binned pseudo-space 

values, we generated pseudo-bulk expression profiles for each treatment, time point, and layer by 

summing the counts across cells within each combination of these factors. These count sums were 

then normalized to account for differences in the number of cells comprising each pseudo-bulk 

profile  using  size  factor  estimation136.  The  resulting  normalized  expression  values  for  each 

treatment × time × layer combination were utilized in downstream analyses and gene classification. 

The binning of the continuous pseudo-space values into distinct layers facilitated the analysis of 

zonated gene expression patterns by providing a discrete representation of the spatial organization 

within the liver lobule. By generating pseudo-bulk  expression  profiles for each treatment,  time 

point, and layer combination, we could effectively capture the dynamic changes in gene expression 

across different experimental conditions and spatial locations within the liver lobule. The pseudo-

96 

 
space values obtained through trajectory inference provide a quantitative measure of hepatocyte 

zonation,  allowing  for  the  identification  and  analysis  of  genes  exhibiting  zonated  expression 

patterns within the liver lobule. By combining batch correction techniques (scVI) and trajectory 

inference  (diffusion  pseudo-time),  we  were  able  to  disentangle  the  confounding  effects  of 

experimental  factors  and  accurately  capture  the  underlying  zonation  patterns,  enabling  a  more 

comprehensive  understanding  of  the  spatial  organization  and  functional  specialization  of 

hepatocytes in response to TCDD exposure. 

A 

B 

C 

D 

Figure  22:  UMAP  visualization  of  pseudo-space  binning  and  zonation.  (A)  UMAP  plot  of  the 
pseudo space of hepatocytes binned into zonation layers. The pseudo-space value of 1 represents 

97 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 22 (cont’d) 
hepatocytes  with  high  pericentral  gene  expression  and  0  represents  hepatocytes  with  high 
periportal  gene  expression.  Zonation  layers  are  deduced  between  number  1  (pericentral)  and  5 
(periportal).    UMAP  plot  of  (B)  Pericentral  gene  marker  Slc1a2,  (C)  Periportal  gene  marker 
Cyp2f2 and (D) Circadian clock gene Arntl. Each dot represents a cell. 

Figure 23:  Bar plot of the number of hepatocytes in each inferred layer. Layers with smaller values 
represent  more  pericentral  hepatocytes  and  layers  with  larger  values  represent  more  periportal 
hepatocytes. 

Classifying hepatocytes genes according to rhythmicity, zonation, and effect of TCDD 
treatment using a non-linear mixed effects model. 

To investigate the spatial and zonated expression profiles of genes and assess how these profiles 

are altered by TCDD treatment, we adopted and extended the approach outlined in Droin et al29. 

In their approach, rhythmicity (R) is modeled using sine and cosine functions, and zonation (Z) is 

modeled  using  first  and  second-order  Legendre  polynomials.  We  expanded  on  their  model  by 

including the effect of TCDD exposure (Dioxin; D) on gene expression across the peri-central axis 

of the liver lobule. We represented TCDD treatment in the model with a simple indicator function, 

where TCDD-treated hepatocytes were denoted as D=1, and untreated hepatocytes as D=0. Using 

98 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
these  variables,  we  constructed  a  series  of  non-linear  mixed-effects  models  (NLMEMs) 

comprising different combinations of factors (D, Z, and R) that could potentially influence gene 

expression based on the experimental design and zonation inference (Fig. 24 and 25). The effects 

of these factors can be independent (e.g., Z + R) or dependent (e.g., D × Z), as illustrated in Fig. 

6. With this modeling approach, we classified the top 15,000 highly variable genes according to 

the  best  fitting  NLMEM  model  to  describe  each  gene's  expression  pattern  (see  methods). The 

NLMEM approach allowed us to model the complex interplay between TCDD exposure, rhythmic 

gene  expression,  and  zonation  patterns  within  the  liver  lobule.  By  incorporating  the  TCDD 

treatment factor (D) into the model, we could assess the impact of dioxin exposure on the spatial 

and  temporal  expression  dynamics  of  individual  genes. The  model  formulations  included  both 

independent  and dependent  effects,  enabling the capture of various scenarios. For instance, the 

independent effects (e.g., Z + R) would describe genes whose zonation and rhythmicity patterns 

are not influenced by TCDD treatment, while the dependent effects (e.g., D × Z) would capture 

genes whose zonation patterns are altered by TCDD exposure. By fitting these NLMEM models 

to the gene expression data and selecting the best-fitting model for each gene, we classify the top 

highly  variable  genes  into  different  categories  based  on  their  expression  patterns.  This 

classification  provides  insights  into  the  diverse  transcriptional  responses  to  TCDD  treatment, 

including genes exhibiting altered zonation patterns, disrupted rhythmicity, or a combination of 

both. 

99 

 
 
 
 
 
 
Figure 24: Schematic illustrations of each single or no effect non-linear mixed effect model profiles 
for classification of hepatocyte genes. Ideal simulations of each model plotted in terms of Time (in 
hours after treatment) versus expression or inferred layers of the liver lobule. Each line plot is 
colored either by time, layer of dose of TCDD of treatment. TCDD (Dioxin) influence is delineated 
by D, influence of liver rhythmicity is delineated by R, and influence of liver zonation (layers) are 
delineated Z. Models with no influence from either D, R, or Z are delineated as flat, F. 

100 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 25: Schematic illustrations of multiple effect  non-linear mixed effect  model profiles  for 
classification of hepatocyte genes. Ideal simulations of each multiple model category plotted in 
terms of Time (in hours after treatment) versus inferred layer of the liver lobule. Each line plot is 
colored either by time and layer of dose of TCDD of treatment. Effects that are separated with ‘+’ 
indicate models in which factors are independent of one another. Effects that are separated with a 
‘x’ indicate models in which factors are dependent on one another. Example: Z+R indicates genes 
that  are  zonated  and  rhythmic  independently  and  ZxR  indicates  genes  that  are  zonated  and 
rhythmic dependently. 

101 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
To evaluate the accuracy of my gene classification approach, we initially examined a set of known 

genes  with  established  rhythmic,  zonated,  or  TCDD-responsive  expression  patterns.  These 

included Arntl (known for rhythmic expression), Slc1a2 (exhibiting both zonated and rhythmic 

patterns), and Ahrr (a gene induced by TCDD with saturation observed at 12 hours).  As shown in 

Fig. 26, the model successfully captures the rhythmic profiles of Arntl and Slc1a2. The rhythmic 

expression  patterns  of  these  genes  are  accurately  represented  by  the  periodic  functions 

incorporated in the model formulations. Additionally, the model can differentiate between TCDD 

saturation effects and true temporal  rhythmicity. This distinction is crucial, as some genes may 

exhibit saturation in expression levels due to TCDD treatment, which should not be mistaken for 

rhythmic oscillations. Furthermore, the model correctly identifies the zonated pattern of Slc1a2. 

The  incorporation  of  Legendre  polynomials  in  the  model  allows  for  the  accurate  modeling  of 

spatial zonation patterns within the liver lobule. Finally, Ahrr is properly classified by the model 

as being influenced by TCDD. The model captures the TCDD-induced expression changes in this 

gene, as evidenced by the distinct expression profiles between the treated and untreated conditions. 

Taken together, the expected expression patterns of these control genes are recapitulated, thereby 

validating the performance of our classification model. The model's ability to accurately capture 

rhythmic, zonated, and TCDD-responsive expression patterns in these well-characterized genes 

provides confidence in its applicability to the classification of the larger gene set. 

102 

 
 
 
 
 
 
 
 
 
A 

B 

C 

Figure 26: Classification of canonical rhythmic, zonal and TCDD responsive genes. (A) Bar plots 
of Schwarz weigh classification of Arntl, Slc1a2, and Ahrr. Line plots of expression colored by (B) 
Layer and TCDD dose. 1_0 denote lineplot expression of gene in the pericentral region after 0 
μg/kg dose treatment and 5_30  denote lineplot expression of gene in the periportal region after 30 
μg/kg dose treatment. (C) Hours after treatment and dose of TCDD. 2_0  denote lineplot expression 
of gene at timepoint 2hrs after 0 μg/kg dose treatment and 24_30  denote lineplot expression of 
gene timepoint 24hrs after 30 μg/kg dose treatment. 

103 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Finally,  we  categorized  each  gene  mRNA  profile  into  one  of  ten  distinct  patterns  based  on  its 

expression  dynamics:  Flat  (F)  for  profiles  lacking  rhythmic,  zonated,  or  TCDD-induced 

influences; Rhythmicity (R) for genes exhibiting purely rhythmic expression patterns; Zonated (Z) 

for genes displaying solely zonated expression patterns; Dioxin (D) for genes induced by TCDD 

treatment;  independent  Zonated  and  Rhythmicity  (Z+R)  for  genes  showing  both  zonated  and 

rhythmic expression patterns independently; dependent Zonated and Rhythmicity (Z×R) for genes 

with  both  zonated  and  rhythmic  expression  patterns  dependently;  dependent  Dioxin  and 

Rhythmicity (D×R) for genes exhibiting both TCDD-induced influence and rhythmic expression 

patterns dependently; dependent Dioxin and Zonated (D×Z) for genes displaying both zonated and 

TCDD-induced  influence  independently;  independent  Zonated  and  Rhythmicity  with  TCDD 

influence  (D×(Z+R))  for  genes  showing  both  zonated  and  rhythmic  expression  patterns 

independently and influenced by TCDD treatment; and dependent Zonated and Rhythmicity with 

TCDD influence (D×(Z×R)) for genes exhibiting both zonated and rhythmic expression patterns 

dependently and influenced by TCDD treatment. The classification revealed that over 39% of the 

gene mRNA profiles were recorded as Flat (F), indicating no significant rhythmic, zonation, or 

TCDD influence. Additionally, 16.63% of the mRNA profiles were classified as Zonated (Z), while 

15.15%  and  8.84%  were  classified  as  Rhythmic  (R)  and  Dioxin  (D),  respectively.  These 

percentages agree with a previous study by Droin and Kholtei et al.¹, where they recorded more 

mRNA profiles classified as Zonated (Z) compared to the Rhythmicity (R), independent Zonated 

and Rhythmicity (Z+R), and dependent Zonated and Rhythmicity (ZxR) classes. The classification 

of gene expression profiles into these 10 patterns allowed for a comprehensive characterization of 

the diverse transcriptional responses to TCDD treatment, rhythmicity, and zonation within the liver 

lobule. By accounting for the independent and dependent effects of these factors, as well as their 

104 

 
potential interactions, we could gain insights into the complex regulatory mechanisms governing 

gene expression dynamics in hepatocytes. Furthermore, the agreement between our findings and 

those of the previous study by Droin et al29. regarding the prevalence of zonated gene expression 

patterns provides additional confidence in the validity of our approach and the biological relevance 

of the observed patterns. 

The  classification  also  revealed  that  about  32%  of  the  mRNA  profiles  exhibited  rhythmic 

expression patterns, encompassing classes with a rhythmic component like R, Z+R, RxZ, DxR, 

etc. Additionally, approximately 35% of the mRNA profiles showed zonated expression patterns, 

including  classes  with  a  zonation  component  such  as  Z,  Z+R,  RxZ,  DxZ,  etc.  Further  analysis 

revealed that nearly 50% of rhythmic genes (mRNA profiles classified with an R component) were 

impacted  by  TCDD  treatment,  compared  to  around  25%  of  zonated  genes  (mRNA  profiles 

classified with a Z component) (Fig. 27 B). This finding suggests that TCDD treatment had a more 

pronounced effect on disrupting rhythmic gene expression patterns compared to zonated patterns. 

The largest multi-effect category was D×(Z+R), which exhibited synergistic effects of TCDD on 

zonation and rhythmicity independently (Fig. 27 A). This class represents genes whose zonation 

and rhythmic expression patterns were independently influenced by TCDD treatment. Gene set 

enrichment analysis of the D×(Z+R) class showed that these genes were enriched for canonical 

pathways involved in TCDD liver toxicity, such as metabolism of xenobiotics and drug metabolism 

by cytochrome P450 (Fig. 27 C). This enrichment indicates that the genes in this class play crucial 

roles  in  the  liver's  response  to  TCDD  exposure,  potentially  contributing  to  the  observed 

toxicological  effects.  Notably,  no  genes  displayed  dependence  between  dose,  zonation,  and 

rhythmicity (D×Z×R), suggesting that the effects of TCDD treatment on zonation and rhythmicity 

were independent in the context of this study. Overall, our results demonstrate that gene expression 

105 

 
in  the  liver  is  not  static  but  is  partitioned  across  the  liver  lobule  in  space  (zonation),  time 

(rhythmicity),  and  by  treatment  dose. The  interplay  between  these  factors  results  in  a  complex 

landscape  of  gene  expression  patterns,  with  TCDD  treatment  exerting  differential  effects  on 

rhythmic and zonated gene expression profiles. By integrating spatial, temporal, and treatment-

related factors in our analysis, we could comprehensively characterize the transcriptional dynamics 

within the liver lobule and elucidate the impact of TCDD exposure on these intricate regulatory 

mechanisms. This  approach  provides  a  more  holistic  understanding  of  the  molecular  processes 

governing liver function and toxicological responses, contributing to the development of targeted 

interventions and therapeutic strategies. 

A 

B 

Figure 27: Classification distributions and enrichment of TCDD toxicity pathways. (A) Bar plot 
of the number of genes (mRNA profiles) in each class for the top 15,000 HVGs. (B) Bar plot of 
the proportion of genes that have TCDD influence for all zonal genes (5182 total zonal genes) and 
all rhythmic genes (4865 total rhythmic genes). (C) GSEA analysis of the Dx(Z+R) class of genes. 

106 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
C 

Figure 27 (cont’d)  

The effects of acute 2,3,7,8 Tetrachlorodibenzo-p-dioxin (TCDD) perturbation of rhythmic 
genes. 

Utilizing the classification framework illustrated in Figure 27A, our investigation delved into the 

acute toxicity effects on rhythmic genes within the liver lobule. We first analyzed the the impact 

of TCDD on a core set of circadian rhythm genes (Clock, Arntl, Per2, Cry1, Nr1d1, Npas2, Rorc) 

by by examining their mRNA transcript classification. Notably, we found that all core circadian 

clock genes exhibited rhythmicity in their classification classes. Only Arntl, Clock and Rorc were 

classified as purely rhythmic (Fig. 28 A). However, upon closer examination of their expression 

patterns, we found that Arntl and Clock exhibited significantly increased expression between 11 

and  16  hours  (p-value  <  0.01,  Mann-Whitney  U-test)  with  TCDD  treatment.  This  observation 

aligns with the saturation of TCDD response genes analyzed in Figure 20 F. Conversely, Npas2, 

Per2,  and  Nr1d2  were  classified  as  having TCDD  influence  (Fig  28 A).  Per2  had  significantly 

higher expression for all time points except 18 hours post-treatment. Npas2 had significantly lower 

107 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
expressions  at  timepoints  2,  4,  and  24.  Nr1d2  exhibited  significantly  higher  expression  at 

timepoints 2 and 4, but significantly lower expression at timepoint 12. Cry1 was classified as zonal 

and rhythmic, which corroborates previous studies. However, the second-highest classification of 

Cry1 implied TCDD influence, observable at timepoints 18 and 24.  

Figure  28:  Core  circadian  clock  genes  classified  as  only  rhythmic.  (A)  Bar  plots  of  Schwarz 
weights for gene classification. (B) Line plots of gene expression graphed with respect to hours 
after treatment and colored by dose of treatment of TCDD. (C) Violin plots plotting the distribution 
of expression for each treatment condition at each timepoint. 

108 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure  29:  Core  circadian  clock  genes  classified  as  having  multiple  effects.  (A)  Bar  plots  of 
Schwarz weights for gene classification. (B) Line plots of gene expression graphed with respect to 
hours  after  treatment  and  colored  by  dose  of  treatment  of TCDD.  (C) Violin  plots  plotting  the 
distribution of expression for each treatment condition at each timepoint. 

109 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
To  quantify  the  effects  of  TCDD  on  rhythmic  gene  expression,  we  explored  whether  TCDD 

treatment  induced  or  removed  rhythmicity  in  genes  classified  as  rhythmic  and  influenced  by 

TCDD (DxR and Dx(R+Z)). First, we used dryR (Differential RhythmicitY analysis in R)132,137, a 

package in R, to assess the differential rhythmicity of the control and treated genes found in the 

DxR and Dx(R+Z) classes. This analysis unveiled that genes such as Elovl7 and Cdh13 gained 

rhythmicity subsequent to TCDD exposure, whereas Klf9 and Myc lost their rhythmic patterns (as 

depicted in Fig 29 A and B). We further analyzed rhythmicity changes by using a likelihood ratio 

test to compare whether gene expression better fit a rhythmic or flat (non-rhythmic) model. Genes 

fitting a rhythmic model under control but a flat model under treatment were deemed to have lost 

rhythmicity. Conversely, those fitting a flat expression model in control but a rhythmic model in 

treatment were classified as having gained rhythmicity. Among TCDD-influenced rhythmic genes, 

distribution analysis revealed 13% lost rhythmicity after treatment, while 21% gained rhythmicity 

(Fig 29 C). Subsequent gene set enrichment analysis (GSEA) conducted on both the gained and 

lost  rhythmicity  groups  elucidated  hallmark  pathways  of  TCDD  toxicity.  The  lost  rhythmicity 

group showed enrichment for processes related to chemical carcinogenesis, drug metabolism and 

metabolism of xenobiotics by cytochrome P450, while the gained rhythmicity group was enriched 

for  retinol  metabolism  138.  In  summary,  TCDD  exposure  induced  both  gains  and  losses  of 

rhythmicity in  different  sets  of genes,  with  the affected genes enriched for processes related to 

TCDD toxicity and metabolism. 

110 

 
 
 
 
A 

B 

C 

E 

D 

Figure 30: Analysis of gain or loss of rhythmicity for TCDD influenced rhythmic genes. 

111 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 30 (cont’d) 
Time  series  expression  profile  (A)  gained  rhythmicity  genes  Elovl7,  and  Cdh13  (B)  lost 
rhythmicity genes Klf9, and Myc for control and treated set. (C) A total of 1072 genes that were 
classified to be influenced by rhythm and TCDD were analyzed as to whether they had gained, or 
lost rhythmicity based on the likelihood ratio test using dryR. A bar plot of the proportion of genes 
analyzed colored by whether they lost or gained rhythmicity. (D) GSEA analysis of gene sets for 
genes that gained rhythmicity. (E) GSEA analysis of gene sets for genes that lost rhythmicity. 

Further analyzing genes that maintained rhythmicity after TCDD exposure, we investigated how  

TCDD impacted the properties of their rhythmic expression patterns, specifically the amplitudes 

and phases. When analyzing the core circadian clock genes, we observed only small changes in 

the  phase  and  amplitude  of  these  genes  (Figure  30  B)  after  TCDD  treatment.  Extending  this 

analysis to all genes that kept rhythmicity, we similarly found no major trends reflecting a delay 

in  phase  or  reduction  of  amplitude  after  TCDD  exposure  (Figure  30  C).  To  determine  if  the 

magnitude of gene expression changes induced by TCDD treatment correlated with changes in the 

phase  or  amplitude  of  rhythmic  expression,  we  examined  the  relationship  between  these 

parameters. However, we found only a weak correlation between the magnitude of gene expression 

changes and the magnitude of phase or amplitude changes (Figure 30 D). These findings suggest 

that while TCDD exposure induced gains and losses of rhythmicity in certain gene sets, the genes 

that  maintained  rhythmicity  after  treatment  generally  exhibited  minimal  alterations  in  their 

rhythmic  properties,  such  as  phase  and  amplitude.  Furthermore,  the  extent  of  gene  expression 

changes induced by TCDD did not strongly correlate with the degree of phase or amplitude shifts 

in the rhythmic expression patterns of these genes. 

112 

 
 
 
 
 
 
 
Figure 31:  Analysis of rhythmic parameters for TCDD influenced rhythmic genes. (A) Schematic 
and equations describing phase and amplitude of gene’s expression. (B) Bar plot of core circadian 
clock genes and their respective phase and amplitude in each treatment group. Phase is measured 
in radians, and amplitude is measured in normalized counts (C) Violin plot of all genes that kept 
their rhythmicity’s rhythmic parameters. (D) Regression plot of the magnitude of change of phase 
or amplitude vs. the magnitude of the mean log fold change in expression with respect to treatment. 

113 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
The effects of acute 2,3,7,8 Tetrachlorodibenzo-p-dioxin (TCDD) perturbation of rhythmic 
genes. 

To elucidate the effects of TCDD on zonal gene expression patterns in the liver, we examined the 

enrichment  of  known  transcriptional  targets  of  the  Wnt/β-catenin,  Ras,  and  hypoxia  signaling 

pathways, which are primarily responsible for regulating hepatic zonation35,139. Our investigation 

aimed  to  ascertain  whether  each  zonated  gene  class  exhibited  enrichment  in  these  zonation 

pathways, considering the presence or absence of TCDD influence. Our analysis unveiled that all 

targets associated with zonation pathways demonstrated enrichment within the Z+R class of genes, 

which exhibited both zonal and rhythmic expression patterns. Interestingly, the TCDD-perturbed 

Dx(Z+R) genes and the dual-effect DxZ class, which exhibited both TCDD influence and zonal 

expression,  were  enriched  for Wnt  and  hypoxia  pathway  target  genes  but  not  for  Ras  pathway 

target genes (Figure 31). These results demonstrate that TCDD selectively disrupts Ras signaling-

mediated periportal gene expression, while the zonation programs regulated by the Wnt/β-catenin 

and  hypoxia  signaling  pathways  remain  largely  intact.  In  summary,  our  findings  suggest  that 

TCDD exposure selectively perturbs the Ras signaling pathway, which is responsible for regulating 

the  expression  of  periportal  genes,  while  leaving  the  Wnt/β-catenin  and  hypoxia-mediated 

zonation programs relatively unaffected. This selective disruption of Ras signaling may contribute 

to  the  hepatotoxic  effects  of  TCDD  and  highlights  the  complex  interplay  between  xenobiotic 

exposure, signaling pathways, and spatial gene expression patterns in the liver. 

114 

 
 
 
 
 
 
Figure 32: Enrichment analysis of zonation regulation pathway targets on all zonation gene classes. 

Building upon the analysis of rhythmic gene expression, we conducted an analogous investigation 

of differential zonation to discern whether genes in the DxZ and Dx(Z+R) categories experienced 

alterations in their zonal expression patterns following TCDD treatment. Using a likelihood ratio 

test, we performed zonation differential expression analysis to identify genes that gained, lost, or 

maintained zonal specificity with TCDD exposure. Our findings indicate that 18% of the analyzed 

genes  gained  zonation  after  TCDD  treatment,  while  13%  lost  their  zonal  expression  patterns 

(Figure  32).  Gene  set  enrichment  analysis  revealed 

that  pathways  related 

to  UDP-

glucuronosyltransferase  enzymes,  such  as  "Pentose  and  Glucuronate  Interconversions,"  were 

overrepresented among genes losing zonation (Figure 32), with no pathways enriched in the gained 

115 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
zonation set. Furthermore, in analyzing whether the genes that lost or gained zonation exhibited a 

more portal or central enrichment, we observed that genes losing zonation were significantly more 

centrally  enriched  compared  to  the  background  (p-value  <  0.001,  Kolmogorov-Smirnov  Test), 

indicating selective disruption of periportal gene expression by TCDD (Figure 32). Together, these 

results demonstrate that TCDD exposure leads to bidirectional alterations in hepatic zonation, with 

apparent centrilobular targeting of UDP-glucuronosyltransferase zonal discontinuity. The selective 

disruption  of  periportal  gene  expression  patterns  and 

the  enrichment  of  UDP-

glucuronosyltransferase pathways among genes losing zonation suggest a potential mechanism for 

TCDD-induced hepatotoxicity through impaired xenobiotic metabolism and clearance. 

B 

A 

C 

Figure 33: Analysis of loss or gain  of zonation based on TCDD treatment.  (A) A total  of 1059 
genes that were classified to be influenced by Zonation and TCDD were analyzed as to whether 

116 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 33 (cont’d) 
they had gained, or lost zonation based on the likelihood ratio. (B) GSEA analysis of genes that 
lost zonation with TCDD treatment. (C) Stacked bar plot describing the distribution of zonated 
genes and in which zone of the liver lobule are those genes most highly expressed. Differences in 
distribution  calculated  using  the  Kolmogorov-Smirnov  Test.  **  means  significant  difference 
between the two expressions at the time point and ns means no significant difference between the 
two expressions. 

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) analysis of dose affected 
categories. 

To  understand  the  transcriptional  regulation  across  various  gene  class  categories  under  the 

influence of TCDD, which impacts both zonal and rhythmic gene expression, we investigated the 

presence of binding motifs for the transcription factors BMAL175 (E-box motifs) and AHR75,140 

(Dioxin Response Elements, DREs) in these genes. While the presence of motifs is necessary but 

not  sufficient  for  transcription  factor  binding,  we  focused  specifically  on  motifs  located  in 

accessible  chromatin  regions,  mapped  by  DNase  I  hypersensitivity  (DHS)  data  in  mouse  liver 

tissue  from  the  ENCODE  project.  Using  the  GRCm38  reference  genome  and  BEDTools68,  we 

extracted all canonical E-box (CANNTG) and DRE (GCGTG) motifs located in DHS peaks. We 

then matched these accessible motifs to hepatic AHR ChIP-seq data in male C57BL/6 mice treated 

with  30  μg/kg  TCDD,  to  identify  motifs  likely  bound  by  transcription  factors.  Our  approach 

involved extracting genes with E-box and DRE binding motifs in their promoter or genome region, 

as well as identifying genes with an overlap of both E-box and DRE motifs (E-box intersect DREs). 

By  integrating  this  information  with  our  gene  classification  data,  we  aimed  to  elucidate  the 

potential transcriptional regulatory mechanisms governing the observed gene expression patterns 

under TCDD exposure. Specifically, we investigated whether the presence of accessible E-box, 

DRE, or overlapping E-box/DRE motifs could explain the differential expression, rhythmicity, and 

zonation patterns observed in various gene classes. By benchmarking these gene sets against the 

117 

 
 
 
classification classes derived from our non-linear mixed effects models (NLMEMs), we uncovered 

intriguing insights. Specifically, more than 55% of genes in the Dx(R+Z) class, which exhibited 

both  TCDD  influence  and  combined  zonal  and  rhythmic  expression  patterns,  contained  DRE 

(Dioxin Response Element) binding motifs, 20% exhibited E-box binding motifs, and over 5% 

possessed overlapping E-box and DRE binding motifs (E-box intersect DRE) (Figure 33 A). Gene 

set enrichment analysis further illuminated distinct pathways associated with these binding motifs. 

For  genes  enriched  with  only  E-box  binding  motifs  in  the  Dx(R+Z)  class,  pathways  related  to 

chemical carcinogenesis, drug metabolism, and xenobiotic metabolism by cytochrome P450 were 

prominent (Figure 33 B). Meanwhile, genes featuring only DRE binding motifs in the Dx(R+Z) 

class were enriched in  pathways like the PPAR signaling  pathway, p53 signaling pathway, and 

glycerolipid metabolism (Figure 33 C). Notably, genes with overlapping E-box and DRE binding 

motifs (E-box intersect DRE) in the Dx(R+Z) class exhibited enrichment in pathways related to 

vitamin  B6  metabolism  and  circadian  rhythm  (Figure  33  D).  These  findings  suggest  that  the 

interplay  between  the  circadian  clock  machinery  (E-box  motifs)  and  the  xenobiotic  response 

pathway  (DRE  motifs)  plays  a  crucial  role  in  regulating  hepatic  gene  expression  under TCDD 

exposure. The presence of accessible E-box and DRE motifs in specific gene classes may explain 

the observed differential expression, rhythmicity, and zonation patterns. Furthermore, the distinct 

pathway enrichments associated with genes containing E-box, DRE, or overlapping E-box/DRE 

motifs provide insights into the biological processes potentially regulated by these transcription 

factor binding sites under TCDD exposure. 

118 

 
 
 
 
 
A 

C 

B 

D 

Figure 34: ChIP-Seq and GSEA analysis of classification categories.  (A) ChIP-Seq analysis of the 
gene classification categories. Stacked histogram of genes enriched for BMAL1 binding (orange), 
genes enriched for AHR binding (green), genes enriched for AHR and BMAL1 binding (red) and 
genes enriched for neither AHR nor BMAL1 binding (blue). Gene-set enrichment analysis on (B) 
Genes enriched with BMAL1 binding (C) Genes enriched with AHR binding. (D) Genes enriched 
with AHR and BMAL1 binding. 

We  took  a  similar  approach  to  characterize  the  presence  of  E-box  motifs,  DRE  motifs,  and 

overlapping  E-box/DRE  motifs  in  genes  that  gained  or  lost  rhythmicity  following  TCDD 

treatment. Our analysis found that comparable numbers of rhythmicity-perturbed genes contained 

only  E-boxes,  only  DREs,  or  intersecting  E-box/DRE  motifs.  Over  50%  of  TCDD-affected 

zonated genes that lost spatial zonation contained DRE motifs in accessible promoter or genomic 

119 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
regions. Notably, about 38% of arrhythmic genes lacked both DRE and E-box motifs, suggesting 

other mechanisms of regulation (Figure 34 A). Gene set enrichment analysis linked E-box/DRE-

containing genes that lost zonation to pathways including starch/sucrose metabolism, fatty acid 

elongation,  and  proteoglycans  in  cancer.  E-box/DRE  motifs  were  also  enriched  in  arrhythmic 

genes  associated  with  bladder  cancer,  insulin  signaling,  FoxO  signaling,  and  related  pathways 

(Figure 34 B). No pathways were significantly enriched among gained rhythmicity genes based on 

our cutoff criteria. Overall, intersecting E-box and DRE motifs emerged as key potential mediators 

of  TCDD-disrupted  hepatic  rhythmicity  and  zonation,  though  additional  factors  appear 

contributory  for  approximately  38%  of  rhythmicity-lost  genes  that  lacked  these  motifs.  These 

findings suggest that the interplay between the circadian clock machinery (E-box motifs) and the 

xenobiotic response pathway (DRE motifs) plays a crucial role in mediating the effects of TCDD 

on hepatic gene expression rhythmicity and zonation. The presence of accessible E-box and DRE 

motifs in specific gene sets may explain the observed gains or losses of rhythmicity and zonation 

following TCDD exposure. Furthermore, the distinct pathway enrichments associated with genes 

containing  E-box,  DRE,  or  overlapping  E-box/DRE  motifs  provide  insights  into  the  biological 

processes  potentially  dysregulated  by  these  transcription  factor  binding  sites  under  TCDD 

exposure.  However,  for  a  subset  of  rhythmicity-lost  genes  (~38%),  the  lack  of  these  motifs 

suggests  the  involvement  of  other  regulatory  mechanisms  in  mediating  TCDD-induced 

arrhythmicity. 

120 

 
 
 
 
 
A 

B 

Figure 35: ChIP-Seq and GSEA analysis of TCDD perturbed rhythmic and zonated genes.  (A) 
ChIP-Seq analysis of the TCDD perturbed rhythmic genes (gained and lost rhythmicity genes) and 
TCDD  perturbed  zonated  genes  (gained  and  lost  zonation  genes).  Genes  enriched  for  BMAL1 
binding (orange), genes enriched for AHR binding (green), genes enriched for AHR and BMAL1 
binding  (red)  and  genes  enriched  for  neither AHR  nor  BMAL1  binding  (blue).  (B)  Gene-set 
enrichment analysis on lost rhythmicity and zonated genes with AHR and BMAL1 binding. 

Comparison of Gene regulatory networks (GRNs) of control and treated data. 

Gene regulatory networks (GRNs) are complex systemsthat govern gene expression patterns and 

cellular processes, influencing cellular functions and phenotypic outcomes. Traditional methods 

for  inferring  GRNs  relies  on  bulk  RNA-seq  and  chromatin  immunoprecipitation  sequencing 

(ChIP-seq)  average  out  cellular  heterogeneity  and  lack  the  resolution  to  capture  regulatory 

dynamics at  the single-cell level141,142. Recent  advances in  single-cell sequencing  technologies, 

121 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
such as single-cell RNA sequencing  (scRNA-seq) and single-nuclei  RNA sequencing  (snRNA-

seq)    have  enabled  the  measurement  of  gene  expression  levels  at  an  unprecedented  resolution, 

capturing  the  heterogeneity  and  dynamics  of  individual  cells  within  a  population126,143,144  To 

understand the gene regulatory mechanism across the portal-central axis of the liver lobule and 

how  this  mechanism  is  impacted  by  TCDD,  we  employ  ScGeneRAI145.  ScGeneRAI  is  an 

interpretable machine learning method that employs layer-wise relevance propagation (LRP) to 

infer GRNs from single-nucleus RNA sequencing data of cells in each layer. ScGeneRAI train a 

deep neural network to predict the expression of a target gene based on the expression of arbitrary 

sets  of  regulator  genes  in  single  cells.  Subsequently,  LRP  quantitatively assigns  each  regulator 

gene  a  relevance  score  for  the  target  gene  expression  prediction.  Thereby,  LRP  identifies  key 

transcription  factors  and  constructs  cell  type  specific  GRNs  de  novo.  Comparison  of  the  layer 

specific  GRNs  generated  by  scGeneRAI  then  allowed  us  to  determine  conserved  and  distinct 

regulatory interactions in central versus portal hepatocytes. Utilizing the computational framework 

tailored  for  inferring  GRNs  from  single-cell  data,  we  unraveled  the  transcriptional  regulation 

between transcription factors (TFs) and targeted genes along the portal-central axis of the liver 

lobule and the impact of TCDD on this mechanism. We analyzed the gene regulatory networks 

(GRNs)  of  central  cells  in  layer  1  to  portal  cells  in  layer  5  to  characterize  transcription  factor 

interactions with targeted genes and assess similarities and differences across the liver lobule. We 

compared the network similarities and differences across layers and across treatments (Control vs 

Treated).   

Our analysis of single-cell transcriptomic data from liver cells revealed intriguing insights into the 

gene regulatory networks (GRNs) governing circadian rhythms and zonal gene expression patterns 

along  the  portal-central  axis  of  the  liver  lobule.  Employing  the  state-of-the-art  ScGeneRAI 

122 

 
framework,  we  inferred  GRNs  from  both  control  and  TCDD-treated  datasets,  unveiling  the 

regulatory interactions between transcription factors (TFs) and their target genes. In the control 

dataset, our findings highlighted a crucial interaction between the TF NPAS2 and the gene Arntl, 

which  together  form  the  master  regulator  of  the  circadian  clock  mechanism  [2].  Notably,  this 

interaction between NPAS2 and Arntl was the only observed interaction involving the NPAS2 TF, 

underscoring its importance in maintaining circadian rhythmicity. Moreover, our analysis revealed 

that  several  TFs,  including  FOXO3,  PROX1,  RXRA,  and  GPAM,  exhibited  substantial 

interactions  with  the  Bach2  gene,  suggesting  its  potential  involvement  in  various  regulatory 

processes within liver cells. Intriguingly, upon analyzing the TCDD-treated dataset, we observed 

a  striking  disruption  of  the  interaction  between  the  NPAS2  TF  and  its  target  gene Arntl.  This 

disruption was consistent across all layers of the liver lobule, implying a potential dysregulation 

of the circadian clock mechanism following TCDD treatment. Furthermore, our findings unveiled 

a gradual reduction in the number of interactions between TFs and target genes as we traversed 

from  the  central  to  the  portal  region  of  the  liver  lobule. This  observation  aligns  with  the  well-

established  concept  of  zonal  gene  expression  and  metabolic  function  zonation  within  the  liver 

lobule. Cells in the central lobule exhibited a higher density of regulatory interactions compared 

to  those in  the portal  lobule, suggesting a more intense transcriptional  regulation  in  the central 

region.  

Finaly, we investigated the similarities between the gene regulatory network (GRN) graphs within 

the control and treated datasets. To achieve this, we utilized the average number of interactions 

based on transcription factors, their target genes, and the interactions between them. The analysis 

involved comparing the GRN similarities across different layers of the liver lobule. Our findings 

revealed that the GRN similarities decrease across the liver lobule, indicating a gradual divergence 

123 

 
in  gene  regulatory  networks  as  we  move  from  the  periportal  zone  to  the  pericentral  zone. 

Specifically, we observed that the similarity between the GRNs of adjacent layers, such as Zone 1 

and Zone 2 (Z1Z2), was higher than the similarity between the GRNs of more distant layers, such 

as  Zone  1  and  Zone  5  (Z1Z5). This  observation  suggests  that  the  gene  regulatory  mechanisms 

governing cellular processes may differ across the liver lobule, potentially reflecting the varying 

metabolic demands and functional specializations of hepatocytes in different zones. The higher 

similarity  between  adjacent  layers  could  be  attributed  to  the  gradual  transition  in  cellular 

environments and the shared regulatory mechanisms between neighboring zones. It is important 

to note that these findings are based on the analysis of transcription factors, their target genes, and 

the  interactions  between  them,  which  form  the  basis  of  gene  regulatory  networks.  Further 

investigations  may  be  required  to  elucidate  the  specific  factors  contributing  to  the  observed 

differences in GRN similarities across the liver lobule and their potential implications for liver 

function  and  disease  pathogenesis.  This  spatial  variation  in  transcription  factor-target  gene 

interactions reflects the functional specialization of hepatocytes along the central-portal axis, with 

cells  in  the  central  lobule  displaying  more  interactions  compared  to  those  in  the  portal  lobule. 

These findings underscore the intricate regulation of gene expression within the liver lobule and 

provide  insights  into  the  spatial  organization  of  transcriptional  networks  governing  hepatic 

physiology and response to environmental perturbations. 

124 

 
 
 
 
 
A 

C 

E 

B 

D 

F 

Figure 36: Weighted gene regulatory network in the form of a heatmap to understand the 
interactions between transcription factors and their weight of interaction for (A) Control Cells in 
Layer 1 (central axis) ,  (B) Treated cells in Layer 1 (central axis),  (C) Control Cells in Layer 2,  
(D) Treated cells in Layer 2, (E) Control Cells in Layer 3,  (F) Treated cells in Layer 3, (G) 
Control Cells in Layer 4,  (H) Treated cells in Layer 4, (I) Control Cells in Layer 5 (portal axis)  
and (J) Treated cells in Layer 5 (portal axis).    

125 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Figure 36 (cont’d) 
(K) Average similarity interactions of gene regulatory inference networks between layers of cells 
in the liver lobule. Z1Z2 refers to average similarities between Zone 1 (layer 1) and Zone 2 (layer 
2). Blue squares represent control average interactions (similarities) and red pentagon represent 
treated average interactions (similarities). 

H 

J 

G 

I 

K 

126 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
DISCUSSION AND CONCLUSION 

Transcription  factor  (TF)-DNA  binding  patterns  play  a  crucial  role  in  regulating  gene 

expression146. However, the precise DNA-binding sequences and the degree of flexibility in these 

sequences remain elusive for many TFs, including BMAL1, a key regulator of the circadian clock. 

While the core clock gene regulatory network in the brain's suprachiasmatic nucleus is believed to 

be similar to that in peripheral tissues, clock-controlled gene expression exhibits significant tissue 

specificity46,147.  In  this  study,  I  aimed  to  elucidate  the  determinants  of  BMAL1-DNA  binding 

beyond the simple DNA binding site sequence, leveraging features such as DNA shape and histone 

modifications.  I  employed  used  XGBoost,  an  ensemble  decision  tree-based  machine  learning 

algorithm,  to  predict  the  binding  of  BMAL1  to  its  putative  binding  motif  (the  E-box)  in  three 

mouse tissues – liver, heart and kidney.  We developed three different types of models: 1) sequence-

only, 2) sequence plus DNA shape, and 3) sequence plus DNA shape plus histone modifications 

(Fig 2A-B). Examining of the binding motifs revealed that the canonical 5’-CACGTG-3’ E-box 

type was the least frequent among all E-box types in the entire mouse genome and in accessible 

chromatin regions across the liver, heart, and kidney tissues.  However, this E-box type exhibited 

the  highest  frequency  of  BMAL1  binding  (Fig  1E),  consistent  with  previous  observations  that 

CACGTG is the preferred binding motif for BMAL1148. Despite the over-representation of the CG 

dinucleotide at  the center of BMAL1-bound E-boxes,  these nucleotides did  not  enhance model 

performance. Interestingly, the heart tissue harbored more E-boxes in accessible chromatin regions 

than the liver and kidney, yet it had the lowest number of bound E-boxes in accessible chromatin. 

The role of circadian rhythms in cardiac function is not well understood, and only 6% of protein-

coding genes in the mouse heart exhibit circadian regulation, compared to 11%-16% in the liver149. 

This observation is likely a consequence of the overall lower BMAL1 binding in the heart tissue. 

127 

 
However, the underlying reasons for the low level of BMAL1 binding to otherwise accessible E-

boxes in the heart remain unexplained. One possible explanation for the low BMAL1 binding to 

accessible E-boxes in the heart tissue could be the interference of a heart-specific E-box binding 

factor.  For  instance,  it  has  been  shown  that  elevated  levels  of  Usf1,  a  ubiquitous  transcription 

factor, can impede the binding of a mutant CLOCKΔ19:BMAL1 complex to E-box sites, and other 

such  factors  may  exist150.  Interestingly,  neither  the  kidney  nor  the  heart  within-tissue  models 

achieved  the  same  level  of  performance  as  the  liver  within-tissue  model. This  could  be  due  to 

evidence suggesting that the heart circadian rhythm might be phase-shifted compared to the liver, 

indicating  that  maximal  BMAL1  binding  in  the  heart  might  occur  at  a  different  time  than  the 

Zeitgeber time 6 (ZT06)46 used for the liver BMAL1 ChIP-seq experiment. Consequently, some 

of the heart BMAL1-bound E-boxes might have been labeled as unbound, affecting model learning 

and resulting in lower performance. A limitation of this work is that the authors considered only 

E-boxes  in  accessible  chromatin  regions  and  disregarded  inaccessible  E-boxes.  While  their 

observations  confirmed  that,  on  average,  more  than  75%  of  BMAL1  peaks  lie  in  accessible 

chromatin,  suggesting  a  higher  likelihood  of  BMAL1  binding  in  these  regions,  it  has  been 

demonstrated that the BMAL1-CLOCK complex can act as a pioneering factor and rhythmically 

control  the  accessibility  of  chromatin  surrounding  BMAL1-bound  sites151.  Recent  studies  have 

demonstrated that incorporating DNA shape features computed from core transcription factor (TF) 

binding  motifs  and  their  flanking  sequences  improves  the  prediction  of  TF  binding  for  many 

human TFs50,62,63. Furthermore, DNA topology is highly correlated with the structure and stability 

of  the  nucleosome,  suggesting  that  topological  changes  can  influence  the  binding  of  TFs  to 

DNA152. In the sequence plus shape models developed in this study, the DNA shape features EP, 

Roll, MGW, and ProT exhibited the highest influence on the prediction of bound E-boxes. A recent 

128 

 
study50 showed that for Max, a basic helix-loop-helix (bHLH) TF like BMAL1 and CLOCK, it 

was  Roll  and  ProT  that  were  the  dominant  determinants  of  TF-DNA  binding  affinity.  These 

observations  agree  with  our  findings.  Further,  in  agreement  with  previous  studies,  I  found  that 

DNA  shape  features  by  themselves  do  not  enhance  model  accuracy.  The  analysis  of  feature 

importance in the sequence plus shape plus histone modifications (HMs) models revealed that the 

HMs  H3K27ac  and  H3K4me3,  along  with  the  DNA  shape  feature  EP,  dominated  binding 

prediction across the kidney and heart tissues, and were also highly ranked in the liver. Previous 

studies  have  shown  that  H3  acetylation  and  methylation  modifications  surrounding  CLOCK-

BMAL1 bound sites exhibit rhythmic changes153. My results findings indicate that even with a 

single snapshot of these HMs from unsynchronized mice, we can accurately distinguish between 

bound and unbound E-boxes, likely owing to the information encoded in the shape and surrounding 

sequence of the E-box motif, in addition to the average levels of histone modifications. I propose 

that this is likely due to the information that is encoded in the shape and flanking sequence of the 

E-box motif in addition to the average levels of histone acetylation and methylation. The analysis 

suggest that this information is tissue-specific, as evidenced by the performance of the cross-tissue 

models. Intriguingly, the DNA sequence features alone had little to no effect on binding prediction 

in the kidney and heart. However, the second nucleotide upstream of the E-box had a significant 

contribution to predicting BMAL1-DNA binding in the liver, with the nucleotide G at this position 

contributing to approximately 50% of the feature importance score. Analysis of the bound E-box 

motifs and their flanking sequences revealed an enrichment of the G nucleotide at the third position 

of  the  5'  flanking  sequence  in  the  liver.  Since  the  heart  and  kidney  models  do  not  rely  on  this 

feature, it is understandable that the liver_kidney and liver_heart cross-tissue models showed an 

unexpected  decrease  in  performance  when  DNA  shape  and  histone  modification  features  were 

129 

 
added to the sequence features. On the other hand, the kidney_liver and heart_liver cross-tissue 

models exhibited a boost in performance with the addition of histone modification features. These 

results suggest that there is some degree of commonality in BMAL1 binding between different 

tissues.  However,  in  cross-tissue  models,  sequence-only  models  exhibited  the  most  robust 

performance, with the exception of the kidney_heart and heart_kidney models. This indicates that 

DNA shape and chromatin context features can exhibit a high degree of tissue specificity and are 

more  similar  between  the  kidney  and  heart  than  between  the  liver  and  the  other  two  tissues. 

Previous studies on the yeast bHLH transcription factors Cbf1 and Tye7 have shown that E-box 

binding specificity is governed by sequences flanking the E-box, as reflected in DNA shape154. My 

findings  extend this  concept,  indicating that not  only  might  DNA shape and chromatin context 

confer different binding specificities to different transcription factors within the same tissue, but 

they  might  also  confer  different  binding  specificities  to  the  same  transcription  factor  across 

different tissues. These observations suggest that while there may be some shared determinants of 

BMAL1  binding  across  tissues,  there  are  also  tissue-specific  factors,  such  as  DNA  shape  and 

chromatin  context,  that  modulate  BMAL1's  binding  specificity.  This  tissue-specific  regulation 

could contribute to the observed differences in clock-controlled gene expression patterns among 

various tissues. 

Dynamic modeling of biological processes from gene regulatory to multicellular network levels 

provides  critical  insights  into  the  fundamental  properties,  physiology,  and  behaviors  related  to 

circadian  rhythmicity155.  While  experimental  and  theoretical  explorations  have  extensively 

detailed the circadian clock gene regulatory network156, few studies have examined the cell-cell 

communication  processes  enabling  synchronization  of  circadian  period,  amplitude,  and  phase 

between autonomous cellular oscillators. Elucidating circadian synchronization through cell-cell 

130 

 
coupling can advance our understanding of circadian rhythm robustness and plasticity at the tissue 

level29.  Neurotransmitters,  acting  as  coupling  factors,  have  been  shown  to  regulate  the 

synchronization mechanism in the suprachiasmatic nucleus (SCN)105,157. However, the coupling 

factors in other peripheral tissues are unknown. In this study, I developed mathematical models of 

the mouse hepatic circadian clock to examine intercellular communication and synchronization of 

autonomous oscillators across the murine liver lobule. The models incorporated core clock genes 

and their regulatory interactions in individual hepatocytes. Simulations showed that incorporating 

cell-cell  coupling  led  to  synchronized  gene  expression  between  hepatocytes,  matching 

experimental findings29,158. Strong synchronicity of circadian oscillation has been associated with 

period  lengthening.  However,  the  models  suggest  that  without  synchronization,  the  period  is 

variable outside of the near 24-hr range. Therefore, optimal cell-cell coupling is required to achieve 

both synchronicity between cells and appropriate oscillatory periods159. Synchronicity of circadian 

oscillation  has  been  shown  to  induce  key  cell-cycle  events,  including  cyclin-dependent  kinase 

network activation, cell growth, DNA replication, and cytokinesis160. A weak synchronicity at the 

cellular level in the SCN manifests as mistimed sleep and impaired cognitive and psychomotor 

performance in humans. This circadian misalignment has detrimental effects on physiology and 

behavior,  including  deficits  in  reaction  time,  memory,  alertness,  and  mood8.  The  sensitivity 

analysis  revealed  dependencies  between  clock  components;  for  instance,  increased  Per 

transcription  decreased  Cry  expression,  likely  due  to  their  mutual  repression161.  Specifically, 

increased Cry transcription rates markedly diminished Per mRNA levels. However, perturbations 

in the Per transcription rate did not comparably suppress Cryptochrome transcripts37. This aligns 

with experimental reporter assays demonstrating that the repression strength of PER proteins on 

Clock/Bmal1-driven transcription is weaker relative to CRY162. The disproportionate parametric 

131 

 
sensitivities  suggest  that  CRY  dynamics  play  a  more  dominant  role  than  PER  in  governing 

circadian rhythmicity via interlocking negative feedback loops. Validating the coupled mechanism 

in  the  model,  the  authors  used  bifurcation  diagrams  to  show  the  periodic  stability  of  the 

transcription  rate  parameters.  These  theoretical  bifurcation  diagrams  generate  experimentally 

testable hypotheses, potentially through overexpression or knockdown of circadian factors in vivo 

and  in  vitro,  to  validate  predicted  changes  in  periodicity,  amplitude,  or  phase  shifts149.    Our 

knowledge of the complex mammalian circadian clock mechanism is still incomplete. More than 

40  genes  directly  interact  with  the  core  clock  genes  in  generating  the  circadian  oscillatory 

rhythm163.  It  is,  therefore,  essential  to  understand  the  temporal  and  spatial  dynamics  and  the 

regulatory mechanism of the circadian clock oscillation. I addressed this knowledge gap with a 

detailed  mathematical  model  incorporating  known  clock  and  associated  rhythmic  genes.  The 

current  model  focused  on  healthy  cells  under  normal  circadian  entrainment.  An  important 

extension would be to model circadian disruption by genetic alterations or toxicant exposure. The 

model could be used to predict the effects of parameter changes representing mutations or cellular 

damage. Linking the circadian clock model with zonated metabolism models could offer insights 

into compounding hepatic effects across scales. While the study provides valuable insights into the 

circadian clock mechanism, I acknowledge the limitations of my current model, which focuses on 

healthy  cells  under  normal  circadian  entrainment.  I  propose  extending  the  model  to  simulate 

circadian  disruption  caused  by  genetic  alterations  or  exposure  to  toxicants,  which  could  be 

achieved  by  modifying  parameters  to  represent  mutations  or  cellular  damage.  Additionally, 

integrating the circadian clock model with zonated metabolism models could provide a holistic 

understanding of the multi-scale effects on hepatic function. 

132 

 
 
The liver exhibits a remarkably intricate spatiotemporal metabolic organization. As the primary 

site of drug metabolism, elucidating the impact of chemical compounds on hepatic gene regulation 

in  both  spatial  and  temporal  domains  could  provide  critical  insights.  Firstly,  characterizing  the 

acute restructuring of rhythmic and zonal gene expression patterns in response to drug exposure 

could  reveal  adaptive  mechanisms.  Secondly,  disruption  of  these  regulatory  mechanisms  likely 

contributes  to  the  progression  of  pathological  states  such  as  non-alcoholic  fatty  liver  disease. 

Analysis of drug effects on the liver has often been limited by examining temporal rhythmicity 

and spatial zonation independently. In this study, we integrated these hepatic properties to account 

for chemical perturbations, an important consideration given the understudied aspect of how the 

liver reorganizes its metabolic framework in response to toxic stimuli. We expanded on previous 

work unifying rhythmicity and zonation to incorporate the effects of acute exposure to 2,3,7,8-

tetrachlorodibenzo-p-dioxin (TCDD). Despite the short timescale, we demonstrated that TCDD 

significantly  impacts  both  rhythms  and  zonation.  The  canonical  TCDD  receptor,  the  aryl 

hydrocarbon receptor (AhR), exhibited both rhythmic and zonal expression patterns. Moreover, 

numerous  TCDD-induced  effects  had  both  temporal  and  spatial  components.  Overall,  this 

integrated  methodology  enabled  a  more  comprehensive  characterization  of  rapid  liver 

reorganization in response to TCDD exposure than studying zonation or rhythms in isolation. Our 

findings  highlight  the  utility  of  integrating  diverse  regulatory  properties  when  elucidating  the 

impacts  of  chemical  compounds  on  hepatic  gene  regulation  and  metabolic  organization.  Our 

integrated analysis demonstrates that hepatic rhythmicity is more sensitive to acute exposure to 

2,3,7,8-tetrachlorodibenzo-p-dioxin  (TCDD)  compared  to  zonation.  This  aligns  with  previous 

findings  showing  that  subchronic  TCDD  administration  at  the  same  dose  greatly  dampened 

amplitude and significantly phase-shifted the core circadian clock28. Although acute TCDD did 

133 

 
not perturb all core clock genes analyzed, we still observed significant influences on nearly half of 

these critical circadian regulators. Effects were most notable for genes directly downstream of the 

CLOCK-ARNTL  heterodimer  (e.g.,  Per2,  Nr1d2),  supporting  the  hypothesis  that  the  aryl 

hydrocarbon  receptor  (AhR)  interferes  with  CLOCK-ARNTL  transcriptional  activation  by 

competing for ARNTL binding. Additional evidence lies in the expression pattern of Per1, a direct 

target  inhibited  by AHR-ARNTL  binding. Although  Per1,  a  core  circadian  gene,  was  excluded 

from  our  dataset  due  to  low  variance,  its  normalized  expression  showed  signs  of  repression  at 

multiple timepoints between 2-24 hours post-TCDD exposure, consistent with disrupted CLOCK-

ARNTL control. TCDD elicited variable effects on rhythmic hepatic genes; while most exhibited 

modest expression changes at specific times, overall oscillation patterns were largely maintained 

with treatment for approximately 85% of rhythmic transcripts. However, approximately 15% of 

rhythmic  genes  gained  or  lost  transcriptional  rhythmicity,  representing  selective  arrhythmic 

effects. Notably, gene set enrichment analysis linked these arrhythmia-gaining genes to canonical 

TCDD response pathways, including hallmark AhR-mediated responses126,164. This indicates that 

TCDD preferentially disrupts rhythmicity of major regulatory nodes and downstream processes 

most susceptible to aberrant AhR activation. Rather than inducing widespread distortions in cyclic 

waveforms, TCDD appears to  ablate oscillatory  control  of TCDD-vulnerable pathways. TCDD 

also significantly impacted zonal gene regulation, altering spatial expression patterns for nearly 

25%  of  zonated  transcripts.  Notably,  about  30%  of  these  genes  either  gained  or  lost  zonation, 

representing substantial chemical-induced re-patterning. Genes losing zonation were enriched for 

phase  II  UDP-glucuronosyltransferase  metabolism  and  localized  to  the  central  hepatic  layer, 

matching the enrichment of the AhR receptor itself. TCDD-perturbed genes were also enriched for 

known  regulators  of  zonation,  potentially  via  crosstalk  with Wnt/β-catenin  signaling. This  was 

134 

 
most prominent for genes jointly exhibiting rhythmic and zonal properties, which are hypothesized 

to  arise  from  Wnt-mediated  pericentral  signals.  As  AhR  shows  central  layer  enrichment,  its 

aberrant activation by TCDD could disrupt this external cue, explaining the observed perturbation 

in rhythmicity and zonation. While the precise mechanisms through which aberrant AhR signaling 

perturbs hepatic rhythmicity and zonation after TCDD exposure remain unclear, potential routes 

could  involve  direct  transcriptional  regulation  via  AhR  binding  to  target  gene  cis-regulatory 

elements  or  hierarchical  effects  by  disrupting  core  clock  regulators,  secondarily  altering  cyclic 

control of downstream processes. High-throughput approaches like ChIP-seq, profiling genome-

wide  binding  locations  for  AhR,  ARNTL,  and  binding  partners  after  TCDD  exposure,  could 

elucidate these questions and map binding sites to clarify pathways exhibiting direct, canonical 

AhR-mediated regulation versus downstream, non-canonical disruption. In summary, these results 

indicate that the disruption of temporal rhythmicity is a primary route through which TCDD alters 

hepatic  gene  regulation,  relative  to  spatial  patterning.  The  specific  effects  on  core  circadian 

components  highlight  direct  mechanisms  through  which  aberrant AhR  activation  propagates  to 

disrupt hepatic rhythmicity, with the immediate downstream targets of CLOCK-ARNTL appearing 

especially  sensitive  to  functional  interference  by  ligand-activated AhR.  This  approach  enables 

matching  of  precise  TCDD-induced  expression  changes  to  interactions  within  the  circadian 

regulatory  network,  refining  hypotheses  on  the  routes  by  which AhR  signaling  disrupts  clock 

function. The  method  enables  comparative  assessment  of  the  impacts  of  chemical  exposure  on 

rhythmic  versus  zonal  gene  expression,  revealing  that  perturbations  in  rhythmic  transcriptional 

outputs precede and potentially contribute to changes in zonation in the case of TCDD exposure. 

The  developed  methodology  provides  a  framework  readily  extensible  to  additional  chemical 

exposure  contexts,  dose-response  relationships,  and  single-nuclei  RNA  sequencing  datasets 

135 

 
assessing  hepatic  chemical  impacts.  By  matching  pathway-specific  expression  changes  to 

overarching spatiotemporal regulatory logic, this systems-based strategy has potential utility for 

pharmacology  and  toxicology,  enabling  researchers  to  situate  perturbed  pathways  within  the 

broader scheme of metabolic zonation and rhythmicity. Characterizing how xenobiotic disruption 

of  specific  functional  modules  scales  up  to  influence  zonation  control  pathways  and  circadian 

timing at the tissue level will drive a more holistic understanding of chemical threats to hepatic 

function.  Overall,  this  multi-parametric  modeling  approach  provides  a  blueprint  for  deep 

phenotyping of gene regulatory restructuring in this vital metabolic organ. 

136 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
BIBLOGRAPHY 

1. 

2. 

3. 

4. 

5. 

6. 

7. 

8. 

9. 

Huang, T. C., Tu, J., Chow, T. J. & Chen, T. H. Circadian Rhythm of the Prokaryote 
Synechococcus sp. RF-1. Plant Physiol 92, 531 (1990). 

Hastings, M. H. Circadian Clocks. Current biology : CB vol. 7 (1997). 

Ko, C. H. & Takahashi, J. S. Molecular components of the mammalian circadian clock. 
Hum Mol Genet 15 Spec No 2, (2006). 

Observation botanique. Histoire de l’Academie Royale de Sciences (Paris). 
https://www.researchgate.net/publication/291798032_Observation_botanique_Histoire_de
_l’Academie_Royale_de_Sciences_Paris. 

A Behavioristic Study of the Activity of the Rat. https://psycnet.apa.org/record/1926-
07705-001. 

Endogenous Rhythms in Insects and Microorganisms on JSTOR. 
https://www.jstor.org/stable/2458417. 

Vitaterna, M. H., Shimomura, K. & Jiang, P. Genetics of Circadian Rhythms. Neurol Clin 
37, 487–504 (2019). 

Patton, A. P. & Hastings, M. H. The Mammalian Circadian Time-Keeping System. J 
Huntingtons Dis 12, 91 (2023). 

Gonze, D. Modeling circadian clocks: Roles, advantages, and limitations. Cent Eur J Biol 
6, 712–729 (2011). 

10.  Takahashi, J. S. Transcriptional architecture of the mammalian circadian clock. Nat Rev 

Genet 18, 164–179 (2017). 

11.  Lee, Y. Roles of circadian clocks in cancer pathogenesis and treatment. Experimental and 

Molecular Medicine vol. 53 1529–1538 Preprint at https://doi.org/10.1038/s12276-021-
00681-0 (2021). 

12.  Maywood, E. S., O’Neill, J. S., Chesham, J. E. & Hastings, M. H. Minireview: The 

circadian clockwork of the suprachiasmatic nuclei--analysis of a cellular oscillator that 
drives endocrine rhythms. Endocrinology 148, 5624–5634 (2007). 

13. 

14. 

Skene, D. J. & Arendt, J. Human Circadian Rhythms: Physiological and Therapeutic 
Relevance of Light and Melatonin. Ann Clin Biochem vol. 43 (2006). 

Schibler, U. et al. Clock-Talk: Interactions between central and peripheral circadian 
oscillators in mammals. Cold Spring Harb Symp Quant Biol 80, 223–232 (2016). 

15.  Mohawk, J. A., Green, C. B. & Takahashi, J. S. Central and peripheral circadian clocks in 

mammals. Annu Rev Neurosci 35, 445–462 (2012). 

16.  Albrecht, U. Timing to perfection: the biology of central and peripheral circadian clocks. 

Neuron 74, 246–260 (2012). 

137 

 
 
 
17.  Gamble, K. L., Berry, R., Frank, S. J. & Young, M. E. Circadian clock control of 

endocrine factors. Nat Rev Endocrinol 10, 466–475 (2014). 

18.  Green, C. B., Takahashi, J. S. & Bass, J. The meter of metabolism. Cell 134, 728–742 

(2008). 

19.  Richards, J. & Gumz, M. L. Mechanism of the circadian clock in physiology. Am J 

Physiol Regul Integr Comp Physiol 304, 1053–1064 (2013). 

20.  Esquirol, Y. et al. Shift work and metabolic syndrome: Respective impacts of job strain, 

physical activity, and dietary rhythms. Chronobiol Int 26, 544–559 (2009). 

21.  Biggi, N., Consonni, D., Galluzzo, V., Sogliani, M. & Costa, G. Metabolic syndrome in 

permanent night workers. Chronobiol Int 25, 443–454 (2008). 

22.  Hadadi, E. et al. Chronic circadian disruption modulates breast cancer stemness and 

immune microenvironment to drive metastasis in mice. Nat Commun 11, (2020). 

23.  Masri, S. & Sassone-Corsi, P. The emerging link between cancer, metabolism, and 

circadian rhythms. Nat Med 24, 1795–1803 (2018). 

24.  Castanon-Cervantes, O. et al. Dysregulation of inflammatory responses by chronic 

circadian disruption. J Immunol 185, 5796–5805 (2010). 

25.  Maury, E., Ramsey, K. M. & Bass, J. Circadian rhythms and metabolic syndrome: from 

experimental genetics to human disease. Circ Res 106, 447–462 (2010). 

26.  Williamson, M. A., Gasiewicz, T. A. & Opanashuk, L. A. Aryl hydrocarbon receptor 

expression and activity in cerebellar granule neuroblasts: Implications for development 
and dioxin neurotoxicity. Toxicological Sciences 83, 340–348 (2005). 

27.  Mimura, J. & Fujii-Kuriyama, Y. Functional role of AhR in the expression of toxic effects 

by TCDD. Biochim Biophys Acta Gen Subj 1619, 263–268 (2003). 

28. 

Fader, K. A., Nault, R., Doskey, C. M., Fling, R. R. & Zacharewski, T. R. 2,3,7,8-
Tetrachlorodibenzo-p-dioxin abolishes circadian regulation of hepatic metabolic activity 
in mice. Sci Rep 9, 1–18 (2019). 

29.  Droin, C. et al. Space-time logic of liver gene expression at sub-lobular scale. Nat Metab 

3, 43–58 (2021). 

30.  Wang, Y. et al. A proteomics landscape of circadian clock in mouse liver. Nature 

Communications 2018 9:1 9, 1–16 (2018). 

31.  DeBruyne, J. P., Weaver, D. R. & Dallmann, R. The hepatic circadian clock modulates 

xenobiotic metabolism in mice. J Biol Rhythms 29, 277 (2014). 

32. 

Ferrell, J. M. & Chiang, J. Y. L. Circadian rhythms in liver metabolism and disease. Acta 
Pharm Sin B 5, 113 (2015). 

33.  Annunziato, S. & Tchorz, J. S. Liver zonation—a journey through space and time. Nat 

Metab 3, 7–8 (2021). 

138 

 
34. 

Jungermann, K. Zonation of metabolism and gene expression in liver. Histochem Cell Biol 
103, 81–91 (1995). 

35.  Burke, Z. D. & Tosh, D. The Wnt/β-catenin pathway: master regulator of liver zonation? 

BioEssays 28, 1072–1077 (2006). 

36.  Mukherji, A., Bailey, S. M., Staels, B. & Baumert, T. F. The circadian clock and liver 

function in health and disease. J Hepatol 71, 200–211 (2019). 

37.  Relógio, A. et al. Tuning the mammalian circadian clock: Robust synergy of two 

loopsRelógio, A., Westermark, P. O., Wallach, T., Schellenberg, K., Kramer, A., & Herzel, 
H. (2011). Tuning the mammalian circadian clock: Robust synergy of two loops. PLoS 
Computational Biology,. PLoS Comput Biol 7, 1–18 (2011). 

38. 

Forger, D. B. & Peskin, C. S. A detailed predictive model of the mammalian circadian 
clock. Proc Natl Acad Sci U S A 100, 14806–14811 (2003). 

39.  Takahashi, J. S., Hong, H. K., Ko, C. H. & McDearmon, E. L. The genetics of mammalian 
circadian order and disorder: implications for physiology and disease. Nature Reviews 
Genetics 2008 9:10 9, 764–775 (2008). 

40.  Landgraf, D., Wang, L. L., Diemer, T. & Welsh, D. K. NPAS2 Compensates for Loss of 
CLOCK in Peripheral Circadian Oscillators. PLoS Genet 12, e1005882 (2016). 

41.  Cox, K. H. & Takahashi, J. S. Circadian clock genes and the transcriptional architecture of 

the clock mechanism. J Mol Endocrinol 63, R93–R102 (2019). 

42.  Yoo, S. H. et al. A noncanonical E-box enhancer drives mouse Period2 circadian 

oscillations in vivo. Proc Natl Acad Sci U S A 102, 2608–2613 (2005). 

43.  Kathiresan, S. & Srivastava, D. Genetics of Human Cardiovascular Disease. Cell 148, 

1242 (2012). 

44. 

Schödel, J. et al. Common genetic variants at the 11q13.3 renal cancer susceptibility locus 
influence binding of HIF to an enhancer of cyclin D1 expression. Nat Genet 44, 420–425 
(2012). 

45.  Lambert, S. A. et al. The Human Transcription Factors. Cell 172, 650–665 (2018). 

46.  Beytebiere, J. R. et al. Tissue-specific BMAL1 cistromes reveal that rhythmic 

transcription is associated with rhythmic enhancer–enhancer interactions. Genes Dev 33, 
294–309 (2019). 

47.  Dror, I., Golan, T., Levy, C., Rohs, R. & Mandel-Gutfreund, Y. A widespread role of the 

motif environment in transcription factor binding across diverse protein families. Genome 
Res 25, 1268–1280 (2015). 

48.  Morgunova, E. & Taipale, J. Structural perspective of cooperative transcription factor 

binding. Curr Opin Struct Biol 47, 1–8 (2017). 

49.  Wang, J. et al. Sequence features and chromatin structure around the genomic regions 
bound by 119 human transcription factors. Genome Res 22, 1798–1812 (2012). 

139 

 
50.  Zhou, T. et al. Quantitative modeling of transcription factor binding specificities using 

DNA shape. Proc Natl Acad Sci U S A 112, 4654–4659 (2015). 

51. 

Filipovic, D. et al. Interpretable predictive models of genome-wide aryl hydrocarbon 
receptor-DNA binding reveal tissue specific binding determinants. Toxicol Sci (2023) 
doi:10.1093/TOXSCI/KFAD094. 

52. 

Steuernagel, L. et al. Computational identification of tissue-specific transcription factor 
cooperation in ten cattle tissues. PLoS One 14, e0216475 (2019). 

53.  Barski, A. et al. High-resolution profiling of histone methylations in the human genome. 

Cell 129, 823–837 (2007). 

54.  Arvey, A., Agius, P., Noble, W. S. & Leslie, C. Sequence and chromatin determinants of 
cell-type–specific transcription factor binding. Genome Res 22, 1723–1734 (2012). 

55.  Gordân, R. et al. Genomic Regions Flanking E-Box Binding Sites Influence DNA Binding 

Specificity of bHLH Transcription Factors through DNA Shape. Cell Rep 3, 1093–1104 
(2013). 

56. 

Pique-Regi, R. et al. Accurate inference of transcription factor binding from DNA 
sequence and chromatin accessibility data. Genome Res 21, 447–455 (2011). 

57.  Das, M. K. & Dai, H. K. A survey of DNA motif finding algorithms. BMC Bioinformatics 

8, 1–13 (2007). 

58.  Alipanahi, B., Delong, A., Weirauch, M. T. & Frey, B. J. Predicting the sequence 

specificities of DNA- and RNA-binding proteins by deep learning. Nature Biotechnology 
2015 33:8 33, 831–838 (2015). 

59.  Quang, D. & Xie, X. DanQ: a hybrid convolutional and recurrent deep neural network for 

quantifying the function of DNA sequences. Nucleic Acids Res 44, (2016). 

60.  Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep 
learning–based sequence model. Nature Methods 2015 12:10 12, 931–934 (2015). 

61.  Chen, T. & Guestrin, C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 
ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 13-
17-August-2016, 785–794 (2016). 

62.  Mathelier, A. et al. DNA Shape Features Improve Transcription Factor Binding Site 

Predictions In Vivo. Cell Syst 3, 278-286.e4 (2016). 

63.  Wang, S. et al. Predicting transcription factor binding sites using DNA shape features 

based on shared hybrid deep learning architecture. Mol Ther Nucleic Acids 24, 154–163 
(2021). 

64.  Wang, Y., Li, X. & Hu, H. H3K4me2 reliably defines transcription factor binding regions 

in different cells. Genomics 103, 222–228 (2014). 

65. 

Slattery, M. et al. Absence of a simple code: how transcription factors read the genome. 
Trends Biochem Sci 39, 381–399 (2014). 

140 

 
66.  Li, J. et al. Expanding the repertoire of DNA shape features for genome-scale studies of 

transcription factor binding. Nucleic Acids Res 45, 12877 (2017). 

67.  Benveniste, D., Sonntag, H. J., Sanguinetti, G. & Sproul, D. Transcription factor binding 
predicts histone modifications in human cell lines. Proc Natl Acad Sci U S A 111, 13367–
13372 (2014). 

68.  Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic 

features. Bioinformatics 26, 841–842 (2010). 

69.  Chiu, T. P. et al. DNAshapeR: an R/Bioconductor package for DNA shape prediction and 

feature encoding. Bioinformatics 32, 1211–1213 (2016). 

70.  Zhou, T. et al. DNAshape: a method for the high-throughput prediction of DNA structural 

features on a genomic scale. Nucleic Acids Res 41, (2013). 

71.  Ramírez, F. et al. deepTools2: a next generation web server for deep-sequencing data 

analysis. Nucleic Acids Res 44, W160–W165 (2016). 

72. 

73. 

Pohl, A. & Beato, M. bwtool: a tool for bigWig files. Bioinformatics 30, 1618–1619 
(2014). 

Peng, C. Y. J., Lee, K. L. & Ingersoll, G. M. An Introduction to Logistic Regression 
Analysis and Reporting. https://doi.org/10.1080/00220670209598786 96, 3–14 (2010). 

74.  ENCODE. https://www.encodeproject.org/. 

75.  Marri, D., Filipovic, D., Kana, O., Tischkau, S. & Bhattacharya, S. Prediction of 

mammalian tissue-specific CLOCK–BMAL1 binding to E-box DNA motifs. Sci Rep 13, 
(2023). 

76.  Liu, S. et al. Assessing the model transferability for prediction of transcription factor 
binding sites based on chromatin accessibility. BMC Bioinformatics 18, 1–11 (2017). 

77.  Benveniste, D., Sonntag, H. J., Sanguinetti, G. & Sproul, D. Transcription factor binding 
predicts histone modifications in human cell lines. Proc Natl Acad Sci U S A 111, 13367–
13372 (2014). 

78.  Guccione, E. et al. Myc-binding-site recognition in the human genome is determined by 

chromatin context. Nature Cell Biology 2006 8:7 8, 764–770 (2006). 

79.  Xin, B. & Rohs, R. Relationship between histone modifications and transcription factor 

binding is protein family specific. Genome Res 28, 321–333 (2018). 

80.  Heintzman, N. D. et al. Histone modifications at human enhancers reflect global cell-type-

specific gene expression. Nature 2009 459:7243 459, 108–112 (2009). 

81.  Ramsey, S. A. et al. Genome-wide histone acetylation data improve prediction of 

mammalian transcription factor binding sites. Bioinformatics 26, 2071–2075 (2010). 

82.  Korobov, M. & Lopuhin, K. ELI5 Documentation Release 0.11.0. (2021). 

83.  Crooks, G. E., Hon, G., Chandonia, J. M. & Brenner, S. E. WebLogo: a sequence logo 

generator. Genome Res 14, 1188–1190 (2004). 

141 

 
84.  Reppert, S. M. & Weaver, D. R. Coordination of circadian timing in mammals. Nature 

2002 418:6901 418, 935–941 (2002). 

85.  Moore, R. Y., Speh, J. C. & Leak, R. K. Suprachiasmatic nucleus organization. Cell Tissue 

Res 309, 89–98 (2002). 

86.  Maronde, E. et al. The clock genes Period 2 and Cryptochrome 2 differentially balance 

bone formation. PLoS One 5, (2010). 

87.  Takahashi, J. S. Molecular Neurobiology and Genetics of Circadian Rhythms in 

Mammals. https://doi.org/10.1146/annurev.ne.18.030195.002531 18, 531–553 (2003). 

88.  Lamia, K. A. & Evans, R. M. Metabolism: Tick, tock, a beta-cell clock. Nature 466, 571–

572 (2010). 

89. 

Sahar, S. & Sassone-Corsi, P. Metabolism and cancer: the circadian clock connection. Nat 
Rev Cancer 9, 886–896 (2009). 

90.  Hastings, M. H., Reddy, A. B. & Maywood, E. S. A clockwork web: circadian timing in 

brain and periphery, in health and disease. Nature Reviews Neuroscience 2003 4:8 4, 649–
661 (2003). 

91.  Marcheva, B. et al. Disruption of the clock components CLOCK and BMAL1 leads to 

hypoinsulinaemia and diabetes. Nature 466, 627–631 (2010). 

92.  Hardin, P. E. & Panda, S. Circadian timekeeping and output mechanisms in animals. Curr 

Opin Neurobiol 23, 724–731 (2013). 

93.  Guillaumond, F., Dardente, H., Giguère, V. & Cermakian, N. Differential control of Bmal1 
circadian transcription by REV-ERB and ROR nuclear receptors. J Biol Rhythms 20, 391–
403 (2005). 

94.  Gebhardt, R. Metabolic zonation of the liver: regulation and implications for liver 

function. Pharmacol Ther 53, 275–354 (1992). 

95.  Halpern, K. B. et al. Single-cell spatial reconstruction reveals global division of labour in 

the mammalian liver. (2017) doi:10.1038/nature21065. 

96.  Li, Y., Liu, Z., Luo, J. & Wu, H. Coupling-induced synchronization in multicellular 

circadian oscillators of mammals. Cogn Neurodyn 7, 59–65 (2013). 

97. 

98. 

99. 

Schmal, C., Herzog, E. D. & Herzel, H. Measuring Relative Coupling Strength in 
Circadian Systems. J Biol Rhythms 33, 84–98 (2018). 

Finger, A. M. et al. Intercellular coupling between peripheral circadian oscillators by 
TGF-β signaling. Sci Adv 7, 1–20 (2021). 

Sloin, H. E. et al. Interactions between the circadian clock and TGF-β signaling pathway 
in zebrafish. PLoS One 13, e0199777 (2018). 

100.  Kon, N. et al. Activation of TGF-β/activin signalling resets the circadian clock through 

rapid induction of Dec1 transcripts. Nature Cell Biology 2008 10:12 10, 1463–1469 
(2008). 

142 

 
101.  Shirakawa, T., Honma, S. & Honma, K. I. Multiple oscillators in the suprachiasmatic 

nucleus. Chronobiol Int 18, 371–387 (2001). 

102.  Cholico Giovani N. Time-Course Single-Nuclei RNA-Seq Analysis Identifies NRG/ERBB 

Signaling Dysregulation in TCDD-Induced Hepatotoxicity. (2023). 

103.  Finger, A. M., Dibner, C. & Kramer, A. Coupled network of the circadian clocks: a driving 

force of rhythmic physiology. FEBS Lett 594, 2734–2769 (2020). 

104.  Robles, M. S., Humphrey, S. J., Mann Correspondence, M. & Mann, M. Phosphorylation 
Is a Central Mechanism for Circadian Control of Metabolism and Physiology Cell 
Metabolism Resource Phosphorylation Is a Central Mechanism for Circadian Control of 
Metabolism and Physiology. Cell Metab 25, 118–127 (2017). 

105.  Gonze, D., Bernard, S., Waltermann, C., Kramer, A. & Herzel, H. Spontaneous 
synchronization of coupled circadian oscillators. Biophys J 89, 120–129 (2005). 

106.  Brenna, A. & Albrecht, U. Phosphorylation and Circadian Molecular Timing. Front 

Physiol 11, 612510 (2020). 

107.  Lei, J. Mathematical Models for Gene Regulatory Network Dynamics. 145–198 (2021) 

doi:10.1007/978-3-030-73033-8_5. 

108.  Swat, M. H. et al. Multi-Scale Modeling of Tissues Using CompuCell3D. Methods Cell 

Biol 110, 325–366 (2012). 

109.  Swat, M. H. et al. Modeling of Tissues Using CompuCell3D Methods. (2012). 

doi:10.1016/B978-0-12-388403-9.00013-8.Multi-Scale. 

110.  Appendix — CC3D Reference Manual 4.2.4 documentation. 

https://compucell3dreferencemanual.readthedocs.io/en/latest/appendix.html. 

111.  Kim, R. & Reed, M. C. A mathematical model of circadian rhythms and dopamine. Theor 

Biol Med Model 18, (2021). 

112.  Finger, A. Molecular Mechanisms of Intercellular Coupling among Peripheral Circadian 

Oscillators. (2020). 

113.  Balsa-Canto, E. & Banga, J. R. AMIGO: A model identification toolbox based on global 

optimization and its applications in biosystems. in IFAC Proceedings Volumes (IFAC-
PapersOnline) vol. 11 132–137 (IFAC Secretariat, 2010). 

114.  Balsa-Canto, E., Henriques, D., Gábor, A. & Banga, J. R. AMIGO 2 THEORETICAL 

BACKGROUND. (2016). 

115.  167293.167637. 

116.  Battogtokh, D. & Tyson, J. J. Deciphering the Dynamics of Interlocked Feedback Loops 
in a Model of the Mammalian Circadian Clock. Biophys J 115, 2055–2066 (2018). 

117.  XPP/XPPAUT Homepage. https://sites.pitt.edu/~phase/bard/bardware/xpp/xpp.html. 

118.  Leloup, J. C. & Goldbeter, A. Modeling the mammalian circadian clock: sensitivity 

analysis and multiplicity of oscillatory mechanisms. J Theor Biol 230, 541–562 (2004). 

143 

 
119.  Egea, J. A., Martí, R. & Banga, J. R. An evolutionary method for complex-process 

optimization. Comput Oper Res 37, 315–324 (2010). 

120.  Villaverde, A. F. et al. BioPreDyn-bench: A suite of benchmark problems for dynamic 

modelling in systems biology. BMC Syst Biol 9, (2015). 

121.  Dibner, C., Schibler, U. & Albrecht, U. The mammalian circadian timing system: 

organization and coordination of central and peripheral clocks. Annu Rev Physiol 72, 517–
549 (2010). 

122.  Santostefano, M. J. et al. Dose-Dependent Localization of TCDD in Isolated Centrilobular 

and Periportal Hepatocytes. (1999). 

123.  Wild, S. L. et al. The Canonical Wnt Pathway as a Key Regulator in Liver Development, 

Differentiation and Homeostatic Renewal. Genes (Basel) 11, 1–20 (2020). 

124.  Nault, R. et al. Single-cell transcriptomics shows dose-dependent disruption of hepatic 

zonation by TCDD in mice. Toxicol Sci 191, 135–148 (2023). 

125.  Huang, P., Tofighi, R., Emgard, M. & Ceccatelli, S. Cell death induced by 2,3,7,8-

tetrachlorodibenzo-p-dioxin (2,3,7,8-TCDD) in AtT-20 pituitary cells. Toxicology 207, 
391–399 (2005). 

126.  Nault, R., Fader, K. A., Bhattacharya, S. & Zacharewski, T. R. Single-Nuclei RNA 

Sequencing Assessment of the Hepatic Effects of 2,3,7,8-Tetrachlorodibenzo-p-dioxin. 
Cell Mol Gastroenterol Hepatol 11, 147–159 (2021). 

127.  Cholico, G. N. et al. Consequences of reprogramming acetyl-CoA metabolism by 2,3,7,8-

tetrachlorodibenzo-p-dioxin in the mouse liver. Scientific Reports 2023 13:1 13, 1–14 
(2023). 

128.  Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression 

data analysis. Genome Biol 19, (2018). 

129.  Lopez, R., Regier, J., Cole, M. B., Jordan, M. I. & Yosef, N. Deep generative modeling for 

single-cell transcriptomics. Nature Methods 2018 15:12 15, 1053–1058 (2018). 

130.  Nelder, J. A. & Mead, R. A simplex method for function minimization. Comput J 7, 308–

313 (1965). 

131.  Schwarz, G. Estimating the Dimension of a Model. 

https://doi.org/10.1214/aos/1176344136 6, 461–464 (1978). 

132.  Thaben, P. F. & Westermark, P. O. Differential rhythmicity: detecting altered rhythmicity 

in biological data. Bioinformatics 32, 2800–2808 (2016). 

133.  Yang, Y., Filipovic, D. & Bhattacharya, S. A Negative Feedback Loop and Transcription 

Factor Cooperation Regulate Zonal Gene Induction by 2, 3, 7, 8-Tetrachlorodibenzo-p-
Dioxin in the Mouse Liver. Hepatol Commun 6, 750–764 (2022). 

134.  Coifman, R. R. & Lafon, S. Diffusion maps. Appl Comput Harmon Anal 21, 5–30 (2006). 

144 

 
135.  Haghverdi, L., Büttner, M., Wolf, F. A., Buettner, F. & Theis, F. J. Diffusion pseudotime 

robustly reconstructs lineage branching. Nature Methods 2016 13:10 13, 845–848 (2016). 

136.  Lun, A. T. L., Bach, K. & Marioni, J. C. Pooling across cells to normalize single-cell RNA 

sequencing data with many zero counts. Genome Biol 17, (2016). 

137.  GitHub - naef-lab/dryR: Differential RhythmicitY analysis in R is an R package that 

provides the statistical framework to assess differential rhythmicity of a time series with 
two and more conditions. https://github.com/naef-lab/dryR. 

138.  Schmidt, C. K. et al. 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) alters the endogenous 
metabolism of all-trans-retinoic acid in the rat. Arch Toxicol 77, 371–383 (2003). 

139.  Gougelet, A. et al. T-cell factor 4 and β-catenin chromatin occupancies pattern zonal liver 

metabolism in mice. Hepatology 59, 2344–2357 (2014). 

140.  Lamb, C. L. et al. Aryl Hydrocarbon Receptor Activation by TCDD Modulates Expression 

of Extracellular Matrix Remodeling Genes during Experimental Liver Fibrosis. Biomed 
Res Int 2016, (2016). 

141.  Matsumoto, H. et al. SCODE: an efficient regulatory network inference algorithm from 
single-cell RNA-Seq during differentiation. Bioinformatics 33, 2314–2321 (2017). 

142.  Badia-i-Mompel, P. et al. Gene regulatory network inference in the era of single-cell 

multi-omics. Nat Rev Genet 24, 739–754 (2023). 

143.  Zhang, Q. et al. Embracing Systems Toxicology at Single-Cell Resolution. Curr Opin 

Toxicol 16, 49–57 (2019). 

144.  Zhang, S., Xiangtao, L. I., Jiecong, L. I. N., Qiuzhen, L. I. N. & Wong, K. C. Review of 

single-cell RNA-seq data clustering for cell-type identification and characterization. RNA 
29, (2023). 

145.  Keyl, P. et al. Single-cell gene regulatory network prediction by explainable AI. Nucleic 

Acids Res 51, E20–E20 (2023). 

146.  Levine, M. & Tjian, R. Transcription regulation and animal diversity. Nature 2003 

424:6945 424, 147–151 (2003). 

147.  Mure, L. S. et al. Diurnal transcriptome atlas of a primate across major neural and 

peripheral tissues. Science (1979) 359, (2018). 

148.  Hogenesch, J. B., Gu, Y. Z., Jain, S. & Bradfield, C. A. The basic-helix-loop-helix-PAS 

orphan MOP3 forms transcriptionally active complexes with circadian and hypoxia 
factors. Proc Natl Acad Sci U S A 95, 5474–5479 (1998). 

149.  Zhang, R., Lahens, N. F., Ballance, H. I., Hughes, M. E. & Hogenesch, J. B. A circadian 

gene expression atlas in mammals: Implications for biology and medicine. Proc Natl Acad 
Sci U S A 111, 16219–16224 (2014). 

150.  Shimomura, K. et al. Usf1, a suppressor of the circadian Clock mutant, reveals the nature 
of the DNA-binding of the CLOCK:BMAL1 complex in mice. Elife 2, 426 (2013). 

145 

 
151.  Menet, J. S., Pescatore, S. & Rosbash, M. CLOCK:BMAL1 is a pioneer-like transcription 

factor. Genes Dev 28, 8–13 (2014). 

152.  Gupta, P., Zlatanova, J. & Tomschik, M. Nucleosome Assembly Depends on the Torsion in 

the DNA Molecule: A Magnetic Tweezers Study. Biophys J 97, 3150 (2009). 

153.  Koike, N. et al. Transcriptional Architecture and Chromatin Landscape of the Core 

Circadian Clock in Mammals. Science 338, 349 (2012). 

154.  Grove, C. A. et al. A multiparameter network reveals extensive divergence between C. 

elegans bHLH transcription factors. Cell 138, 314–327 (2009). 

155.  DeWoskin, D. et al. Distinct roles for GABA across multiple timescales in mammalian 

circadian timekeeping. Proc Natl Acad Sci U S A 112, E3911–E3919 (2015). 

156.  Agostino, P. V. et al. Sildenafil accelerates reentrainment of circadian rhythms after 

advancing light schedules. PNAS 104, 9834–9839 (2007). 

157.  Bernard, S., Gonze, D., Čajavec, B., Herzel, H. & Kramer, A. Synchronization-induced 

rhythmicity of circadian oscillators in the suprachiasmatic nucleus. PLoS Comput Biol 3, 
667–679 (2007). 

158.  Gagliano, O. et al. Synchronization between peripheral circadian clock and feeding-

fasting cycles in microfluidic device sustains oscillatory pattern of transcriptome. Nature 
Communications 2021 12:1 12, 1–12 (2021). 

159.  Garcia-Ojalvo, J., Elowitz, M. B. & Strogatz, S. H. Modeling a Synthetic Multicellular 

Clock: Repressilators Coupled by Quorum Sensing. 
www.pnas.orgcgidoi10.1073pnas.0307095101 (2004). 

160.  Feillet, C. et al. Phase locking and multiple oscillating attractors for the coupled 

mammalian clock and cell cycle. Proc Natl Acad Sci U S A 111, 9828–9833 (2014). 

161.  Kim, J. K. & Forger, D. B. A mechanism for robust circadian timekeeping via 

stoichiometric balance. Mol Syst Biol 8, 1–14 (2012). 

162.  Ye, R. et al. Dual modes of CLOCK:BMAL1 inhibition mediated by Cryptochrome and 

period proteins in the mammalian circadian clock. Genes Dev 28, 1989–1998 (2014). 

163.  Lehmann, R. et al. Assembly of a comprehensive regulatory network for the mammalian 

circadian clock: A bioinformatics approach. PLoS One 10, (2015). 

164.  Nault, R. et al. Dose-dependent metabolic reprogramming and differential gene expression 

in TCDD-elicited hepatic fibrosis. Toxicological Sciences 154, 253–266 (2016). 

146