Large Scale Toxicoepigenetics on Histones: A Mass Spectrometry-based Screening Assay Applied to Developmental Toxicity

Toxicoepigenetics is an emerging eld that studies the toxicological impact of compounds on protein expression through heritable, non-genetic mechanisms, such as histone post-translational modications (hPTMs). Due to substantial progress in the large-scale study of hPTMs, integration into the eld of toxicology is promising and offers the opportunity to gain novel insights into toxicological phenomena. Moreover, there is a growing demand for high-throughput human-based in vitro assays for toxicity testing, especially for developmental toxicity. Consequently, we developed a mass spectrometry-based proof-of-concept to assess a histone code screening assay capable of simultaneously detecting multiple hPTM-changes in human embryonic stem cells. To prove the applicability and performance, we rst validated the untargeted workow with valproic acid (VPA), a histone deacetylase inhibitor. These results demonstrate that our workow is capable of mapping the hPTM-dynamics, with a general increase in acetylations as an internal control. To illustrate the scalability, a dose-response study was performed on a proof-of-concept library of ten compounds i) with a known effect on the hPTMs (BIX-01294, 3-Deazaneplanocin A, Trichostatin A, and VPA), ii) classied as highly embryotoxic by the European Centre for the Validation of Alternative Methods (ECVAM) (Methotrexate, and All-trans retinoic acid), iii) classied as non-embryotoxic by ECVAM (Penicillin G), and iv) compounds of abuse with a presumed developmental toxicity (ethanol, caffeine, and nicotine). In conclusion, we show that toxicoepigenetic screening on histones is feasible and yields very rich data that holds potential, not only for applications in the pharmaceutical industry, but also for environmental toxicity and food safety.


Introduction
In reproductive and developmental toxicity assessment, there is a considerable need for both alternative assays and additional targets (Adler et al. 2008; Krewski et al. 2020; Laustriat et al. 2010). This is because i) it is essentially the most animal-consuming area in drug development and chemical regulatory toxicity testing (Piersma et al. 2013), ii) interspecies extrapolation of developmental toxicity is not always possible (Nakanishi 2007), iii) current testing procedures are relatively time-consuming and resource-intensive (Krewski et al. 2010), and iv) molecular processes that mediate gene expression during differentiation are still largely understudied.
These challenges could be addressed by a high-throughput, but above all more sensitive and human-based invitro assay for the detection of developmental toxicity caused by pharmaceuticals and chemicals.
Brie y, histones are basic and positively charged proteins that form an octamer consisting of two dimers of histone H2A and H2B, and one tetramer of histone H3 and H4 that operate as a central point of attraction for the negatively charged DNA. Accordingly, approximately 147 base pairs of DNA tend to wrap around an octamer resulting in the formation of a nucleosome. Multiple nucleosomes are connected through the linker histone H1 and linker DNA to form chromatin, which is the structural level at which hPTMs play their in uential role (Luger et al. 1997). In fact, speci c hPTMs cause relaxation or reinforcement of the chromatin that leads to transcriptional activation or inhibition, respectively. Ergo, hPTMs may interfere with gene expression by rendering the DNA more or less accessible to transcription factors (Castillo-Aguilera et al. 2017). A scholarly example of this is histone acetylation, which is associated with transcriptional activation, whereas deacetylation of histones leads to transcriptional repression. This is either caused by the altered biophysical a nity between the histones and DNA (i.e. acetylation reduces the positive charge of the histones), or more importantly, by the recruitment of additional proteins and protein complexes. These proteins, so-called 'readers', contain a characteristic domain capable of 'reading' a speci c hPTM and its stored information (Engelberg et al. 2021; Taylor and Young 2021).
However, this simplistic view of one speci c hPTM underlying a biological outcome has long been abandoned and replaced by the concept of the so-called histone code. Herein dozens of hPTMs and the histone variants act together to decide on the nal outcome of the chromatin state and its transcriptional activity (Taylor and Young 2021). In fact, histone modi cations are chemical reactions involving energy-rich donors like acyl-CoA (acylations), Adenosine triphosphate (ATP) (phosphorylation), and S-adenosylmethionine (methylation) (Simithy et al. 2017; Zhang et al. 2019). It is therefore increasingly accepted that histone modi cation arose as an ancient mechanism to directly sense the energetic state of the eukaryotic cell by translating metabolic information into gene regulation via histones. In turn, this explains the full alphabet of hPTMs that have been discovered to date. Spatially, the PTM combination-centric model even suggests that functionally connected hPTMs can be found on different histone subunits or even on different nucleosomes (Taylor and Young 2021).
Unfortunately, the widely used antibody-based assays to study hPTMs are con ned by a limited number of targets that can be studied in a single experiment and therefore by a lack of combinatorial information. These assays are targeted to a single modi cation, making screening impossible. Moreover, it is very di cult to nd a speci c antibody for each modi cation of interest due to the sequence homology of histone variants and the wide range of hPTMs. As a result, histone antibodies often suffer from cross-reactivity and epitope occlusion (Taylor and Young 2021;Zheng et al. 2016). In part because of this targeted nature, the large-scale study of the histone code lacks behind compared to other epigenetic mechanisms, which urges for an untargeted screening method to capture the dynamics of the histone code. Fortunately, the introduction of high-end mass spectrometers in the proteomics eld enabled the large-scale study of histones and their hPTMs.
We developed a mass spectrometry-based proof-of-concept to develop an assay capable of screening the histone code and hence detecting multiple hPTMs changes simultaneously in response to treatment with compounds of interest. This work ow was applied on human embryonic stem cells (hESCs) treated with compounds from a proof-of-concept library. More speci cally, Oct4-reporter hESCs were treated with different concentrations of i) drugs with a known effect on the hPTMs (BIX-01294 (BIX), 3-Deazaneplanocin A (DZNep), trichostatin A (TSA), valproic acid (VPA), ii) drugs classi ed as highly embryotoxic by the European Centre for the Validation of Alternative Methods (ECVAM) (methotrexate (MTX) and all-trans retinoic acid (RA)), iii) a drug classi ed as non-embryotoxic by ECVAM (penicillin G (PenG)), and iv) common substances of abuse with a presumed developmental toxicity (caffeine, ethanol, and nicotine) (Brown 2002). Following this dose-response experiment, ow cytometric, Reverse Transcription-quantitative Polymerase Chain Reaction (RT-qPCR), and mass spectrometric (MS) data were acquired to respectively investigate cell death, level of differentiation and the hPTM changes. Accordingly, it is now attainable to detect alterations in hPTMs as an indication of potential toxicity following exposure to a compound of interest.

Materials And Methods
Cell culture and harvest Oct4-enhanced green uorescent protein (eGFP) knock-in hESCs (WA01, WiCell Research Institute) were cultured in Essential 8 (E8) medium on a precoated Vitronectin XF™ plate (0.5 µg/cm², Primorigen) in 5% O2, 5% CO2 and 37°C. Cells were routinely passaged with 0.5 mM Ethylenediaminetetraacetic acid (EDTA) in Dulbecco's phosphate-buffered saline (DPBS) according to the manufacturer's protocol of culturing hESCs in E8 medium. After every passage, the cells were replated in E8 medium; on day 4, the medium containing the test compound was added. Each compound was added in four different concentrations and a negative-and quality control were included (Table 1). After an incubation of 24 h, the cells were harvested. Brie y, cells were washed in PBS and incubated for 5 min at 37°C in 2.5 mL Trypsin-EDTA (0.05%). Subsequently, 2.5 mL trypsininhibitor was added, and the cells were dissociated. 150 µL of the suspension was transferred to an Eppendorf tube for subsequent ow cytometric analysis. 500.000 cells were isolated for mRNA expression studies. The remaining cells were frozen as a dry pellet in liquid nitrogen for histone extraction. Each concentration and the negative controls were conducted in fourfold, while the quality controls were conducted in twofold. Karyotype analysis was performed at the beginning and end of the experiment, indicating that the cells maintained normal karyotypes throughout the study (data not shown). The culture was free of mycoplasma contamination (data not shown).  Table 2. Relative quanti cation of the markers was calculated using the qbasePLUS software. Each sample is relative to a calibrator, in this case untreated hESCs (negative control), and was normalized for two reference loci: B2M and RPL13A. For each marker, statistical analysis was performed using a One-way ANOVA test. Histone extraction, propionylation and digestion

Data analysis
Mass spectrometric data analysis was performed as previously described (Verhelst et al. 2020) yet some modi cations were implemented. For every compound, raw data from all runs were imported in a single experiment and all runs were aligned against a bovine histone standard in Progenesis QIP 4.2.7 (Nonlinear Dynamics, Waters). Next, feature detection was performed on the samples excluding the bovine histone samples to eliminate features that are only present in the bovine histones and not in the hESCs samples. The twenty MS/MS spectra closest to the elution apex were selected for each precursor ion and merged into a single * .mgf le. On this le, two types of searches in Mascot (Matrix Science) were performed. Therefore, the experimental MS/MS-spectra were compared to theoretical spectra obtained after in silico digest of the appropriate protein database, resulting in a given score for each peptide, which enabled 1) a quality search to identify non-propionylated standards (ß-gal) and to assess the amount of over-and underpropionylation, which was acceptable (data not shown), and 2) an error tolerant search to identify the proteins present in the sample. For both searches, the following parameters were included: 1) mass error tolerances for the precursor ions and its fragment ions were set at 10 ppm and 50 ppm respectively; 2) enzyme speci city was set to Arg-C, allowing for up to one missed cleavage; 3) variable modi cations included N-terminal propionylation and propionylation on K for the quality search and deamidation on asparagine (N) and glutamine (Q) and oxidation of methionine (M) for the error tolerant search, 4) no xed modi cations were included for the quality search and N-terminal propionylation and propionylation on K were set as xed modi cations for the error tolerant search; and 5) a complete Human SwissProt database (downloaded from UniProt and supplemented with contaminants from the common Repository for Adventitious Proteins (cRAP) database (https://www.thegpm.org/crap/)) was used. Based on the error tolerant search, a FASTA-database was generated, and a xed hPTM set was determined for all 10 compounds for further analysis (i.e. based on the highest ranked hPTMs in the error tolerant searches for each compound, together with the biologically most commonly studied hPTMs (acetylations and methylations)). Next, a second * .mgf le containing the three MS/MS spectra closest to the elution apex per feature was exported to perform a Mascot-search with the following parameters: 1) mass error tolerances for the precursor ions and its fragment ions were set at 10 ppm and 50 ppm respectively; 2) enzyme speci city was set to Arg-C, allowing for up to one missed cleavage site; 3) variable modi cations included acetylation, butyrylation, crotonylation, trimethylation and formylation on K, methylation on R, dimethylation on both K and R, deamidation on N, Q and R and oxidation of M; and 4) N-  VPA induces (neuro)ectoderm differentiation Both ow cytometry and RT-qPCR ( Fig. 2a and 2b) analysis indicate that treatment with 1mM VPA or more results in cell differentiation within 24 hours of incubation. An increase in eGFP-negative cells, represents a decrease in Oct4 expression, i.e. a decrease in pluripotency, indicating that the cells are differentiating (Fig. 2a). RT-qPCR was applied for evaluating the expression of pluripotent and lineage speci c markers such as POU5F1 and Nestin (NES), respectively. As shown in Fig. 2b, the results are consistent with those re ected by the ow cytometric analysis i.e. a decrease in POU5F1 expression upon increasing concentrations of VPA. For NES, an increased expression is observed, especially at the highest concentration, indicating that the cells start to differentiate towards the (neuro)ectoderm as a result of the VPA treatment (Hermann et al. 2006).

VPA treatment results in a hyperacetylation of histone H3 and H4
To create a more comprehensive picture of the dynamic histone code, we report the RAb along with the peptidoform-centric data, i.e. the measured peptidoforms with their combinatorial hPTMs (Fig. 3). Figure 3a depicts the RAb of all hPTMs competing for nine different acetylated residues. RAb estimates the percentage of the chromatin that is occupied by a given hPTM at a given residue. Importantly, we are able to detect very signi cant changes in very low hPTM levels occupying below 1% of the chromatin (e.g. Figure 3a: X-XIII). This will be highly relevant in the context of toxicity testing because localized at promotor or enhancer regions, these changes could induce considerable differences in expression of developmental mediators. The red lines clearly display an overall gain in acetylation levels as the VPA-concentration increases, con rming its action as an HDACi. In general, these acetylations replace other hPTMs, as is shown by e.g. the pronounced decrease in  Interestingly, an additional internal validation of the performance of the work ow is the opposing trend exhibited by H31K36me2 (Fig. 3b highlighted by red arrowhead) compared to H31K27me3, a recently described direct interaction discovered by using advanced computational models (Alabert et al. 2020).
In conclusion, this data illustrates the applicability of our untargeted work ow to simultaneously map changes in many different hPTMs.
To illustrate the scalability of the work ow, hESCs were incubated with four different concentrations of in total

Compounds from the ECVAM-classi cation
The effect of MTX on hPTMs has, to the best of our knowledge, never been investigated, yet we show that this strong embryotoxic compound displays a very prominent and non-coherent dysregulation of the hPTMs. Some hPTMs do not show a concentration-dependence and are heavily affected, even at the lowest concentration (e.g.

Compounds of abuse
Finally, the compounds of abuse displayed relatively moderate uctuations in their histone signature. Still, caffeine exhibits the most pronounced pattern which, when directly matched, resembles that of MTX most closely ( Figure 5). This is a nding of concern. Currently, it is recommended by the World Health Organization (WHO) not to consume more than 300 mg of caffeine per day during pregnancy because excessive intake may be associated with growth restriction, decreased birth weight, preterm birth or stillbirth (Guilbert 2003;Sengpiel et al. 2013). Our data suggests that these toxic effects might be linked to changes induced in the histone code.
Nevertheless, it should be noted that the metabolization of caffeine was not considered in this experiment.
Next, we included ethanol because of its established negative impact during gestation, referred to as fetal alcohol spectrum disorders. Overall, ethanol displays very subtle fold changes, however it does seem to mirror PenG, our negative control in terms of embryotoxicity ( Figure 5). This suggests that the effect of a one-time  . Whereas the toxicity of nicotine has been widely studied, its impact on hPTMs has only been studied in differentiated tissues (Chase and Sharma 2013). Our toxicoepigenetic work ow shows that the hPTM-changes for nicotine are so subtle that it is very conceivable that a one-day intake of nicotine does not affect the hPTMs in stem cells. Again, advanced data analysis strategies need to be developed to make this conclusion more founded.

Future perspectives
To date, little is known about the effects of different compounds on the hPTM-landscape. Yet, our comprehensive overview of the hPTM changes induced by ten compounds in stem cells shows that most compounds have a (subtle) effect on the histone code.
Our study is the ideal steppingstone to extend the knowledge on this form of epigenetic toxicity in light of developmental toxicity. This can be done by i) including other compounds of interest, ii) adjustment of dose, iii) adjustment of incubation time or the use of time-lapse experimental designs and, iv) developing more advanced statistical methods and algorithms to cluster compounds to facilitate the decision-making toolbox.
Moreover, the applicability of our work ow goes far beyond developmental toxicity. Firstly, other forms of toxicity can also be investigated depending on the cell line used, e.g. hepato-and nephrotoxicity by using liver and kidney cells respectively. Secondly, this study is not only important in the context of toxicoepigenetics but is also a promising tool in the eld of pharmacoepigenetics. As these epigenetic modi cations are interesting targets due to their dynamic and reversible character, the development of epidrugs is gaining momentum.
Especially in oncology, the use of epidrugs is on the rise and our work ow may contribute to discovering or elucidating the mechanism of action of these drugs (Miranda Furtado et al. 2019; Montalvo-Casimiro et al.

2020
). Moreover, personalized medicine is receiving growing attention and this study can contribute to this as well (Rasool et al. 2015). For example, it is possible to determine whether a patient exhibits a particular hPTM characteristic on which the drug will act, thereby predicting whether or not the treatment is likely to succeed.
Finally, the scope of this study can be extended outside the pharmaceutical context including applications for environmental toxicity and food safety.
However, to make the results of this work ow easier to interpret, more reliable and consequently easier to implement, we are still working on some improvements both in terms of acquisition and data analysis. For instance, with our current LC-MS/MS settings, it is di cult to acquire modi ed forms of H3K4. There are two reasons for this: (i) this PTM site is located on a small tryptic peptide, that consequently elutes early, making it di cult to analyze and, (ii) our mobile phase contains DMSO, which improves ionization, but also causes charge state reduction. Therefore, the H3K4 peptide occurs mostly as a singly charged ion (Hahne et al. 2013).
Nevertheless, this modi cation site can be of interest as methylation of H3K4 is associated with active transcription. Consequently, besides optimization of the LC gradient, acquisition parameters can be adjusted to also target singly charged precursors or DMSO can be removed from the mobile phase to include modi cations of H3K4 in the future. These improvements in LC-MS/MS settings should also result in a better separation of other peptides, which in turn will allow more accurate quanti cation, so that differences will become even more apparent. Note that the effect of withdrawing DMSO on the other histone peptides should be assessed as well, since it is known that doubly charged peptides are best annotated as they mostly generate singly charged fragments. Furthermore, including data-independent acquisition technologies like (Scanning)SWATH will result in an improved quanti cation and will be a stepping stone in the transition towards a multiple or parallel reaction monitoring, respectively MRM and PRM, assay (De Clerck et al. 2019b).
When focusing on data-analysis, we already mentioned that caution is always required when reporting RAbs (De Clerck et al. 2019b). Depending on the peptidoforms used for the calculation in combination with the ionization effects, RAbs can lead to a confusing and even misleading form of reporting. Therefore, we are working on more advanced statistical approaches that could contribute to better reporting and consequently a better understanding of the outcomes.
In conclusion, we demonstrated that with our work ow toxicoepigenetic screening on histones is feasible and will yield very rich data, for which more streamlined interpretation tools are yet to be developed. Integration of this epigenetic information into the eld of toxicology is a promising addition that offers an opportunity to   12E9716N and 11B4518N awarded to MD and BVP, respectively and BOF Mandate BOF20DOC220 awarded to LC.

Con icts of interest
The authors declare that they have no con ict of interest.
Availability of data and material (data transparency)  Work ow overview. Commercial human embryonic stem cells (hESCs) were originally derived from an inner cell mass of blastocyst-stage embryos. In this experiment, Oct4-eGFP Knock-In hESCs (WiCell) were cultured in Essential 8 (E8) medium on a precoated feeder-free vitronectin plate. For the baseline culture, hESCs were passaged every 4 to 5 days. After every passage the cells were replated in E8 medium. For the toxicoepigenetic assessment, the medium with test compound was added on day 4. Each compound was added in four different log 10 concentrations with each concentration in quadruplicate. A negative (E8 medium + solvent) and a quality control (E8) were included in respectively quadruplicate and duplicate. After an incubation of 24 hours, hESCs were harvested. To perform cell count, and to monitor the number of dead cells and the level of differentiation, ow cytometry was carried out. To monitor differences in the level of gene expression of lineage speci cation markers (POU5F1, SOX2, Nanog, HAND1 and NES) RT-qPCR was used. Subsequently, the samples were subjected to our MS-based toxicoepigenetic screening of histones. First, the histones were extracted by direct acid (DA) extraction. The amount of extract which corresponds to 400.000 cells was used for quanti cation by SDS-PAGE to allow normalization against histones alone and to assess the purity of the extract. The remaining extract was subjected to a rst propionylation reaction and digested with trypsin, followed by a second propionylation reaction and a reversal of the overpropionylation. Acquisition of the samples was done in randomized batches per compound by using HPLC (capillary ow mode) coupled to MS/MS (DDA-mode). Database searching (Mascot) was performed to identify the peptidoforms present in the samples and was followed by relative quanti cation on single hPTM-level (RAb-plots) and on peptidoform-    Heatmap highlighting compound clustering between caffeine and methotrexate, as well as ethanol and penicillin G. Fold changes were calculated for a set of peptide targets of histone H3 and H4 against the solvent (H2O) control for the abundances normalized to all histone peptides.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. S1Normalization.xlsx S3Heatmapinputqlucore.xlsx S4RelativeabundanceVPA.xlsx