Call to Action:

Concretizing the capabilities of clocks to accelerate discovery in aging biology

A piece by Satvik Dasariraju and Martin Borch Jensen

Executive Summary

Research into aging biology and therapeutic interventions that target this biology to modulate multiple diseases are held back by the timelines of experimental studies involving natural aging. Biomarkers of the aging process (i.e., ‘aging clocks’) would unlock faster and more efficient breakthroughs, laying foundations for surrogate endpoints in human clinical trials. Many clocks have been proposed in the last 13 years [1], but no definite process for validating their predictive power exists.

We envision the Clock Assessment Program (CAP) as a gold standard for benchmarking aging clocks, deployed to accelerate preclinical discovery by reducing the amount of time it takes to identify compounds that meaningfully shift survival curves. Instead of waiting 30 months or more for mouse lifespan experiments to finish, CAP-validated clocks will predict eventual lifespan extension at earlier timepoints (e.g., 12 and 18 months of age) with sensitivity and specificity that clears a high bar.

The CAP would leverage coupled lifespan and longitudinal omics data (proteomics and DNA methylation) from the NIH-funded Interventions Testing Program (ITP), providing public training datasets for researchers developing clocks and an independent third-party institution to benchmark clock performance based on non-public test data. On such a foundation of rigor, aging clocks can be further developed for accelerating human studies. 

To no longer wait three years for each preclinical lifespan study, to not take up a decade to validate life-saving therapies that treat universal age-related diseases, we need a rigorous, visionary program to enable progress. Now is the time to concretize. 

If you are a founder, researcher, or someone passionate about this topic, we invite you to join us in tackling or sponsoring this initiative and reach out to us for further discussion. 

Why

‘Aging clocks’ are designed to model biological (rather than chronological) age based on molecular measurements. These clocks have been proposed as a tool for research into the biology of aging [2-5]. An objective function that is clear and useful for the field is key for the success of these computational tools, but such an objective function has not yet been established in clock development. Notably, research on aging clocks sometimes conflates two separate goals, which are best pursued separately in a specialized manner:

  1. Biomarker development: rigorously building and validating algorithms predictive of specific characteristics or endpoints (e.g., frailty index or age-related mortality)

  2. Feature discovery: quantitatively pinpointing patterns and relationships to reveal insights about aging biology and bolster discovery of targets and pathways

Here, we focus on the first aim: biomarker development. We posit that a major bottleneck to aging research is the amount of time and resources necessary to conduct lifespan studies, which are the surest way to validate an aging indication. Studying molecular characteristics and phenotypes of mice in interventional studies can be informative for the aging field, but the gold standard for physiological implication of aging biology remains lifespan studies, which take over two years in mice. This principle also applies to potential clock use in human clinical trials

Clocks can accelerate this process and massively upscale discovery—if each clock is very precisely designed to be a concretized surrogate readout for a specific goal (e.g., predicting lifespan of a mouse, predicting frailty index for a mouse when it is age two years old, etc.). The aging field has already developed clocks that track well with chronological age and biological age, but tight correlation has only been robustly validated in the forward direction. In other words, we need to show that clocks can be predictive of the changes to mouse lifespan that will be induced due to an intervention. Very little work has been done on comparing the predictions of clocks with future ground truth results. Even less work has been done on juxtaposing the responsiveness of clocks to interventions and the actual changes caused by the intervention, as measured by endpoint data. When examining interventions which previously have not been tested in a randomized placebo-controlled lifespan or mortality study, neither the interventions nor clocks can be validated by examining how the clock predictions are affected by the interventions; we need ground truth endpoint data.

Developing, testing, and rigorously validating responsive clocks (which can forecast future endpoints) alongside intervention testing in mouse lifespan studies can open the door to massive improvements in the quality and scale of in vivo screens at an organismal level. Notably, this type of benchmark measures practical utility of potential clocks, rather than technical measures of prediction on a specific dataset.

Validation of concretized clocks in mouse lifespan studies can enable the translation of these tools in human studies. Clinical endpoints for age-related diseases, and possibly for age-related multimorbidity or mortality itself, will require clinical trials spanning five or more years to be sufficiently powered for measurement of multi-morbidity and/or mortality. While a long duration will be absolutely necessary for the first generation of aging clinical trials, getting aging drugs to the market can be accelerated in the second generation of aging trials by establishing clocks as a surrogate endpoint. Robust validation and acceptance of surrogate endpoints can take many years, but omics and multimorbidity and/or mortality data from the first generation of aging clinical trials can be used for development and retrospective validation of clocks. Crucially, learnings from preclinical clocks that forecast mouse lifespan will inform and set the precedent for development of highly accurate clocks specialized for human clinical trials.

How

The first step to concretizing clocks is defining a clear goal; the clock algorithm will be trained and evaluated based on this aim. 

We suggest forecasting mouse lifespan in interventional studies as the specific goal for clocks. Optionally, specific morbidities or endpoints (e.g., frailty index) in these animals can be included as well. An excellent source of this data already exists in the NIH-funded Interventions Testing Program (ITP), which also banks samples potentially usable for clock development and testing. We propose the CAP to couple the development and validation of these mouse lifespan predictors with the ongoing ITP. Specifically, the CAP will be an independent institution that provides neutral third-party validation of the predictive power of clocks, serving as a gold standard for measuring progress in the capabilities of clocks developed by anyone in the field.

The ITP tests up to six interventions a year in groups of pathogen-free male and female mice of the UM-HET3 stock [6]. Treatment with the intervention begins at four to six months of age, and lab mice typically live to around 30 months. 

We suggest the following timeline for collecting blood samples in the ITP placebo-controlled lifespan studies:

  • 6 months: baseline plasma proteomics and DNA methylation collected

  • 6 months: intervention administration begins for treatment group

  • 12 months: plasma proteomics and DNA methylation collected

  • 18 months: plasma proteomics and DNA methylation collected

  • 24 months: plasma proteomics and DNA methylation collected

  • End of lifespan for mice in study, plasma proteomics and DNA methylation collected when possible

The CAP will provide centralized processing for these omics data. Coupled with lifespan data (and optionally mouse health assessment and disease incidence, where collected), this generates an integrated dataset for predicting lifespan from omics. This data will be split into a publicly released training dataset, and a test dataset held within the CAP. Both the training set and test set will include data from control mice and intervention-treated mice, while the test set will additionally include omics and lifespan data on mice treated with novel interventions (not included in the training set), to assess generalizability of submitted clocks. 

The CAP’s platform will allow any scientist to use the training data to develop predictive clocks based on proteomics, DNA methylation, or both, and submit the clock models for evaluation by the CAP. Specifically, researchers will be tasked with developing and training a X-month omics clock—this refers to a predictive model that takes in proteomics and/or DNA methylation data from a mouse aged X months and is trained to use these data to output a prediction for eventual lifespan. Submissions would require public release of methodology and code. A leaderboard of predictive power will be publicly posted.

In the first year of the CAP, the focus will be generating training data and creating infrastructure for data accessibility and unbiased benchmarking. The ITP already collects blood samples from lifespan studies, albeit at fewer timepoints than we propose, and these samples will be used to immediately generate data from previous years of studies. Meanwhile, all new ITP studies will collect additional samples as described above to generate training data. 

Importantly, the blood samples will be biobanked, and all omics data collected will be made publicly available. The longitudinal collection of data will also enable data-driven profiling of the dynamics of aging, supporting efforts aiming at feature discovery and mechanistic insight.

The Future

As an independent, expert-advised organization, the CAP will be the gold standard benchmark for validating aging clocks, first deployed to accelerate preclinical discovery by reducing the amount of time it takes to test the hypothesis that a compound or combination extends healthy lifespan in mice. Holding new clocks to a common standard of rigor should ensure that scientific research leads to useful outcomes.

The CAP proposal focuses on predicting mouse lifespan and age-modulation, because rigorous testing depends on robust experimental data showing both extension and shortening of lifespan by multiple mechanisms. Such data exists in mice, but not yet in humans. Benchmarking predictive power for mouse clocks will immediately accelerate preclinical discovery, but we consider it a natural extension to use high-performing clocks (and the principles of their design) as the basis for developing robust human clocks that could eventually become clinical surrogate endpoints. To bridge between mouse lifespan clocks and human clocks, a dataset will be developed for a cross-species frailty index clock benchmarking program. Achieving cross-species integration requires additional efforts both in sample acquisition and data generation, and in design of clocks; proteins and DNA methylation sites are not identical across species. The prediction task has to be carefully formulated for the clock to be concretized and meaningful. 

However, a cross-species clock that forecasts frailty index is quite feasible for two reasons: 

First, a pan-mammalian clock (a single regression model) can predict relative age (chronological age divided by maximum species lifespan) with strikingly high correlation [7]. Second, the relationship between frailty index and relative age in mice and humans is close to identical, implying a very similar rate of deficit accumulation [8]. The CAP will set up a prediction task compatible with data on mice and humans, in which omics from a certain relative age used to forecast frailty index at a later relative age. The focus on frailty index clocks can be extended to age-related reduction in intrinsic capacity, which is included in ICD-11 as code MG2A, and thus may have a direct path to primary endpoint status in the future. We envision branching our benchmarks into two categories over time: those focused entirely on mouse models, and those aiming towards performance in human predictions.

Extending the CAP to human clock development in the long term will create an engine for developing and validating clinical surrogate endpoints to enable rigorous, accelerated human trials for therapeutics seeking to treat, prevent, and reverse age-related disease. Just as LDL cholesterol was proposed, validated, and adopted as a surrogate endpoint for cardiovascular disease/death (enabling faster trials and thus faster approval of life-saving lipid lowering drugs), the CAP will be a nexus for proposing, benchmarking, open-sourcing, and validating surrogate endpoints for human clinical trials; CAP-validated surrogate endpoints will provide proactive signal about the eventual primary outcome (e.g., mortality or multi-morbidity in aging trials, six minute walk distance in sarcopenia trials, etc.), thus efficiently and rigorously accelerating access to healthspan-extending drugs that give us more autonomy during old age.

References

  1. Bocklandt, S., Lin, W., Sehl, M. E., Sánchez, F. J., Sinsheimer, J. S., Horvath, S., & Vilain, E. (2011). Epigenetic predictor of age. PloS one, 6(6), e14821. https://doi.org/10.1371/journal.pone.0014821

  2. Horvath, S., & Raj, K. (2018). DNA methylation-based biomarkers and the epigenetic clock theory of ageing. Nature Reviews Genetics, 19(6), 371–384. https://doi.org/10.1038/s41576-018-0004-3

  3. Horvath, S., & Topol, E. J. (2024). Digitising the ageing process with epigenetic clocks. Lancet, 404(10451), 423. https://doi.org/10.1016/S0140-6736(24)01554-X

  4. Moqri, M., Herzog, C., Poganik, J. R., Biomarkers of Aging Consortium, Justice, J., Belsky, D. W., Higgins-Chen, A., Moskalev, A., Fuellen, G., Cohen, A. A., Bautmans, I., Widschwendter, M., Ding, J., Fleming, A., Mannick, J., Han, J.-D. J., Zhavoronkov, A., Barzilai, N., Kaeberlein, M., … Gladyshev, V. N. (2023). Biomarkers of aging for the identification and evaluation of longevity interventions. Cell, 186(18), 3758–3775. https://doi.org/10.1016/j.cell.2023.08.003

  5. Moqri, M., Herzog, C., Poganik, J. R., Ying, K., Justice, J. N., Belsky, D. W., Higgins-Chen, A. T., Chen, B. H., Cohen, A. A., Fuellen, G., Hägg, S., Marioni, R. E., Widschwendter, M., Fortney, K., Fedichev, P. O., Zhavoronkov, A., Barzilai, N., Lasky-Su, J., Kiel, D. P., … Ferrucci, L. (2024). Validation of biomarkers of aging. Nature Medicine, 30(2), 360–372. https://doi.org/10.1038/s41591-023-02784-9

  6. Macchiarini, F., Miller, R. A., Strong, R., Rosenthal, N., & Harrison, D. E. (2021). Chapter 10 - NIA Interventions Testing Program: A collaborative approach for investigating interventions to promote healthy aging. In N. Musi & P. J. Hornsby (Eds.), Handbook of the Biology of Aging (Ninth Edition) (pp. 219–235). Academic Press. https://doi.org/10.1016/B978-0-12-815962-0.00010-X

  7. Lu, A. T., Fei, Z., Haghani, A., Robeck, T. R., Zoller, J. A., Li, C. Z., Lowe, R., Yan, Q., Zhang, J., Vu, H., Ablaeva, J., Acosta-Rodriguez, V. A., Adams, D. M., Almunia, J., Aloysius, A., Ardehali, R., Arneson, A., Baker, C. S., Banks, G., Belov, K., … Horvath, S. (2023). Universal DNA methylation age across mammalian tissues. Nature aging, 3(9), 1144–1166. https://doi.org/10.1038/s43587-023-00462-6

  8. Whitehead, J. C., Hildebrand, B. A., Sun, M., Rockwood, M. R., Rose, R. A., Rockwood, K., & Howlett, S. E. (2014). A clinical frailty index in aging mice: comparisons with frailty index data in humans. The journals of gerontology. Series A, Biological sciences and medical sciences, 69(6), 621–632. https://doi.org/10.1093/gerona/glt136

Inspired? Have something to contribute? Reach out to us: