Opportunities for Quantitative Translational Modeling in Oncology

A 2-day meeting was held by members of the UK Quantitative Systems Pharmacology Network (<http://www.qsp-uk. net/>) in November 2018 on the topic of Translational Challenges in Oncology. Participants from a wide range of backgrounds were invited to discuss current and emerging modeling applications in nonclinical and clinical drug development, and to identify areas for improvement. This resulting perspective explores opportunities for impactful quantitative pharmacology approaches. Four key themes arose from the presentations and discussions that were held, leading to the following recommendations:

A 2-day meeting was held by members of the UK Quantitative Systems Pharmacology Network (<http://www.qsp-uk. net/>) in November 2018 on the topic of Translational Challenges in Oncology. Participants from a wide range of backgrounds were invited to discuss current and emerging modeling applications in nonclinical and clinical drug development, and to identify areas for improvement. This resulting perspective explores opportunities for impactful quantitative pharmacology approaches. Four key themes arose from the presentations and discussions that were held, leading to the following recommendations: • Evaluate the predictivity and reproducibility of animal cancer models through precompetitive collaboration. • Apply mechanism of action (MoA) based mechanistic models derived from nonclinical data to clinical trial data. • Apply MoA reflective models across trial data sets to more robustly quantify the natural history of disease and response to differing interventions. • Quantify more robustly the dose and concentration dependence of adverse events through mathematical modelling techniques and modified trial design.
There is a growing worldwide population of patients with newly diagnosed cancer, 12.7 million in 2008 1 rising to 17 million in 2018, 2 with a consequent rise in cancer mortality. Thus, there is an existing and increasing need for cancer therapies. Estimates of the research and development (R&D) cost of a new drug are also increasing, with the latest estimate of pre-tax capitalized cost set in the order of $2.6 billion per approval. 3 Although the most recent estimated total oncology market size was US $123.8 billion in 2018, 4 the cost of failure is considerable as well. The data in ref. 3 suggest that across therapy areas the capitalized cost per program failing in phase III is of the order of $1.2 billion. Clearly there is a need to improve efficiency in oncology drug R&D to avoid exposing patients to ineffective treatments and the loss of capital that could be spent on other valuable research programs. The largest bottleneck in the delivery of new drugs is still in phase II, with success rates of merely 36%. This indicates that one of the major hurdles is the lack of translatability of the therapeutic index hypothesis, formulated in drug discovery, into the clinical setting. Improvement of translational tools to support the decision-making process from discovery to development could significantly reduce attrition and drug development costs. The purpose of this paper is, therefore, to describe the challenges currently faced within the field of translational oncology, and to propose recommendations on best practices and opportunities for modeling to address these.
The working definition of "translation" extends from nonclinical lead optimization to late development phases. In addition, the definition of "translation" is, here, geared toward supporting quantitative decision making. Despite a robust and diverse suite of in vivo models and successful mathematical models to describe tumor growth and treatment effect, examples where these have led to successful translational signals are still sparse. Several challenges remain, which can be summarized as:

NONCLINICAL IN VIVO TUMOR MODELS USED IN ONCOLOGY
Animal models are important tools for drug discovery, extending from target evaluation and validation through to characterization of the pharmacology and pharmacodynamics (PDs) of a novel therapeutic candidate drug. For targeted therapies, in vitro and in vivo tools are routinely used to characterize and quantify the pharmacology of new drug candidates, with some successful examples of quantitative translation to the clinic. [5][6][7][8] For the emergent field of immuno-oncology (IO), animal models are often essential to fully explore the drug target biology, given the complexities associated with the modulation of the immune system, which cannot be faithfully recapitulated using reductionist in vitro systems. Given the huge numbers of potential clinical combinations that could be explored, nonclinical in vivo models, in concert with in vitro systems, can also be used to evaluate combination approaches. Therefore, data generated in these models can influence go/no go decisions for a drug target/molecule in nonclinical development and influence the strategy for evaluation in patients. In consequence, it is critical that the data generated in these models are translationally relevant.
Transplantable xenograft models of cancer allow the study of tumor tissues of human origin using a living, but immunocompromised rodent as a host. There is a wide variety of wellcharacterized human cell lines available that can be grown in vivo, thus offering a varied menu of cell line-derived xenograft (CDX) models with which to study a specific disease setting. Another specific type of xenograft model is the patient-derived xenograft (PDX), in which primary patient tumor fragments are transplanted into mice without the need to establish cell lines. PDX models have experienced an extraordinary development in the last few years, as new transplantation and preservation techniques have allowed the increased availability of tumor models from different disease areas. Their genetic and microenvironment similarity to their disease of origin, albeit in the absence of an adaptive immune system, make the results obtained from them potentially more translatable than those from CDX. As such, PDX models are currently increasingly used in "n = 1 mouse PDX trials" to explore genetic/epigenetic/ transcriptome predictors of drug sensitivity for patient selection hypotheses. 9 Although tumor response end points similar to those used in the clinic are derived from these studies, a current gap is that quantitative translation has not been demonstrated. Whereas widely used to explore targets expressed in/on the cancer cell and vasculature, these CDX and PDX models lack any insight into the impact of the adaptive and (to varying extents) the innate immune system, as these tumors will only grow in immunodeficient animals.
In the era of IO, the need to use immunocompromised mice in CDX and PDX models becomes a major barrier to evaluating drug-induced tumor shrinkage of IO agents. Alternative models, such as syngeneic and genetically modified models of cancer, may be used when an immune system is required for efficacy. In contrast to xenografts, transplantable syngeneic models have been far less well-characterized. For standard mouse syngeneic models, recent studies have provided additional characterization of the genomic, transcriptomic, and tumor microenvironmental landscape. 10,11 With this increased level of characterization, transplantable syngeneic models can be more readily selected to test a specific hypothesis. This deeper characterization of the syngeneic models provides an opportunity to link therapeutic response to the phenotype/genotype of the tumor, providing an opportunity to bridge data generated in these models to the clinical setting. Using this approach, insights from the syngeneic models can be used to generate hypotheses about patient selection and identify relevant biomarkers of response. 10 However, there are some limitations associated with transplantable syngeneic tumor models, namely the limited number of available models compared with xenografts as well as their rapid growth rates (outlined further in Table 1). Genetically engineered mouse models (GEMMs) recapitulate the anatomic site of disease through use of tissue-specific promotors and can be designed with mutations that frequently occur in human disease. GEMMs may, therefore, recapitulate the formation of a complex microenvironment shaped by the stochastic interaction of the tumor cells with the immune system. 12 However, their long latency times and requirement for large breeding colonies limit the throughput of these models. Moreover, when compared with chemically induced syngeneic transplantable models, GEMMs have a limited mutational burden and genetic mosaicism, which may influence the development of an antitumor response and, therefore, may not recapitulate the clinical setting. 13 For all these animal tumor models, it is standard practice to measure subcutaneous tumor volume by manual measurement of length and width using calipers. A spheroid mathematical formula is then used to approximate volume. This method suffers from inter-operator and intra-operator variations. 14,15 Further, the repeatability and accuracy of caliper measurements are negatively affected by the morphological complexity of tumors: tumors are often irregular, may contain fluid, and are mobile beneath the skin. 16 Three dimensional scanning provides an opportunity to noninvasively derive tumor morphometry, such as length, width, area, height, or volume. 15 Thermography is used to enhance the localization of subcutaneous tumors, as xenografts tend to be cooler than the surrounding tissue. 17 This effect is probably due to the tortuous vasculature within the tumor being inefficient at distributing blood. Furthermore, thermal imaging is, in most instances, sufficiently sensitive to detect tumors before they can be palpated and can detect changes due to therapy within hours of exposure. 17 The frequency of these caliper measures is typically once every 2-3 days, which is a considerable burden on laboratory resource. Parra-Guillen et al. 18 demonstrated that reducing the number of measurements to twice per week, or even once per week for cell lines with low growth rates, showed little impact on model-estimated parameter precision. However, large studies (i.e., > 50) were still required to accurately characterize parameter variability. A further consideration is that for heterogeneously responding animal models, reporting, or modeling, the group mean tumor size is inappropriate. Analysis of xenograft efficacy data can also be supplemented by considering animal dropout in the studies. 19 This is particularly important where welfare issues due to tumor burden and drug-toxicity can confound the interpretation of data.
Tolerability assessment in animal disease and efficacy models is important to evaluate animal welfare and potential confounding effects in the interpretation of the studies. Body weight is measured daily (with > 20% body weight loss being judged as intolerable in the United Kingdom). This is probably largely indicative of gastrointestinal toxicities, which may be shown to be important dose limiting toxicities. General behavioral signs are also monitored (e.g., lethargy, posture, and gait changes). Usually, these are observed in a small fraction of animals, and their scoring can be idiosyncratic. Other more objective external signs, such as redness, ulceration, necrosis, and pallor, can be used as biomarkers for animal welfare. There are combined visual scanning and artificial intelligence solutions being developed to automatically identify these signs. 20 Being able to measure cardiovascular end points, including hemodynamics (e.g., blood pressure) and blood count changes, might also enable more informative animal pharmacology studies. However, given the size of a mouse, the volume and number of blood samples that can be obtained is limited, and competes with samples for pharmacokinetic (PK) studies, which are arguably more crucial to the exposure-response interpretation of an efficacy study. In addition, proper assessment of drug safety, and thus of the potential therapeutic index, is usually done in dedicated, and ultimately regulated, healthy animal safety studies, where disease status and progression are not a confounder.

TRANSLATABLE EFFICACY END POINTS
Even though both patients and animal models of cancer typically present with measurable tumor burden, the use of different measurement techniques and end points in each setting presents a challenge (see Figure 1). For xenografts, the usual end point is the percent of tumor growth inhibition calculated by comparing the change in mean tumor size in the control group to that observed for the treatment group, and growth delay. For clinical tumor growth assessment, there is rarely a placebocontrolled or untreated comparison, so a similar analysis  approach is not possible. Instead, clinical tumor response is typically evaluated using the Response Evaluation Criteria in Solid Tumors (RECIST)-based objective response rate, 21 objective response rate (ORR) and progression-free survival (PFS). In RECIST 1.1, tumor burden is measured as the sum of longest diameters (SLDs) of up to 5 measurable lesions. ORR is calculated as percentage of patients who achieve at least a 30% decrease in SLD from baseline at any assessment. PFS is measured as time to tumor progression, or time to death (whichever comes first). Although PFS can be used to support new drug approval, it is time to death from any cause (a.k.a. overall survival (OS)), which is considered the gold standard for drug efficacy. For animal welfare reasons, survival is not an end point used in nonclinical studies: however, the time to a predefined death end point is the closest nonclinical end point comparable to OS. These nonclinical and clinical end points are often compared inappropriately: for example, tumor growth inhibition in an animal model is often directly correlated to clinical ORR, without proper PK/PD driven interpretation and without consideration of whether the xenograft/ allograft model used was representative of the clinical population. 7 The work of Wong et al. 7 and Rocchetti et al., 22 as well as Inaba and co-workers 23,24 has at least demonstrated the potential to predict the clinical efficacy, at least as a ranking, of multiple compounds by considering PK differences between species as well as potency and clinical tolerability. An analysis of the National Cancer Institute nonclinical and clinical databases 25 is suggestive that broad activity across a range of animal cancer model is indicative of activity in patients. However, the same analysis did not demonstrate that matching histology increased predictivity.
Based upon the above observations in the literature, achieving unbound drug exposure 26 and biomarker modulation in patients similar to those observed in animal studies should be considered as necessary (e.g., ref. 27), but not sufficient for observing efficacy in the clinic. Obtaining PD data in patients is an opportunity to back-translate and build upon nonclinical learning and improve translation. The pembrolizumab case study 28 illustrates the benefits of such an approach and provides a useful roadmap for the challenging case of immunomodulation. However, obtaining tumor-based PD data in patients requires invasive testing, considerable resource, and may ultimately impact patient recruitment to clinical trials.
A clinical (and nonclinical) challenge is measuring PD longitudinally in the same subject. Failing to collect longitudinal data makes it very difficult to disentangle the various sources of variability from the actual PD trajectory. Paired (pre-start and post-start of treatment) biopsies are commonly collected in early clinical trials to demonstrate target engagement in tumors and to determine whether the necessary pharmacology for efficacy is achievable at tolerated doses. The two biopsies will not be from the same location of the tumor and, therefore, some variability between the samples is driven by variations in the composition of the samples, in addition to whether the sample is taken from a primary vs. metastatic lesion. The situation is even more challenging in nonclinical studies, where, despite the ease of measuring tumor burden longitudinally, only control and treatment groups are compared for PD, and there is no internal control for interindividual variability (i.e., tumor tissue samples for PD analysis are often collected at termination only).
Thus, there is the potential for significant inflation of PD variability estimates. Some laboratories have investigated the use of fine-needle aspirates to obtain serial samples for PD from subcutaneously implanted tumors in mice. 29,30 Although the data are encouraging, some spatial variability still confounds full analysis. Therefore, further investment in experimental methods that allow more nonterminal longitudinal tumor biopsies in animals are recommended.
There is also the potential to measure more clinically comparable circulating and imaging end points. Indeed, Rago et al. 31 have reported the measurement of circulating tumor DNA in mice as a longitudinal surrogate for tumor burden. Further, Mair et al. 32 demonstrated that, based upon data from a patient-derived orthotopic model of glioblastoma, mitochondrial DNA has potential as a sensitive endpoint for detecting tumors. Imaging of tumor metabolism has been investigated with fluorodeoxyglucose-based positron emission tomography imaging in nonclinical and clinical studies 33,34 demonstrating this marker to be predictive of clinical outcome after chemotherapy, as well as with fluorescence imaging. 35 Other metabolism-based imaging end points have been investigated in mouse models, including labeled pyruvate, fumarate, and lactate, 36,37 and shown to be predictive of treatment effect. These noninvasive longitudinal end points, that can be measured in animal models and patients, present the opportunity for more translatable criteria to proceed in the clinic.

TRANSLATIONAL CONSIDERATIONS IN PHASE I ONCOLOGY TRIALS
All oncology first-in-human (FIH) trials make use of nonclinical data in their justification and study design. Aspects of FIH trials, which may use these data include: • The starting dose. This is typically derived from a combination of animal safety data and predicted human PKs, derived from in vitro and in vivo PK data. 38 • The PK sampling scheme. The predicted human PK profile is commonly used to plan blood collection time points to optimally evaluate clinical PK. Nonlinearities observed in nonclinical PK studies may be incorporated. • The planned dose range and dosing frequency. By identifying a target efficacious concentration through nonclinical biomarker/xenograft studies, this information may be coupled with the predicted human PK to project the efficacious dose range in humans. This helps evaluate the viability of the candidate molecule and helps to plan the dose amounts and dosing frequencies, which may be explored in the clinical trial. • The PD sampling schedule. If a biomarker for target engagement or biological activity is identified preclinically and might be relevant/detectable in the clinic, the dynamics of the PD response in vivo, which has been characterized preclinically may be used to inform PD sample collection time points in the clinical study.
In seeking an understanding of the potential efficacious dose range from nonclinical data, robust translational animal models and appropriate PK/PD analysis tools are required. Where soluble or tissue PD biomarkers used in animals can be evaluated in patients, exploring the dose range during FIH trials in the clinic becomes a highly informed process. However, such biomarkers often do not readily exist, so clinical dose exploration is typically limited to identifying safe and tolerated doses up to a maximum tolerated dose (MTD) level. However, by using a guided approach to dose escalation and schedule exploration based on in vivo potencies and efficacious drug concentrations identified in animal models, the potentially efficacious dose and dosing schedule may be better identified ahead of phase II studies. FIH dose escalation plans, which incorporate nonclinical predictions of the efficacious dose range, should be updated with emerging clinical (and nonclinical) safety and/or efficacy data as it emerges to inform dosing regimens for testing in subsequent studies.
The approach described above, where predicted exposure-response relationships are updated as clinical (and nonclinical) data become available, are most powerful when they are actively incorporated into FIH study design. Such alternative model-based study designs for FIH trials take emerging data and update the PK/PD model assumptions and parameters, then use the revised predictions to inform the next dose level. One such example of model-based study design, the time-to-event continual reassessment method, 39 incorporates a time-dependent dose-toxicity relationship, which predicts the likelihood of an adverse event. Time-to-event continual reassessment method designs provide the opportunity for longer dose limiting toxicity windows without excessive delays in running the study, which may be particularly relevant for radiotherapy studies. Another example of model-based study design is the exposure-driven escalation with overdose control, 40 which uses emerging PK information from the ongoing clinical study to update the model and to predict the next dose. In all, these designs allow the nonclinical data package to inform clinical dose escalation, while also considering human PK/PD data as it becomes available. These study designs typically offer a faster path to FIH study completion, partly because the Bayesian methodology accepts uncertainties associated with small patient data sets. Model-based approaches also incorporate data from all cohorts, something rule-based approaches generally do not do.
There is ongoing debate as to whether phase I studies are adequately designed for the purposes of the clinical development program that will follow. As an example, it is not clear whether current rule-based dose escalation designs 41 fully characterize the safety and tolerability profile of the drug, particularly chronic dose-limiting low-grade toxicities. Even the newer designs discussed above may fail to predict delayed or chronic toxicities due to the definition of the dose limiting toxicity period. This leads to dose interruptions and reductions in later clinical development 42 that the FIH study did not clearly predict. As data emerge from the FIH and phase II clinical trials, the use of PK/PD modeling to evaluate the therapeutic window and propose optimized dosing regimens should be used during clinical development. This allows any newly proposed doses to be assessed prior to registration where possible (e.g., phase II/ III). Making final dose decisions based solely on FIH data can be particularly erroneous, as cancer drugs often exhibit large PK and biological variability either due to drug characteristics or the use of highly heterogeneous patient populations (e.g., multiple tumor types) and small patient numbers in typical FIH study designs. In addition, as oncology FIH studies become more complex and potentially continue through to registration (e.g., via a multi-arm multistage study), such re-evaluation of the optimal dose and dosing regimen should be actively considered.
Historically, FIH studies have prioritized safety and MTD finding as the primary objective(s), with PK, immunogenicity (for biologics), and PDs being either secondary or exploratory objectives. However, the MTD approach does not necessarily best inform the recommended phase II dose. 43,44 This is particularly relevant for highly selective drugs (e.g., monoclonal antibodies), which elicit maximal effect at sub-MTD doses. 45 As drugs become highly targeted, PK and PD end points and combined PK/PD analyses should take greater prominence in FIH design and recommended phase II dose identification. Furthermore, cancer drugs are commonly used in combination with other novel drugs or standard of care therapies, and many studies now incorporate both single agent and combination arms, so PK and PD can be characterized in parallel. In these complex cases, PK/PD evaluation can (as study design allows) be used to identify and quantify the individual contributions of each combination drug to the observed combination therapy safety/efficacy profile, and thus optimize the combination dose to achieve the best possible benefit/risk ratio.

TRANSLATIONAL CONSIDERATIONS IN PHASE II AND PHASE III
Regulatory agencies consider OS as the most reliable and, if feasible, the preferred end point for oncology trials. 46 However, for cancer indications with median survival times > 1 year, relying solely on OS for drug approval can significantly delay the availability of treatments to patients. In lieu of OS, regulators frequently accept PFS, and occasionally ORR, as end points for accelerated approval pathways and registration. Furthermore, whereas OS is an unambiguous, precisely measured end point, its relationship to the study treatment under consideration can be confounded by the subsequent treatment sequence. A key challenge, therefore, is the prediction of OS from early indicators of drug activity, such as change in disease burden as assessed by RECIST criteria. 47,48 Early indicators of activity, such as change in SLDs at 8 weeks, may indicate extent, but not duration, of response. 49 Such work is useful to provide an early assessment of drug viability 50,51 ; but typically cannot be used to support drug registration. An interesting model-based observation 52 is the potential discordancy between initial rate of tumor shrinkage and the duration of response. Further, after progressing on an early line of treatment, a patient is often switched to a different treatment, which will hopefully improve their OS. It is likely that each line of treatment impacts the composition of progressing tumors 53 in a drug target-specific and mechanism-specific way, probably resulting in increasing within-patient and between-patient disease heterogeneity as treatment proceeds. Indeed, in some trials where the study drug becomes an approved treatment option, it is possible that some patients who were randomized to receive placebo may later receive study treatment. Statistical methods have been proposed to derive unbiased estimates of treatment effect in the presence of treatment switching. 54 It should also be recalled that phase I trials usually recruit patients with late-stage disease with a range of different tumor types. Clearly, the development of predictive early efficacy end points and modeling techniques is an area requiring ongoing modeling research. Of note is that for IO treatments a revised assessment criterion, iRECIST, 55 has been developed to account for differences in the initial response of tumors to those treatments.
In later phase trials, modeling of clinical efficacy data concentrates on SLD, PFS, and OS where available. When modeling tumor dynamics, researchers should be aware of the information that may be gained by using the spectrum of currently available and emerging radiographic features instead of reducing the richness of the imaging data to just SLD. These features include the following: a individual lesion diameters (ILDs) and their response to treatment b lesion location c lesion volume d lesion morphology e nontarget lesion data and appearance of new lesions An example of currently available feature is ILD and modeling data sets should include these wherever possible. If an exploratory analysis reveals divergent or mixed responses between individual lesion within subjects, researchers should consider modeling ILDs rather than SLDs. Such an analysis can be achieved using a nested hierarchical model, such as a mixed effects approach. 56,57 There can be significant correlation between ILDs within a patient, 58 possibly driven by PK and common genetics, however, they are not perfectly correlated. The impact of dropout, compliance, and the existence of patient subpopulations, for example, via statistical mixture modeling, should also be explored. Promising efforts are under way to combine qualitative longitudinal data on nontarget lesions and new lesions with target lesion data to improve outcome predictions. 59 Moreover, incorporating biomarker responses in tumor kinetic modeling has the potential to resolve the pharmacological variability of different response types in order to better predict the duration of response. 60 Despite some progress, longitudinal tumor size models tend to be agnostic of mechanism of action (MoA). 61,62 Careful review of the dynamic features of nonclinical data accompanied by MoA-based hypothesis generation and model-based hypothesis validation can help identify mechanistic model features in nonclinical data, which may be hard to establish in clinical data due to small subject numbers and limitations of clinical practice. To this end, in vitro data on pazopanib's anti-angiogenic and cytotoxic effects together with dynamic features in mouse in vivo data guided the development of a semimechanistic model, including both MoAs, which resulted in improved characterization of both in vivo and clinical tumor size data. 63 Another opportunity to inform more mechanistic clinical tumor size models is provided by applying nonclinical mathematical models linking PD biomarker changes to tumor response 8,64 to clinical data.
A second challenge for oncology clinical development are the complexities arising from the need to combine an experimental treatment with the current standard of care as a "back bone" treatment 65 or with another experimental treatment to increase the pool of responding patients. With many novel-novel combinations available, nonclinical experiments and translational modeling play an important role to guide clinical development by: (1) prioritizing the choice of combination partner and the degree of synergy by in vivo efficacy ranking 27 ; (2) assessing whether concurrent administration or specific sequencing is more promising or worth exploring 64,66 ; and (3) characterizing the in vivo PD response to guide combination biomarker strategy and starting dose selection. 67 There are few examples of clinical trials that go on to test optimal sequencing of treatments, although one notable example is the GeparNuevo trial combining an immune checkpoint inhibitor with chemotherapy. 68 Third, there are challenges to applying PK/PD and other dose/ exposure analyses in oncology. Partly, this is due to the impact disease can have on PK and so "PD feeds back on PK, " as has been shown for checkpoint inhibitors. 69,70 Here, disease state variables with prognostic potential can impact drug exposure, and thus can act as confounders for exposure-response analyses. There are also the issues of adherence, dose interruptions, and dose reductions, that might confound standard exposure response analyses. This makes interpreting efficacy and safety end points challenging. A good example of this 71 was the abemaciclib MONARCH 2 trial, where OS curves stratified by drug exposure on day 15 of treatment suggested that patients with the lowest exposure fared better. However, this is likely to be an artifact introduced by dose interruptions and reductions seen predominantly for patients with greater than average drug exposure. Indeed, for monoclonal antibodies whose PK depends on target expression, time-varying changes in drug clearance are often related to response, such as for avelumab where the magnitude of reduction in drug clearance over time was higher in responders compared with nonresponders. 72 Another challenge to exploring full exposure-response in oncology is that exposing patients with cancer to doses that are predicted to be ineffective (as would be expected in a full dose-response characterization) raises ethical questions and many of the newer FIH trial designs minimize the number of patients treated in such low-dose cohorts.
Radiotherapy is an example where translatable nonclinical data (in vitro) informs the treatment of patients. This includes the optimization of dosing schedules when incorporated into mechanistic mathematical models of tumor response. 73 The Linear-Quadratic (LQ) model 74 works well in its domain of applicability: radiation doses of 1-8 Gray. A knowledge of the cumulative on-target effect of radiation, toxicities in neighboring healthy tissues, and the potential for tumor recovery during treatment holidays has led to a quantitative framework to guide patient treatment: parameters for the LQ model from in vitro studies with cell lines relevant to a patient's disease are used as part of this. The effectiveness of these approaches is backed up by prospective trial results. 75 However, the LQ model may not be appropriate for some of the hypofractionated high-dose regimens that are being tested now. 76 Therefore, an updated model is required to broaden the domain that this model framework can be applied to. Park et al. 76 proposed a "universal" survival model, combining the LQ equation for low dose and the multitarget model for high dose. Below a certain cut-off dose, the LQ equation is used; whereas above this dose, the multitarget model was suggested.
An interesting development is the use of image-guided radiation, including the use of image density as a surrogate of cellularity. This is used with the hope of concentrating the dose where the tumor cells are most prevalent. This touches on the application of radiomics 77 to provide surrogates of biological processes that might aid diagnosis and treatment decisions as well as provide useful mechanistic information for modeling approaches. As part of the deep learning revolution in imaging, new feature extraction methods are arising, potentiating the extraction of biomarkers from images and their multifaceted analysis. 78

MATHEMATICAL MODELING APPROACHES TO TRANSLATION IN ONCOLOGY
For systemic therapies used to treat animal models, several mathematical modeling approaches have been applied. The most commonly adopted models focus primarily on drug effects in a nonspatially resolved framework of coupled equations with tumor mass given as a growth law. The models for mass growth vary greatly, ranging from simple exponential 8 to more complex models, 79 including models based on tumor surface growth. 80 However, distinguishing between growth laws is challenging, given the available data. 81 In practice, the utility of the modeling has been largely to formally derive a concentration-effect relationship, 82 but also to investigate drug MoA, scheduling, and combinations (e.g., ref. 64 by incorporating biomarkers of target engagement and/or biological response. The end-goal of the model then is focused on establishing these longitudinal biomarkers as predictors of tumor response, and their relationships to dose and/ or concentration. These relationships are then used to set target concentrations (and so doses) that support taking a new molecular entity into the clinic as well as setting PK, PD, and tumor kinetic criteria for progression into phase II trials. The rationale for this is that biomarker modulation enhances and augments the predictive power of changes in tumor mass. 60 The tumor size data from image-guided radiation systems provide near daily measurements of tumor size: something that is not seen in trials of systemic therapies. This provides informative data 83 that allows the initial response and growth of tumors to be modeled mathematically in detail. Further mathematical modeling studies 84 have incorporated tumor composition and spatial detail to explain delayed treatment responses. However, when considering clinical data, such full spatial model parameterizations are typically unfeasible with the temporal resolution of the data only allowing a concentration or dose dependency to be determined from the model. As a result, Lewin 84 also developed a simpler two compartment ordinary differential equation-based model to describe tumor dynamics as a result of radiotherapy, which could be used for the type of tumor size data typically collected in the clinic.
As discussed earlier, translation of nonclinical responses to clinical outcome and drug registration is highly challenging. One potential solution is to incorporate additional mechanistic descriptors in the modeling framework to describe differences between animal models and the clinic and so potentially improve translation, such as proliferating fraction, cell cycle time, and drug-resistant cells. Most models used in drug R&D programs are not spatially resolved, despite there being clear evidence that the spatial structure of tumors will considerably affect tumor response to treatment. 85,86 This contrasts with the academic literature on tumor modeling where spatially resolved models have been developed. 87 Spatial effects of clear relevance to predicting patient response include potential structural and genetic heterogeneity within tumors. Considering drug concentration gradients in tumors, stromal architecture can affect drug availability by several orders of magnitude 85 and can drive drug-resistant clonal selection. 88 Modeling approaches could encompass detailed tumor simulations of drug availability 89 through to classification-based approaches where architecture can be used to predict responsiveness. 90 However, the within-patient tumor physical and mechanical environment influences PD effects, 91,92 with recent results showing efficacy across a range of cell lines and therapeutics being affected by environmental stiffness. 93 Using modeling to encompass the range of confounding effects would involve substantial complexity but with the potential to improve translational efficacy, for example, to account for the different sensitivities of tumors in different anatomic locations. A recent example of the application of mechanistic modeling to understand drug distribution effects is that for antibody-drug conjugates 94 where antibody and drug PK are integrated alongside the expression and internalization of the target cell surface receptor.
An important new and developing modeling application is that of tumor-immune interactions, with currently few examples developed with direct pharmaceutical applications, despite this class of therapy becoming increasingly important. There are, however, new models being developed that capture tumor-immune interaction in a mouse syngeneic tumor allowing combinations of radiation and immune checkpoint inhibitors to be modeled. 95 A second example of the application of mechanistic modeling to IO is the quantitative modeling of chimeric antigen receptor T cell therapies. 96,97 Agent-based and individual-based simulations are increasingly used for this application, with these simulations being, of course, spatially resolved in their outputs. 98 This does, of course, increase their computational cost. This should be an area where the community revisits existing models proposed in mathematical immunology, in the current context of increased availability of data to inform these models (e.g., ref. 99). These (mostly differential equations-based) models, despite having been proposed some time ago, can still be used to describe complex immunological mechanisms, such as polyclonal responses, in a less computationally costly framework.

DISCUSSION: OPPORTUNITIES FOR TRANSLATIONAL ONCOLOGY
It is clear that there are several opportunities that should be explored to further the application of quantitative techniques in oncology, not least because oncology has the largest body of literature on mathematical modeling. 100 It is becoming clear that in vivo to clinical translation is not the only challenge: prediction from one phase of clinical trials to another is still difficult. A key issue for the field is the lack of connection between modeling efforts: preclinically to clinically; statistical to mechanistic. 101 Below is a summary of key opportunities, reported in their approximate chronological order in a drug R&D program progression process.
There are significant gaps in the literature in evaluating the predictivity and reproducibility of animal cancer models. Many studies consider relatively small data sets or are deficient because they do not correct for such things as the differences in free drug exposure achieved in the mouse vs. patients. This can result in animal experiments that cannot be replicated, an issue compounded by the poor reporting of basic experimental parameters, such as animal numbers and randomization. 102 This gap could be filled via a precompetitive exercise to look at reproducibility between laboratories of in vivo cancer models: does the "same" animal model grow the same way and show the same concentration-response relationship across laboratories? If not, then translation from a handful of models in one laboratory is a forlorn hope.
Following from this comes the idea of "cohorts of biological models": using multiple animal cancer models to fully elucidate mechanism of action of direct tumor targeting vs. immune effects and pre-empt potential heterogeneity in clinical response. Stressing the therapeutic index hypothesis fully in a range of nonclinical models might reduce the impact of false-positive findings from nonclinical research and guide the indication selection for early clinical development. There is some evidence that generating a more heterogenous nonclinical data set will improve the robustness and reproducibility of results. 103 How the range is chosen (cell lines or PDX models) would depend on the specific question at hand. Importantly, all the models selected should be included in an integrated quantitative modeling framework, and bias toward just those animal models that provide a positive signal should be avoided.
Although research organizations use nonclinical data and modeling to inform early clinical studies, the question of quantitative translation remains open. One clear opportunity is to translate the mechanistic learning from in vivo models to the clinic by applying mathematical models that reflect the drug MoA. The need, therefore, exists for simple mechanistic models that can be applied to animal and clinical data to allow direct comparison. Several reports on the performance of a broad range of existing models for tumor growth trajectory 81 point to there being no clear differences in explanatory power when applied to nonclinical data. However, few of these models mechanistically describe the key physical processes of tumor growth and response to treatment and, therefore, what the key differences between animal models and the human disease might be. Thus, models that incorporate biomarker modulation and its impact on tumor growth inhibition 104 would be preferable, especially because these mechanistic aspects would, in principle, translate better than simple tumor mass growth patterns. It is also clear that further work needs to be done to develop mathematical models that incorporate aspects of the immune system and its interaction with tumors.
Current phase I study designs and data analysis approaches do not always fully characterize the frequency and severity of adverse events observed in the wider patient population, 42 mainly due to logistical challenges in the design and execution of these trials. This leads to dose interruptions, reductions, and withdrawal of patients from subsequent trials, all of which can severely confound the analysis of exposure response in phase II to further optimize regimen. Therefore, further research in trial design and data analysis is needed into generating a more adequate characterization of drug-dependent efficacy profiles and adverse events.
Further exploration is required of new sources of clinical and nonclinical in vivo data that could help parameterize models: greater use of computed tomography scan data, such as radiomic approaches, to inform on the physical structure of tumor lesions and development and validation of one-dimensional quantitative measures of spatial characteristics, which can be integrated in clinical modeling approaches; circulating biomarkers, such as tumor antigens (including PSA and CA125), circulating tumor cells and circulating free tumor DNA; imaging mass-spectrometry data to give spatial resolution on drug and biomarker distribution in tumor tissue. 105 Data sharing through public databases (e.g., Project Data Sphere and Vol-PACT: https://fnih.org/what-we-do/biomarkers-consortium/programs/vol-pact) and precompetitive industrial collaborations could facilitate this.
Such mechanistic models should contain some of the more spatially relevant effects discussed in preceding sections, as well as potential intra-tumor and inter-tumor heterogeneity that would be predictive of patient response rate and emergence of therapy-resistant disease. However, when considering the development of more mechanistic models, the complexity of a model needs to be tailored to the level and richness of data available from clinical studies.
Another opportunity for a more mechanistic approach to modeling clinical data would be the ability to integrate apparently disparate clinical trial data sets by reflecting differences/ commonalities via the model parameterization. This can now be done, for example, in pharmacometric analyses of clinical data, where PK and efficacy information from several clinical trials is pooled in a single analysis and the underlying model is shared across all datasets. 28 If extended to other cancers or treatments, this kind of analysis could bring a more global understanding of the disease course of a cancer, the response to different therapies, and how previous therapies impact the response to subsequent therapies, including evolutionary pressures and how different therapies result in the emergence of resistant clones. Similarly, MoA-based models would enable comparison of the effect for a drug or combination of drugs across a wide range of cancers, with the assumption that the MoA is conserved but the sensitivity to treatment might vary between cancers. Such models might enable a more refined prediction for new drugs with an MoA related to an existing therapy. The data sources to inform such models is likely to be a synergy of nonclinical and clinical information.
A final opportunity for such a consistent, MoA-driven modeling approach is the optimization of dosing schedules in combinations, including gaps and sequences. Combination treatments can be explored in some level of detail in a nonclinical setting, and so reapplying the model in the clinic with minimal recalibration (purely simulating existing, sparse measurements, instead of refitting the models) allows knowledge to be quantitatively transferred and the nonclinical hypothesis to be formally tested against clinical data.
The failure of such a test would nevertheless be informative. With a refined understanding of the effects of existing treatments, as discussed in the previous paragraphs, combination effects might be predicted and optimized computationally, by combining mechanistic models of monotherapy. 106 However, one clear barrier to the full application of modeling and simulation to combination regimen optimization is given by the objective obstacles to altering the regimen for standard of care or advanced stage experimental medicines.
Clearly, in conclusion, mathematical modeling approaches have impacted translation in oncology drug research and development. Coupled with more sophisticated nonclinical biological models of cancer and a growing range of measurement techniques, these have enabled more informed decision making. In this report, we have surveyed the existing state of the art, the challenges of translation, and identified where there are further opportunities to make gains in more efficient drug research.