Bayesian Data Assimilation to Support Informed Decision Making in Individualized Chemotherapy
Abstract
An essential component of therapeutic drug/biomarker monitoring (TDM) is to combine patient data with prior knowledge for model-based predictions of therapy outcomes. Current Bayesian forecasting tools typically rely only on the most probable model parameters (maximum a posteriori (MAP) estimate). This MAP-based approach, however, does neither necessarily predict the most probable outcome nor does it quantify the risks of treatment inefficacy or toxicity. Bayesian data assimilation (DA) methods overcome these limitations by providing a comprehensive uncertainty quantification. We compare DA methods with MAP-based approaches and show how probabilistic statements about key markers related to chemotherapy-induced neutropenia can be leveraged for more informative decision support in individualized chemotherapy. Sequential Bayesian DA proved to be most computationally efficient for handling interoccasion variability and integrating TDM data. For new digital monitoring devices enabling more frequent data collection, these features will be of critical importance to improve patient care decisions in various therapeutic areas.
Study Highlights
- WHAT IS THE CURRENT KNOWLEDGE ON THE TOPIC?
☑ Typically, maximum a posteriori (MAP) estimation is used to combine prior knowledge with patient-specific monitoring data for model-based predictions to support individualized therapy (e.g., in chemotherapy).
- WHAT QUESTION DID THIS STUDY ADDRESS?
☑ The study addresses major limitations of MAP-based prediction for efficient and reliable clinical decision support: (i) it only provides a point estimate without associated uncertainties, (ii) it does not transform into the most probable observation (e.g., nadir concentration), and (iii) it processes data in a batch.
- WHAT DOES THIS STUDY ADD TO OUR KNOWLEDGE?
☑ Bayesian data assimilation (DA) methods are compared that allow a comprehensive uncertainty quantification by approximating the full posterior distribution. We demonstrate the advantages of uncertainty quantification in therapeutic drug/biomarker monitoring for chemotherapy-induced neutropenia. Sequential DA methods were found to be particularly suitable for long-run analyses as they process data sequentially.
- HOW MIGHT THIS CHANGE DRUG DISCOVERY, DEVELOPMENT, AND/OR THERAPEUTIC?
☑ Comprehensive uncertainty quantification and recursive data processing enable reliable, efficient, and individual decision support during ongoing treatment and will become increasingly relevant with the development of novel digital monitoring devices.
In the presence of a narrow therapeutic window and large interpatient variability, therapeutic drug/biomarker monitoring (TDM) is indicated for safe and efficacious therapies. With the help of Bayesian forecasting tools, patient-specific data are combined with prior knowledge from previous clinical studies and a drug-specific model to enable model-informed precision dosing (MIPD).1 Typically, only the most probable individual parameter values (i.e., the maximum a posteriori (MAP) estimates, are used to predict the individual therapy outcome without quantifying associated uncertainties.2 Thus, relevant risks associated with a dosing regimen selection (e.g., treatment inefficacy or unacceptable toxicity), are not determined hindering well-founded therapeutic decision making.
Quantifying associated uncertainties is at the heart of making more informed decisions, not only in clinical applications. In this article, we thoroughly compare in a TDM context different Bayesian data assimilation (DA) methods that either (i) estimate the full posterior distribution (termed full Bayesian approaches) to quantify uncertainties, or (ii) estimate only its mode (MAP estimation) without uncertainty quantification. The full Bayesian approaches comprise not only methods that process patient data collected over time in a batch (i.e., all at once), like Markov Chain Monte Carlo (MCMC), sampling importance resampling (SIR), and a normal approximation (NAP) to the posterior at the MAP estimate, but also particle filters (PFs) that allow for efficient sequential data processing. PFs are well-established in areas of application in which real-time predictions based on online/monitoring data are required, as in navigation, meteorology, and tracking.3-5
In the context of chemotherapy-induced neutropenia—the most frequent dose-limiting side effect for cytotoxic anticancer drugs with substantial decrease of neutrophil granulocytes and, thus, potentially life-threatening fever and infections6—we demonstrate the clear benefits of uncertainty quantification compared with purely MAP-based predictions (e.g., ref. 7) using the gold-standard model for neutropenia.8 Further, we compare the full Bayesian approaches regarding quality of uncertainty quantification and computational runtime for multiple cycle chemotherapy.9 Although MCMC, SIR, and PF all provide a reliable uncertainty quantification, the efficient data processing of the sequential approach will be clearly beneficial in a continuous monitoring context, where digital healthcare devices (e.g., wearables) allow patients to measure and report individual marker concentrations online.
Methods
First, the statistical framework of MIPD in TDM is introduced, which is used throughout the different methods described below. Then, the considered clinical application scenarios are described along with the prior knowledge from literature.
Statistical framework
Because direct sampling from the posterior is, in general, not possible, alternative approaches (described below) need to be used to generate a sample of the posterior. For a detailed description, see Section S3–Section S7.
MAP estimation
The MAP estimate is a one-point summary of the posterior distribution, without quantification of the associated uncertainty.
NAP
SIR
The SIR algorithm is a full Bayesian approach. It generates an unweighted sample from the posterior based on a sample from a so-called importance distribution G, from which samples can easily be generated (Section 10.4 in ref. 11). SIR proceeds in three steps. S-Step: Sampling from the importance distribution G resulting in a sample . We considered the prior as importance distribution, assuming that the patient under consideration is sufficiently well-represented by the clinical patient populations given by the prior. I-Step: Each sample point is assigned an importance weight given by the likelihood (in case G is the prior). R-Step: After normalization of the weights , a resampling is performed: S unweighted samples are drawn from according to weights .
Note that once a new data point becomes available, the SIR algorithm does not simply update the present sample points in , but re-performs all three steps based on the updated posterior to determine .
MCMC
A popular alternative to SIR in Bayesian inference are MCMC methods with a wide range of different algorithms.15 MCMC generates an unweighted sample from the posterior by means of a Markov chain (Section 11 in ref. 11). MCMC comprises two steps to generate sample points : a proposal step (generating a potential new sample point ) and an acceptance step (accepting or rejecting as a new sample point). The challenge in MCMC is to design application-specific proposal distributions.
In TDM, MCMC was previously considered with the prior as fixed proposal distribution (independence sampler) for sparse patient monitoring data.16 We observed, however, large rejection rates with increasing number n of data points, because the posterior becomes narrower, see Section S6. To counteract large rejection rates, we used an adaptive Metropolis-Hastings sampler with log-normally distributed proposal distribution (see Section S6 and Figure S4 for details).
PF
For , the distribution is identical to the prior (Section 3 in ref. 10). Note that due to independence in Eq. 3. Analogously to the I-step in SIR, the weights are updated proportional to the (local) likelihood: involving, however, only the new data point . As in SIR and MCMC, evaluation of the likelihood involves solving the structural model (1). Because the structural model is deterministic, one may either solve Eq. 1 with initial condition for the total timespan , or with initial condition for the incremental timespan . The latter approach requires to store for each sample point also the corresponding state at time , because typically the structural model cannot be solved analytically. The incremental approach makes use of the Markov property that the future state is independent of the past when the present state is known.
The resulting triple is called a particle. In our setting, the ensemble of particles can be interpreted as the state of a population of virtual individuals at time , whose “diversity” represents the uncertainty about the state/parameters of the patient at time , given the individual measurements . The posterior obtained by n sequential update steps in Eq. 9 is mathematically identical to the posterior obtained in Eq. 5 by assimilating all data in a batch (Section 3.3.3 in ref. 19). However, the sequential update is much more efficient as it involves a reduced integration time span.
Biomarker data during chemotherapy for single/multiple cycle simulation studies
Two simulation studies (see below) were performed in MATLAB R2017b/2018b, see supplementary information files SCode, to analyze the approaches regarding their suitability to support MIPD.
For the single cycle study with docetaxel (, 1 hour infusion), we used the NLME model in ref. 20. It is based on the well-known pharmacodynamic model in ref. 8 (Figure S6) and describes the effect of a single dose of the anticancer drug docetaxel based on monitoring neutrophil counts. Important model parameters are the drug effect parameter “Slope” and the pretreatment baseline neutrophil concentration “Circ0.” For inference, neutrophil concentrations were considered on a log-scale at time points days postdose, see Section S8 for full details. This simulation study aims to demonstrate the limitations of MAP estimation for a model frequently used in MIPD for TDM.7, 21 Because recursive data processing and decision-support gain in relevance for long-term monitoring, we performed a simulation study for multiple cycle therapy with paclitaxel using the NLME model in ref. 9. It describes the effect of the anticancer drug paclitaxel (200 mg/m^{2}, 3-hour infusion) over six cycles of 3 weeks each, corresponding to treatment arm A of the CEPAC/TDM study.22 In ref. 9, an aggravation of neutropenia over subsequent treatment cycles is accounted for by bone marrow exhaustion9 (Figure S6). The model includes interoccasion variability (IOV) on pharmacokinetic parameters describing the variability between cycles within one patient. Therefore, the parameter values comprise the interindividual parameters () and a parameter for each occasion (). As a consequence, the size of increases with every occasion/cycle, , where denotes the number of cycles, see Section S9 for details. Neutrophil counts were assumed to be monitored every third day. We were interested in a setting where data become available sequentially (one-by-one). To this end, neutrophil count data were simulated for a virtual patient using Eqs. 3 and 4 and the corresponding model. Then the individual parameter values were inferred based on the simulated neutrophil count data available up to a certain time point, using the same model. For the statistical analysis, this procedure was repeated for virtual patients (with covariate characteristics mirroring the real study population underlying the NLME model).
Key characteristics for decision support in cytotoxic chemotherapy
We investigated different characteristics of the neutropenia time-course related to risk and recovery.7 Depending on the nadir (i.e., minimal neutrophil concentration), different grades of neutropenia are distinguished, see Figure 1. Neutropenia grade 4 is dose-limiting as this severe reduction in neutrophils exposes patients to life-threatening infections. On the contrary, neutropenia grade 0 is also undesired as it is associated with a worse overall treatment outcome.23 The time at which the nadir is reached is important for time management of intervention. We considered the patient out of risk at time when neutropenia grade 2 is reached post nadir. For the initiation of the next treatment cycle, the recovery time to grade 0 is important. In addition, risk is also related to the duration and of an individual being in grade 3 and 4 neutropenia, respectively.
Workflow in Bayesian forecasting
In full Bayesian forecasting, uncertainty is quantified on the parameter level and subsequently propagated to the observable level, possibly summarized for some key quantities of interest, see Figure 2. Prior to observing patient-specific data, the parameter uncertainty is characterized by the prior (cmp. Eq. 4). It allows to make a priori predictions of the neutropenia time course and its uncertainty in form of a -confidence interval. In addition, a priori predictions for quantities of interest can be derived (e.g., the neutropenia grade (Figure 2, left column)). Once patient-specific data are assimilated into the Bayesian model, the remaining uncertainty on the parameter values is characterized by the posterior, allowing to also update the uncertainty in the observable space (CrIs) and the quantities of interest (Figure 2, middle column).
For the transformed MAP estimate, , the first term in the right hand side of Eq. 12 is zero, because its first factor vanishes by definition. The second term, however, is non-zero, because both its factors are non-zero for nonlinear T. Therefore, the transformed MAP estimate does not satisfy the condition for the mode of the transformed posterior probability and, hence, .
Method comparison
Results
First, we show the limitations of MAP estimation for MIPD and how full Bayesian approaches can overcome these limitations (using SIR with as reference). Next, we compare different full Bayesian approaches with reduced sample sizes regarding accuracy and computational efficiency.
Unfavorable properties of MAP-based predictions
The first example of decision support in individualized chemotherapy uses the most frequently used model of neutropenia.8 The MAP estimate is derived from the parameter posterior given experimental data , see Eq. 7. In the context of TDM, it is used to predict the future time course of the patient and thereon based observables. In mathematical terms, is mapped to some quantity of interest (e.g., the nadir concentration). As pharmacometric models are generally nonlinear, this does, however, not result in the most probable outcome (see also paragraph preceding Eq. 12 in the Methods). This is due to the fact that first determining the MAP estimate and then applying a nonlinear mapping is, in general, different from first applying the mapping to the full parameter posterior and then determining its MAP estimate: , see Figure 3 for an illustration with and Figure S1 for more details.
Thus, MAP-based estimation lacks both, a measure of uncertainty and the feature to predict the most probable observation/quantity of interest. In addition, relevant outcomes, such as the risk of grade 4 neutropenia cannot be evaluated from the point estimate alone. MAP-based estimation, therefore, provides a biased basis for clinical decision making. In contrast, full Bayesian inference provides access to the full posterior distribution of the parameters and correctly transforms uncertainties forward to the observables and quantities of interest, allowing to compute any desired summary statistic and relevant risks (Section 5.2 in ref. 19).
Uncertainty quantifications for more comprehensive, differentiated understanding, and better informed decision making
The first scenario served to demonstrate the limitations of MAP-based estimations for the gold-standard model,8 however, the model does not account for the observed cumulative neutropenia over multiple cycles. Therefore, we considered for dose adaptations a model accounting for bone marrow exhaustion over multiple cycles,9 see paragraph about the multiple cycle study paclitaxel. We exemplarily considered the dose selection for the third treatment cycle based on prior information and patient-specific measurements during the first two cycles. The patient-specific data together with the full Bayes (reference) model fit and prediction are shown in Figure 4a, see also Figures S8 and S9. The CrIs (dashed) and prediction intervals (dotted) show the uncertainty about the “state of the patient,” without and with measurement errors, respectively.
For optimizing the dose of the third cycle, different dosing scenarios were considered: the standard dose and a –15%, −30%, and +10% adapted dose. Figure 4b shows the probability of the predicted grades of the third cycle for each dose. To find an effective and safe dose, the risk of being ineffective (neutropenia grade 0) should be minimized jointly with the risk of being unsafe (neutropenia grade 4). For illustration in Figure 4b, the dashed horizontal lines indicate a 10% and 5% level of being ineffective and unsafe, respectively. The standard dose and the increased dose have a risk of toxicity larger than 5% (lower horizontal line). A decrease in dose also leads to an increased risk of an ineffective dose (upper horizontal line). The 15% reduced dose is with 96% probability safe and efficacious (grades 13), with 3% probability ineffective (grade 0) and with 1% probability unacceptably toxic (grade 4). If grade 3 is also to be avoided, the 30% reduction would be preferable, as it is with 74% probability safe and efficacious (grades 12), with 15% probability ineffective (grade 0) and with 11% probability toxic (grades 34). Thus, the choice of an optimal dose might depend on how priority is given to the risk of inefficacy and toxicity. As both risks are described by the tails of the posterior distribution, a point estimate is not able to adequately capture them. The MAP-based predicted grades were: grade 2 (standard dose and +10% dose), grade 1 (−15% dose); and grade 0 (−30% dose), which do not only make it difficult to distinguish between some doses, but also do not reflect the true most probable grades.
Posterior-based predictions of important statistics related to the neutropenia time course can help to answer questions like “How probable is it that the patient will suffer from grade 4 neutropenia?” or “How probable is it that the patient will recover in time for the next scheduled dose so that the therapy can be continued as planned?” To answer such questions, Figure 4c shows important predicted quantities of interest, illustrated for the standard dose in cycle 3. We inferred that the risk of grade 4 neutropenia is 8%, and if the patient were to reach grade 4, it would be most probable (68%) on day 12. The probability that the patient’s duration in grade 4 is a day or longer is very small (< 7%). As the probability to not have been recovered until day 21 is negligible, the administration can remain scheduled on day 21 for cycle 4. Therefore, uncertainty quantification improves the decision-making process by quantifying the a posteriori probabilities of relevant risks and quantities of interests. Repeating the above analysis for different doses, therefore, allows for an improved distinction between dose adjustments.
Approximation accuracies comparable across different full Bayesian approaches
We next compared different established methods for uncertainty quantification with regard to their approximation accuracy. To this end, posterior inference was investigated for a patient at day 5 of the first cycle (Figure 5). Whereas the marginal posterior distribution for the parameter “Circ0” (pretreatment neutrophil concentration) is close to a normal distribution, the marginal posterior for the drug effect parameter (“Slope”) is closer to a log-normal distribution. Accordingly, the NAP is rather reasonable for “Circ0,” but is questionable for the “Slope” parameter. In addition, sampling from the normal distribution can lead to unrealistic (negative) parameter values (Figure 5a). In addition, NAP very clearly underestimated the patient’s risk to reach grade 4 neutropenia (Figure 5b,c), which could possibly lead to a fatal dose selection. Considering a Student’s t distribution instead of the normal approximation, as in ref. 13, did not lead to an adequate improvement (Figure S3). Consequently, the NAP approach can result in overoptimistic, overpessimistic, and unrealistic predictions. In contrast, the full Bayesian methods (SIR, MCMC, and PF) adequately represent the tails and respect the positivity constraint of parameter values. The resulting CrIs are comparable to the reference CrIs. For illustration, Figure 6a shows the approximation error for the predicted probability of neutropenia grades, measured in the Hellinger distance (see Eq. 13). Overall, SIR and PF showed the best approximation, whereas NAP resulted in the largest errors.
Sequential DA processes patient data most efficiently
The need for real-time inference algorithms is increasing with the possibilities to more frequently collect patient-specific data (online collection) during treatment. Sequential DA methods (e.g., PF) provide an efficient framework for real-time data processing. At any time, all information (including associated uncertainty) is present in a collection of particles that can be interpreted as representing the current state and associated uncertainty of a patient via a virtual population. With a new datum, this information is updated. Approaches that rely on batch data analysis (i.e., MAP, SIR, and MCMC), need to redo the inference from scratch. This has impact on the computational effort as the number of data points increases. Figure 6b shows a comparison of the computational cost to assimilate an additional data point. All approaches show some kind of increase in effort every 21 days—due to the IOV on some parameters. Clearly, PF shows lowest and almost constant costs, whereas for batch mode approaches computational costs increase over time due to an increasing number of parameters (one additional parameter for every cycle due to the IOV, see paragraph about the multiple cycle study with paclitaxel) and an increasing integration time span to determine the likelihood. This could become computationally expensive in view of long-term treatments and higher time resolution of data points provided by new digital healthcare devices. Figure 6c summarizes the features of the different inference approaches. Note that all sampling-based approaches can be accelerated by parallel computing. In summary, it was found that PF processes patient monitoring most efficiently and facilitated the handling of IOV because only the IOV parameter of the current occasion needs to be considered.
Discussion
In the context of chemotherapy-induced neutropenia, we illustrated the severe drawbacks of MAP-based approaches for forecasting and thereon based decision making. A prediction based on the MAP estimate does neither correspond to the most probable outcome nor does it allow to quantify relevant risks as the uncertainties are not quantified. Both are highly undesirable characteristics and make MAP-based inference difficult to interpret in a TDM setting. A normal approximation of the posterior at the MAP estimate is no alternative, as it retains the same point estimate and proved to be unsuitable in case of skewed parameter distributions. We demonstrated that full Bayesian approaches, like SIR, MCMC, or PF, provide accurate approximations to the posterior distribution, enabling comprehensive uncertainty quantification of the quantities of interest (e.g., nadir concentration). Among the three considered approaches, PF is a sequential approach that is beneficial in a more continuous monitoring context.
Uncertainty quantification in TDM is scarce. In ref. 27 the SIR algorithm was previously used in the TDM setting to construct CrIs using a Student’s t distribution located at the MAP estimate as importance function. A sequential approach in the context of MAP estimation is discussed in ref. 28 with a moving estimation horizon (window of data points that are considered). A sequential DA approach has been investigated previously for glucose forecasting,29, 30 yet, not in combination with an NLME modeling framework and without decision-support statistics. A systematic comparison of approaches, as presented herein, is lacking.
In this study, particle filtering is applied in TDM within an NLME modeling framework to represent the current patient status via an uncertainty ensemble. A challenge in the application of PF is the potential for weight degeneracy (i.e., a gradual separation into a few large and many very small weights). A rejuvenation approach (as applied in this study) resolves this problem, but requires specification of an additional parameter (magnitude of the rejuvenation). A too large value might result in an artificially increased uncertainty, whereas a too small value might hinder exploration of the parameter space. In the present application context, however, IOV counteracts in addition to the rejuvenation step weight degeneracy.
Sequential data processing is not only computationally efficient and convenient for IOV handling, but has the additional advantage that already assimilated experimental data need not be stored to assimilate future data points. Sampling approaches allow a simple extension for hierarchical models to include the uncertainties in the population parameters for even more holistic uncertainty quantification. This would enable a continuous learning process between clinical trials from drug development (e.g., phase III) and continue during the acquisition of real-world data after market authorization in quantifying the diverse population of patients who have taken a given drug. For a future patient, this “historic” diversity would transform into well-quantified uncertainty in a TDM setting. The absence of need to store “historic” experimental data can also be helpful for the exchange of information among clinics, health insurances, and pharmaceutical companies. The current knowledge, present in form of a sample of particles, can easily be exchanged without the need to exchange the experimental data. The “historic” data are implicitly present in the particles.
In view of new treatments and new mobile healthcare devices (e.g., wearables) gathering data from various sources, clinicians have to deal with new challenges and an increasing complexity of treatment decision making, which demands for comprehensive approaches that integrate data efficiently and provide informative and reliable decision support. We illustrated that comprehensive uncertainty quantification can result in a more informative, reliable, and differentiated decision support, which is not only limited to individualized chemotherapy but has the potential to improve patient care in various therapeutic areas in which TDM is indicated, such as oncology, infectious diseases, inflammatory diseases, psychiatry, and transplantation patients.
Acknowledgments
Fruitful discussions with Andrea Henrich (Idorsia Pharmaceuticals Ltd, Allschwil), Sven Mensing (AbbVie, Ludwigshafen), and Sebastian Reich (University of Potsdam, University of Reading) are kindly acknowledged.
Funding
Graduate Research Training Program PharMetrX: Pharmacometrics & Computational Disease Modelling, Berlin/Potsdam, Germany. Deutsche Forschungsgemeinschaft (DFG) through grant CRC 1294 Data Assimilation (associated project). Deutsche Forschungsgemeinschaft and Open Access Publishing Fund of University of Potsdam.
Conflict of Interest
C.K. and W.H. report research grants from an industry consortium (AbbVie Deutschland GmbH & Co. K.G., Boehringer Ingelheim Pharma GmbH & Co. K.G., Grünenthal GmbH, F. Hoffmann-La Roche Ltd., Merck KGaA, and Sanofi) for the PharMetrX program. In addition C.K. reports research grants from the Innovative Medicines Initiative-Joint Undertaking (“DDMoRe”) and Diurnal Ltd. All other authors declared no competing interests for this work.
Author Contributions
C.M., N.H., C.K., and W.H. wrote the manuscript. C.M., N.H., J.dW., C.K., and W.H. designed the research. C.M. performed the research. C.M., N.H., C.K., and W.H. analyzed the data.