Integration of SAS and NONMEM for Automation of Population Pharmacokinetic/Pharmacodynamic Modeling on UNIX systems Alan J Xiao, Cognigen Corporation, Buffalo NY Jill B Fiedler-Kelly, Cognigen Corporation, Buffalo NY ABSTRACT In drug development and other pharmaceutical applications, SAS is a powerful tool for statistical analysis, graph creation, text file processing, and system command submission, while NONMEM is a powerful software tool for population pharmacokinetic/ pharmacodynamic (PK/PD) modeling. The integration of SAS and NONMEM is therefore a natural solution for automation or semiautomation of population PK/PD modeling. Population PK/PD modeling, especially the process of covariate evaluation (including forward selection of the most significant covariate and backward elimination of the most non-significant covariate), can be automated using SAS programs as a recycling process of running NONMEM programs, comparing results, updating programs, and rerunning. For example, the results comparison step can be implemented via SAS macros that read NONMEM output files, list and sort the minimum objective function values, and aid in the identification of the next base model. The flowchart of the population PK/PD modeling, algorithm of the automation, and SAS and NONMEM code segments will be demonstrated in this paper. The automation is especially efficient and time-saving for population PK/PD analyses with many covariates to be evaluated. An additional benefit of this automated solution is the potential for reduction in error opportunities and the need for extensive quality assurance. INTRODUCTION Population pharmacokinetic/pharmacodynamic (PK/PD) modeling is becoming more and more important in the healthcare/pharmaceutical industry because necessary information for decision-making, drug development optimization, and clinical dosage individualization can be effectively obtained using population PK/PD analysis. Different methods for population PK/PD modeling have been proposed 1-4. Whatever method is used, the flowchart of the population PK/PD modeling process is similar to the one as illustrated in Figure 1. As is shown in the flowchart, one of the important processes in population PK/PD model development is covariate evaluation, identifying the relationship between the model parameters and covariates, such as demographic factors, clinical laboratory biochemical measurements, organ functionality, genotype/phenotype, and drug-drug interactions. This process is time-consuming and could account for up to 90% of the workload and total time for the whole population PK/PD analysis, depending on the complexity of the population PK/PD model structure, number of significant covariates, and methods used for covariate evaluation. The complexity of the population PK/PD model structure and number of covariates are generally determined by the complexity of the drug s metabolism, population PK/PD modeling strategy (such as physiologically-based or mechanism-based population PK/PD modeling), and availability of clinical trial data. Given a drug, the time required for covariate evaluation is mainly determined by the method used for this evaluation, which is commonly implemented by NONMEM programs 5. However, implementation of some methods might be very difficult in practice under certain circumstances. For example, the Likelihood Ratio Test (LRT) method with Wald s approximation proposed by Kowalski and Hutmacher 1 starts from a full model all relevant covariates are initially included, followed by a process of backward elimination of the most non- significant covariates from the model. Population PK/PD modeling with this method could never proceed if too many covariates are included in the full model or initial guesses for the model parameters are very different from the true values since runs might never converge. Therefore, selection of methodology for covariate evaluation is important and critical in determining the computation workload and time. To avoid the problems described above, this paper will discuss a commonly used method for covariate evaluation: stepwise forward selection followed by stepwise backward elimination. Data characterization SAS/NONMEM dataset creation Covariate distribution/statistics Drug concentration-time profiles Model Development Structural model Covariate evaluation Forward selection Backward elimination Random effect evaluation Model refinement Model validation/test/simulation Figure 1. Flowchart of population PK/PD modeling
The total workload and time for population PK/PD modeling can be split into two parts: machine (computation) workload/time and scientist workload/time. The description in the previous paragraph refers to machine workload and time. This paper will only focus on how to reduce the scientist s workload and time for population PK/PD modeling through automation using SAS programs. Specifically, automation of the process of covariate evaluation will be illustrated. FEASIBILITY OF AUTOMATING COVARIATE EVALUATION The issue of reducing a scientist s workload and time becomes important when the number of potential significant covariates is large and/or the number of model parameters on which covariate effects will be evaluated is large. A typical process of covariate evaluation via forward selection and backward elimination can be expressed as a recycling process as illustrated in Figure 2. Base model (base NONMEM program) Derived models (derived NONMEM programs) Running derived NONMEM programs Comparing running results of the derived programs More covariates? Yes Update base model (updated base NONMEM program) No Covariate Evaluation done Figure 2. Flowchart of forward selection or backward elimination Each derived/updated NONMEM program has only one more covariate included/removed from the previous base program. Therefore, the whole process of covariate evaluation can be easily automated by SAS macros. The structure of a NONMEM program which is composed of different functional blocks also makes automation feasible. Forward Selection During forward selection, the derived NONMEM programs are different from the base model by only one covariate term which is added on some model parameter and one more line in the $THETA block for the initial value of that covariate, as illustrated below. Segment of the base NONMEM program: $PK TVCL = THETA(1)+THETA(4)*(SGOT/20-1) CL = TVCL*EXP(ETA(1)) TV21 = THETA(2)+THETA(5)*(SEX- 0.53)+THETA(6)*(AGE/52-1) TVV = TV21+THETA(7)*(RAC1-0.69) V = TVV*EXP(ETA(2)) TVKA = THETA(3)+THETA(8)*(SGOT/20-1) KA = TVKA*EXP(ETA(3)) $THETA (0, 30); THETA(1) CLEARANCE (L/H) (0, 150); THETA(2) CENTRAL VOLUME (L) (0, 0.7); THETA(3) KA (PER H) (-20) (20) (-0.02) One derived NONMEM program: $PK TVCL = THETA(1)+THETA(4)*(SGOT/20-1)+THETA(9)*(RAC2-0.43) CL = TVCL*EXP(ETA(1)) TV21 = THETA(2)+THETA(5)*(SEX- 0.53)+THETA(6)*(AGE/52-1) TVV = TV21+THETA(7)*(RAC1-0.69) V = TVV*EXP(ETA(2)) TVKA = THETA(3)+THETA(8)*(SGOT/20-1) KA = TVKA*EXP(ETA(3)) $THETA (0, 30); THETA(1) CLEARANCE (L/H) (0, 150); THETA(2) CENTRAL VOLUME (L) (0, 0.7); THETA(3) KA (PER H) (-20) (20) (-0.02) (2) 2
. Compared to the base program, one term for the effect of a covariate has been added on a selected PK parameter and one initial guess for the coefficients in the added covariate term are appended at the end of the $THETA block (The number of the appended initial guess in the $THETA block should be equal to the number of the coefficients contained in the covariate term). The above updates can be implemented with a SAS macro like: %trnsctlm(path, baseprogramfilename, newprogramfilename, covariatetobeevaluated, covariatetermtobeadded, ParameterOnWhichTheCovariateTermAdded, initialguessforthecovariatecoefficient); This macro %trnsctlm implements the following tasks: 1. Read variable Xline ($80.) from a data file &path&baseprogramfilename line by line; 2. If index(xline, &ParameterOnWhichTheCovariateTermAdded ) = 1 and length(&covariatetermtobeadded) <79 - length(trim(xline)) <80 then a. Xline = Xline &covariatetermtobeadded ; Else do; Xline = newaccumulatetermforparameter substring(xline, length(¶meteronwhichthecovariateter madded)); Insert a line with: &ParameterOnWhichTheCovariateTermAdded = newaccumulatetermforparameter &covariatetermtobeadded ; 3. Append Xline = &initialguessforthecovariatecoefficient at the end of the block $THETA ; 4. Output the derived NONMEM program with a new file name &newprogramfilename. The derived NONMEM program in the previous example can be achieved by calling this macro with the macro variables being replaced with appropriate values, such as: &covariatetobeevaluated = RAC2 ; &covariatetermtobeadded = +THETA(9)*(RAC2-0.43) ; &ParameterOnWhichTheCovariateTermAdded = TVCL ; &initialguessforthecovariatecoefficient = (2) ; and so on. Running Derived NONMEM Programs Once a batch of new NONMEM programs are created, they can be started to run in parallel and/or in a queue simply by implementing the following SAS command: x cd /dir0/dir1/dir2/dir3/dir4/dir5/ x./nonmemcommand >& nonmemcommand.o & The script file nonmemcommand contains the command to run the derived NONMEM programs, such as # cd /dir0/dir1/dir2/dir3/dir4/dir5/dir6/ nonmem derivednonmemprogram1 nonmem derivednonmemprogram2 nonmem derivednonmemprogram3 nonmem derivednonmemprogram4 nonmem derivednonmemprogram5 # Backward Elimination It is easier to update NONMEM programs during backward elimination since the actual term which expresses the effect of a covariate is not necessarily physically removed from the program. The elimination of a covariate term from the program can be implemented just by fixing the corresponding coefficient(s) to zero(s), as expressed in this algorithm: 1. Read variable Xline ($80) from a data file: &path&baseprogramfilename line by line 2. If a covariate is selected to be removed, then set the corresponding initial guesses in the $THETA block to (0 FIXED). 3. Output the derived NONMEM program with a new file name &newprogramfilename. Results Comparison The purpose of the covariate evaluation process is to identify the most significant covariates for the PK of a drug. During the forward selection process, significant covariates are added to the PK model while, during the process of backward elimination, insignificant covariates are removed from the model. Therefore, to reach this goal, comparisons between the minimum objective function values of different NONMEM programs are required, which can be implemented by a SAS macro program that reads the objective function values from the output of each NONMEM program. The algorithm can be expressed as: 1. Read the minimum objective function value of the base NONMEM program from its NONMEM report file 2. Read the minimum objective function value of each derived NONMEM program and find the difference in the degrees of freedom between the derived program and the base program 3
3. Calculate p-values, based on the difference in minimum objective function values and the difference in degrees of freedom using the likelihood ratio test approach based on an asymptotic χ 2 distribution. 4. Identify the most significant covariate during forward selection of covariates (with the smallest p-value) or the most non-significant covariate during backward elimination (with the largest p-value) 5. The identified program is used as a new base program for further evaluation of covariate effects CASE STUDY Case Study 1: The data was from a sparse sampling Phase II clinical trial for drug XX1 with 331 patients enrolled. The drug was sequentially converted to two active/toxic metabolites in the human body. A five-compartment model was identified to best describe the drug s PK. Covariate effects were estimable on six PK parameters in this model. It took about 3 hours to finish running the structural model on a high speed UNIX system, with longer time to run when covariates were included in the PK model. According to the exploratory investigation and prior information, these six PK parameters are potentially associated with some or all of the 15 covariates. Therefore, for a full spectrum of covariate evaluation (main effect), there are 15*6 = 90 potential programs for the first round, not including the potential interactions between covariates. Even after reduction with physiological rationale, the total number of potential programs was still up to 50. Manual editing of the base program to generate these 50 programs would have been extremely time-consuming and prone to error. Traditionally, a series of graphs to evaluate the potential correlations between each PK parameter and each covariate are created to visually judge the significance of the covariate and the trend of the PK parameter with the covariate values so as to further largely reduce the number of covariates on each PK parameter. However, this method is disadvantageous since: 1). There are many graphs to be visually compared; 2). It is difficult to compare the significance of a covariate on a parent PK parameter with that on a metabolite PK parameter or the significance of a covariate on two different metabolite PK parameters; 3). It is difficult to compare different covariates on the same PK parameter when they are at different levels of magnitude, such as bilirubin at a level of 10-1 mg/dl while body weight is at a level of 10 2 kg. 4). It is almost impossible to visually decide which is more significant across covariates and species, such as bilirubin on the clearance of the parent drug and level of one enzyme on the metabolic rate constant. 5). Even with the number of total significant covariates largely reduced, whether or not those unselected covariates might be significant can not be confirmed. With the long run time for each NONMEM program, a scientist may have enough time to manually update NONMEM programs so that his computer does not have to wait for available programs to run. That is to say, the limiting process of the covariate evaluation is computation time. However, with manual update of NONMEM programs, the scientist has to be tied to the program updating and results comparison all the time. Furthermore, errors could frequently occur during the manual update of programs and retrieving/comparison of results. In contrast, with automation, the time that the scientist spends on updating programs and retrieving/comparing results is negligible. In addition, error occurrence could be largely reduced. As a result, the scientist could save time to perform other tasks. Case 2: The data was from three rich data Phase I clinical trials of drug XX2 (oral administration in different formulation and age groups) with 100 young patients enrolled. One active metabolite was identified for this drug. A two-compartment model was identified to best describe the drug PK. Altogether 21 covariates are potentially associated with 6 PK parameters. With physiological rationale, the total number of possible evaluations on effects of covariates on PK parameters could be reduced from 21*6=126 to 77. The computation time for the structural model was about 4 minutes. Obviously, program updating would be a rate limiting process for the evaluation of covariate effects if manual updating were conducted. With automation, computation time becomes the rate limiting process while the scientist is liberated from burden workload. Case 3: The data was from a sparse sampling Phase II clinical trial of drug XX3 (oral administration) with 200 patients enrolled. No metabolite was of concern. A onecompartmental model was identified to best describe the drug s PK. Altogether 14 covariates were potentially associated with 3 PK parameters. The total number of possible evaluations of the effect of covariates on PK parameters of the structural model was 30. It took about 20 seconds to run the structural model. In this case, program updating and results comparison were the rate limiting process for the covariate evaluation. Manual updating of programs and comparisons of results would require much more time than computation. That is, the computer would have to wait for the scientist to update programs for running. Even though the total number of programs could be further reduced via visual assessment, the total time for covariate evaluation would be still determined by manual update of programs. However, with automation, even though more programs were actually run, the total time for the covariate evaluation was still much less. SUMMARY Automation of program updating and the comparisons of results can be implemented with SAS macros and can save scientists a tremendous amount of time and dramatically decrease the opportunities for error occurrence. This automation does not change the total computation time for population PK/PD modeling, but largely reduces the 4
workload and the total time that a scientist spends on covariate evaluation and therefore reduces the total time for a project. Together with future automation of identifying structural models, integration of SAS and NONMEM provides a possibility for the complete automation of the whole life cycle of a population PK/PD analysis. Phone: 716-633-3463 ext. 228 Fax: 716-633-7404 Email: jbf@cognigencorp.com Web: www.cognigencorp.com REFERENCE 1. K. G. Kowalski and M. M. Hutmacher. Efficient screening of covariates in population models using Wald's approximation to the likelihood ratio test. J Pharmacokinet Pharmacodyn 2001 Jun; 28(3): 253-75. 2. E. N. Jonsson and M. O. Karlson. Automated Covariate Model Building within NONMEM. Pharm Res 1998 Sep; 15 (9): 1463-8. 3. J. W. Mandema, D. Verotta and L. B. Sheiner. Building Population Pharmacokinetic-pharmacodynamic Models. I. Models for Covariate Effects. J. Pharmacokin. Biopharm. 20:511-598 (1992) 4. J. F. Lawless and K. Singhal. Efficient Screening of Nonnormal Regression Models. Biometrics 34:318-327 (1978) 5. S. L. Beal and L. B. Sheiner (eds.). NONMEM users guides, NONMEM Project Group, University of California, San Francisco, CA, 1992. SAS and all other SAS Institute Inc. product r service names are registered trademarks or trade marks of SAS Institute Inc. in the USA and other countries. indicates USA registration. Other brand and product names are registered trademarks or trademarks of their respective companies. CONTACT INFORMATION The authors can be contacted at: Alan J Xiao, Ph.D. Population PK/PD Scientist Cognigen Corporation 395 Youngs Road Buffalo, NY 14221-5831 Phone: 716-633-3463 ext. 265 Fax: 716-633-7404 Email: alan.xiao@cognigencorp.com Web: www.cognigencorp.com Jill B Fiedler-Kelly, M.S. Vice President, Population PK/PD Cognigen Corporation 395 Youngs Road Buffalo, NY 14221-5831 5