Issues in Non-Clinical Statistics

Issues in Non-Clinical Statistics Stan Altan Chemistry, Manufacturing & Control Statistical Applications Team Department of Non-Clinical Statistics 1

Outline Introduction Regulatory Considerations Impacting Statistical Practices in Non-Clinical Development (Two Issues) Stability Analysis Issues and Controversies Equivalence Approach to Bioassay Potency Testing Issues in the Statistical Analysis of a Nonstandard Design (The N-1 Design) Excipient Compatibility Studies Wrap-up 2

Pharmaceutical Product Development: Discovery through Launch Lead Opt Compound Selection Discovery Formulation Development Safety Assessment Non-Clinical /Pre-Clinical Phase I First in humans Phase IIa Proof of biological activity in humans Clinical Research and Commercialization Phase IIb Phase III Registration NDA/MAA Submission Approval & Launch 3

Information Needed for Formulation Development Project Active Pharmaceutical Ingredient (API) fundamental physical and chemical properties of the drug molecule and other derived properties: pka,solubility,melting point,hygroscopicity Chemical stability through degradation studies Oral absorption potential of API evaluated based on the API aqueous solubility throughout the ph range of the GI Tract and the permeability of the compound in an in-situ rat intestinal loop or CaCo-2 model. Polymorphism, particle size and surface characteristics 4

Biopharmaceutics Classification System BCS Class I: High Solubility & High Permeability Solubility > 1 mg/ml; Permeability > 6 x 10-5 cm/s BCS Class II: Low Solubility & High Permeability Solubility < 1 mg/ml; Permeability > 6 x 10-5 cm/s BCS Class III: High Solubility & Low Permeability Solubility > 1 mg/ml; Permeability < 6 x 10-5 cm/s BCS Class IV: Low Solubility & Low Permeability Solubility < 1 mg/ml; Permeability < 6 x 10-5 cm/s CLASS BOUNDARIES HIGHLY SOLUBLE when the highest dose strength is soluble in < 250 ml water over a ph range of 1 to 7.5. HIGHLY PERMEABLE when the extent of absorption in humans is determined to be > 90% of an administered dose, based on massbalance or in comparison to an intravenous reference dose. RAPIDLY DISSOLVING when > 85% of the labeled amount of drug substance dissolves within 30 minutes using USP apparatus I or II in a volume of < 900 ml buffer solutions. 5

Stability Analysis : Issues and Controversies Introduction Objectives of a Stability Study Kinetic Models Design of Stability Studies Stability Models Fixed and Mixed effects Case Study Bayesian Approach 6

Introduction Stability is defined as the capacity of a drug substance or a drug product to remain within specifications established to ensure its identity, strength, quality, and purity throughout the retest period or expiration dating period 7

Introduction Purpose of Stability Testing To provide evidence on how the quality of a drug substance or drug product varies with time under the influence of a variety of environmental factors (such as temperature, humidity, light, package) To establish a re-test period for the drug substance or an expiration date (shelf life) for the drug product To recommend storage conditions Control focused on lot mean 8

9 Kinetic Models (API) (Underlying Mechanism) Orders 0,1,2 where C 0 is the assay value at time 0 When k 1 and k 2 are small, 1 2 0 ( 2) 0 (1) 0 0 ( 0) 1 ) ( ) ( ) ( 1 t k C t C e C t C t k C t C t k t k C C ) (t C and t k C C ) (t C 2 2 0 0 2 ) ( 1 0 0 ( 1)

Basic Design Randomly select containers/dosage units at time of manufacture, minimum of 3 batches, stored at specified conditions related to zones I,II,III,IV requirements At specified times 0,1,3,6,9,12,18,24,36,48,60 months, randomly select dosage units and perform assay on composite samples Basic Factors : Batch, Strength, Storage Condition, Time, Package Additional Factors: Position, Drug Substance Lot, Supplier, Manufacturing Site,... 10

Development Stability Study Description of Data Assay measurements at 0, 1, 3, 6, 9,12 months 3 Batches held at 25C/60%RH and 30C/65%RH and 40C/75%RH storage conditions, 3 package configurations Specification limits: 90 110% Label Claim (w/w) 11

Expiration Date Regression Model (Fixed Terms) If b i < 0, the expiration date (T ED ) at condition i is the solution to the equation (roots of a quadratic) LSL y ij A b i T A b ED i t t ij Var ( Ab T (, df ) i ED LSL = lower specification limit, t (,df) is the (1-) th quantile of the t-distribution with df degrees of freedom. e ij ) 12

Concentration % Label Expiration Date Intersection of specification limit with lower 1- sided 95% confidence bound on the batch mean Scatterplot of Observed Assay, True Concentration, Lower CL vs Time 104 102 100 Variable Obs_Assay True_Conc LCLM Pred 98 96 94 92 90 90 0 1 3 6 9 12 Time (Months) 18 24 Shelf Life 13

Regulatory Model ICH Q1E (n b Batches, n c Conditions) Models 1,2,3 (Fixed Terms) 1. Fit individually by Batch and Condition (n b * n c models) 2. Fit by Batch, include all Conditions (fit n b constrained intercept models) 3. Fit all Batches and Conditions (fit 1 model, constrained batch intercepts, with/without constrained slopes) 14

Regulatory Models (ICH Q1E) Model Specification Type Number Form Number Fixed Fixed 1 2 3a 3b y y y y ijk ijk ijk ijk A A ij i A i A i Index i=batch, j=condition, k=time B B B B ij ij ij j T T T T ijk ijk ijk ijk ijk ijk ijk ijk Parameters Number Variance Parameters 2*n b *n c n b *n c n b *n c +n b n b n b *n c +n b 1 n c +n b 1 15

Issues with Regulatory Models Pooling across batches of drug Product Intercepts and Slopes at p=0.25 Unrealistic to assume batch potencies are identical at release (time or manufacture), batches are going to be different, so why test for equality? Residual error term used for pooling across Intercepts Why should this be the criterion for poolability? Multiple error terms possible if models 1,2 chosen P=0.25 ignores levels of process and analytical variability Cannot power a stability study design emphasis is on estimation of degradation rates Ignores the fact that the Chemistry is independent of batch, same API, rate constant is property of the molecule 16

Issues with Regulatory Models Are the regulatory guidelines reflective of current technology and statistical practice? This is the right time to question the pooling paradigm Equivalence approach not a way out Are we stuck in a Hypothesis Testing /Equivalence Testing rut? 17

Pooling across batches Mechanistic basis exists to forego pooling tests (constrained models) Assume a fixed common temperature-condition specific slope based on kinetic considerations Assume different batch-specific Intercepts Main requirement is to estimate the parameters and account for incipient variation in such a way that control over the lot mean is assured. 18

Mixed Model If one can assume that drug product batches arise from a fixed manufacturing process, then one can regard the batches as the primary independent statistical units. Statistical model needs to estimate : Process Mean at time of Manufacture Rate parameter Variance Structure Process (Lot-Lot) Analytical Variation Measurement error, Extraneous sources 19

Mixed Effects Model The mixed effects model provides a coherent modeling framework in a compact way consistent with the manufacturing process Acknowledges all sources of variation, simple but flexible variance structure Consistent with the basic philosophy that batch is the conditionally independent primary statistical unit (subject specific effects) A natural representation of a batch process, direct leadin to process simulations, bootstrapping, post commercialization studies Easily extended to multiple fixed factors under study Main objection small number of batches 20

Mixed Effects Model Models 4, 5 (Mixed Models, with 1 or 2 Random Terms) 4. Random Term in the Intercept 5. Random Terms in Intercept and Slopes Correlation not likely for API, may be for others 21

Mixed Effects Model Model Specification Type Number Form Number Fixed Mixed 4 y ijk ) ( 0 Index i=batch, j=condition, k=time i B j T ijk 5 yijk ( 0 i ) B j i Tijk ijk ijk Parameters Number Variance Parameters n c +1 2 n c +1 3 22

Case study: Data Listing Condition Month Ba tch B1 Ba tch B2 Ba tch B3 25C-60RH 0 99.7 99.0 99.4 25C-60RH 1 100.0 99.4 100.5 25C-60RH 3 99.1 99.2 99.7 25C-60RH 6 98.8 99.5 99.7 25C-60RH 9 98.7 98.7 99.4 25C-60RH 12 98.7 98.6 99.3 30C-65RH 1 100.4 99.9 101.7 30C-65RH 3 99.6 99.4 100.0 30C-65RH 6 99.3 99.5 99.6 30C-65RH 9 98.2 98.2 98.9 30C-65RH 12 98.0 97.4 98.5 23

Assay (%LC) Stability Profiles 0 3 6 9 12 B1 B2 102 101 condition 25C-60RH 30C-65RH 100 99 98 102 B3 101 100 99 98 0 Panel variable: Batch 3 6 9 12 Time (Months) 24

Results from 5 Models Type Fixed Mixed Model Number Fixed Params. Rate Intercept 1 12-1.23-0.60-0.61 2 9-1.62-1.08-1.87 2 ˆ e 25C 30C 25C 30C 25C 30C -2.21 99.7 100.2 0.10 0.14-1.97 99.3 99.8 0.10 0.34-2.03 99.9 100.6 0.17 0.77-1.98-1.87-1.90 100.0 99.7 100.5 0.14 0.20 0.42 3a 9 same same same 0.25 24 3b 5-1.35-1.91 99.9 0.22 28 99.7 100.5 4 3-1.35-1.91 100.0 0.22 0.12, - 28 5 3-1.35-1.91 100.0 0.22 0.12, 0 26 ˆ 2, ˆ 2 Res DF 4 8 25

Shelf Life Estimates Model Fixed Models 1. Batch and Condition specific Intercept, Slope, Error 2. Batch specific Intercept, Slopes, Error 3a. Batch specific Intercept, Slopes, Common error across Batches 3b. Batch specific Intercept, Condition Specific Slopes, Common error across Batches Mixed Models 4. Combined across Batches, Conditions (Mixed model with a single Random Coefficient) 5. Combined across Batches, Conditions (Mixed model with two Random Coefficients) 25C/60RH 30C/65RH 60 32 51 41 49 43 65 49 67 51 67 51 26

Assay (%LC) Assay vs Time on Stability - M4 0 6 12 18 24 102 25C-60RH 30C-65RH 101 100 99 98 97 96 95 0 6 12 18 24 Time (Months) 27

Bayesian Approach Provides mechanism to include prior information to the statistical analysis of current data and to update model parameter estimates as new data are collected A more natural way to approach to CMC decision making in terms of a posterior predictive distribution 28

Hierarchical Model Random Intercept Model Y Y ijk ijk i ( m ) ~ ~ N(0, 2 ), 2 Add (prior) distribution on the unknown parameters, m, j, 2, 2 N( m i i j X j ijk X ijk ijk ) 29

Prior Distributions Expert opinion Process mean is likely between 99% and 101% m ~ N(100, 0.1) Lot to lot variance is likely between 0.1 and 0.5 2 1 ~ (10, 2) Flat prior on the yearly degradation rates, ~ I(, ) 1 2 Analytical variance is likely between 0.1 to 1.0 2 1 ~ (6, 2) 30

Parameter Estimates Frequentist Bayesian Parameters 95% Confidence Mean 95% Credible Estimate Interval (Median) Interval m 100 99.2, 100.8 100 99.6, 100.5-0.11-0.16, -0.07-0.11-0.16, -0.06-0.16-0.21, -0.11-0.16-0.21, -0.11 0.12 0.21 (0.20) 0.11, 0.38 0.22 0.27 (0.26) 0.17, 0.42 Method Expiration Date Storage Condition 25C 30C Frequentist 67 51 Bayesian 65 50 31

Summary Current regulatory guidelines are being challenged in view of current technologies and scientific understanding needs continued discussion Bayesian framework is available needs further discussion in relation to reasonable priors, integrating scientific judgment 32

Equivalence Approach to Bioassay Potency Testing Introduction Modeling of Potency Curves Equivalence vs. Equality Issues 33

Definition of Bioassay WHO/NIBSC, J. Immunol. Methods (1998), 216, 103-116. International consensus, Dev. Biol. Standard. (1999) vol 97: "A bioassay is defined as an analytical procedure measuring a biological activity of a test substance based on a specific, functional, biological response of a test system Finney, 3rd Edition (1978) Statistical Method in Biological Assay 34

Product induced response Product induced response Product induced response Examples of Assay Dose-Response Curves Assay Type A Assay Type B Assay Type C Day1 Day2 1 unit 1 unit 1 unit 1 unit 35

Parallelism of Dose-Response Curves The definition of relative potency requires the test sample and standard have an identical type of response within the assay. Two parallel response lines should be observed (any displacement between the curves is related to their relative activities.) Non-parallelism indicates that standard and test material are not acting similarly and any definition of potency is not valid. 36

Four Parameter Logistic f x, i 1 2 1 2 2 2 4 x 1exp4 log( x ) log( 3) i i 1 3 where = ( 1, 2, 3, 4 ), 1 = asymptote as the concentration x 0 (for 4 >0), 2 = asymptote as x, 3 = concentration corresponding to response halfway between the asymptotes, 4 = slope parameter. 1 37

Variance Function Variability modeled as a function of mean response to capture heteroscedasticity: 2 2 i Var( y ) g f x,, i 2 2 where g f xi,, is the variance function with parameter, is a scale parameter. Correct specification is important for calculating standard errors of parameter estimates. Power of the mean is commonly used: 2 m Ey ( ) g f x i,, where m i i. Generalized least squares (GLS) method is used to estimate the parameters (Giltinan and Ruppert). i 38

Extend Model 1 to the comparison of multiple curves, say standard, test preparations. This is the context of potency testing. 39

Constrained Four-Parameter Logistic Conditions of Similarity: s1 = t1, s2 = t2, s4 = t4, lower upper parallelism asymptote asymptote and only s3, t3 vary. Let * s3 = log s3, * t3 = log t3. Under these conditions, the constrained model is given by: y i 1 2 2 * 1exp 4 Islog xi It (log xi t ) s3 where t = * s3- * t3. i (2) 40

Current EQUALITY Approach for Similarity Testing of Dose-Response Curves For parallel-line assays, similarity is assessed using a statistical hypothesis test of equal slopes (test of Parallelism): Ho: S - T = 0, Ha: S - T ne 0 Test Statistic : T = Difference in slopes / Std Error(Difference) Conclude equality if T falls below a critical value. Problems with current approach 1. The greater the precision, greater sensitivity to declare even a very small difference as significant. t-test in this case penalizes a more precise assay 2. For assays with poor precision, it is hard to conclude even substantially different slopes as different. 41

Proposed Changes in USP Chapter <111> (Design and Analysis of Biological Assays) Focus on analysis design and validation issues moved to other chapters Preferred method for combination of independent assays is unweighted averaging Non-linear mixed effects models (4 and 5-parameter Logistic) Method of similarity testing of dose-response curves is shifted from an equality to an equivalence approach 42

Proposed EQUIVALENCE Approach for Similarity Testing Equivalence Limits are established that represent a measure of acceptable closeness in the slope estimates H 0 : The two slopes are not equivalent: H a : T - S ) < D L or ( T - S ) > D U The two slopes are equivalent. D L ( T - S ) D U, limits. where D L and D U are the equivalence Confidence Intervals (CI) for the difference in slopes can be used to test equivalence: If CI s fall within the equivalence limits then one concludes similarity in dose-response curves. 43

Challenges of the Equivalence Approach What to choose as the measure of nonparallelism slope difference, ratio of slopes 3 or 4 parameters for 4&5 logistic curves How to determine the equivalence limits Knowledge/experience for product and assay? Historical data that compare the STND to itself? Provisional capability-based equivalence limits tolerance intervals on slope differences of sample replicates? Lack of enough data during assay development Highly variable assays such as In-Vivo may not be able to satisfy a CI criterion use acceptance interval for the point estimate (rather than the CI)?

Bioassay Potency Testing Summary Equivalence approach replacing the traditional equality approach for similarity testing means that assay precision will have to be characterized very carefully during assay development Assay design will play a more important role, number of plates Further discussion is needed on a coherent approach to establishing equivalence criteria for dose response curve similarity testing 45

Excipient Compatibility Studies - N-1 Design Introduction Construction Analysis Issues 46

Introduction Prior to development of any dosage form of a drug candidate, fundamental physical and chemical properties of the drug molecule and other derived properties are studied. This information will provide possible approaches in formulation development and this early phase is known as Preformulation. Excipient Compatibility studies fall within this early phase and emphasis is on excipient effects on chemical and physical properties of the drug formulation (API stability in particular). 47

Introduction Objective Carry out a stability study on drugexcipient combinations to identify the most stable formulation. Complete (or close to complete) combinations are preferred over binary formulations. Economic incentive to formulate early. 48

Balanced Incomplete Block Designs Consider a BIBD, the simplest BIBD has 3 treatments with parameters t treatments = 3 k exp units/block = 2 b number of blocks = 3 r number of reps/trtmt = 2 number of blocks each pair of treatments occur = 1 Treatment Block A B C 1 x x 2 x x 3 x x BIBD defined by (t-1) = r(k-1) 1(3-1) = 2(2-1) = 2 49

Balanced Incomplete Blocked Factorial Designs (or N-1 Designs) Suppose instead of treatments A,B,C, we have say 3 classes of factors, say F 1, F 2, F 3, each with 2 levels, and we form factorial designs of each pair of factors, denoted Dij, ij, i,j=1,2,3, j>i, then D12 = 2x2 factorial design of factors F 1 and F 2 D13 = 2x2 factorial design of factors F 1 and F 3 D23 = 2x2 factorial design of factors F 2 and F 3 Then the N-1 design is the collection of these 3 factorial designs. The N-1 Design can be considered a factorial extension of the BIBD idea but not subject to the restrictions in the BIBD parameters. 50

General Construction of an N-1 Design Given F 1, F 2,,F p factors, so that each factor has say l i levels, i=1,2,,p, (l i 1) construct k factorial designs, the 1,2, i-1,i+1, k-th design denoted by D 12 I-1,I+1, k (full factorial design of all factors minus the ith factor), then the N-1 design is the collection of all such factorial designs. The total number of observations will be given by k N 1 l l2 3 l... l 1l 1... l i 1 i i k An N-1 Design has more points than the corresponding full factorial Design So why do an N-1 design? 51

Geometric 2 3 N-1 Design Y = 10*C Example Data + 10*A*C (0 means Absent) + N(0,16) Group A B C Y 1 1 1 0 1 1-1 0 2-1 1 0 6-1 -1 0-6 2 1 0 1 22 1 0-1 -14-1 0 1-5 -1 0-1 6 3 0 1 1 5 0 1-1 -7 0-1 1 16 0-1 -1-10 52

2 3 N-1 Design Response = Mean + main effects + 2 way interactions Consider each factorial as a group, then an error term can come from the group differences and group interactions and replication if available. Solve normal equations for a non-orthogonal analysis 53

Example 2 3 N-1 Design Source DF Type I SS Mean Square SAS Output Error(MS)=14.68 A 2 13.17 6.58 B 2 5.63 2.81 C 1 496.13 496.13 A*B 3 89.38 29.79 A*C 2 573.38 286.69 B*C 1 49.00 49.00 Source DF Type III SS Mean Square A 1 12.50 12.50 B 1 1.13 1.13 C 1 496.13 496.13 A*B 1 42.25 42.25 A*C 1 552.25 552.25 B*C 1 49.00 49.00 54

Non-geometric N-1 Design Four Factors Active + Filler + F = 4 Disintegrant + D = 2 Lubricant + L = 2 Flow Enhancer E = 1 Ignore Mixture Aspect of Design 55

Non-geometric Design Can it be a 4x2x2x1 Factorial Design 16 combinations? N-1 Design 4+8+8+16 = 36 combinations 56

Analysis Issues Can the group interactions form a meaningful error term? Should replication be designed into the study? Geometric designs, the usual nonorthogonal analysis is straightforward and probably acceptable. For non-geometric N-1 designs, what does one do when a factor has only 1 level (Flow Enhancer example). Fractionate to get a main effects design from a combined N-0/N-1. 57

Wrap-up Many issues in non-clinical statistical applications We live in exciting times with the advent of new technologies and methods leading to a continuous source of statistical problems, especially in Non-Clinical applications Rutgers students and staff are at the forefront of solving these problems Thank you! 58

Acknowledgements J&J PRD Non-Clinical Statistics CM&C Statistical Applications Team Ray Buck Hans Coppenolle Oscar Go Areti Manola Yan Shen Jyh-Ming Shoung CM&C Client Groups at J&J PRD 59