What Does It Mean? A Review of Interpreting and Calculating Different Types of Means and Standard Deviations

Size: px
Start display at page:

Download "What Does It Mean? A Review of Interpreting and Calculating Different Types of Means and Standard Deviations"

Transcription

1 pharmaceutics Review What Does It Mean? A Review Interpreting Calculating Different Types Means Stard Deviations Marilyn N. Martinez 1, * Mary J. Bartholomew 2 1 Office New Animal Drug Evaluation, Center for Veterinary Medicine, US FDA, Rockville, MD 20855, USA 2 Office Surveillance Compliance, Center for Veterinary Medicine, US FDA, Rockville, MD 20855, USA; mary.bartholomew@fda.hhs.gov * Correspondence: Marilyn.martinez@fda.hhs.gov; Tel.: Academic Editors: Arlene McDowell Neal Davies Received: 17 January 2017; Accepted: 5 April 2017; Published: 13 April 2017 Abstract: Typically, investigations are conducted with goal generating inferences about a population (humans or animal). Since it is not feasible to evaluate entire population, study is conducted using a romly selected subset population. With goal using results generated from sample to provide inferences about true population, it is important to consider properties population distribution how well y are represented by sample ( subset values). Consistent with study objective, it is necessary to identify use most appropriate set summary statistics to describe study results. Inherent in choice is need to identify specific question being asked assumptions associated with data analysis. The estimate a mean value is an example a summary statistic is sometimes reported without adequate consideration as to its implications or underlying assumptions associated with data being evaluated. When ignoring se critical considerations, method calculating variance may be inconsistent with type mean being reported. Furrmore, re can be confusion about why a single set values may be represented by summary statistics differ across published reports. In an effort to remedy some this confusion, this manuscript describes basis for selecting among various ways representing mean a sample, ir corresponding methods calculation, appropriate methods for estimating ir stard deviations. Keywords: animal pharmacology; pharmacokinetics; data analysis 1. Introduction In setting foundation for any discussion data analysis, it is important to recognize most studies are conducted with goal providing inferences about some population. If it were possible to assess every element in a population, inference would be unnecessary. Since it is rarely feasible to evaluate entire population, measurements are made on a sample population those measurements are used to make inference about distribution variable interest (e.g., body weight, treatment outcome, drug pharmacokinetics, food intake) in entire population. Imbedded within concept using finite samples to represent larger population is requirement observations be derived from a romly sampled subset population. The larger sample size, better will be approximation population distribution. An evaluation sample data frequently includes calculation sample mean variance. The sample mean is a measure location. The stard deviation (stdev), i.e., square root variance, provides a measure dispersion values about mean [1]. In addition to ir use in interpreting individual study outcomes, se values are ten compared across studies. Pharmaceutics 2017, 9, 14; doi: /pharmaceutics

2 Pharmaceutics 2017, 9, Once a sample mean variance (or stdev) are calculated, researchers ten use se estimates to generate inferences about population parameters, true mean (which is expected value population), variance. For example, area under concentration versus time curve (AUC) is one variable used as an indicator rate extent drug absorption. In a crossover bioequivalence (BE) trial, an AUC is computed for each study subject following administration a test product a reference product ratio test AUC/reference AUC is interest. The natural logarithm ratio test AUC/reference AUC (i.e., difference in natural log (Ln)-transformed AUC values) is subsequently calculated for each study subject. The mean residual error (variance) within subject test/reference product ratios are determined. On basis se statistics, we can generate conclusions as to wher or not two products will be bioequivalent when administered to population potential patients. While concept mean may appear straightforward, mean needs to be appreciated from perspective distribution underlying data determined method by which mean was estimated. In this regard, depending upon ir characteristics (i.e., what y represent), a set numbers can be described by different mean values. For this reason, it is important to underst reason for selecting one type mean over anor. Similarly, distributional assumptions influence calculation corresponding error (variance) estimates. Data transformations change distributional assumptions apply. Neverless, re are instances where investigators use transformations when estimating means but revert back to arithmetic assumptions in calculation variance. Such method for calculating summary statistics is flawed, rending sample population inferences derived from data analysis likewise flawed. This leads to question consequence using (or not using) data transformations in calculation means variances. When data are derived from a normal (symmetric bell curve) distribution, we can infer about 68% population values will be within one stdev mean, about 95% population values will be within two stdevs mean, 99.7% values will be within three stdevs mean ( three-sigma rule [1]). We do not have this property to rely upon when using arithmetic mean associated stdev to describe data from any or type distribution. Appropriately chosen data transformations can result in a distribution is more nearly normal, is, closer to a symmetric bell shape, than distribution original data will permit use three-sigma rule or benefits apply to normal data. In or words, when applying 3-sigma rule, it needs to be applied under conditions transformation made data distribution closer to a symmetric bell shaped curve. As an example how natural logarithm transformation can correct right-skewed data distributions (a larger proportion low values) result in a distribution is more nearly normal (i.e., closer to a symmetric bell shaped curve), Figure 1 shows best fit distribution to right-skewed Number values in left graph best fit distribution, a normal distribution, to Ln number values in right graph. The natural logarithm transformation can only be calculated for original values are non-negative. Similar results may be shown for use reciprocal transformation (1/ Number ) which applies only to positive values. While re are many data transformations in use, this paper focuses on natural logarithm transformation reciprocal transformation, which are transformations commonly applied to analysis right-skewed positive-valued biomedical data. Taking this example furr, we compute arithmetic mean stdevs for both original Number values natural logarithm transformed values Ln number. The original values are recorded in units in which y were measured are familiar to researcher. However, Ln number values are in units may be less familiar. As a result, mean mean ± 1, 2 or 3 stdevs Ln-transformed values are transformed back into original units by exponentiation. When back-transformed values are compared to those calculated from original values, we find location exponentiated mean Ln-transformed assessment (i.e.,

3 Pharmaceutics 2017, 9, geometric mean) is shifted to left arithmetic mean. In addition, distribution exponentiated Ln-transformed values is again right-skewed, with larger differences between mean + stdev than between mean stdev. While mean minus some multiple stdev may be calculated as a negative number for untransformed values, lower limit exponentiated Ln-transformed values is constrained to be non-negative. See Table 1. Pharmaceutics 2017, 9, Figure 1. Best fit distribution for Number for Ln number. Similar Table 1. results Comparison may be statistics shown based for on original use values reciprocal Ln-transformed transformation values(1/ Number ) back-transformed which applies to only original to positive units. values. Stdev: stard While re deviation. are many data transformations in use, this paper focuses on natural logarithm transformation reciprocal transformation, which are transformations Transformation commonly Used applied Mean to analysis Mean ± right-skewed Stdev Mean positive-valued ± 2 Stdev Mean biomedical ± 3 Stdev data. Taking Untransformed this example furr, 11.6 we compute 5.7, 17.3 arithmetic mean 0.2, 23 stdevs 5.5, for 28.7 both original Back-transformed Number values from Ln natural 10.4 logarithm 6.2, 17.3 transformed values 3.7, 28.8 Ln number. 2.2, The 47.9original values are recorded in units in which y were measured are familiar to researcher. However, Clearly, in Ln number absence values an appropriate are in units data representation may be less familiar. (i.e., if As when a result, a transformation mean mean original ± 1, 2 or observations 3 stdevs is appropriate), Ln-transformed we values will have are transformed flawed assessments back into original will bias units our by interpretation exponentiation. When study results back-transformed associated values population are compared inferences to those derived calculated from ourfrom analysis. original In orvalues, words, we in find presence location inappropriate exponentiated data transformations mean (orln-transformed in absence assessment necessary (i.e., transformations), geometric mean) analysis shifted to data left will be biased. arithmetic It should mean. also In be addition, noted distribution greater skewing exponentiated data Ln-transformed being evaluated, values greater is again right-skewed, magnitude with difference larger differences between between two sets mean predictions. + stdev than between mean stdev. While mean minus some multiple stdev may be calculated Givenas a negative vast differences number infor inference untransformed might be values, made depending lower limit on sample exponentiated statistics Ln-transformed used to describevalues a study s is constrained results, it is to evident be non-negative. authors See should Table 1. specify nature mean being calculated reported, corresponding stdev needs to be calculated in a manner consistent Table with 1. Comparison distributional statistics assumptions based on associated original values with calculated Ln-transformed mean values value. backtransformed It is with se to points original in mind units. Stdev: this stard educational deviation. article provides a review different kinds means, Transformation examples Used when each may Mean be appropriately Mean ± Stdev applied, Mean ± statistically 2 Stdev sound Mean ± methods 3 Stdevfor estimating Untransformed corresponding variances 11.6 stdevs. 5.7, 17.3 The types means 0.2, 23 ir associated 5.5, 28.7 stdevs discussed Back-transformed in this article from include Ln arithmetic, 10.4 geometric, 6.2, 17.3 harmonic 3.7, least 28.8 squares (see Appendix 2.2, 47.9 A for furr discussion se means). We also discuss how to determine arithmetic mean variance whenclearly, data are in pooled absence across published an appropriate investigations data representation when only summary (i.e., if statistics when a are transformation available. Given original density observations information is appropriate), covered in we this will manuscript, have flawed assessments divided information will bias intoour interpretation following sections: study results associated population inferences derived from our analysis Illustrating In or words, problems: in presence here we provide inappropriate a summary data transformations kinds data (or orin situations absence where necessary different transformations), methods estimating analysis summary data statistics will be may biased. be appropriate. It should We also also be noted provide a table greater values toskewing illustrate differences data being in evaluated, values greater means magnitude stdevs are difference estimated between (based two sets predictions. Given vast differences in inference might be made depending on sample statistics used to describe a study s results, it is evident authors should specify nature mean being calculated reported, corresponding stdev needs to be calculated in a manner consistent with distributional assumptions associated with calculated mean value.

4 Pharmaceutics 2017, 9, upon a single set initial values) based upon assumption made about distribution data, we describe how types means stdevs are matched to data distribution to type question being addressed Within study estimation: ungrouped data: This section applies to evaluation means stdevs for a single set observations (e.g., average elimination half-life (T 1/2 ) for a given drug product). The study design is not intended for a comparison across effects (e.g., no inter-treatment comparison) Arithmetic mean stdev Harmonic mean stdev Geometric mean stdev Within study estimation: grouped data: Here we provide an example to showcase importance adjusting for unequal number observations associated with factors being compared. An example a method for estimating mean stdev in presence study imbalance is provided for a simple situation a parallel study design where data are assumed to be normally distributed [1,2] Between-study comparison: There are occasions when reported means stdevs are combined to obtain a pooled estimate mean stdev a variable (e.g., AUC). This kind analysis is frequently used by generic drug working group within Clinical Laboratory Stards Institute (CLSI) Veterinary Antimicrobial Susceptibility Testing Subcommittee (VAST), VET01 document (Performance Stards for Antimicrobial Disk Dilution Susceptibility Tests for Bacteria Isolated from Animals; Approved Stard) which can be found at CLSI website, CLSI.org. When generating a cross-study pooling data, it is important to consider differences in number observations included in each pooled study reports. Using a hypotical example (intentionally different from situations encountered by CLSI), we provide a method for generating an unbiased estimate means stdevs in presence study imbalance. Of course, statistical considerations (as described earlier in this review) need to be evaluated prior to inclusion any dataset within a pooled analysis. It is important to recognize re are many experimental statistical considerations to be considered during study protocol development, data analysis, study interpretation. Among se are estimator bias, independence observations, distribution residuals, homogeneity variances. This information is discussed in great detail in basic statistics textbooks [1,2] refore will not be covered in this review. 2. Estimation Means Variances 2.1. Illustrating Problem Table 2 provides three types means (averages) stdev calculated from one set values. The issue here is not which is most appropriate mean to use but rar to illustrate a single set values can have different means, depending upon assumptions associated with nature distribution from which data originates. In this example, in addition to use untransformed data (i.e., estimation arithmetic mean shown in Table 1), individual values were eir transformed to ir natural log (Ln) values (resulting in calculation a geometric mean shown in Table 1 as back-transformed value mean Ln-number ) or as reciprocals ir original values (resulting in calculation a harmonic mean). For example, value first Number is 11. When Ln-transformed, value is 2.4 when expressed as reciprocal, value is 1/11 or Within each column, summary statistics were evaluated based upon equations provided in this review. The mean estimate, based upon wher data values in column are presented on Ln scale or as a reciprocal back-transformed from mean values in column to original scale, is provided in corresponding column. That is, geometric mean is value obtained, as shown in Equation (11), by exponentiating arithmetic

5 Pharmaceutics 2017, 9, mean Ln-transformed values (2.34, not shown). The harmonic mean is value obtained by taking reciprocal arithmetic mean reciprocal values (0.11, not shown). Table 2. Example differing summary values depending upon underlying assumptions estimation procedures for obtaining mean. Number Ln number 1/Number Type estimate Arithmetic Geometric Harmonic Mean Stdev %CV The corresponding geometric harmonic stdevs were calculated in accordance with methods described later in this manuscript, justified by use Norris expansion method [3]. The percent coefficient variation (%CV) (where %CV = 100 stdev/mean), based upon geometric harmonic stdevs corresponding means, is justified by method we used for calculating variability estimates. Please note, in contrast with need to determine mean ±1, ±2, or ±3 stdev within scale transformation n report m as ir back-transformed values, %CV is estimated using means (Equations (1), (3), or (5)) formula for associated stdev (Equations (2), (4), or (12)) described in following sections (e.g., geometric stdev/geometric mean, harmonic stdev/harmonic mean). Table 2 showcases point an estimate mean, stdev %CV is dependent upon assumptions made about characteristics data distribution. Accordingly, in response to confusion has sometimes been expressed, what appears to be inconsistencies among study reports may in fact be a reflection how se basic statistical values have been calculated. It is also interesting to note arithmetic mean > geometric mean > harmonic mean. A discussion for mamatical underpinnings for this outcome is provided in Appendix B. Identifying when each kinds means should be applied is a first-step in this analysis. Indeed, se same points could be important to consider when designing study as it can influence number subjects need to be included. We leave issue study design to reader, to be matched with specific objective investigation. Within framework this review article, a summary considerations with regard to matching type mean with distribution study considerations is provided in Table Within Study Estimation Ungrouped Data Arithmetic Mean Stard Deviation The arithmetic mean reflects sum individual values divided by number values in sum. In mamatical notation, we show summing function using a capital sigma (Σ) we indicate we are summing over all n observed values, x i, by using index, i, to indicate each integer from 1 to n, in turn. X = n x i n (1)

6 Pharmaceutics 2017, 9, Table 3. Matching selection mean to nature distribution question being addressed. Type mean stard deviation Arithmetic Harmonic Geometric Least square means Nature distribution (examples) Normal (i.e., additive) Reciprocal transformation positive real values Log transformation positive real values Should be consistent with distribution characteristics data collected model used to address study assumptions investigation Considerations for its application When estimate interest is based upon sum individual observations data are best presented as a normal distribution. In general, harmonic mean (HM) is useful for expressing average rates (e.g., miles per hour; widgets per day). In clinical pharmacology, an elimination rate constant, λ z, is estimated for each subject based on his/her systemic drug concentration during depletion portion Ln-concentration vs. time prile. The corresponding estimate elimination half-life (T 1/2 ) is n derived on basis elimination rate constant (see Appendix C for more details on calculating terminal elimination rate where we show T 1/2 = Ln 2/λ z or 0.693/λ z ). Because relationship T 1/2 to reciprocal value λ z ( latter being a rate constant), harmonic means may be used when describing average time to reduce systemic drug concentrations by 1/2. It should be noted if mean T 1/2 was generated by obtaining arithmetic average T 1/2 values, n estimate should be referred to as an arithmetic mean not as a HM. The mean T 1/2 represents a HM only when actual averaging was done on basis λ z estimates. It is only in case (i.e., when mean T 1/2 was derived via transformation HM λ z ) we have HM T 1/2. Within realm pharmacokinetics, geometric means are typically used when describing means variables such as area under curve (AUC) maximum concentrations (C max ). These variables are ten transformed to natural logarithm (Ln) prior to analysis geometric mean computed using back-transformation shown in Equation (11). This type assessment is also important for estimating average performance an investment where interest rates are compounded over time or when averaging changes in bacterial growth rates. The use least square means is important when re are an unequal number observations associated with any terms in statistical model. The stdev is square root sum squared differences between each observation arithmetic mean, divided by n 1 [1]. Stdev = n ( xi X ) 2 n 1 (2) One question arises is why we divide by n 1 rar than n. Assuming we have a rom selection observations sampled from population interest, dividing by n 1 versus n, is necessary to obtain an unbiased estimation population variance. This n is because sample variance is itself a rom variable (i.e., if variances were estimated for several sets samples from same distribution, we would obtain a range values). If we were n to take average all possible sets sample variances, average value should equal true population variance. Using n 1 is said to account for fact one estimate ( sample mean) was used in calculation sample variance. That is also why when one has measurements on entire population interest (e.g., average height all students in fifth grade at a particular school in 2016), population parameter values are obtained for mean variance denominator for variance is no longer n 1 but n. In or words, we are no longer calculating sample statistics as population estimates but rar population values mselves. Since small samples tend to underestimate variance (stdev), a less biased estimate is achieved by dividing value by (n 1) rar than n.

7 Pharmaceutics 2017, 9, Harmonic Mean Stard Deviation Harmonic Mean The harmonic mean (HM) is reciprocal arithmetic mean reciprocals observed values [4]. The reciprocal transformation applies to positive values only. HM = 1 ( n 1 xi n ) = n ( 1x1 + 1 x2 + 1 x x n ) (3) Table 3 indicates concept harmonic mean is useful in evaluation half-lives. Confusion has occasionally been expressed by observation although arithmetic mean individual T 1/2 values is always greater than HM individual T 1/2 values, re is no difference in estimate T 1/2 mean when estimated as 0.693/(arithmetic mean λ z ) or as HM T 1/2 values. The reason for this relationship is detailed in Appendix C. Estimation Harmonic Stard Deviation The formula for stdev associated with HM is written as follows [3]: n 1 HM 2 n xi n n 1 1 x i 2 (4) where x i is original observation n is number observations. Note again when stdev for T 1/2 is computed on basis reciprocal λ z values, to return to an estimate stdev associated with T 1/2, it is necessary to divide by constant Indeed, it is this estimate stdev which should be used when describing variation individual values about a HM. The use stdev values estimated on basis arithmetic T 1/2 is appropriate only when used to describe variability about an arithmetic mean. Such values should not be reported when T 1/2 is summarized as HM. The estimation stdev associated with HM, corresponding Excel codes, are provided in Table 4. Table 4. Comparison arithmetic vs. harmonic stard deviation (Stdev) values Excel codes for T 1/2. Data Row Column C Column D Column E Column F T 1/2 λ z T 1/2 sq d Deviations λ z sq d Deviations Row (C2 C12) 2 (D2 D12) 2 Row (C3 C12) 2 (D3 D12) 2 Row (C4 C12) 2 (D4 D12) 2 Row (C5 C12) 2 (D5 D12) 2 Row (C6 C12) 2 (D6 D12) 2 Row (C7 C12) 2 (D7 D12) 2 Row (C8 C12) 2 (D8 D12) 2 Row (C9 C12) 2 (D9 D12) 2 Row (C10 C12) 2 (D10 D12) 2 Row (C11 C12) 2 (D11 D12) 2 Row 12 Arithmetic mean Row 13 Harmonic mean T 1/ Arithmetic Stdev 5.70 sqrt[(sum(e2:e11))/9] Harmonic Stdev 5.06 [D13 2 sqrt[(sum(f2:f11))/9]]/0.693

8 Pharmaceutics 2017, 9, Geometric Means Stard Deviations Estimation Geometric Mean The geometric means are typically reported when describing data have been Ln-transformed prior to analysis (e.g., AUC or C max ). It is also used for estimation means when values are multiplied rar than added toger. Both se situations are described below. Multiplicative Relationships The geometric mean is calculated as n th root product (denoted by symbol Π) n positive observations. G = n Π n x i = (x 1 x 2 x n ) 1/n (5) To illustrate error would have occurred if arithmetic (additive) rar than geometric (multiplicative) mean was estimated when dealing with multiplicative relationships, we can consider growth rate values provided in Table 5 (includes Excel code). Here we see by calculating arithmetic rar than geometric mean, we would have markedly over-estimated proportion initial amount remaining (89.2% versus 76.9%). Examples where this kind calculation may be applicable include bacterial growth (where we are dealing with a constant time for change measurement) or in a financial situation where we are concerned with changes as a function month or year. Table 5. Comparison estimating average change when expressed as an arithmetic versus as a geometric mean when re are compounded (multiplicative) changes. Data Row Interest (percent) Column D relative change Row Row Row Row Row Arithmetic mean Excel code for geometric mean Geometric mean (D2 D3 D4 D5 D6) (1/5) Proportion remaining A slightly different set considerations would be needed if time associated with rates change is not held constant. For example, consider situation where geometric bacterial growth rate is being traced in response to changing ecological conditions. The numbers bacteria after a period change can be described as follows: Y = X τ/λ (6) where X = starting bacterial number = kind change (where = 2 for doubling; = 0.5 for halving) λ = time associated with a doubling or halving bacterial number τ = duration time Y = resulting bacterial number In this hypotical example, we are starting with 100 isolates. These bacteria have an initial doubling time 10 min for a duration 20 min, a reduction in numbers (e.g., due to presence some stressors) where re is a halving numbers every 35 min for a duration 120 min, followed

9 Pharmaceutics 2017, 9, by a resumption growth at original growth rate for 60 min. Sequentially using equation above, estimated number bacteria at end 200 min is found to be : Y1 = /10 = 400 (7) Y2 = /35 = (8) Y3 = /10 = (9) The results are also depicted in Table 6 (values) Table 7 (Excel spreadsheet code): Table 6. Estimation bacterial growth. Doubling time or halving time (λ) Time duration (τ) Value X Relative change ( τ/λ ) Resulting # bacteria (Y) Table 7. Excel spreadsheet code for estimating bacterial growth rate. Data Row Column A Column B Column C Column D Column E Doubling time or halving time Time duration Value X Relative change Resulting # bacteria (Y) Row B1/A1 C1 D1 Row B2/A2 C2 D2 Row B3/A3 C3 D3 What was average rate change occurred over this 200 min. duration? Can formula for computing geometric mean be applied to se three relative changes to obtain a meaningful geometric mean? The answer to is NO, here is why. As expressed in this example, three relative changes are not comparable expressions rate change. For computation geometric mean to be meaningful, relative changes must all be expressed for same time interval. In financial example, rate increase was given in annual increments. In bacterial growth example, doubling time was expressed as a 10 min interval while halving time is expressed as a 35-min interval. Thus, one rates needs to be converted to time interval or. As an illustration, rate decrease during a 10-min time interval is determined by taking τ to be 10. Therefore, relative decrease in a 10-min interval is: τ/λ = (0.5) 10/35 = (10) Following format financial example above, re are two 10-min intervals with a relative change 2, twelve 10 min intervals with a relative change 0.820, followed by six more 10-min intervals with a relative change 2. The geometric mean relative change per 10-min interval is ( ) 1/20 = Table 8 shows calculation expected number bacteria at end each 10-min interval calculation geometric mean 10-min interval rates. In addition, table shows replacing each 10-min interval rate with geometric mean 10-min interval rates results in same calculation number bacteria at end 200 min. Note if one wanted to estimate mean number colony forming units (CFUs) over 200 min duration (i.e., mean CFU/min), this would be estimated by integrating CFU vs. time prile (area under curve, AUC), with total area divided by 200 min (Figure 2). Concepts AUC are familiar to scientists working with pharmacokinetic data. Since AUC ~57982, mean CFU/min ~ 290.

10 Pharmaceutics 2017, 9, Table 8. Calculating number bacteria at 10-min intervals. GM: geometric mean. Column A Column B Column C Column D Column E Column F Column G Row Minutes 10 min change rate change ( ) Number bacteria (Y) based on 10-min change Estimating Y [X τ/λ where τ = 10 λ = 10] Number bacteria based on geometric mean X GM (τ/λ) D2 C D2 $C$ D3 C F3 $C$ D4 C F4 $C$ D5 C F5 $C$ D6 C F6 $C$ D7 ( ) D8 (C9 0.5 ) F7 $C$ D9 C F9 $C$ D10 C F10 $C$ D11 C F11 $C$ D12 C F12 $C$ D13 C F13 $C$ D14 C F14 $C$ D15 ( ) D16 (C ) F15 $C$ D17 C F17 $C$ D18 C F18 $C$ D19 C F19 $C$ D20 C F20 $C$ D21 C F21 $C$ D22 C F22 $C$ D23 C F23 $C$26 25 GM Product (C3:C24) (1/20) Pharmaceutics , GM9, Figure 2. AUC for estimating average CFD over time. Figure 2. AUC for estimating average CFD over time. Ln-Transformed Data Data This discussion geometric means example rates, as shown in bacterial example, This discussion geometric means from example rates, as shown in bacterial example, segues with estimation means when using Ln-transformed parameters such as AUC Cmax. segues with estimation means when using Ln-transformed parameters such as AUC C Ln-transformation se pharmacokinetic parameter values is based upon an assumption y max. Ln-transformation are better described se in terms pharmacokinetic a log-normal rar parameter than a normal values distribution. is based Because upon an Ln(a) assumption + Ln(b) y are = Ln(ab), better described process in terms adding Ln-transformed a log-normal rar values than is in fact a normal a multiplicative distribution. computation. Because Ln(a) + Ln(b) = Furrmore, Ln(ab), because processln (ab) adding 1/n = ( 1 Ln-transformed /n)ln(ab) exp[ln values (ab) 1/n is] in= fact (ab) 1/n a, multiplicative exponentiation computation. Furrmore, averaged because Ln-transformed (ab) values 1/n = reverses ( 1 /n)ln(ab) estimation exp[ln procedure (ab) from 1/n ] one = (ab) addition 1/n, exponentiation (Lntransformed values) to multiplication when expressed on original scale. averaged Ln-transformed values reverses estimation procedure from one addition In mamatical symbols, geometric mean from Equation (5) may be obtained by: (Ln-transformed values) to multiplication when expressed on original scale. n In mamatical symbols, geometric ln( xmean i ) from Equation (5) may be obtained by: i 1 1/ exp = = ( x x x ) n n 1 2 n (11) n ln(x i ) In situation where exp original data are log-normal log transformed, it is convenient n = (x 1x 2 x n ) 1/n (11) appropriate to obtain a value for geometric means by exponentiating arithmetic average log-transformed values. Using values provided in Table 2, let us this time assume rar than T½, se reflect Cmax values (Table 9). The difference between arithmetic mean untransformed data versus geometric mean (based upon Ln-transformed dataset), is provided below. Again, arithmetic mean Ln-transformed data are exponentiated to obtain geometric mean (i.e., 10.4 = exp 2.34 ).

11 Pharmaceutics 2017, 9, In situation where original data are log-normal log transformed, it is convenient appropriate to obtain a value for geometric means by exponentiating arithmetic average log-transformed values. Using values provided in Table 2, let us this time assume rar than T 1/2, se reflect C max values (Table 9). The difference between arithmetic mean untransformed data versus geometric mean (based upon Ln-transformed dataset), is provided below. Again, arithmetic mean Ln-transformed data are exponentiated to obtain geometric mean (i.e., 10.4 = exp 2.34 ). Table 9. Estimation geometric means (based upon Ln-transformation individual observations). Data Row C max Column D Ln C max Column E Ln C max Row Row Row Row Row Row Row Row Row Row Arithmetic mean sum(e2:e11)/10 Geometric mean (exponentiation arithmetic mean Ln values)) exp[sum(e2:e11)/10] - Estimation geometric stard deviation (for Ln-transformed values) The following equation describes a method for obtaining an estimate [3] stdev when geometric mean, G, is used: n [ ] ln(x i ) ln(x 2 i) n G (12) n 1 where x i are individual observations, with i ranging from 1 to n [4]. The calculation stdev for C max example from Table 9 is provided in tabular form (values Excel equations) in Table 10. Table 10. Values spreadsheet code for estimating geometric stdevs. Row Column A Column B Column C Column D Number Ln number Geometric sq d dev (B1 D12) (B2 D12) (B3 D12) (B4 D12) (B5 D12) (B6 D12) (B7 D12) (B8 D12) (B9 D12) (B10 D12) 2 11 Sum 2.34 sum(d1:d10) 12 Average sum(b1:b10)/10 13 Geometric mean exp(d12) 14 Arithmetic Stdev - 14 Geometric Stdev 5.29 D13 sqrt(d11/9) 2.3. Least Square (Marginal) Means Grouped Data This discussion focuses on estimation means stdevs when re are an unequal number observations associated with factors being compared [5,6]. While precise calculation

12 Pharmaceutics 2017, 9, marginal means need to reflect study design statistical model being used, we provide a simple case two sets independent observations associated with effects being compared. We are also assuming data are normally distributed for purpose this example Estimation Mean There are circumstances when study design necessitates blocking experimental units by a specified variable (e.g., gender, age, sequence treatment administrations) due to expected differences among those experimental units are attributable to specified variable [6]. Under conditions where re are an equal number observations within each block, sample statistics may be based upon arithmetic means variances. However, a more complicated situation is when re is study imbalance such some blocks experimental units are under-represented relative to or blocks, e.g., more males than females. Least square means (LSmeans, also known as estimated population marginal means) provide an opportunity to obtain an unbiased estimate averages in face this kind study imbalance. For example, Table 11 provides a hypotical example body weights measured in two groups men women. In one group, all observations were made in individuals were enrolled in a daily exercise routine. The or group was comprised individuals who did not engage in any regular exercise program. Due to differences in proportion men women enrolled in se two groups (a parallel study), males had greater total influence on results arithmetic mean generated in exercise group (more males than females) while females had greater influence on arithmetic mean value calculated in no exercise group (more females than males). For this reason, when looking at arithmetic means (with without exercise), results suggest body weights were actually lower in no exercise group as compared to associated with people who exercised. However, when averaging means within each cell (i.e., marginal means determined, for example, by taking average males in exercise group average females in exercise group) n basing our LSmean estimates for exercise group no exercise group on average corresponding marginal means, we now see on average, body weight was slightly lower in those individuals exercised than those did not. That is, let X 1 be marginal mean for males in no exercise group, let X 2 be marginal mean for females in no exercise group, similarly, let X 3 be marginal mean for males in exercise group, X 4 be marginal mean for females in exercise group. Then, LSmean for no exercise group, T0 is: X T0 = X 1 + X 2 2 Table 11. Arithmetic means versus LSmeans for effect exercise on body weights. (13) No exercise group Exercise group Male Female Male Female Marginal means Arithmetic mean LSmean

13 Pharmaceutics 2017, 9, Estimation Stdev about LSmean Inherent in use LSmeans [6,7] is an assumption re is some difference between groups comprising two treatments being compared (in our example, difference between groups is reflected by separation participants by gender). Under assumption, we need to consider if mean stdevs reflect a normal distribution or some or type distribution. With respect to example above, we see men women clearly have unequal body weights, irrespective wher or not y were on exercise program. Thus, means, possibly, variances will likewise be dissimilar. It is for this reason LSmeans for treatment effects, which in our example corrected for imbalance gender blocks within treatment, provides a different result than did arithmetic treatment mean. For this reason, estimating variability about LSmean is far more complex than those associated with harmonic, geometric, or arithmetic means. This is because estimation variability for LSmean needs to consider both variability within a block between blocks. Therefore, se are typically generated through using an analysis variance (ANOVA) program. When running an ANOVA, value is reported with LSmeans is stard error (SE). Therefore, question is How does a statistical program estimate LSmeans corresponding error about mean estimate? To begin, let us consider equation for calculating SE an arithmetic mean. When dealing with a simple rom sample taken from population interest, SE mean is square root variance divided by sample size (which is equal to stdev divided by square root sample size). SE mean = variance n = Stdev n (14) In this case, if one is given SE mean, simple multiplication value by square root sample size yields stdev estimate for underlying population values. In or words, if we repeatedly sampled obtained means derived from same population, deviation estimates average those means (i.e., stdev about mean means) can be described by SE mean, variability among estimates means will be less than variability associated with individual observations. For example, LSmean for no exercise treatment (T0) values is average marginal mean 10 females (158 lbs) marginal mean 5 males (202 lbs). This results in LSmean 180 lbs, with a computer-generated SE for this LSmean being 3.21 lbs. Upon multiplying 3.21 lbs by square root 15 ( total number observations in no exercise group (5 males + 10 females), we estimate stdev no exercise group to be lbs. Based on this information, we would imagine underlying population generated this data summary to look as follows (Figure 3). However, given we know re are differences between men women, we need to ask wher such a variability estimate provides an accurate representation true population. Before reporting se error estimates, we need to ask how to provide a close approximation underlying population(s) from which we obtained our samples. In fact, it is with question in mind an ANOVA was performed with a model accounted for both gender treatment effects. Revisiting same dataset, we see distribution weights in females did not exercise is about normal (157.8, 11.73), as shown in Figure 4. Similarly, distribution weights in males did not exercise is about normal (202.4, 11.72), as shown in Figure 5. In requesting an estimate least squares mean for treatment in no exercise group, we are essentially requesting an estimate mean would have been obtained if we had sampled from male population non-exercisers from female population non-exercisers equally. Figure 6 provides distribution results when we have an equal probability drawing a value from male distribution from female distribution. Note population is no longer

14 Pharmaceutics 2017, 9, Pharmaceutics 2017, 9, Pharmaceutics 2017, 9, Pharmaceutics 2017, 9, represented as aas single a single normal distribution but rar as bimodal equal mixture male male represented as a single normal distribution but rar as a bimodal equal mixture male female represented female distributions. as a single normal distribution but rar as bimodal equal mixture male female distributions. female distributions. Figure 3. The distribution weight values for no exercise group predicted using LSmean Figure 3. The distribution weight values for no exercise group predicted using LSmean Figure a stdev estimate The distribution based on weight stard values error for (SE) no exercise LSmean group when predicted assuming using sample LSmean was a stdev estimate based on stard error (SE) LSmean when assuming sample was a a stdev generated estimate from based a single on normal stard population. error (SE) LSmean when assuming sample was generated from a single normal population. generated from a a single normal population. Figure 4. Distribution female weights based upon mean variance sample values Figure body weights 4. Distribution females in female no weights exercise based group. upon The X-axis mean represents variance body weight sample (lbs) values Y- Figure Figure Distribution female weights based basedupon upon mean mean variance variance sample sample values values body axis represents weights females fraction in no exercise population group. The females X-axis represents at body body weight based (lbs) upon Y- body weights females females in in no exercise no exercise group. group. The X-axis The X-axis represents represents body weight body(lbs) weight (lbs) Y- axis information represents contained fraction within subset population individuals included females in at this hypotical body weight study. based The upon possible axis Y-axis represents represents fraction fraction population population females females at body body weight weight based based upon upon information values contained Y-axis range within from subset a value individuals zero (no individuals included in expected this hypotical at body study. weight) The possible to 1 (all information in The values individuals contained in Y-axis population range within from subset will a value individuals have zero identical (no individuals included in body weight). expected this hypotical at body study. weight) The to possible 1 (all values individuals Y-axis in population range from will a value have zero zero identical (no (no individuals body weight). expected expected at body body weight) weight) to 1 (all to 1 (all individuals in in population will will have have identical identical body body weight). weight). Figure 5. Distribution male weights based upon mean variance sample values Figure body weights 5. Distribution males male in weights no exercise based group. upon mean variance sample values Figure body weights 5. males male in weights no exercise based group. Figure 5. Distribution male weights based upon upon mean mean variance variance sample sample values values body weights body weights males inmales no in exercise no exercise group. group.

15 Pharmaceutics 2017, 9, Pharmaceutics 2017, 9, Figure 6. Distribution population male female body weights based upon mean variance sample values from no-exercise group, assuming an equal likelihood sampling from eir gender. As As evidenced evidenced by by this this figure, figure, as as reflected reflected in in corresponding corresponding estimation estimation stdev, stdev, this this bimodal bimodal distribution distribution has has a higher higher estimate estimate stdev stdev T0 T0 (25.10) (25.10) as as compared compared to to reflected reflected by by previous previous assumption assumption mean mean was was associated associated with with a single single normally normally distributed distributed population population (stdev (stdev T0 T0 = 12.44) ). This This underscores underscores importance importance considering considering underlying underlying assumption assumption is is foundation foundation for for data data analysis. analysis. Within Within context context this example, this example, having generated having generated a statistical a statistical analysis using analysis a model using containing a model containing a term for a gender term for effects gender effects having discovered having discovered gender is gender a statistically is a statistically significant term, significant one needs term, to one determine needs to determine information information should be conveyed should about be conveyed LSmeans about when LSmeans combining when combining two genders two genders stdev for stdev underlying for joint underlying population. joint population. In light In identified light gender identified effects, gender question effects, is wher question it is was wher more it appropriate was more appropriate to compare to treatments compare (with treatments or without (with or exercise) without using exercise) data using are data pooled are across pooled sexes across or wher sexes or it wher would it have would been have preferable been preferable to separate to separate assessment to assessment one generated to one within generated a gender. within The a answer gender. to The answer question to depends question upon depends study upon objective study which objective in turn, should which be in reflected turn, should in be study reflected protocol. in study How protocol. does this relate to our typical pharmacokinetic or bioequivalence studies? This kind question How may does have this relevance relate to when our typical re is pharmacokinetic reason to believe or bioequivalence re may be gender-by-formulation studies? This kind question interactions. may Unless have relevance re are when data re to is contrary, reason to however, believe we typically re may assume be gender-byformulation two subpopulations interactions. do not Unless differ re with are respect data to to ir contrary, overall however, mean values we typically variances. assume Anor two example subpopulations potential do relevance not differ is imbedded with respect within to ir overall design mean crossover values trials variances. where re Anor is example assumption potential no statistically relevance significant is imbedded sequence within effects design (i.e., crossover relationship trials where between re is rate assumption extent absorption no statistically for two significant treatments sequence do not effects differ (i.e., as a result relationship order in between which y rate are administered). extent absorption Accordingly, for two even treatments in face do not an differ imbalance as a result in subjects nested order in within which sequence, y are administered). we allow for Accordingly, use ratio even in LSmeans face an imbalance corresponding in subjects nested SEs within estimate sequence, when we allow calculating for use 90% confidence ratio intervals LSmeans for AUC C corresponding SEs estimate when max. Neverless, this exercise underscores calculating why, from a statistical 90% confidence perspective, intervals calculation for AUC 90% Cmax. confidence Neverless, interval this could exercise be biased underscores in why, face from statistically a statistical significant perspective, sequence calculation effects. 90% confidence interval could be biased in face For statistically each main effect significant LSmean, sequence Table 12 effects. provides a comparison corresponding stdev estimated For each main effect LSmean, Table 12 provides a comparison corresponding stdev by using within/between calculation vs. estimated by simulating population as if re estimated by using within/between calculation vs. estimated by simulating population were an equal probability drawing from two groups comprising LSmeans. On outside as if re were an equal probability drawing from two groups comprising LSmeans. On chance stdev for bimodal situation is desired, it can be approximated by using outside chance stdev for bimodal situation is desired, it can be approximated by using within/between type calculation used for meta-analysis example described in next section within/between type calculation used for meta-analysis example described in next (estimating variance when utilizing data derived across several studies). section (estimating variance when utilizing data derived across several studies).

16 Pharmaceutics 2017, 9, Table 12. Comparison stdev based upon an assumption a single versus bimodal dataset comprising treatment effect. LSmean Description Within/between calculated Stdev Simulated Stdev (equal probability sampling from each group) No exercise (T0) Average T0 males T0 females Exercise (T1) Average T1 males T1 females Males Average T0 males T1 males Females Average T0 females T1 females Cross Study Comparisons Research based on synsizing information from a number independently conducted biomedical studies has been occurring since at least 1904 when Karl Pearson compared infection rates among inoculated soldiers soldiers had not been inoculated [8]. In 1930s, as related in Hodges Olkin [9], publications on how to combine data from agricultural experiments were written by groundbreakers in field statistics, L.H.C. Tippett, R.A. Fisher, Karl Pearson, Yates Cochran. From re, synsis method ( came to be known as meta-analysis) exped into social sciences has experienced a burgeoning popularity in areas clinical medicine epidemiology. Accompanying increase in its application, re was a concomitant growth in numbers papers provided complex methodologies for combining information from various studies. When pooling information across studies to estimate an overall average variance, re needs to be consideration given to number observations in each respective studies, within-study variance associated with each investigation, cross-investigational variance. In situations where this numerical solution would be employed, we may have access only to published sample statistics such as sample size, mean stdev from each studies evaluated. We may not have access to underlying datasets. Therefore, re is a need to ascertain an appropriate method by which to estimate means variances when using conducting a cross study evaluation based solely on available summary statistics. To address this question, let us consider a scenario where re were four trainers, each one having a different number runners under ir tutelage. A separate study was conducted for each se trainers to ascertain average miles per hour (MPH) stdev among runners covering a given course. Their corresponding metrics are summarized below (Table 13): Table 13. Example a cross study assessment mean stdev. MPH: miles per hour. Heading Trainer # Runners Average MPH Stdev Mean ni LSmean Sum If goal is to estimate a global stdev mean (i.e., to describe MPH for all runners, irrespective ir trainer), data generated from each se studies would need to be combined. The synsis information derived from each four trainers is contingent upon an assumption samples were drawn from same underlying population (i.e., runners for four trainers were derived from identical population potential runners). In or words, we assume differences in statistics among studies reflect only sampling variation.

17 Pharmaceutics 2017, 9, It is only on basis assumption, cross study mean stdev can be calculated as described below Estimating Population Mean The index for study number ( trainer in this example) is i, going from 1 to I, number studies. I is four in this example. For study i, we have sample size n i (i.e., number subjects included in study i), a sample average X i, a stard deviation stdev i. In many cases, published manuscripts provide a biased estimate cross study mean stdev because differences in samples sizes are not considered. It is not uncommon to find study averages, or cross-study mean, used to estimate population mean. In so doing, each observation is not receiving an equal weight (influence) on total population average. Accordingly, as shown in Equation (15), mean is unequally weighted (uneqwtd) mean. X..uneqwtd = I X i I (15) In this example, it is important to recognize what is being described is average individual study (trainer) averages not averages across all individual observations. When population mean estimate is calculated in this way, individual observations from different studies exert an unequal influence on overall estimate. For example, if one study contains data on 4 subjects while anor study includes 20 subjects, n simple averaging two studies would result in each subject in Study 1 contributing 1 / 8 th ( 1 / 4 1 / 2 ) overall influence on means variances while each subject in Study 2 would contribute only 1 / 40 th ( 1 / 20 1 / 2 ) overall influence. We need to adjust values so each subject in each study contributes only 1 / 24 th to estimate population mean. To weight each subject equally in population mean estimate, LSmean should be used. In this example, N = 35 ( runners). X.. = I n ix i N where N = I n i (16) So in our trainer example, overall LSmean would equal: [(10 6.2) + (5 5.5) + (8 6.1) + (12 6.8)]/35 (17) The calculations are provided in two right h columns Table Estimating Population Variance Recognizing stdev is square root variance, we will first consider estimating global stdev since stdevs are given for each study. Each stard deviation, SD i estimates (square root ) variation within study from which it was calculated. Just as with samples averages, one might think estimating population within study stdev as average sample stdevs, deriving an average within-study stard deviation (SD WA ) as follows. SD WA = I SD i I Similarly, one might think to estimate between-study stard deviation (SD BA ) as stdev among study means as follows: SD BA = I ( ) 2 X.. Xi I 1 (18) (19)

18 Pharmaceutics 2017, 9, The concept partitioning total variance into both within study between study variations is familiar to anyone who has studied analysis variance. Clearly, using square just one SD WA or SD BA will seriously underestimate total population variance. The most straightforward way to think about calculating an estimate population variance from estimates within between study stdevs would be to square stdevs to add m. TotalVar unwtd = (SD BA ) 2 + (SD WA ) 2 (20) Doing so, however, results in a sum which, much like unweighted mean in Equation (15), does not appropriately weight contribution within between study components variance to total, nor magnitude contribution for each study within two components. Therefore, to calculate estimate population variance with correct weights, we follow steps as described in Appendix D. Accordingly, within study sums squares, SS w is weighted sum individual study variance estimates. SS W = I (n i 1) SDi 2 (21) The between-study sums squares, SS B, is weighted sum squared deviations individual study means from LSmean. SS B = I ( ) n i Xi X 2.. (22) Then estimate population variance is total sums squares divided by total sample size minus one. Variance WB = [SS W + SS B ] N 1 = SS T N 1 The weighted total stdev is square root weighted variance estimate (Table 14) corresponding Excel codes are provided in Table 15. A discussion numerical underpinnings this approach is provided in Appendix D. Table 14. Total weighted stdev across four trainers. (23) Data Row Trainer Col B # Runners Col C Average MPH Col D Stdev Col E Mean ni Col F Lsmean Col G SS W Col H SS B Col I Variance WB Col J Stdev WB Col K Row Row Row Row Row Table 15. Excel cell formulas. Data Row Trainer Col B # Runners Col C Ave MPH Col D Stdev Col E Mean ni Col F Row D2 C2 Lsmean Col G SS W Col H (E2 2 ) (C2-1) - Row D3 C3 (E3 2 ) (C3-1) Row D4 C4 (E4 2 ) (C4-1) Row D5 C5 (E5 2 ) (C5-1) Row 6 - SUM (C2:C5) - - SUM (F2:F5) F6/C6 SUM (H2:H5) SS B Col I Variance WB Col J C2 (D2-$G$6) 2 - C3 (D3-$G$6) 2 C4 (D4-$G$6) 2 C5 (D5-$G$6) 2 SUM (I2:I5) (I6 + H6)/(C6 1) Stdev WB Col K SQRT(J6)

19 Pharmaceutics 2017, 9, Closing Comments Our hope is this educational article will serve both as a resource as an instructional tool for scientific community. Where possible, we provide samples excel coding so an estimation respective means stdevs can be readily transferred to an Excel spreadsheet. Importantly, through discussions in this manuscript, our goal is to encourage an appreciation identification underlying assumptions influence our interpretation any dataset. Conflicts Interest: The authors declare no conflict interest. Appendix A Types means discussed in this article (refer to references as defined in body manuscript): Arithmetic mean = sum (A1) n Geometric mean = n product (e.g., growth rate) (A2) Harmonic mean = n [ 1a + 1 b + 1 c... 1 n ] (A3) Least square means are estimated in a manner comparable to arithmetic mean. However, fundamental difference is study design potential imbalance (dissimilarity between numbers observations in comparators) are adjusted for, such each observation becomes equally weighted [7]. This is accomplished by use marginal means. For example, for two groups size n 1 n 2, with marginal means X 1 X 2 : Least squares mean = ( n 1 X 1 + n 2 X 2 ) /(n1 + n 2 ) (A4) For more information on challenges associated with use a spreadsheet to estimate LSmeans lack a simple formula to follow, please see blog discussion link: http: //onbiostatistics.blogspot.com/2009/04/least-squares-means-marginal-means-vs.html. With two or more factors in a model, least squares mean calculation is depicted in matrix algebra which is beyond scope this manuscript. For more information, refer to [10]. Appendix B Pro arithmetic mean is always greater than geometric mean, which is always greater than harmonic mean: Arithmetic mean = A + B (A5) 2 Harmonic mean = Geometric mean = AB 2 [ ] if multiplied by AB 2AB can be written as 1A + B 1 AB B + A Demonstrating arithmetic mean > geometric mean: (A6) (A7) (A B) 2 0 (A8) A 2 2AB + B 2 0. (A9) By adding 4AB to each side inequality, A 2 + 2AB + B 2 4AB, which is (A + B) 2 4AB (A10)

20 Pharmaceutics 2017, 9, Taking square root both sides, A + B 2 AB n dividing both sides by 2, A + B 2 AB (A11) Showing arithmetic mean is greater than or equal to geometric mean. Continuing, we multiply line above by 2, returning to A + B 2 AB. Then, divide both sides by A + B, 1 2 AB A+B next multiply both sides by AB AB 2AB A + B (A12) Showing geometric mean is greater than or equal to harmonic mean. Note se relationships hold under condition A B are non-negative values. These results can be extended to an arbitrary number variables [11]. However, only two are included here to facilitate understing. Appendix C Relationship Terminal Elimination Rate Constant T 1/2 Starting at 1 (for 100%) as portion initial amount a chemical, decay curve decreases to nearly zero. Using 0 to 1 on y-axis, half-life is time associated with point at which curve crosses 0.5. Typically, amount on y-axis is transformed to its natural log (Ln) values since decay curves are curvilinear log transformation linearizes relationship between Ln fractional amount original material remaining time. Lambda z ( λ z ) is elimination rate constant at z th phase decay portion prile, estimated as slope line. For sake this discussion, assume drug in question follows a one-compartment open body model refore z is terminal phase decline. Assume fractional part original amount at time 0 is 1, entire original amount, which is intercept a (log linear) regression. Let Y be fractional amount original material remaining at time t, n linear relationship is: Ln(Y(t)) = Ln(1) λ z (t), (A13) Note since Ln(1) = 0, regression equation is really: Ln(Y(t)) = λ z (t), (A14) The regression equation associated with half-life would be time when value Y(t) would be 2 1 or 0.5. That is, Ln(1/2)= λ z (T 1/2 ). (A15) Since Ln( 1 2 ) = Ln(1) Ln(2), n upon dividing both sides by λ z, we can solve for T 1/2 : ((Ln(1) Ln(2)) /( λ z ) = T 1/2 (A16) Since Ln(1) = 0, equation can be rewritten as: Ln(2)/ λ z = T 1/2 = 0.693/λ z (A17) The relationship between Y(t) vs. t Ln(Y(t)) vs. t is illustrated in Figures A1 A2 in Table A1.

21 Pharmaceutics 2017, 9, 9, Pharmaceutics 2017, 9, Y Depletion curve shown on on linear scale y y = = 100e x Time (h) (h) Figure C1. A1. C1. Depletion curve shown on on linear scale. The curve is is best described by by relationship: Y = = exp exp x x ; ; where ; where X X = = X time = time (h). (h). (h). 55 Ln Ln Y y y = = x Figure C2. A2. C2. Depletion curve shownfor for Ln-transformed values. Theline lineis is best describedby by equation Ln(Y) = = X X , where X X = X = time = time (h). (h). (h). Table C1. C1. A1. Comparison arithmetic vs. vs. vs. HM for HM for T1/2 T1/2 for estimates T 1/2 estimates (including (including both both values values codes for for codes estimating for estimating se values se based values upon based upon columns columns rows rows might be be might found bein found in an an Excel in an Excel spreadsheet). spreadsheet). Data Row Column CC Column D Column EE Data Row Column C Column D Column E T1/2 T1/2 λz λz 1/T1/2 1/T1/2 T1/2 T1/ /λz T1/2 T1/2 Row 22 T /2 λ z 1/T /2 T 1/ /λ z T 1/2 Row Row Row Row Row Row Row Row Row Row Row Row Row Row Row Row Row 10 Row Row Arithmetic sum(c2:c11)/10 mean Arithmetic sum(c2:c11)/ /(Arith /(sum(D2:D11)/10) mean mean λ z ) 0.693/(Arith Harmonic mean /(sum(D2:D11)/10) 1/(sum(E2:E11)/10) mean λz) T 1/2 Harmonic mean /(sum(E2:E11)/10) T1/2 T1/2 Appendix D Time Time (h) (h) Cross-Study Appendix D Variability Estimation Procedure (Background) Cross-Study It is demonstrated Variability Estimation through Procedure following (Background) algebra a total variance estimate calculated by adding squared weighted within- between-study stdevs is biased gives incorrect weights It It is is demonstrated through following algebra a a total variance estimate calculated by to within between-study components variance. by adding squared weighted within- between-study stdevs is is biased gives incorrect weights to to within between-study components variance.

22 Pharmaceutics 2017, 9, In general setting where X ij is jth observation from group I, basic textbook introductions [1] to analysis variance show SS T, total sum squared deviations from gr mean, X.., can be divided into two sums squares, as illustrated below, because ir cross products sum to zero. SS T = I n i j=1 ( Xij X.. ) 2 = I n i j=1 [ (Xij X i. ) 2 + ( X i. X.. ) 2 ] (A18) The first component right h side this equation is SS W. As discussed prior to Equation (21), by definition sample Stdev for study i, contribution study i to SS W is equal to (n i 1) S 2 i SS W is sum contributions from all studies. SS W = I n i j=1 ( ) I 2 Xij X i. = (n i 1) Si 2 (A19) The second component right h side equation describing SS T is SS B. It can be simplified as displayed in Equation (22) because re are no terms involving index j. SS B = I n i j=1 ( ) I 2 ( ) Xi. X.. = n i Xi. X 2.. (A20) We examine case where all studies have same sample size, n, to facilitate furr discussion. (n I = N total observations). When conducting a meta-analysis, we assume all studies have drawn samples from same population. Therefore, average stdevs across studies (S. ) is an estimate stdev for population squaring S. provides an estimate within group variance, which is same for all groups. Under se two assumptions, (n 1)S. 2 can be substituted for sum squared deviations for each group. Carrying this out, one finds SS W is estimated by (N I)S. 2 SS W = I (n 1) S 2. = I(n 1)S 2. = (N I)S 2. (A21) Next, consider between group sums squares SS B with equal n. SS B = n I ( ) Xi X 2.. (A22) The definition sample variance among group averages S 2 X i is as follows. S 2 X i = I ( ) 2 Xi X.. I 1 (A23) Using same argument as one used prior to Equation (21), sum squared deviations study means about ir gr mean can be rewritten as (I 1)S 2 X i. SS B = n I ( Xi X.. ) 2 = n (I 1) S 2 X i = (N n)s 2 X i (A24) When SS W SS B are put back over N 1 to estimate total variance, weights given to each component are seen more clearly.

23 Pharmaceutics 2017, 9, SS T N 1 = (N I) N 1 S2. + (N n) N 1 S2 X i (A25) This shows contribution two variance components to total variance is not equal depends on number studies size studies. For a fixed total size N, as number equal size studies increase (implying smaller n per study), less weight is placed on within component similarly, as size studies increase (implying fewer studies, which results in smaller values i), less weight is placed on between component. While it is not possible to obtain a neat form for weights when sample sizes vary per study, as is typically case, same principles apply. The equations used in development unweighted method estimating LSmean total variance use observed sample sizes refore use appropriate weights for within between components in calculating total population variance. References 1. Snedecor, G.W.; Cochran, W.G. Statistical Methods, 8th ed.; Iowa State University Press: Ames, IA, USA, Petrie, A; Watson, P. Statistics for Veterinary Animal Science, 1st ed.; Blackwell Science: Maledon, MA, USA, 1999; pp Lam, F.C.; Hung, C.T.; Perrier, D.G. Estimation variance for harmonic mean half-lives. J. Pharm. Sci. 1985, 74, [CrossRef] [PubMed] 4. Norris, N. The stard errors geometric harmonic means ir application to index numbers. Ann. Math. Statist. 1940, 11, [CrossRef] 5. Searle, S.R.; Speed, F.M.; Milliken, G.A. Population marginal means in linear model: an alternative to least squares means. Am. Stat. 1980, 34, [CrossRef] 6. Milliken, G.A.; Johnson, D.E. Analysis Messy Data Volume I: Designed Experiments, 1st ed.; Chapman & Hall/CRC Press: Boca Raton, FL, USA, Goodnight, J.H.; Harvey, W.R. Least Squares Means in Fixed-Effects General Linear Models, 1st ed.; SAS Institute Inc.: Cary, NC, USA, O Rourke, K. An historical perspective on meta-analysis dealing quantitatively with varying study results. J. R. Soc. Med. 2007, 100, [CrossRef] [PubMed] 9. Hedges, L.V.; Olkin, I. Statistical Methods for Meta-Analysis; Academic Press: Orlo, FL, USA, Xia, D.F.; Xu, S.L.; Qi, F. A pro arithmetic mean-geometric mean-harmonic mean inequalities. RGMIA Res. Collect. 1999, 2, SAS Institute Inc. SAS/STAT 9.3 User s Guide; SAS Institute Inc.: Cary, NC, USA, by authors. Licensee MDPI, Basel, Switzerl. This article is an open access article distributed under terms conditions Creative Commons Attribution (CC BY) license (

Chapter 2. Mean and Standard Deviation

Chapter 2. Mean and Standard Deviation Chapter 2. Mean and Standard Deviation The median is known as a measure of location; that is, it tells us where the data are. As stated in, we do not need to know all the exact values to calculate the

More information

Testing for bioequivalence of highly variable drugs from TR-RT crossover designs with heterogeneous residual variances

Testing for bioequivalence of highly variable drugs from TR-RT crossover designs with heterogeneous residual variances Testing for bioequivalence of highly variable drugs from T-T crossover designs with heterogeneous residual variances Christopher I. Vahl, PhD Department of Statistics Kansas State University Qing Kang,

More information

Chapter 1 Statistical Inference

Chapter 1 Statistical Inference Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations

More information

How Critical is the Duration of the Sampling Scheme for the Determination of Half-Life, Characterization of Exposure and Assessment of Bioequivalence?

How Critical is the Duration of the Sampling Scheme for the Determination of Half-Life, Characterization of Exposure and Assessment of Bioequivalence? How Critical is the Duration of the Sampling Scheme for the Determination of Half-Life, Characterization of Exposure and Assessment of Bioequivalence? Philippe Colucci 1,2 ; Jacques Turgeon 1,3 ; and Murray

More information

Chapter 2 Sampling for Biostatistics

Chapter 2 Sampling for Biostatistics Chapter 2 Sampling for Biostatistics Angela Conley and Jason Pfefferkorn Abstract Define the terms sample, population, and statistic. Introduce the concept of bias in sampling methods. Demonstrate how

More information

Glossary. Appendix G AAG-SAM APP G

Glossary. Appendix G AAG-SAM APP G Appendix G Glossary Glossary 159 G.1 This glossary summarizes definitions of the terms related to audit sampling used in this guide. It does not contain definitions of common audit terms. Related terms

More information

Discrete Distributions

Discrete Distributions Discrete Distributions STA 281 Fall 2011 1 Introduction Previously we defined a random variable to be an experiment with numerical outcomes. Often different random variables are related in that they have

More information

BIOS 2041: Introduction to Statistical Methods

BIOS 2041: Introduction to Statistical Methods BIOS 2041: Introduction to Statistical Methods Abdus S Wahed* *Some of the materials in this chapter has been adapted from Dr. John Wilson s lecture notes for the same course. Chapter 0 2 Chapter 1 Introduction

More information

Statistics and Quantitative Analysis U4320. Segment 5: Sampling and inference Prof. Sharyn O Halloran

Statistics and Quantitative Analysis U4320. Segment 5: Sampling and inference Prof. Sharyn O Halloran Statistics and Quantitative Analysis U4320 Segment 5: Sampling and inference Prof. Sharyn O Halloran Sampling A. Basics 1. Ways to Describe Data Histograms Frequency Tables, etc. 2. Ways to Characterize

More information

y = b x Exponential and Logarithmic Functions LESSON ONE - Exponential Functions Lesson Notes Example 1 Set-Builder Notation

y = b x Exponential and Logarithmic Functions LESSON ONE - Exponential Functions Lesson Notes Example 1  Set-Builder Notation y = b x Exponential and Logarithmic Functions LESSON ONE - Exponential Functions Example 1 Exponential Functions Graphing Exponential Functions For each exponential function: i) Complete the table of values

More information

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty.

What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. What is Statistics? Statistics is the science of understanding data and of making decisions in the face of variability and uncertainty. Statistics is a field of study concerned with the data collection,

More information

1 Mathematics and Statistics in Science

1 Mathematics and Statistics in Science 1 Mathematics and Statistics in Science Overview Science students encounter mathematics and statistics in three main areas: Understanding and using theory. Carrying out experiments and analysing results.

More information

Figure Figure

Figure Figure Figure 4-12. Equal probability of selection with simple random sampling of equal-sized clusters at first stage and simple random sampling of equal number at second stage. The next sampling approach, shown

More information

Topic Page: Central tendency

Topic Page: Central tendency Topic Page: Central tendency Definition: measures of central tendency from Dictionary of Psychological Testing, Assessment and Treatment summary statistics which divide the data into two halves (i.e. half

More information

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc.

where Female = 0 for males, = 1 for females Age is measured in years (22, 23, ) GPA is measured in units on a four-point scale (0, 1.22, 3.45, etc. Notes on regression analysis 1. Basics in regression analysis key concepts (actual implementation is more complicated) A. Collect data B. Plot data on graph, draw a line through the middle of the scatter

More information

Sampling (Statistics)

Sampling (Statistics) Systems & Biomedical Engineering Department SBE 304: Bio-Statistics Random Sampling and Sampling Distributions Dr. Ayman Eldeib Fall 2018 Sampling (Statistics) Sampling is that part of statistical practice

More information

Analysis of Variance and Co-variance. By Manza Ramesh

Analysis of Variance and Co-variance. By Manza Ramesh Analysis of Variance and Co-variance By Manza Ramesh Contents Analysis of Variance (ANOVA) What is ANOVA? The Basic Principle of ANOVA ANOVA Technique Setting up Analysis of Variance Table Short-cut Method

More information

Protocol for the design, conducts and interpretation of collaborative studies (Resolution Oeno 6/2000)

Protocol for the design, conducts and interpretation of collaborative studies (Resolution Oeno 6/2000) Protocol for the design, conducts and interpretation of collaborative studies (Resolution Oeno 6/2000) INTRODUCTION After a number of meetings and workshops, a group of representatives from 27 organizations

More information

Rerandomization to Balance Covariates

Rerandomization to Balance Covariates Rerandomization to Balance Covariates Kari Lock Morgan Department of Statistics Penn State University Joint work with Don Rubin University of Minnesota Biostatistics 4/27/16 The Gold Standard Randomized

More information

CHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the

CHAPTER 4 VARIABILITY ANALYSES. Chapter 3 introduced the mode, median, and mean as tools for summarizing the CHAPTER 4 VARIABILITY ANALYSES Chapter 3 introduced the mode, median, and mean as tools for summarizing the information provided in an distribution of data. Measures of central tendency are often useful

More information

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing

Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing Notes 3: Statistical Inference: Sampling, Sampling Distributions Confidence Intervals, and Hypothesis Testing 1. Purpose of statistical inference Statistical inference provides a means of generalizing

More information

Math Review. for the Quantitative Reasoning measure of the GRE General Test

Math Review. for the Quantitative Reasoning measure of the GRE General Test Math Review for the Quantitative Reasoning measure of the GRE General Test www.ets.org Overview This Math Review will familiarize you with the mathematical skills and concepts that are important for solving

More information

FAQ: Linear and Multiple Regression Analysis: Coefficients

FAQ: Linear and Multiple Regression Analysis: Coefficients Question 1: How do I calculate a least squares regression line? Answer 1: Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables so that one variable

More information

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM

CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH. Awanis Ku Ishak, PhD SBM CHOOSING THE RIGHT SAMPLING TECHNIQUE FOR YOUR RESEARCH Awanis Ku Ishak, PhD SBM Sampling The process of selecting a number of individuals for a study in such a way that the individuals represent the larger

More information

Part 7: Hierarchical Modeling

Part 7: Hierarchical Modeling Part 7: Hierarchical Modeling!1 Nested data It is common for data to be nested: i.e., observations on subjects are organized by a hierarchy Such data are often called hierarchical or multilevel For example,

More information

3.4 Solving Exponential and Logarithmic Equations

3.4 Solving Exponential and Logarithmic Equations 214 Chapter 3 Exponential and Logarithmic Functions 3.4 Solving Exponential and Logarithmic Equations Introduction So far in this chapter, you have studied the definitions, graphs, and properties of exponential

More information

CHAPTER 8 INTRODUCTION TO STATISTICAL ANALYSIS

CHAPTER 8 INTRODUCTION TO STATISTICAL ANALYSIS CHAPTER 8 INTRODUCTION TO STATISTICAL ANALYSIS LEARNING OBJECTIVES: After studying this chapter, a student should understand: notation used in statistics; how to represent variables in a mathematical form

More information

MN 400: Research Methods. CHAPTER 7 Sample Design

MN 400: Research Methods. CHAPTER 7 Sample Design MN 400: Research Methods CHAPTER 7 Sample Design 1 Some fundamental terminology Population the entire group of objects about which information is wanted Unit, object any individual member of the population

More information

4. STATISTICAL SAMPLING DESIGNS FOR ISM

4. STATISTICAL SAMPLING DESIGNS FOR ISM IRTC Incremental Sampling Methodology February 2012 4. STATISTICAL SAMPLING DESIGNS FOR ISM This section summarizes results of simulation studies used to evaluate the performance of ISM in estimating the

More information

CIVL 7012/8012. Collection and Analysis of Information

CIVL 7012/8012. Collection and Analysis of Information CIVL 7012/8012 Collection and Analysis of Information Uncertainty in Engineering Statistics deals with the collection and analysis of data to solve real-world problems. Uncertainty is inherent in all real

More information

Final Exam - Solutions

Final Exam - Solutions Ecn 102 - Analysis of Economic Data University of California - Davis March 19, 2010 Instructor: John Parman Final Exam - Solutions You have until 5:30pm to complete this exam. Please remember to put your

More information

Course Goals and Course Objectives, as of Fall Math 102: Intermediate Algebra

Course Goals and Course Objectives, as of Fall Math 102: Intermediate Algebra Course Goals and Course Objectives, as of Fall 2015 Math 102: Intermediate Algebra Interpret mathematical models such as formulas, graphs, tables, and schematics, and draw inferences from them. Represent

More information

Big Ideas Chapter 6: Exponential Functions and Sequences

Big Ideas Chapter 6: Exponential Functions and Sequences Big Ideas Chapter 6: Exponential Functions and Sequences We are in the middle of the year, having finished work with linear equations. The work that follows this chapter involves polynomials and work with

More information

COMPLEMENTARY EXERCISES WITH DESCRIPTIVE STATISTICS

COMPLEMENTARY EXERCISES WITH DESCRIPTIVE STATISTICS COMPLEMENTARY EXERCISES WITH DESCRIPTIVE STATISTICS EX 1 Given the following series of data on Gender and Height for 8 patients, fill in two frequency tables one for each Variable, according to the model

More information

0611a2. Algebra 2/Trigonometry Regents Exam x = 4? x 2 16

0611a2. Algebra 2/Trigonometry Regents Exam x = 4? x 2 16 Algebra /Trigonometry Regents Exam 06 www.jmap.org 06a A doctor wants to test the effectiveness of a new drug on her patients. She separates her sample of patients into two groups and administers the drug

More information

Calculus (Math 1A) Lecture 4

Calculus (Math 1A) Lecture 4 Calculus (Math 1A) Lecture 4 Vivek Shende August 30, 2017 Hello and welcome to class! Hello and welcome to class! Last time Hello and welcome to class! Last time We discussed shifting, stretching, and

More information

Measures of Central Tendency and their dispersion and applications. Acknowledgement: Dr Muslima Ejaz

Measures of Central Tendency and their dispersion and applications. Acknowledgement: Dr Muslima Ejaz Measures of Central Tendency and their dispersion and applications Acknowledgement: Dr Muslima Ejaz LEARNING OBJECTIVES: Compute and distinguish between the uses of measures of central tendency: mean,

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

Calculus (Math 1A) Lecture 4

Calculus (Math 1A) Lecture 4 Calculus (Math 1A) Lecture 4 Vivek Shende August 31, 2017 Hello and welcome to class! Last time We discussed shifting, stretching, and composition. Today We finish discussing composition, then discuss

More information

Practice problems from chapters 2 and 3

Practice problems from chapters 2 and 3 Practice problems from chapters and 3 Question-1. For each of the following variables, indicate whether it is quantitative or qualitative and specify which of the four levels of measurement (nominal, ordinal,

More information

California Content Standard. Essentials for Algebra (lesson.exercise) of Test Items. Grade 6 Statistics, Data Analysis, & Probability.

California Content Standard. Essentials for Algebra (lesson.exercise) of Test Items. Grade 6 Statistics, Data Analysis, & Probability. California Content Standard Grade 6 Statistics, Data Analysis, & Probability 1. Students compute & analyze statistical measurements for data sets: 1.1 Compute the mean, median & mode of data sets 1.2 Understand

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E

FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E FCE 3900 EDUCATIONAL RESEARCH LECTURE 8 P O P U L A T I O N A N D S A M P L I N G T E C H N I Q U E OBJECTIVE COURSE Understand the concept of population and sampling in the research. Identify the type

More information

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career. Introduction to Data and Analysis Wildlife Management is a very quantitative field of study Results from studies will be used throughout this course and throughout your career. Sampling design influences

More information

Prentice Hall Mathematics, Geometry 2009 Correlated to: Connecticut Mathematics Curriculum Framework Companion, 2005 (Grades 9-12 Core and Extended)

Prentice Hall Mathematics, Geometry 2009 Correlated to: Connecticut Mathematics Curriculum Framework Companion, 2005 (Grades 9-12 Core and Extended) Grades 9-12 CORE Algebraic Reasoning: Patterns And Functions GEOMETRY 2009 Patterns and functional relationships can be represented and analyzed using a variety of strategies, tools and technologies. 1.1

More information

Time: 1 hour 30 minutes

Time: 1 hour 30 minutes Paper Reference(s) 666/01 Edexcel GCE Core Mathematics C1 Silver Level S4 Time: 1 hour 0 minutes Materials required for examination Mathematical Formulae (Green) Items included with question papers Nil

More information

Chapter 12 Comparing Two or More Means

Chapter 12 Comparing Two or More Means 12.1 Introduction 277 Chapter 12 Comparing Two or More Means 12.1 Introduction In Chapter 8 we considered methods for making inferences about the relationship between two population distributions based

More information

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- #

Lecture Slides. Elementary Statistics Twelfth Edition. by Mario F. Triola. and the Triola Statistics Series. Section 3.1- # Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series by Mario F. Triola Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview 3-2 Measures

More information

C1: From Weather to Climate Looking at Air Temperature Data

C1: From Weather to Climate Looking at Air Temperature Data C1: From Weather to Climate Looking at Air Temperature Data Purpose Students will work with short- and longterm air temperature data in order to better understand the differences between weather and climate.

More information

California. Performance Indicator. Form B Teacher s Guide and Answer Key. Mathematics. Continental Press

California. Performance Indicator. Form B Teacher s Guide and Answer Key. Mathematics. Continental Press California Performance Indicator Mathematics Form B Teacher s Guide and Answer Key Continental Press Contents Introduction to California Mathematics Performance Indicators........ 3 Answer Key Section

More information

Overview of Dispersion. Standard. Deviation

Overview of Dispersion. Standard. Deviation 15.30 STATISTICS UNIT II: DISPERSION After reading this chapter, students will be able to understand: LEARNING OBJECTIVES To understand different measures of Dispersion i.e Range, Quartile Deviation, Mean

More information

Algebra and Trigonometry 2006 (Foerster) Correlated to: Washington Mathematics Standards, Algebra 2 (2008)

Algebra and Trigonometry 2006 (Foerster) Correlated to: Washington Mathematics Standards, Algebra 2 (2008) A2.1. Core Content: Solving problems The first core content area highlights the type of problems students will be able to solve by the end of, as they extend their ability to solve problems with additional

More information

After completing this chapter, you should be able to:

After completing this chapter, you should be able to: Chapter 2 Descriptive Statistics Chapter Goals After completing this chapter, you should be able to: Compute and interpret the mean, median, and mode for a set of data Find the range, variance, standard

More information

Introduction to Matrix Algebra and the Multivariate Normal Distribution

Introduction to Matrix Algebra and the Multivariate Normal Distribution Introduction to Matrix Algebra and the Multivariate Normal Distribution Introduction to Structural Equation Modeling Lecture #2 January 18, 2012 ERSH 8750: Lecture 2 Motivation for Learning the Multivariate

More information

Unit 1: Statistics. Mrs. Valentine Math III

Unit 1: Statistics. Mrs. Valentine Math III Unit 1: Statistics Mrs. Valentine Math III 1.1 Analyzing Data Statistics Study, analysis, and interpretation of data Find measure of central tendency Mean average of the data Median Odd # data pts: middle

More information

Tennessee s State Mathematics Standards - Algebra I

Tennessee s State Mathematics Standards - Algebra I Domain Cluster Standards Scope and Clarifications Number and Quantity Quantities The Real (N Q) Number System (N-RN) Use properties of rational and irrational numbers Reason quantitatively and use units

More information

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018

UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018 UNIVERSITY OF MASSACHUSETTS Department of Mathematics and Statistics Basic Exam - Applied Statistics Thursday, August 30, 2018 Work all problems. 60 points are needed to pass at the Masters Level and 75

More information

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

One-Way ANOVA. Some examples of when ANOVA would be appropriate include: One-Way ANOVA 1. Purpose Analysis of variance (ANOVA) is used when one wishes to determine whether two or more groups (e.g., classes A, B, and C) differ on some outcome of interest (e.g., an achievement

More information

1 Introduction to Minitab

1 Introduction to Minitab 1 Introduction to Minitab Minitab is a statistical analysis software package. The software is freely available to all students and is downloadable through the Technology Tab at my.calpoly.edu. When you

More information

Statistical Inference for Means

Statistical Inference for Means Statistical Inference for Means Jamie Monogan University of Georgia February 18, 2011 Jamie Monogan (UGA) Statistical Inference for Means February 18, 2011 1 / 19 Objectives By the end of this meeting,

More information

3 Results. Part I. 3.1 Base/primary model

3 Results. Part I. 3.1 Base/primary model 3 Results Part I 3.1 Base/primary model For the development of the base/primary population model the development dataset (for data details see Table 5 and sections 2.1 and 2.2), which included 1256 serum

More information

Georgia Standards of Excellence Algebra II/Advanced Algebra

Georgia Standards of Excellence Algebra II/Advanced Algebra A Correlation of 2018 To the Table of Contents Mathematical Practices... 1 Content Standards... 5 Copyright 2017 Pearson Education, Inc. or its affiliate(s). All rights reserved. Mathematical Practices

More information

Introduction. ECN 102: Analysis of Economic Data Winter, J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 January 4, / 51

Introduction. ECN 102: Analysis of Economic Data Winter, J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 January 4, / 51 Introduction ECN 102: Analysis of Economic Data Winter, 2011 J. Parman (UC-Davis) Analysis of Economic Data, Winter 2011 January 4, 2011 1 / 51 Contact Information Instructor: John Parman Email: jmparman@ucdavis.edu

More information

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages: Glossary The ISI glossary of statistical terms provides definitions in a number of different languages: http://isi.cbs.nl/glossary/index.htm Adjusted r 2 Adjusted R squared measures the proportion of the

More information

30. (+) Use the unit circle to explain symmetry (odd and even) and periodicity of trigonometric functions. [F-TF]

30. (+) Use the unit circle to explain symmetry (odd and even) and periodicity of trigonometric functions. [F-TF] Pre-Calculus Curriculum Map (Revised April 2015) Unit Content Standard Unit 1 Unit Circle 29. (+) Use special triangles to determine geometrically the values of sine, cosine, and tangent for and use the

More information

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation

Chapter Four. Numerical Descriptive Techniques. Range, Standard Deviation, Variance, Coefficient of Variation Chapter Four Numerical Descriptive Techniques 4.1 Numerical Descriptive Techniques Measures of Central Location Mean, Median, Mode Measures of Variability Range, Standard Deviation, Variance, Coefficient

More information

Experiment 2 Random Error and Basic Statistics

Experiment 2 Random Error and Basic Statistics PHY191 Experiment 2: Random Error and Basic Statistics 7/12/2011 Page 1 Experiment 2 Random Error and Basic Statistics Homework 2: turn in the second week of the experiment. This is a difficult homework

More information

ALGEBRA II GSE ADVANCED ALGEBRA/ALGEBRA II

ALGEBRA II GSE ADVANCED ALGEBRA/ALGEBRA II Unit Name 1 Quadratics Revisited GSE ADVANCED ALGEBRA/ALGEBRA II ALGEBRA II SCOPE AND SEQUENCE CHART Unit Description Georgia Standards of Excellence Unit Duration Unit1: Students will revisit solving

More information

Ohio Department of Education Academic Content Standards Mathematics Detailed Checklist ~Grade 9~

Ohio Department of Education Academic Content Standards Mathematics Detailed Checklist ~Grade 9~ Ohio Department of Education Academic Content Standards Mathematics Detailed Checklist ~Grade 9~ Number, Number Sense and Operations Standard Students demonstrate number sense, including an understanding

More information

Name: Lab Partner: Section: In this experiment error analysis and propagation will be explored.

Name: Lab Partner: Section: In this experiment error analysis and propagation will be explored. Chapter 2 Error Analysis Name: Lab Partner: Section: 2.1 Purpose In this experiment error analysis and propagation will be explored. 2.2 Introduction Experimental physics is the foundation upon which the

More information

Sequenced Units for the Common Core State Standards in Mathematics High School Algebra I

Sequenced Units for the Common Core State Standards in Mathematics High School Algebra I In the three years prior to Algebra I, students have already begun their study of algebraic concepts. They have investigated variables and expressions, solved equations, constructed and analyzed tables,

More information

August 20, Review of Integration & the. Fundamental Theorem of Calculus. Introduction to the Natural Logarithm.

August 20, Review of Integration & the. Fundamental Theorem of Calculus. Introduction to the Natural Logarithm. to Natural Natural to Natural August 20, 2017 to Natural Natural 1 2 3 Natural 4 Incremental Accumulation of Quantities to Natural Natural Integration is a means of understanding and computing incremental

More information

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics

Last Lecture. Distinguish Populations from Samples. Knowing different Sampling Techniques. Distinguish Parameters from Statistics Last Lecture Distinguish Populations from Samples Importance of identifying a population and well chosen sample Knowing different Sampling Techniques Distinguish Parameters from Statistics Knowing different

More information

) The cumulative probability distribution shows the probability

) The cumulative probability distribution shows the probability Problem Set ECONOMETRICS Prof Òscar Jordà Due Date: Thursday, April 0 STUDENT S NAME: Multiple Choice Questions [0 pts] Please provide your answers to this section below: 3 4 5 6 7 8 9 0 ) The cumulative

More information

Chapter 4. Characterizing Data Numerically: Descriptive Statistics

Chapter 4. Characterizing Data Numerically: Descriptive Statistics Chapter 4 Characterizing Data Numerically: Descriptive Statistics While visual representations of data are very useful, they are only a beginning point from which we may gain more information. Numerical

More information

Algebra 1 Scope and Sequence Standards Trajectory

Algebra 1 Scope and Sequence Standards Trajectory Algebra 1 Scope and Sequence Standards Trajectory Course Name Algebra 1 Grade Level High School Conceptual Category Domain Clusters Number and Quantity Algebra Functions Statistics and Probability Modeling

More information

MATH 215 LINEAR ALGEBRA ASSIGNMENT SHEET odd, 14, 25, 27, 29, 37, 41, 45, 47, 49, 51, 55, 61, 63, 65, 67, 77, 79, 81

MATH 215 LINEAR ALGEBRA ASSIGNMENT SHEET odd, 14, 25, 27, 29, 37, 41, 45, 47, 49, 51, 55, 61, 63, 65, 67, 77, 79, 81 MATH 215 LINEAR ALGEBRA ASSIGNMENT SHEET TEXTBOOK: Elementary Linear Algebra, 7 th Edition, by Ron Larson 2013, Brooks/Cole Cengage Learning ISBN-13: 978-1-133-11087-3 Chapter 1: Systems of Linear Equations

More information

TABLE OF CONTENTS INTRODUCTION TO MIXED-EFFECTS MODELS...3

TABLE OF CONTENTS INTRODUCTION TO MIXED-EFFECTS MODELS...3 Table of contents TABLE OF CONTENTS...1 1 INTRODUCTION TO MIXED-EFFECTS MODELS...3 Fixed-effects regression ignoring data clustering...5 Fixed-effects regression including data clustering...1 Fixed-effects

More information

Swarthmore Honors Exam 2012: Statistics

Swarthmore Honors Exam 2012: Statistics Swarthmore Honors Exam 2012: Statistics 1 Swarthmore Honors Exam 2012: Statistics John W. Emerson, Yale University NAME: Instructions: This is a closed-book three-hour exam having six questions. You may

More information

1.0 Continuous Distributions. 5.0 Shapes of Distributions. 6.0 The Normal Curve. 7.0 Discrete Distributions. 8.0 Tolerances. 11.

1.0 Continuous Distributions. 5.0 Shapes of Distributions. 6.0 The Normal Curve. 7.0 Discrete Distributions. 8.0 Tolerances. 11. Chapter 4 Statistics 45 CHAPTER 4 BASIC QUALITY CONCEPTS 1.0 Continuous Distributions.0 Measures of Central Tendency 3.0 Measures of Spread or Dispersion 4.0 Histograms and Frequency Distributions 5.0

More information

On Assessing Bioequivalence and Interchangeability between Generics Based on Indirect Comparisons

On Assessing Bioequivalence and Interchangeability between Generics Based on Indirect Comparisons On Assessing Bioequivalence and Interchangeability between Generics Based on Indirect Comparisons Jiayin Zheng 1, Shein-Chung Chow 1 and Mengdie Yuan 2 1 Department of Biostatistics & Bioinformatics, Duke

More information

Part 3: Parametric Models

Part 3: Parametric Models Part 3: Parametric Models Matthew Sperrin and Juhyun Park August 19, 2008 1 Introduction There are three main objectives to this section: 1. To introduce the concepts of probability and random variables.

More information

Quadratics and Other Polynomials

Quadratics and Other Polynomials Algebra 2, Quarter 2, Unit 2.1 Quadratics and Other Polynomials Overview Number of instructional days: 15 (1 day = 45 60 minutes) Content to be learned Know and apply the Fundamental Theorem of Algebra

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

MATH III CCR MATH STANDARDS

MATH III CCR MATH STANDARDS INFERENCES AND CONCLUSIONS FROM DATA M.3HS.1 Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate population percentages. Recognize that there are data sets

More information

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49

4 Hypothesis testing. 4.1 Types of hypothesis and types of error 4 HYPOTHESIS TESTING 49 4 HYPOTHESIS TESTING 49 4 Hypothesis testing In sections 2 and 3 we considered the problem of estimating a single parameter of interest, θ. In this section we consider the related problem of testing whether

More information

A Correlation of. To the. North Carolina Standard Course of Study for Mathematics High School Math 2

A Correlation of. To the. North Carolina Standard Course of Study for Mathematics High School Math 2 A Correlation of 2018 To the North Carolina Standard Course of Study for Mathematics High School Math 2 Table of Contents Standards for Mathematical Practice... 1 Number and Quantity... 8 Algebra... 9

More information

Common Core State Standards for Mathematics High School

Common Core State Standards for Mathematics High School A Correlation of To the Common Core State Standards for Mathematics Table of Contents Number and Quantity... 1 Algebra... 1 Functions... 4 Statistics and Probability... 10 Standards for Mathematical Practice...

More information

Discrete Structures Lecture Sequences and Summations

Discrete Structures Lecture Sequences and Summations Introduction Good morning. In this section we study sequences. A sequence is an ordered list of elements. Sequences are important to computing because of the iterative nature of computer programs. The

More information

CURRICULUM MAP. Course/Subject: Honors Math I Grade: 10 Teacher: Davis. Month: September (19 instructional days)

CURRICULUM MAP. Course/Subject: Honors Math I Grade: 10 Teacher: Davis. Month: September (19 instructional days) Month: September (19 instructional days) Numbers, Number Systems and Number Relationships Standard 2.1.11.A: Use operations (e.g., opposite, reciprocal, absolute value, raising to a power, finding roots,

More information

Module 1. Identify parts of an expression using vocabulary such as term, equation, inequality

Module 1. Identify parts of an expression using vocabulary such as term, equation, inequality Common Core Standards Major Topic Key Skills Chapters Key Vocabulary Essential Questions Module 1 Pre- Requisites Skills: Students need to know how to add, subtract, multiply and divide. Students need

More information

The Calculus Behind Generic Drug Equivalence

The Calculus Behind Generic Drug Equivalence The Calculus Behind Generic Drug Equivalence Stanley R. Huddy and Michael A. Jones Stanley R. Huddy srh@fdu.edu, MRID183534, ORCID -3-33-231) isanassistantprofessorat Fairleigh Dickinson University in

More information

Preliminary Statistics course. Lecture 1: Descriptive Statistics

Preliminary Statistics course. Lecture 1: Descriptive Statistics Preliminary Statistics course Lecture 1: Descriptive Statistics Rory Macqueen (rm43@soas.ac.uk), September 2015 Organisational Sessions: 16-21 Sep. 10.00-13.00, V111 22-23 Sep. 15.00-18.00, V111 24 Sep.

More information

Common Core State Standards for Mathematics

Common Core State Standards for Mathematics A Correlation of Pearson to the Standards for Mathematics Appendix A, Integrated Pathway High School A Correlation of Pearson, Appendix A Introduction This document demonstrates how Pearson,, meets the

More information

SOLUTIONS. Math 110 Quiz 5 (Sections )

SOLUTIONS. Math 110 Quiz 5 (Sections ) SOLUTIONS Name: Section (circle one): LSA (001) Engin (002) Math 110 Quiz 5 (Sections 5.1 5.3) Available: Wednesday, November 8 Friday, November 10 1. You have 30 minutes to complete this quiz. Keeping

More information

California: Algebra 1

California: Algebra 1 Algebra Discovering Algebra An Investigative Approach California: Algebra 1 Grades Eight Through Twelve Mathematics Content Standards 1.0 Students identify and use the arithmetic properties of subsets

More information

16.400/453J Human Factors Engineering. Design of Experiments II

16.400/453J Human Factors Engineering. Design of Experiments II J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential

More information

Psych Jan. 5, 2005

Psych Jan. 5, 2005 Psych 124 1 Wee 1: Introductory Notes on Variables and Probability Distributions (1/5/05) (Reading: Aron & Aron, Chaps. 1, 14, and this Handout.) All handouts are available outside Mija s office. Lecture

More information

Draft Proof - Do not copy, post, or distribute

Draft Proof - Do not copy, post, or distribute 1 LEARNING OBJECTIVES After reading this chapter, you should be able to: 1. Distinguish between descriptive and inferential statistics. Introduction to Statistics 2. Explain how samples and populations,

More information

CHAPTER 2 Estimating Probabilities

CHAPTER 2 Estimating Probabilities CHAPTER 2 Estimating Probabilities Machine Learning Copyright c 2017. Tom M. Mitchell. All rights reserved. *DRAFT OF September 16, 2017* *PLEASE DO NOT DISTRIBUTE WITHOUT AUTHOR S PERMISSION* This is

More information