International Journal of Clinical Medicine Researc 2016; 3(5): 76-80 ttp://www.aascit.org/journal/ijcmr ISSN: 2375-3838 Estimating Peak Bone Mineral Density in Osteoporosis Diagnosis by Maximum Distribution Loc Nguyen Sunflower Soft Company, Ho Ci Min City, Vietnam Email address ng_ploc@yaoo.com Keywords Peak Bone Mineral Density, Osteoporosis Diagnosis, Gumbel Maximum Distribution Received: September 13, 2016 Accepted: October 3, 2016 Publised: October 18, 2016 Citation Loc Nguyen. Estimating Peak Bone Mineral Density in Osteoporosis Diagnosis by Maximum Distribution. International Journal of Clinical Medicine Researc. Vol. 3, No. 5, 2016, pp. 76-80. Abstract Te T-score is very important to diagnosis of osteoporosis and its formula is calculated from two parameters suc as peak bone mineral density (pbmd) and variance of pbmd. Tis researc proposes a new metod to estimate tese parameters wit recognition tat pbmd conforms maximum distributions, for instance, Gumbel distribution. Firstly, my metod models pbmd sample as a series of maximum values and suc values are assumed to obey Gumbel distribution. Secondly, I apply moment tecnique to estimate mean and variance of Gumbel distribution wit regard to te series of maximum values. Tese mean and variance are adjusted to become te best estimates of pbmd and pbmd variance. Tere is no normality assumption in tis researc because pbmd is essentially te extreme value wit low frequency in population and so te estimate of pbmd will be more accurate if we take full advantage of specific caracteristics of Gumbel maximum distribution. 1. Introduction to pbmd Estimation T-score is reference measure used by DA instrument to diagnose osteoporosis but autors [1] indicates tat DA instrument tends to be over-diagnosing osteoporosis wen DA instrument is used in Vietnamese population. Te issue is tat T-score is calculated from peak bone mineral density (pbmd) and pbmd in Vietnamese population is different in oter population were DA instrument is produced. Te autors [1] proposed a new approac to estimate Vietnamese pbmd by applying polynomial regression model. Autor [2] used 4 t -order polynomial fit model for estimating pbmd and se/e also used bootstrap metod to obtain pbmd data like I do. Polynomial regression model in [1, p. 2] is very effective because it is appropriate to statistical cross-sectional and it is based on normality assumption but te researc [1] meets te difficulty wen determining te confident interval of pbmd. Originating from te researc of autors [1], I propose anoter approac to estimate pbmd and variance of pbmd witout normality assumption and elp us to be easy to calculate confident interval of pbmd. Tis approac assumes tat pbmd values conform a maximal distribution, for instance, Gumbel distribution; tus, pbmd and variance of pbmd are mean and variance of suc maximal distribution, respectively. Before te way to estimate maximal mean and variance is described in next section, we sould browse over some definitions about T-score and pbmd. Bone mineral density (BMD) is amount of bone cell per square centimeter of bone [2]. Tere are many means to test BMD but dual-energy -ray absorptiometry (DA) mean is very popular [3]. DA instrument uses two -ray beams to project on bone and te BMD is measured by te absorption of eac beam by bone [3]. Wen BMD is
77 Loc Nguyen: Estimating Peak Bone Mineral Density in Osteoporosis Diagnosis by Maximum Distribution determined, te relevant T-score is calculated according to following formula [1]: T score= standard deviation of Were pbmd is peak bone mineral density wic is te main object in tis researc. World Healt Organization ttp://www.wo.int defines criteria of osteoporosis according to T-score. If T-score is greater tan or equal to 1.0 ten it is normal. If T-score is in interval [ 2.5, 1.0] ten tat is osteopenia. If T-score is less tan 2.5 ten tat is osteoporosis. Te most important parameters in T-score formula is relevant pbmd and its variance. According to [4], Peak bone mineral density (pbmd) is defined as BMD at te end of skeletal maturation and it lasts for a sufficient time interval. As aforementioned, pbmd varies from population to population, wic causes lack of accuracy in determining T-score in different populations. Tis researc focuses on ow to estimate pbmd and its variance from data sample collected in area were researc is done in order to improve te accuracy of relevant pbmd. 2. A New Metod to Estimate pbmd Based on Maximum Distribution Given m independent identical variables { 1, 2,, m } wit te same accumulative distribution F 1 ( 1 ) = F 2 ( 2 ) = = F m ( m ) = F() were is te notation representing all variables i (s). Suppose Y is te maximum value among variables i (s) and Y is considered random variable. Let P(Y < y) is te accumulative distribution of Y, we ave [5, p. 2]: <= < and < and and <= < < < Due to,,, mutually independent =' ' ' =' ' ' =' It is proved tat P(Y < y) = F() m approaces to Gumbel maximum distribution denoted G(Y) wen m approaces infinite limit [5, p. 2]. So Y conforms approximately to Gumbel distribution G(Y), we ave [6]: (~(=exp+ exp, -. /0 Were α and β are location parameter and scale parameter, respectively. Te notation exp denotes exponent function. Te Gumbel density function is [6]: 1= d( d =1. exp, -. /exp+ exp, -. /0 In general, equation (1) specifies cumulative function G(Y) and density function g(y) of Gumbel distribution. (=exp+ exp, -. /0 1= 3 exp4 567 3 8exp9 exp4 567 8: (1) 3 Let µ and σ 2 be teoretical mean and variance of G(Y), respectively [6]. ;=-+=. > = 3? A Were = is Euler Masceroni constant,= 0.5772. According to equation (2), it is easy to recognize tat µ and σ 2 represent pbmd and variance of pbmd. Wat we do now is to estimate teoretical µ and σ 2 given sample observations { 1, 2,, m } were i (s) are BMD (2) measures collected in researc. In oter words, parameters α and β will be estimated via observations { 1, 2,, m } because µ and σ 2 are calculated based on α and β. Now observations D = { 1, 2,, m } is considered as a population, we re-sample m samples from D wit replacement at eac time according to bootstrap tecnique. Suppose { i1, i2,, im } is a sample at time i were ij is j t observation taken from D at time i. Let Y i = max{ i1, i2,, im } be te maximum value at time i. After resampling n times, we ave n new maximal observations E = {Y 1, Y 2,, Y n }. Parameters α and β are estimated according to tese maximal observations. Let E(Y) and E(Y 2 ) be teoretical first-order moment and second-order moment of g(y), respectively, we ave: 3? L=-+=. A => =LM LN =L L =L ; =L -+=. L =-+=. + 3? Te sample first-order moment and second-order moment are P = S and S, respectively. According to moment metod, we estimate parameters α and β by equating teoretical moments to sample moments. Let -U and Ṿ be estimated values of respective parameters α and β, -U and Ṿ are solutions of following equations: -+=.=P A W -+=. +. 6 = 1 Z [ S
International Journal of Clinical Medicine Researc 2016; 3(5): 76-80 78 _ -U = P = 6 ] +1 Z [ S ^ ] \ Ṿ = 6 +1 Z [ S P 0 P 0 Te quality of statistical estimates suc as -U and Ṿ is dependent on expectation of suc estimates. Estimate gets good quality if its expectation is near to teoretical parameter. Estimate is te best one called unbiased estimate if its expectation is equal to teoretical parameter. Let E(-U) and E(Ṿ) be expectations of -U and Ṿ, respectively. We ave: L4 S P 8= L S LP = ; +> 4; + > 8=; +> 4; + > 8= 6 > = 63? A 4due to L S =; +> and LP =; + > 8 L-U=LP ` A? L4 S P 8=; 6`3 = -+=. 6`3 LMṾN= A? L4 S P 8= 63 Bot -U and Ṿ are biased estimates because teir expectations sift away from parameters α and β. Hence, -U and Ṿ are adjusted according to equation (3). -U =P = A 6? S ZP =P =Ṿ a Ṿ = A 6? S ZP Were S is maximum value at time b and P = S is sample mean of S s. (3) Expectations of -U and Ṿ are re-calculated wit regard to equation (2) as follows: L-U=LP = 6 Z 1 L+[ S =-+=. =.=- LMṾN= 6 Z 1 L+[ S ZP 0=LP = 6 ZP 0= 6 Z Z 1 L+1 Z [ S Z Z 1 L+1 Z [ S P 0= -+=. = 6 P 0= 6 Z Z 1. =. Z 1 Z 6 Z Z 1. Z 1 Z 6 Now adjusted -U and Ṿ are te best estimates because tey are unbiased estimates. Note tat te metod of moment estimation for Gumbel distribution is not new; please read [7, pp. 5-6] in wic autors mentioned bot moment metod and maximum likeliood metod. Autors [8] proposed Expectation Maximization (EM) algoritm to estimate parameters of extreme value distributions from truncated data. However my contribution in Gumbel parameter estimation is to adjust moment metod in order to produce unbiased estimate of parameter β according to equation (3). Let ; and >U be estimated values of mean µ and variance σ 2, respectively. From equation (2), we ave equation (4) to determine tese estimates: ; =-U+=Ṿ =P >U = 3 d? = A 6 S ZP = 6 S P (4) Were -U and Ṿ are specified by equation 3. In general, ; and >U are estimates of pbmd and its variance. Anoter facility of tis metod is to allow us to determine te confident interval of pbmd given significant level. Gumbel distribution is not symmetric and so we sould calculate one-sided confident interval. Given significant level s 0, percentile point denoted Y 0 of Gumbel distribution is solution of following equation: (=s exp+ exp, -. /0=s =-.lnm lnj N =-U ṾlnM lnj N Note tat te notation ln denotes natural logaritm function. Tus, te lower-tail confident interval of pbmd given significant level s 0 is µ Y 0, respectively. Te upper-tail confident interval of pbmd is computed in te similar way wen µ Y 0 and =-U+ ṾlnMln1 j N is solution of G(Y) = 1 s 0. In general, equation (5) determines percentile point denoted Y 0 given significant level s 0. k =-U ṾlnM lnj N given lower tail =-U ṾlnM ln1 j N given upper tail Were -U and Ṿ are specified by equation 3. Tere is no pre-calculated table for accessing percentage point in te similar way of standard normal distribution lookup table due to lack of accuracy if Gumbel distribution is (5)
79 Loc Nguyen: Estimating Peak Bone Mineral Density in Osteoporosis Diagnosis by Maximum Distribution approximated to normal distribution. So it is sligtly complicated to calculate pbmd confident interval due to a few of aritmetic operations for determining Y 0. 3. A Case Study of pbmd Estimation Table 1. Re-sampling and maximum values. No Re-sampling Y 1 1.08428, 0.972234, 1.08428, 1.14306, 1.14306 Y 1=1.14306 2 1.08428, 0.972234, 1.14306, 0.972234, 0.972234 Y 2=1.14306 3 0.972234, 0.972234, 1.06506, 1.06506, 1.06506 Y 3=1.06506 4 1.08428, 0.972234, 1.08428, 1.14306, 1.06506 Y 4=1.14306 5 1.06506, 1.08428, 1.00824, 1.00824, 1.08428 Y 5=1.08428 6 1.14306, 1.00824, 0.972234, 0.972234, 1.06506 Y 6=1.14306 7 1.00824,1.08428, 1.14306, 0.972234, 1.00824 Y 7=1.14306 8 1.00824, 0.972234, 1.06506, 1.00824, 1.00824 Y 8=1.06506 9 1.08428, 0.972234, 1.08428, 1.14306, 1.14306 Y 9=1.14306 10 1.14306, 1.14306, 1.06506, 0.972234, 1.14306 Y 10=1.14306 For convenience, suppose BMD population mean is approximated to 1g/cm 2 wit deviation 0.1g/cm 2. Suppose we ave simulated BMD sample D = { 1 =0.972234, 2 =1.06506, 3 =1.08428, 4 =1.00824, 5 =1.14306} of size n=5 wic conforms normal distribution wit mean 1 and variance 0.1 2. After re-sampling m=10 times witout replacement according to bootstrap tecnique, we ave 10 new maximal observations E = {Y 1, Y 2,, Y 10 } were Y i is maximum value of sample { i1, i2,, i5 } be te maximum value at time i. It is easy to recognize tat E is simulated pbmd sample. Note tat autor [2] also used bootstrap metod to obtain pbmd data like I do. Table 1 sows tese maximum values. Suppose Y i conforms Gumbel distribution, we need to estimate its mean ; and variance >U were ; is te estimated pbmd. According to equation (4), te parametric estimates -U and Ṿ must be determined first wit regard to equation (3). We ave: P = 1 10 [ S 1.12158 = 6 -U P 101 +[ S 10P 01.12103 Ṿ 6 101 +[ S 10P 00.000953833 ; -U<=Ṿ P 1.12158 >U Ṿ 6 0.00122334 Finally, te estimated pbmd is 1.12158 wit deviation 0.001223340.0349763. Moreover, te lower-tail percentile point denoted Y 0 at significant level 0.025 is: -U< ṾlnMlnj N1.121030.000953833lnMln0.025N1.11979 Te upper-tail percentile point denoted q at significant level 0.025 is: q -U< ṾlnMlnj N1.121030.000953833lnMln10.025N1.12454 According to equation (1), given estimates -U and Ṿ, Gumbel density function g(y) is: 11048.4exp1175.291048.4expMexp1175.291048.4N Figure 1 sows te grap of suc Gumbel density function g(y) as follows: Altoug g(y) is asymmetric, te lower-tail and upper-tail percentile points (1.11979 and 1.12454) at significant level 0.025 are very close to te peak. In oter words, 95% confident interval containing te mean µ = 1.12158 is very narrow (0.00475 = 1.12454 1.11979), wic indicates tat suc mean is te most likely pbmd. Following is code snip for pbmd estimation mentioned above, written by Wolfram language built in Matematica software [7]. m=5; n=10; significant=0.025; (* Simulated sample D wit mean 1 and deviation 0.1 *) x=randomvariate[normaldistribution[1,0.1],m] Figure 1. Grap of Gumbel density function. As seen in figure 1, pbmd data concentrates igly on te peak of density function g(y) at its mean µ = 1.12158. (* Producing maximal observations E by bootstrap tecnique *) yy=table[randomcoice[x,m],{i,n}]
International Journal of Clinical Medicine Researc 2016; 3(5): 76-80 80 y=map[max,yy] (* Estimating parameters suc as alpa and beta *) ymean=mean[y] beta=sqrt[6]/((n-1)pi)(sum[(y[[i]])^2,{i,n}]-n*(ymean^2)) alpa=ymean-eulergamma*beta (* Estimating mean and variance *) mu=alpa+eulergamma*beta var=beta*pi/sqrt[6] Sqrt[var] (* Percentile points at significant level 0.025 *) lowerpercentile=alpa-beta*log[-log[significant]] upperpercentile=alpa-beta*log[-log[1-significant]] g[z_]:=1/beta*exp[-(z-alpa)/beta]*exp[-exp[-(zalpa)/beta]] g[z] (* Plotting te Gumbel density function *) Image[Plot[g[z],{z,-10,10}]] 4. Conclusion Te ideology of tis researc is to apply maximum Gumbel distribution into estimating pbmd and its variance by moment tecnique, instead of using normal approximation to maximal values. Tis is my main contribution in te area of pbmd estimation. So Gumbel distribution ensures accuracy of estimation and tis is te strong point of my metod. Te drawback is te complexity of estimation wen it is required a lot of calculus and aritmetic operations. In te future, I will researc ow to make approximation of Gumbel distribution so tat te derived distribution is simpler tan Gumbel distribution. Acknowledgement Tis researc is te place to acknowledge Doctor Ho- Pam, Lan T. Pam Ngoc Tac University of Medicine, Ho Ci Min city, Vietnam wo produces an excellent researc [1] from wic my researc is inspired. Moreover se gave me valuable comments and advices tat elp me to improve tis researc. References [1] L. T. Ho-Pam, H. N. Pam, T.. Lai, U. D. T. Nguyen, N. D. Nguyen and T. V. Nguyen, "Reference Ranges for Bone Mineral Density and Prevalence of Osteoporosis in Vietnamese Men and Women," BMC Musculoskeletal Disorders, vol. 12, no. 182, pp. 1-7, 2011. [2] D. A. Banu, "Peak Bone Mineral Density Of Bangladesi Men And Women," International Journal of Scientific & Tecnology Researc, vol. 4, no. 11, pp. 8-13, November 2015. [3] Wikipedia, "Bone density," Wikimedia Foundation, 15 February 2016. [Online]. Available: ttps://en.wikipedia.org/wiki/bone_density. [4] Wikipedia, "Dual-energy -ray absorptiometry," Wikimedia Foundation, 7 Marc 2016. [Online]. Available: ttps://en.wikipedia.org/wiki/dual-energy_ray_absorptiometry. [5] J. P. Bonjour, G. Teintz, F. Law, D. Slosman and R. Rizzoli, "Peak bone mass," Osteoporosis international, vol. 4, no. 1 Supplement, pp. 7-13, January 1994. [6] D. Veneziano, "Distribution of te maximum of independent identically-distributed variables," MIT OpenCourseWare, Cambridge, 2005. [7] Wikipedia, "Gumbel distribution," Wikimedia Foundation, 29 Marc 2016. [Online]. Available: ttps://en.wikipedia.org/wiki/gumbel_distribution. [8] J. Kemwong, T. Boonyawiwat, T. Kriengkomol, J. Songsiri and P. Hoisungwan, "Parameter estimation of Gumbel distribution for," Jitkomut Songsiri Homepage, Bangkok, 2015. [9] T. Kernane and Z. Raiza, "Estimation of te Parameters of Extreme Value Distributions from Truncated Data Via te EM Algoritm," Te open arcive HAL, 2010. [10] Wolfram, Matematica, vol. 10, Wolfram Researc.