Variance Penalizing AdaBoost

Size: px

Start display at page:

Download "Variance Penalizing AdaBoost"

Milton Osborne
5 years ago
Views:

1 Variace Pealizig AdaBoost Paagadatta K. Shivaswamy Departmet of Computer Sciece Corell Uiversity, Ithaca NY Toy Jebara Departmet of Compter Sciece Columbia Uiversity, New York NY Abstract This paper proposes a ovel boostig algorithm called VadaBoost which is motivated by recet empirical Berstei bouds. VadaBoost iteratively miimizes a cost fuctio that balaces the sample mea ad the sample variace of the expoetial loss. Each step of the proposed algorithm miimizes the cost efficietly by providig weighted data to a weak learer rather tha requirig a brute force evaluatio of all possible weak learers. Thus, the proposed algorithm solves a key limitatio of previous empirical Berstei boostig methods which required brute force eumeratio of all possible weak learers. Experimetal results cofirm that the ew algorithm achieves the performace improvemets of EBBoost yet goes beyod decisio stumps to hadle ay weak learer. Sigificat performace gais are obtaied over AdaBoost for arbitrary weak learers icludig decisio trees (CART). 1 Itroductio May machie learig algorithms implemet empirical risk miimizatio or a regularized variat of it. For example, the popular AdaBoost [4] algorithm miimizes expoetial loss o the traiig examples. Similarly, the support vector machie [11] miimizes hige loss o the traiig examples. The covexity of these losses is helpful for computatioal as well as geeralizatio reasos []. The goal of most learig problems, however, is ot to obtai a fuctio that performs well o traiig data, but rather to estimate a fuctio (usig traiig data) that performs well o future usee test data. Therefore, empirical risk miimizatio o the traiig set is ofte performed while regularizig the complexity of the fuctio classes beig explored. The ratioale behid this regularizatio approach is that it esures that the empirical risk coverges (uiformly) to the true ukow risk. Various cocetratio iequalities formalize the rate of covergece i terms of the fuctio class complexity ad the umber of samples. A key tool i obtaiig such cocetratio iequalities is Hoeffdig s iequality which relates the empirical mea of a bouded radom variable to its true mea. Berstei s ad Beett s iequalities relate the true mea of a radom variable to the empirical mea but also icorporate the true variace of the radom variable. If the true variace of a radom variable is small, these bouds ca be sigificatly tighter tha Hoeffdig s boud. Recetly, there have bee empirical couterparts of Berstei s iequality [1, 5]; these bouds icorporate the empirical variace of a radom variable rather tha its true variace. The advatage of these bouds is that the quatities they ivolve are empirical. Previously, these bouds have bee applied i samplig procedures [6] ad i multiarmed badit problems [1]. A alterative to empirical risk miimizatio, called sample variace pealizatio [5], has bee proposed ad is motivated by empirical Berstei bouds. A ew boostig algorithm is proposed i this paper which implemets sample variace pealizatio. The algorithm miimizes the empirical risk o the traiig set as well as the empirical variace. The two quatities (the risk ad the variace) are traded-off through a scalar parameter. Moreover, the 1

2 algorithm proposed i this article does ot require exhaustive eumeratio of the weak learers (ulike a earlier algorithm by [10]). Assume that a traiig set (X i,y i ) is provided where X i Xad y i {±1} are draw idepedetly ad idetically distributed (iid) from a fixed but ukow distributio D. The goal is to lear a classifier or a fuctio f : X {±1} that performs well o test examples draw from the same distributio D. I the rest of this article, G : X {±1} deotes the so-called weak learer. The otatio G s deotes the weak learer i a particular iteratio s. Further, the two idices sets I s ad J s, respectively, deote examples that the weak learer G s correctly classified ad misclassified, i.e., I s := {i G s (X i )=y i } ad J s := {j G s (X j ) = y j }. Algorithm 1 AdaBoost Require: (X i,y i ), ad weak learers H Iitialize the weights: w i 1/ for i =1,...,; Iitialize f to predict zero o all iputs. for s 1 to S do Estimate a weak learer G s ( ) from traiig examples weighted by (w i ). α s = 1 log i:g s (X i)=y i w i / j:g s (X j)=y j w j if α s 0 the break ed if f( ) f( )+α s G s ( ) w i w i exp( y i G s (X i )α s )/Z s where Z s is such that w i =1. ed for Algorithm VadaBoost Require: (X i,y i ), scalar parameter 1 λ 0, ad weak learers H Iitialize the weights: w i 1/ for i =1,...,; Iitialize f to predict zero o all iputs. for s 1 to S do u i λwi +(1 λ)w i Estimate a weak learer G s ( ) from traiig examples weighted by (u i ). α s = 1 4 log i:g s (X i)=y i u i / j:g s (X j)=y j u j if α s 0 the break ed if f( ) f( )+α s G s ( ) w i w i exp( y i G s (X i )α s )/Z s where Z s is such that w i =1. ed for Algorithms I this sectio, we briefly discuss AdaBoost [4] ad the propose a ew algorithm called the VadaBoost. The derivatio of VadaBoost will be provided i detail i the ext sectio. AdaBoost (Algorithm 1) assigs a weight w i to each traiig example. I each step of the AdaBoost, a weak learer G s ( ) is obtaied o the weighted examples ad a weight α s is assiged to it. Thus, AdaBoost iteratively builds S α sg s ( ). If a traiig example is correctly classified, its weight is expoetially decreased; if it is misclassified, its weight is expoetially icreased. The process is repeated util a stoppig criterio is met. AdaBoost essetially performs empirical risk miimizatio: mi 1 f F e yif(xi) by greedily costructig the fuctio f( ) via S α sg s ( ). Recetly a alterative to empirical risk miimizatio has bee proposed. This ew criterio, kow as the sample variace pealizatio [5] trades-off the empirical risk with the empirical variace: 1 arg mi f F ˆV[l(f(X),y)] l(f(x i ),y i )+τ, (1) where τ 0 explores the trade-off betwee the two quatities. The motivatio for sample variace pealizatio comes from the followig theorem [5]:

3 Theorem 1 Let (X i,y i ) be draw iid from a distributio D. Let F be a class of fuctios f : X R. The, for a loss l : R Y [0, 1], for ay δ>0, w.p. at least 1 δ, f F E[l(f(X),y)] 1 l(f(x i ),y i )+ where M() is a complexity measure. 15 l(m()/δ) ( 1) + 18 ˆV[l(f(X),y)] l(m()/δ), () From the above uiform covergece result, it ca be argued that future loss ca be miimized by miimizig the right had side of the boud o traiig examples. Sice the variace ˆV[l(f(X),y)] has a multiplicative factor ivolvig M(), δ ad, for a give problem, it is difficult to specify the relative importace betwee empirical risk ad empirical variace a priori. Hece, sample variace pealizatio (1) ecessarily ivolves a trade-off parameter τ. Empirical risk miimizatio or sample variace pealizatio o the 0 1 loss is a hard problem; this problem is ofte circumveted by miimizig a covex upper boud o the 0 1 loss. I this paper, we cosider the expoetial loss l(f(x),y):=e yf(x). With the above loss, it was show by [10] that sample variace pealizatio is equivalet to miimizig the followig cost, e yif(xi) + λ e yif(xi) e yif(xi). (3) Theorem 1 requires that the loss fuctio be bouded. Eve though the expoetial loss is ubouded, boostig is typically performed oly for a fiite umber of iteratios i most practical applicatios. Moreover, sice weak learers typically perform oly slightly better tha radom guessig, each α s i AdaBoost (or i VadaBoost) is typically small thus limitig the rage of the fuctio leared. Furthermore, experimets will cofirm that sample variace pealizatio results i a sigificat empirical performace improvemet over empirical risk miimizatio. Our proposed algorithm is called VadaBoost 1 ad is described i Algorithm. VadaBoost iteratively performs sample variace pealizatio (i.e., it miimizes the cost (3) iteratively). Clearly, VadaBoost shares the simplicity ad ease of implemetatio foud i AdaBoost. 3 Derivatio of VadaBoost I the s th iteratio, our objective is to choose a weak learer G s ad a weight α s such that s t=1 α tg t s 1 ( ) reduces the cost (3). Deote by w i the quatity e yi t=1 αtgt (x i) /Z s. Give a cadidate weak learer G s ( ), the cost (3) for the fuctio s 1 t=1 α tg t ( )+αg s ( ) ca be expressed as a fuctio of α: V (α; w,λ,i,j):= w i e α + w j e α +λ wi e α + j w e α w i e α + w j e α. (4) up to a multiplicative factor. I the quatity above, I ad J are the two idex sets (of correctly classified ad icorrectly classified examples) over G s. Let the vector w whose i th compoet is w i deote the curret set of weights o the traiig examples. Here, we have dropped the subscripts/superscripts s for brevity. Lemma The update of α s i Algorithm miimizes the cost U(α; w,λ,i,j):= λw i +(1 λ)w i e α + λw j +(1 λ)w j e α. (5) 1 The V i VadaBoost emphasizes the fact that Algorithm pealizes the empirical variace. 3

4 Proof By obtaiig the secod derivative of the above expressio (with respect to α), it is easy to see that it is covex i α. Thus, settig the derivative with respect to α to zero gives the optimal choice of α as show i Algorithm. Theorem 3 Assume that 0 λ 1 ad w i = 1 (i.e. ormalized weights). The, V (α; w,λ,i,j) U(α; w,λ,i,j) ad V (0; w,λ,i,j)=u(0; w,λ,i,j). That is, U is a upper boud o V ad the boud is exact at α =0. Proof Deotig 1 λ by λ, we have: V (α; w,λ,i,j)= w i e α + w j e α + λ wi e α + wj e α w i e α + w j e α = λ w i e α + w j e α + λ wi e α + wj e α = λ wi e α + wj e α + λ w i e α + w j e α + w i w j = λ wi e α + wj e α + λ w i 1 w j e α + w j 1 w i e α + λ w i w j = λw i + λw i e α + λw j + λw j e α + λ w i w j e α e α + λw i + λw i e α + λw j + λw j e α = U(α; w,λ,i,j). O lie two, terms were simply regrouped. O lie three, the square term from lie two was expaded. O the ext lie, we used the fact that w i + = w i =1. O the fifth lie, we oce agai regrouped terms; the last term i this expressio (which is e α + e α ) ca be writte as (e α e α ). Whe α =0this term vaishes. Hece the boud is exact at α =0. Corollary 4 VadaBoost mootoically decreases the cost (3). The above corollary follows from: V (α s ; w,λ,i,j) U(α s ; w,λ,i,j) <U(0; w,λ,i,j)=v (0; w,λ,i,j). I the above, the first iequality follows from Theorem (3). The secod strict iequality holds because α s is a miimizer of U from Lemma (); it is ot hard to show that U(α s ; w,λ,i,j) is strictly less tha U(0; w,λ,i,j) from the termiatio criterio of VadaBoost. The third equality agai follows from Theorem (3). Fially, we otice that V (0; w,λ,i,j) merely correspods to the cost (3) at s 1 t=1 α tg t ( ). Thus, we have show that takig a step α s decreases the cost (3). 4

5 Actual Cost:V Upper Boud:U 3 Actual Cost:V Upper Boud:U Cost Cost α α Figure 1: Typical Upper boud U(α; w,λ,i,j) ad the actual cost fuctio V (α; w,λ,i,j) values uder varyig α. The boud is exact at α =0. The boud gets closer to the actual fuctio value as λ grows. The left plot shows the boud for λ =0ad the right plot shows it for λ =0.9 We poit out that we use a differet upper boud i each iteratio sice V ad U are parameterized by the curret weights i the VadaBoost algorithm. Also ote that our upper boud holds oly for 0 λ 1. Although the choice 0 λ 1 seems restrictive, ituitively, it is atural to have a higher pealizatio o the empirical mea rather tha the empirical variace durig miimizatio. Also, a closer look at the empirical Berstei iequality i [5] shows that the empirical variace term is multiplied by 1/ while the empirical mea is multiplied by oe. Thus, for large values of, the weight o the sample variace is small. Furthermore, our experimets suggest that restrictig λ to this rage does ot sigificatly chage the results. 4 How good is the upper boud? First, we observe that our upper boud is exact whe λ =1. Also, our upper boud is loosest for the case λ =0. We visualize the upper boud ad the true cost for two settigs of λ i Figure 1. Sice the cost (4) is miimized via a upper boud (5), a atural questio is: how good is this approximatio? We evaluate the tightess of this upper boud by cosiderig its impact o learig efficiecy. As is clear from figure (1), whe λ =1, the upper boud is exact ad icurs o iefficiecy. I the other extreme whe λ =0, the cost of VadaBoost coicides with AdaBoost ad the boud is effectively at its loosest. Eve i this extreme case, VadaBoost derived through a upper boud oly requires at most twice the umber of iteratios as AdaBoost to achieve a particular cost. The followig theorem shows that our algorithm remais efficiet eve i this worst-case sceario. Theorem 5 Let O A deote the squared cost obtaied by AdaBoost after S iteratios. For weak learers i ay iteratio achievig a fixed error rate <0.5, VadaBoost with the settig λ =0 attais a cost at least as low as O A i o more tha S iteratios. Proof Deote the weight o the example i i s th iteratio by wi s. The weighted error rate of the sth classifier is s = s wj s. We have, for both algorithms, w S+1 i = ws i exp( y iα S G S (X i )) Z s The value of the ormalizatio factor i the case of AdaBoost is = exp( y S i α sg s (X i )) S Z s. (6) Z a s = j j s w s je αs + s w s i e αs = s (1 s ). (7) Similarly, the value of the ormalizatio factor for VadaBoost is give by Z v s = s w s je αs + s w s i e αs =(( s )(1 s )) 1 4 ( s + 1 s ). (8) 5

6 The squared cost fuctio of AdaBoost after S steps is give by S S S O A = exp( y i α s y i G (X)) s = = Zs a We used (6), (7) ad the fact that ws+1 i λ =0the cost of VadaBoost satisfies S O V = exp( y i α s y i G (X)) s = S = ( s (1 s )+ s (1 s )). w s+1 i Z a s S = 4 s (1 s ). =1to derive the above expressio. Similarly, for S Zs a w s+1 i = S Now, suppose that s = for all s. The, the squared cost achieved by AdaBoost is give by (4(1 )) S. To achieve the same cost value, VadaBoost, with weak learers with the same log(4(1 )) error rate eeds at most S times. Withi the rage of iterest for, the term multiplyig S above is at most. log((1 )+ (1 )) Z v s Although the above worse-case boud achieves a factor of two, for >0.4, VadaBoost requires oly about 33% more iteratios tha AdaBoost. To summarize, eve i the worst possible sceario where λ =0(whe the variatioal boud is at its loosest), the VadaBoost algorithm takes o more tha double (a small costat factor) the umber of iteratios of AdaBoost to achieve the same cost. Algorithm 3 EBBoost: Require: (X i,y i ), scalar parameter λ 0, ad weak learers H Iitialize the weights: w i 1/ for i =1,...,; Iitialize f to predict zero o all iputs. for s 1 to S do Get a weak learer G s ( ) that miimizes (3) with the followig choice of α s : α s = 1 4 log (1 λ)( s wi) +λ s w i (1 λ)( i Js wi) +λ i Js w i if α s < 0 the break ed if f( ) f( )+α s G s ( ) w i w i exp( y i G s (X i )α s )/Z s where Z s is such that w i =1. ed for 5 A limitatio of the EBBoost algorithm A sample variace pealizatio algorithm kow as EBBoost was previously explored [10]. While this algorithm was simple to implemet ad showed sigificat improvemets over AdaBoost, it suffers from a severe limitatio: it requires eumeratio ad evaluatio of every possible weak learer per iteratio. Recall the steps implemetig EBBoost i Algorithm 3. A implemetatio of EBBoost requires exhaustive eumeratio of weak learers i search of the oe that miimizes cost (3). It is preferable, istead, to fid the best weak learer by providig weights o the traiig examples ad efficietly computig the rule whose performace o that weighted set of examples is guarateed to be better tha radom guessig. However, with the EBBoost algorithm, the weight o all the misclassified examples is i J s w i + i J s w i ad the weight o correctly classified examples is s w i + s w i ; these aggregate weights o misclassified examples ad correctly classified examples do ot traslate ito weights o the idividual examples. Thus, it becomes ecessary to exhaustively eumerate weak learers i Algorithm 3. While eumeratio of weak learers is possible i the case of decisio stumps, it poses serious difficulties i the case of weak learers such as decisio trees, ridge regressio, etc. Thus, VadaBoost is the more versatile boostig algorithm for sample variace pealizatio. The cost which VadaBoost miimizes at λ =0is the squared cost of AdaBoost, we do ot square it agai. 6

7 Table 1: Mea ad stadard errors with decisio stump as the weak learer. Dataset AdaBoost EBBoost VadaBoost RLP-Boost RQP-Boost a5a ± ± ± ± ± 0.1 abaloe 1.64 ± ± ± 0..9 ± ± 0. image 3.37 ± ± ± ± ± 0.1 mushrooms 0.0 ± ± ± ± ± 0.0 musk 3.84 ± ± ± ± ± 0.1 mist ± ± ± ± ± 0.0 mist ± ± ± ± ± 0.0 mist7.11 ± ± ± ± ± 0.1 mist ± ± ± ± ± 0.1 mist56.79 ± ± ± ± ± 0.1 rigorm ± ± ± ± ± 0.6 spambase 5.90 ± ± ± ± ± 0.1 splice 8.83 ± ± ± ± ± 0.1 twoorm 3.16 ± ± ± ± ± 0.1 w4a.60 ± ± ± ± ± 0.1 waveform ± ± ± ± ± 0.1 wie 3.6 ± ± ± ± ± 0.1 wisc 5.3 ± ± ± ± ± Experimets Table : Mea ad stadard errors with CART as the weak learer. Dataset AdaBoost VadaBoost RLP-Boost RQP-Boost a5a ± ± ± ± 0.1 abaloe 1.87 ± ± ± ± 0. image 1.93 ± ± ± ± 0.1 mushrooms 0.01 ± ± ± ± 0.0 musk.36 ± ± ± ± 0.1 mist ± ± ± ± 0.0 mist ± ± ± ± 0.0 mist ± ± ± ± 0.0 mist ± ± ± ± 0.1 mist ± ± ± ± 0.1 rigorm 7.94 ± ± ± ± 0.4 spambase 6.14 ± ± ± ± 0.1 splice 4.0 ± ± ± ± 0.1 twoorm 3.40 ± ± ± ± 0.1 w4a.90 ± ± ± ± 0.1 waveform ± ± ± ± 0.1 wie 1.94 ± ± ± ± 0. wisc 4.61 ± ± ± ± 0. I this sectio, we evaluate the empirical performace of the VadaBoost algorithm with respect to several other algorithms. The primary purpose of our experimets is to compare sample variace pealizatio versus empirical risk miimizatio ad to show that we ca efficietly perform sample variace pealizatio for weak learers beyod decisio stumps. We compared VadaBoost agaist EBBoost, AdaBoost, regularized LP ad QP boost algorithms [7]. All the algorithms except AdaBoost have oe extra parameter to tue. Experimets were performed o bechmark datasets that have bee previously used i [10]. These datasets iclude a variety of tasks icludig all digits from the MNIST dataset. Each dataset was divided ito three parts: 50% for traiig, 5% for validatio ad 5% for test. The total umber of examples was restricted to 5000 i the case of MNIST ad musk datasets due to computatioal restrictios of solvig LP/QP. The first set of experimets use decisio stumps as the weak learers. The secod set of experimets used Classificatio ad Regressio Trees or CART [3] as weak learers. A stadard MATLAB implemetatio of CART was used without modificatio. For all the datasets, i both experimets, 7

8 AdaBoost, VadaBoost ad EBBoost (i the case of stumps) were ru util there was o drop i the error rate o the validatio for the ext 100 cosecutive iteratios. The values of the parameters for VadaBoost ad EBBoost were chose to miimize the validatio error upo termiatio. RLP-Boost ad RQP-Boost were give the predictios obtaied by AdaBoost. Their regularizatio parameter was also chose to miimize the error rate o the validatio set. Oce the parameter values were fixed via the validatio set, we oted the test set error correspodig to that parameter value. The etire experimet was repeated 50 times by radomly selectig trai, test ad validatio sets. The umbers reported here are average from these rus. The results for the decisio stump ad CART experimets are reported i Tables 1 ad. For each dataset, the algorithm with the best percetage test error is represeted by a dark shaded cell. All lightly shaded etries i a row deote results that are ot sigificatly differet from the miimum error (accordig to a paired t-test at a 1% sigificace level). With decisio stumps, both EBBoost ad VadaBoost have comparable performace ad sigificatly outperform AdaBoost. With CART as the weak learer, VadaBoost is oce agai sigificatly better tha AdaBoost. We gave a guaratee o the umber of iteratios required i the worst case for Vadaboost (which approximately matches the AdaBoost cost (squared) i Theorem 5). A assumptio i that theorem was that the error rate of each weak learer was fixed. However, i practice, the error rates of the weak learers are ot costat over the iteratios. To see this behavior i practice, we have show the results with the MNIST 3 versus 8 classificatio experimet. I figure we show the cost (plus 1) for each algorithm (the AdaBoost cost has bee squared) versus the umber of iteratios usig a logarithmic scale o the Y-axis. Sice at λ = 0, EBBoost reduces to AdaBoost, we omit its plot at that settig. From the figure, it ca be see cost AdaBoost EBBoost λ=0.5 VadaBoost λ=0 VadaBoost λ= Iteratio Figure : 1+ cost vs the umber of iteratios. that the umber of iteratios required by VadaBoost is roughly twice the umber of iteratios required by AdaBoost. At λ = 0.5, there is oly a mior differece i the umber of iteratios required by EBBoost ad VadaBoost. 7 Coclusios This paper idetified a key weakess i the EBBoost algorithm ad proposed a ovel algorithm that efficietly overcomes the limitatio to eumerable weak learers. VadaBoost reduces a well motivated cost by iteratively miimizig a upper boud which, ulike EBBoost, allows the boostig method to hadle ay weak learer by estimatig weights o the data. The update rule of VadaBoost has a simplicity that is remiiscet of AdaBoost. Furthermore, despite the use of a upper boud, the ovel boostig method remais efficiet. Eve whe the boud is at its loosest, the umber of iteratios required by VadaBoost is a small costat factor more tha the umber of iteratios required by AdaBoost. Experimetal results showed that VadaBoost outperforms AdaBoost i terms of classificatio accuracy ad efficietly applyig to ay family of weak learers. The effectiveess of boostig has bee explaied via margi theory [9] though it has take a umber of years to settle certai ope questios [8]. Cosiderig the simplicity ad effectiveess of VadaBoost, oe atural future research directio is to study the margi distributios it obtais. Aother future research directio is to desig efficiet sample variace pealizatio algorithms for other problems such as multi-class classificatio, rakig, ad so o. Ackowledgemets This material is based upo work supported by the Natioal Sciece Foudatio uder Grat No , by a Google Research Award, ad by the Departmet of Homelad Security uder Grat No. N C

9 Refereces [1] J-Y. Audibert, R. Muos, ad C. Szepesvári. Tuig badit algorithms i stochastic eviromets. I ALT, 007. [] P. L. Bartlett, M. I. Jorda, ad J. D. McAuliffe. Covexity, classificatio, ad risk bouds. Joural of the America Statistical Associatio, 101(473): , 006. [3] L. Breima, J.H. Friedma, R.A. Olshe, ad C.J. Stoe. Classificatio ad Regressio Trees. Chapma ad Hall, New York, [4] Y. Freud ad R. E. Schapire. A decisio-theoretic geeralizatio of o-lie learig ad a applicatio to boostig. Joural of Computer ad System Scieces, 55(1): , [5] A. Maurer ad M. Potil. Empirical Berstei bouds ad sample variace pealizatio. I COLT, 009. [6] V. Mih, C. Szepesvári, ad J-Y. Audibert. Empirical Berstei stoppig. I COLT, 008. [7] G. Raetsch, T. Ooda, ad K.-R. Muller. Soft margis for AdaBoost. Machie Learig, 43:87 30, 001. [8] L. Reyzi ad R. Schapire. How boostig the margi ca also boost classifier complexity. I ICML, 006. [9] R. E. Schapire, Y. Freud, P. L. Bartlett, ad W. S. Lee. Boostig the margi: a ew explaatio for the effectiveess of votig methods. Aals of Statistics, 6(5): , [10] P. K. Shivaswamy ad T. Jebara. Empirical Berstei boostig. I AISTATS, 010. [11] V. Vapik. The Nature of Statistical Learig Theory. Spriger, New York, NY,

Empirical Bernstein Boosting

Empirical Bernstein Boosting Empirical Berstei Boostig Paagadatta K. Shivaswamy Departmet of Computer Sciece Columbia Uiversity, New York NY 0027 pks203@cs.columbia.edu Toy Jebara Departmet of Computer Sciece Columbia Uiversity, New