Proceedgs of the 7th World Cogress The Iteratoal Federato of Automatc Cotrol Seoul, Korea, July 6-, 008 Prcpal Compoet Aalyss Based Support Vector Mache for the Ed Pot Detecto of the Metal Etch Process Kyoughoo Ha, Seughyok Km, Ku Joo Park, E Sup Yoo, ad Heeyeop Chae School of Chemcal & Bologcal Egeerg, Seoul Natoal Uversty, Seoul, Korea (e-mal: mokdog@pslab.su.ac.kr, hyok0@pslab.su.ac.kr, esyoo@pslab.su.ac.kr) Semcoductor busess part, DMS Co., Ltd., Suwo, Gyeogg-do, Korea (e-mal: kjpark@dms.co.kr) Departmet of Chemcal Egeerg, Sugkyukwa Uversty, Suwo, Gyeogg-do, Korea (e-mal: hchae@skku.edu) Abstract: A edpot detecto usg the algorthm of prcpal compoet aalyss based support vector mache was developed for the plasma etchg process. Because may edpot detecto techques use a few maually selected wavelegths, ose reder them effectve ad t s hard to select the mportat wavelegths. So the prcpal compoet algorthm wth the whole wavelegths has bee developed for the more effectve motorg of ed pot. Ad the support vector regresso was followed for the realtme ed pot detecto wth reduced wavelegths to save the processg tme. Ths approach was appled ad demostrated for a metal etchg process of Al ad 0.5% Cu o the oxde stack wth ductvely coupled BCl /Cl plasma.. INTRODUCTION I semcoductor dustry, plasma etchg was developed a varety of ames plasma etchg, plasma asssted etchg, reactve sputter etchg, reactve o etchg, etc. These all evertheless rely o the same basc prcple removg uwated materals by the formato of volatle reacto productos a glow dscharge. Whe the target materals or layer s removed, t eeds to stop the plasma etchg process exactly to avod excessve over-etchg ad ths evet s called as the ed pot detecto (EPD). The optcal emsso spectroscopy (OES) techology s the most wdely used method for the smple EPD motorg. Ths OES uses a optcal emsso spectrometer for tracg of the reactve speces plasma reacto wth the removable materals. Most of the EPD methods usg OES focus o detfyg a sgle wavelegth correspodg to a chemcal speces whch shows a proouced trasto at the ed pot tme. Whe the target layer s cleared by etchg process, the cocetrato of reacto product from the target layer s reduced ad the oe from the uder-lyg s creased. Bols et al. [999] demostrated a advaced edpot system for small ope-area etchg by applyg threshold sgal processg wth sgle wavelegth. Ths sgle wavelegth method caot avod the ose problem or tme delayg assocated wth flterg. Furthermore selecto of approprate wavelegth requres sgfcat experece of process egeers. However, ths sgle wavelegth method shows ts detecto lmtato whe the ope area (surface of target materals) s small or the sgal s ot strog eough. So ths method usually works relably for oly for large ope area wafers (typcally larger tha 0%). Whte et al. [000] proposed T ad Q statstcs for the edpot detecto of low ope-area wafers usg prcpal compoet aalyss (PCA) cojucto wth T detecto ad recursve mea update. Ths method works also relably for large ope areas (>0%) because the edpot feature s severely corrupted wth a drft small ope area. [Yue ad co-workers, 00] Yue ad coworkers [00] also used PCA wth extracted a relable edpot sgal usg the loadg vectors. But the reduced data ca cause decreasg of sestvty PCA algorthm. I ths paper, we used PCA for data reducto to save the processg tme ad for the models of EPD motorg. The we estmated for real-tme EPD by support vector regresso for creasg ts sestvty.. THEORY. Prcpal Compoet Aalyss Because PCA [Jackso, 99] ad Support Vector Mache (SVM) [Vapk, 998] are very famous tools chemometrcs owadays, the bref troductos were show ths secto. PCA decomposes the data matrx (X) as the sum of outer product of vectors t ad p plus a resdual matrx (E): X = t p T + t p T + +t k p T k + E () where k must be less tha or equal to the smaller dmeso of X. The t vectors are defed as scores, ad cota formato o how the samples relate to each other. The p 978--34-7890-/08/$0.00 008 IFAC 4560 0.38/0080706-5-KR-00.397
7th IFAC World Cogress (IFAC'08) Seoul, Korea, July 6-, 008 vectors are kow as loadgs ad cota formato o how the varables relate to each other. I the PCA decomposto, the p vectors are the egevectors of the covarace matrx,.e. for each p : Cov(X)p = p () where the s the egevalue assocated wth the p. Note that for X ad ay t, p par: Xp = t (3) The score vector (t ), s the lear combato of the orgal X varables defed by p. Because the score vector eeds ormalzato of data the cocept of product was used for real-tme motorg of score vector wthout ormalzato. Y = X' P (4) Y meas product value of th sample tme ad X' meas the data matrx wthout ormalzato. The edpot ca be decded by the motorg of the estmated product wth raw data of real tme wafer ad loadg vector of model wafer. The rato of two products ca be used for EPD to crease the sestvty of motorg after comparso of all products values.. Support Vector Mache for Regresso The SVM s a learg system that uses a hypothess space of lear fucto a hgh dmesoal feature space, whch s traed wth a learg algorthm from optmzato theory that mplemets a learg bas derved from statstcal learg theory. [Vapk, 995; Cherkassky ad Muler, 998] Suppose there s a set of trag data {(x,y ),, (x,y )} X R, where X deotes the space of the put patters. The SVM cosders approxmatg fuctos of the form where the kerel fucto φ (x) are features, as a olear classfcato. f ( xw, ) = wφ ( x) + b (5) = where φ (x) represets a mappg from the put space (x) to a feature space f ad b also represets a bas term. A more geeralzed form for SVM uses a kerel fucto K(x, x j ) s whch s the er producto of pot φ (x ), φ (x j ) mapped to feature space. The use of kerels makes t possble to map the data mplctly a feature space ad to tra a lear mache such a space, potetally sde-steppg the Table. Dfferet types of kerel fucto Name of kerel Expresso K( x, ) [(, ) ] p xj = x xj + Polyomal of degree p Gaussa RBF x x j K( x, xj) = exp σ Multlayer perceptro K( x, xj) = tah[( x, xj) + b] computatoal problems heret evaluatg the feature map. A lst of popular kerels s show Table. Vapk [995] troduced a geeral type of error fucto, whch s kow as the lear loss fucto wth ε-sestvty zoe. 0; f y f (, x w) y f(, x w) = ε ε y f(, x w) ε; otherwse The loss s equal to zero f the dfferece betwee the predcted value of f(x, w) ad the measured value s less tha ε. The support vector regresso eeds to mmze the followg rsk fucto. mmze subject to ( w + C ξ + ξ ) = y wφ ( x) b ε + ξ wφ ( x) + b y ε + ξ ξ, ξ 0 (6) (7) where ξ ad ξ are slack varables, whch have postve values order to quatfy the o-separable data the defg codto of the hyperplae. The costat, C ca be determed by the trade off betwee the model complexty of f ad ts accuracy o the trag data. The parameters used support vector regresso are show Fg.. Ths costraed optmzato s solved by formg a prmal varables Lagraga, L p (w, ξ, ξ ) L p (w, b, ξ, ξ, α, α, β, β ) = + ξ+ ξ α φ ε + ξ = = w C ( ) ( y w ( x) b ) α( wφ( x) b y ε ξ) ( βξ βξ ) = = + + + + (8) (9) 456
7th IFAC World Cogress (IFAC'08) Seoul, Korea, July 6-, 008 ε ξ ξ Fg.. The sestve bad for a o-lear regresso fucto [Vapk, 995] Lagraga L p (w, b, ξ, ξ, α, α, β, β ) must be mmzed wth respect to the prmal varables, w, b, ξ ad ξ, ad maxmzed wth respect to o-egatve Lagrage multplers α, α, β, ad β. Aga, ths problem ca be solved ether a dual space. By applyg the Karush Khu Tucker (KKT) codtos for regresso, the dual varables Lagraga L d (α, α ) are maxmzed; L d(, αα) = ε α + α + α α α α α α ( ) ( ) y( )( j j) Kx (, xj) = =, j= subject to (0) α = α () = = 0 α, α =,, C Wth assumpto of ξ, the rage of 0 < α, < C the regresso represets usg the prevous kerel fucto as followgs: ( ) ( α α) (, ) = f x = K x x + b () 3. PCA BASED SVM FOR EPD 3. Prcpal compoet aalyss for the product wth the whole wavelegths Itally, the etre rage of OES sgals from the stadard wafer was captured ad ormalzed. The covarace of ths ormalzed data was obtaed ad sgular value decomposto (SVD) performed. The loadg vectors were obtaed from solvg of egevalue problem of the result of ts SVD. Fally, the product value of the stadard wafer was calculated from the multplcato of raw OES data ad loadg vectors as show Fg.. Fg.. PCA based SVM model for real-tme EPD 3. PCA based wavelegths reducto If you should use ad save the whole OES data, there ca be magfcet memory burdes for the real-tme processes. The reducto techque for the wavelegths wth mportat formato was troduced by Yue et al [00]. The wavelegths from the stadard wafer ca be sorted by usg the loadg vectors for ther mportace. l lj j= = (3) where l j deotes the loadg vectors for the th wavelegth ad j th PC. I geeral, two or three loadg vectors were selected, whch have the ma formato over tha 80%. Ad the reduced wavelegths of the real tme target wafer ca be selected by above crtera. 3.3 SVR wth reduced wavelegths for EPD The SVM was leared wth reduced wavelegths as put varables ad wth the selected product rato of whole wavelegths as output varables the stadard wafer. The the reduced wavelegths from the real-tme target wafer ca estmate the product ratos by support vector regresso (SVR). EPD ca be decded by the motorg of the product rato values, whch chage sgfcatly. Ths stadard wafer should be updated perodcally to cosder the process drft as show Fg. 3. The more rapd the stadard wafer ca be updated, the more exact the EPD predcto ca be exact. Because the vast data chage ca cause the process memory falure, the perod of model update should be optmzed. 456
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx x x xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 7th IFAC World Cogress (IFAC'08) Seoul, Korea, July 6-, 008 Fg. 3. Real-tme EPD estmato model update Fg. 5. Real-tme SVM model update ad model update. Fg. 5 shows the real-tme model update for the SVM predcto for EPD. BCl 3 /Cl ICP TN Al-0.5%Cu TN SO Fg. 4. Wafer composto for the metal etch [Wse, B. M, et al., 004] 4. CASE STUDY For the case study, the ope data source of the motorg problem the semcoductor processg was used. (http://www.software.egevector.com/data/etch/, Egevector Research Ic.) The goal of ths process was to etch a TN/Al-0.5%Cu/TN/oxde stack to form a metal le employg a ductvely coupled BCl 3 /Cl plasma, as show Fg. 4. Our focus was oly o a Al-stack etch process, whch was performed o the commercally avalable Lam 9600 plasma etch tool. The OES data were collected, whch cosst of 40 process set pots ad the measured varables sampled at secod tervals. Ths expermet was doe at three dfferet tmes cosderg the codto drft, ad cossted of 6 wafers. These expermets were dvded to the 3 dfferet expermet groups for 3 dfferet expermet tmes. I the frst expermet, the,,, 3 wafers were used for model wafer ad EPDs of aother wafers were predcted by our algorthm. Ad the model was updated for every 0 wafers. Smlarly there were four modellg groups secod expermet, ad thrd expermet for predcto 5. RESULTS The optcal emsso sgals of the 9 chaels from 50m to 79.5m were measured. The ormalzato processed for PCA ad sgular value decomposto (SVD) was performed wth the covarace of these ormalzed data of each model wafer. The, the loadg vectors were obtaed from the solvg of egevalue problem of ths result of SVD. The etre rage of OES sgals was multpled wth the loadg vector for drawg the product le. Fg. 6 shows the 4-dmesoal data set whch was composed of the sgal testy (arbtrary ut), samples (sec), chaels (m), batch (waver umber). These 9 chaels of theses model wafer were raked by the PCA based wavelegths reducto algorthm. The SVM was used for each model wth reduced hgh raked 0% chaels as put value ad prevous product le (frst, secod) as output value. The kerel fucto used for modellg was the polyomal wth d degree ad the parameter C was set to 5,000. The loss fucto wth ε-sestvty zoe was wet to zero for the smple regresso ths study. The, we motored the rato of the predcted frst product value over the secod value. Fally the SVR was used for the real-tme EPD predcto wth real-tme 0% reduced chael data (6 chaels). Because the frst wafer caot be motored at each three expermet, the 3 wafers from the 6 wafers were motored for EPD usg both sgle PCA ad PCA-based SVM method. Expermet 3 Table. Wafer classfcato Wafer umbers Model 3 Target - - -3 3-4 Model 43 53 64 73 Target 44-53 54-63 64-73 74-85 Model 86 96 06 6 Target 87-96 97-06 07-6 7-6 4563
7th IFAC World Cogress (IFAC'08) Seoul, Korea, July 6-, 008 Fg. 6. Data stack of metal etch wth plasma gas Fg. 7, 8, 9, 0 shows the motorg results proposed by our algorthm (up) ad sgle PCA estmato several mportat wafers (dow). We could fd the exact chage of the predcto curve about these tmes usg our algorthm about 3 to 0 secods whereas there were some fluctuatos usg the sgle PCA predcto wth the small data fgure 7 (9 th wafer), 8 (33 th wafer), 9 (58 th wafer), 0 (88 th wafer). Fg. 9. EPD motorg comparso of the 58 th wafer Fg. 0. EPD motorg comparso of the 88 th wafer Fg. 7. EPD motorg comparso of the 9 th wafer Fg. 8. EPD motorg comparso of the 33 th wafer The motorg results of 0 wafers (98%) showed exact ed pot behavours usg PCA-based SVM method except 3 wafers because of faulty codtos. But oly 59 wafers (48%) showed the ed pot behavours usg sgle PCA method as show as the bad cases Fg. 7 to Fg. 0. 6. CONCLUSIONS I ths paper, the PCA based SVM algorthm was developed for the real-tme ed pot predcto of the plasma etchg process. Ths algorthm was appled to the Al-0.5%Cu metal etch stack wth BCl 3 /Cl plasma gas a ductvely coupled ether. At frst, 6 data sets were collected wth 9 wavelegths per secod. Ad the PCA loadg vector was calculated for the wavelegths rakg ad product value from the model wafer. The the SVM modellg was acheved wth the reduced 0% hgh rakg data as put value ad product wth whole wavelegths as output value. Fally the SVR was doe wth the reduced wavelegths of every wafer for the real-tme ed pot predcto. Whle the tradtoal sgle PCA method ca show ts sestvty lmtato wth reduced wavelegths, the suggested PCA 4564
7th IFAC World Cogress (IFAC'08) Seoul, Korea, July 6-, 008 based SVM model shows better predcto fgures as a result wth small process burdes. NOMENCLATURES X: data matrx (PCA), put space (SVM) y: output space : dmeso t: score p: loadg E: resdual matrx Cov(X): covarace of matrx X R: real umber doma C: costat of support vector regresso model K(x, x j ): Kerel fucto ε : error probablty Y: product value f(x, w): feature space of data x wth weght vector w w: weght vector φ: mappg b: bas ξ : marg slack varable α, β : Lagrage multplers L p : prmal varables Lagraga fucto L d : dual varables Lagraga fucto ξ ε : ε -tesvty loss fucto ad umercal applcato., Korea J. Chem. Eg. 3(), 99 (006) Vladmr N. Vapk, Statstcal Learg Theory, Joh Wley & Sos, Ic. (998) Whte, D., Goodl, B. E., Gower, A. E., Bog, D. S., Che, H. Saw, H. H. ad Dalto T. J., Low ope-area edpot detecto usg a PCA-based T statstc ad Q statstc o optcal emsso spectroscopy measuremets, IEEE Tras. Semcoductor Maufacturg, 3(), 93 (000) Wse, B. M., et al., A comparso of prcpal compoet aalyss, multway prcpal compoet aalyss, trlear decomposto ad parallel factor aalyss for fault detecto semcoductor etch process, Joural of Chemometrcs, 3, 379 (999) Wse, B. M. et al., PLS_Toolbox 3.5 for use wth MATLAB TM, Egevector Research (004) Yue. H. H., et al., Plasma etchg edpot detecto usg multple wavelegths for small ope-area wafers, Joural of Vacuum Scece & Techology, A 9, 66 (00) http://www.software.egevector.com/data/etch/ REFERENCES Bols, P., Morvay, D., Drachk, ad Elger, S., A advaced Edpot Soluto <% Ope Area Applcatos; Cotact ad Va, IEEE/SEMI Advaced Semcoductor Maufacturg Coferece, 39 (996) Chag, L. H., Russell, E. L. ad Braatz, R. D., Fault Detecto ad Dagoss Idustral Systems, Sprger- Verlag (00) Dreeskorfeld, L., et al. Reactve o etchg ed pot detecto of mcrostructured Mo/S multlayers by optcal emsso spectroscopy, Mcroelectroc egeerg, 54, 303 (000) Grll, A., Cold Plasma Materals Fabrcato, IEEE Press (994) Km, H. S., et al., Etch ed-pot detecto of GaN-based devce usg optcal emsso spectroscopy, Materals Scece & Egeerg, B8, 59 (00) Km K. S. ad Ko J. W., Real-tme rsk motorg system for chemcal plats, Korea J. Chem. Eg. (), 6 (005) Pearto, S. J., et al., Optcal emsso ed pot detecto for va hole etchg IP ad GaAs power devce structures, Materals Scece Egeerg, B3, 36 (999) Q, S. J., et al, Semcoductor maufacturg process cotrol ad motorg : A fab-wde freamwork, Joural of Process Cotrol, 6, 79 (006) Seo, S. T., et al., Ru-tu-ru cotrol of ductvely coupled CF6 plasma of SO : Multvarable cotroller desg 4565