1
Design of Experiments in Semiconductor Manufacturing Comparison of Treatments which recipe works the best? Simple Factorial Experiments to explore impact of few variables Fractional Factorial Experiments to explore impact of many variables Regression Analysis to create analytical expressions that model process behavior Response Surface Methods to visualize process performance over a range of input parameter values
Design of Experiments Objectives: Compare Processing Recipes Find the Parameters that Matter Create Models to Predict Process Results Problems: Experimental / Measurement Error Confusion of Correlation with Causation Complexity of the Effects we study 3
Problems Solved Compare Recipes Choose the recipe that gives the best results Organize experiments to facilitate the analysis Use experimental results to build process models Use models to optimize the process 4
Comparison of Treatments Internal and External References The Importance of Independence Blocking and Randomization Analysis of Variance 5
Using an External Reference to make a Decision An external reference can be used to decide whether a new observation is different than a group of old observations. Example: Create a comparison procedure for lot yield monitoring. Do it without "statistics". Use external reference data (historical data from the same process, but not from the same experiment): 6
Example: Using an External Reference To compare the difference between the average of successive groups of ten lots, I build the histogram from the reference data: Points generated by the good process that fail the test (type I error) Each new point can then be judged on the basis of the reference data. The only assumption here is that the reference data is relevant to my test! 7
Using an Internal Reference... We could generate an "internal" reference distribution from the very data we are comparing. Sampling must be random, so that the data is independently distributed. Independence would allow us to use statistics such as the arithmetic average or the sum of squares. Internal references are based on Randomization. 8
Randomization Example Is recipe A different than recipe B? 660 A A B Recipe Type Etch Rate 650 640 X A X B B 60 630 640 650 Etch Rate 660 630 0 4 6 8 Sample 10 1 9
Randomization Example - cont. There are many ways to decide this... 1. External reference distribution (based on old data.). Assumed, approximate external reference distr. (such as student-t, normal, etc). 3. Internal reference distribution. 4. "Distribution free" tests. Options, 3 and 4 depend on the assumption that the samples are independently distributed. 10
Randomization Example - cont. If there was no difference between A and B, then let me assume that I just have one out of the 10!/5!5! (5) possible arrangements of labels A and B. I use the data to calculate the differences in means for all the combinations: 11
The Origin of the student-t Distribution The student-t distribution was, in fact, defined to approximate such randomized distributions, when the parent distribution is normal! t 0 = (y B - y A ) - (μ A - μ B ) s 1 na + 1 n B For the etch example, t 0 = 0.44 and Pr (t > t 0 ) = 0.34 Randomized Distribution = 0.33 1
Example in Blocking Compare recipes A and B on five machines. If there are inherent differences from one machine to the other, what scheme would you use? Random A A A B A B A B B B Blocked A B B A B A A B B A 13
Example in Blocking - cont. With the blocked scheme, we could calculate the A-B difference for each machine. The machine-to-machine average of these differences could be randomized. d = ±d 1±d ±d 3 ±d 4 ±d 5 5 d - δ s d / n ~ t n-1 In In general, randomize what you don't know and block what you do do know. 14
Analysis of Variance Recipe Type D C B X A X A X A A X A 610 60 630 640 Etch Rate 650 660 Your Question: Are the four treatments the same or or not? The Statistician's Question: Are the discrepancies between the groups greater than the variation within each group? 15
Calculations for our Example (y t - y) i=1 i= i=3 i=4 i=5 Avg s t ν t 1: 650 648 63 645 641 643.0 0.80 4 5.00 : 645 650 638 643 640 643.0 86.80 4 5.00 3: 63 68 630 60 618 63.80 104.80 4 07.36 4: 645 640 648 64 638 64.60 63.0 4 19.36 s R = s T = s T s R = 16
Variation Within Treatment Groups First, lets assume that all groups have the same spread. Lets also assume that each group is normally distributed. The following is used to estimate their common σ: n S t = Σ t (y tj - y t ) j=1 s R = ν 1s 1 +ν s +...+ ν k s k ν 1 + ν +...+ ν k s t = S t n t -1 = S R N - k = S R ν R This is an estimate of the unknown, within group σ It is called the within treatment mean square 17
Variation Between Treatment Groups Let us now form Ho by assuming that all the groups have the same mean. Assuming that there are no real differences between groups, a second estimate of s T would be: s T = k Σ t=1 n t (y t - y) k - 1 = S T νt This is the between treatment mean square If If all all the treatments are the same, then the within and between treatment mean squares are estimating the same number! 18
What if the Treatments are different? If the treatments are different then: s T estimates σ + Σ n t τ t k t = 1 where τt μt - μ / (k - 1) In In other words, the between treatment mean square is is inflated by by a factor proportional to to the spread among the treatments! 19
Final Test for Treatment Significance Therefore, the hypothesis of equivalence is rejected if: s T s R is significantly greater than 1.0 This can be formalized since: s T s R ~ F k-1, N-k 0
More Sums of Squares A measure of the overall variation: k n t S D = Σ Σ (y tj -y) t=1 j=1 s D = S D N - 1 = S D ν D Obviously (actually, this is not so obvious, but it can be proven): S D = S T + S R and ν D = ν T + ν R 1
ANOVA Table Source Sum of DFs of Var squares Mean square between ST vt (k-1) st within SR vr (N-k) sr total SD vd (N-1) sd
ANOVA Table (full) Source Sum DFs of Var of sq Mean sq average SA va ( 1 ) sa between ST vt (k-1) st within SR vr (N-k) sr total S v ( N ) 3
Anova for our example... Data File: CompEtch Source Sum of Squares Deg. of Freedom Mean Squares F-Ratio Prob>F Between Recipe 1.38e+3 3 4.61e+ 1.61e+1 4.9e-5 Error 4.58e+ 16.86e+1 Total 1.84e+3 19 4
Decomposition of Observations Y = A + T + R In Vector Form: yti y yt - y yti -yt... =. +. +....... N 1 k-1 N-k The term degrees of freedom refers to the dimensionality of the space each vector is free to move into. 5
Geometric Interpretation of ANOVA Y = A + D Easy to prove that A D. D = R +T Easy to prove that R T and A R. Y Y D R A T 6
ANOVA Model and Diagnostics y ti = μ t + e ti e ti ~ N (0, σ ) So, the "sufficient statistics" are: as estimators of: s R, y 1, y,..., y k σ, μ 1, μ,..., μ k For our example: y t This model describes the data sufficiently. Its values are the sufficient statistics of the dataset. A: 643.0 B: 643.0 C: 63.80 D: 64.60 s R 8.6 According to to this model, the residuals are IIND. How does one verify that? 7
ANOVA Example: Poly Deposition Thickness in nm 00 190 180 170 160 150 140 130 10 110 100 90 80 70 60 50 Analysis of Variance E B C D A F Recipe Are these recipes significantly different? Source Model Error C Total DF 5 7 3 Sum of Squares 6970 58419 85388 Mean Square 5394 57 F Ratio 0.96 Prob > F 0.0000 8
Residual thick. Residual Plots: -40-30 -0-10 0 10 0 30 40 Residual thick. -40-30 -0-10 0 10 0 30 40 Residual thick. -40-0 -10 0 10 0 30 40 Residual thick. -40-30 -0-10 0 10 0 30 40 9
Residual Plots (cont): R e s i d u a l t h i c ḳ 90 80 70 60 50 40 30 0 10 0-10 -0-30 -40 E B C D E F Deposition Recipe V1 Wafer Vendor V 30
ANOVA Summary Plot Originals Construct ANOVA table Are the treatment effects significant? Plot residuals versus: treatment group mean time sequence other? ANOVA is the basic tool behind most empirical modeling techniques. 31