AIRFOILS CLASSIFICATION USING PRINCIPAL COMPONENTS ANALYSIS (PCA)

AIRFOILS CLASSIFICATION USING PRINCIPAL COMPONENTS ANALYSIS (PCA) Camila Becker, camila_becker_87@hotmail.com Post Graduation in Industrial Systems and Processes, University of Santa Cruz do Sul 96815-900, Santa Cruz do Sul, RS, Brazil Rubén Edgardo Panta Pazos, rpazos@unisc.br and rpp@impa.br Department of Mathematics and Post Graduation in Industrial Systems and Processes, University of Santa Cruz do Sul 96815-900, Santa Cruz do Sul, RS, Brazil Abstract. The importance of air transport has grown considerably in recent decades. Therefore, much research on the aircraft is made. Research on airfoils, or aerodynamical profiles, is an example of a focus of study. Basically, the airfoils consist of a two-dimensional section used in order to fly with changes of velocities of a flux around the airfoil. In aircrafts, the airfoils are present in the wings and empennage, being the former generally asymmetric airfoils (generating lifting and greater moment, so the drag is lower), and for the second, as symmetrical airfoils. The objective of this study is to classify airfoils using principal components analysis (PCA), which is a statistical technique that aims to find patterns to represent changes in many variables, using a smaller number of factors. Its operation is to build a new system of principal components for the representation of the samples, so less dimensions can be considered. Thus, this method presents a lower computational complexity and also the benchmark to obtain results is reduced. The methodology to be developed is as follows: Initially, the data are digitalized. In the next stage, pre-processing is carried out. Subsequently, the correlation matrix is estimated. In the fourth stage, the eigenvalues and eigenvectors of correlation matrix are determined, so that the eigenvectors are indexed by increasing order of eigenvalues. Then, the eigenvalues and the more representative associated eigenvectors are chosen, in order to form the characteristic vector. Subsequently, the sample is projected into a new sub-vector space. Finally, the image is classified with the database formed. The results were favorable, in order to classify airfoils using principal components analysis; this was achieved with a computer algebraic system. Keywords: Airfoils, Principal Components Analysis, computer algebraic system, correlation matrix. 1. INTRODUCTION Currently, much research about the aircraft components is accomplished. This is in reason of the importance of air transport. One focus of the study are the airfoils because of the remarkable role in the study of aerodynamics not only aircraft but also in cars. For the airplanes the airfoils are employed as sections of the wing. For racing cars, the airfoils have great importance because it allows greater stability of the vehicle in order to provide greater adherence on rear wheels. In this work, then, are objectively classified the aerodynamic profiles, or airfoils, using the Principal Component Analysis (PCA) and considering some key components. This paper is organized as follows. In the following section, some considerations about airfoils are presented. In section 3, some ideas on the Principal Component Analysis (PCA) are depicted. Then, there are included some results. Finally, conclusions and possible extensions of this work are given. 2. AIRFOILS The aerodynamics began to have industrial importance with the advent of airplanes and automobiles, because they need to move with the least possible friction with the air for faster and spend less fuel. The study of the airfoils meant an important great step in the aerodynamics; the airfoils represent a section with the capacity for generate lifting (which allows the aircraft up in the air and remain there during the flight) producing so the lowest drag (aerodynamic force against the movement of an object). Figure 1. The balancing forces on an airplane.

Basically, the airfoils are classified as symmetrical and asymmetrical. Both have advantages: the first exhibit simple construction and have easy adaptation to the purposes of the flight. The second have greater aerodynamic efficiency. Figure 2 shows the components of an airfoil: The frontal point of the airfoil is called the leading edge, while the point farthest from the rear edge is called the trailing edge. The segment connecting these two points is called chord. The top half of the airfoil is defined by a curve called upper camber line. The curve that defines the bottom half is called lower camber line. The curve in the middle between these two curves is called the mean line and refers to the arithmetic mean of the coordinates of both camber lines. The greatest distance between the chord and the mean line is called the curvature. The angle of attack is the angle between the chord and the direction of movement of air on the airfoil. Figure 2. Components of the airfoil (example) In this work, in order to classify the airfoils, it was used the principal component analysis (PCA). For this, three parameters are employed: the aspect ratio (i.e. is the ratio between the length and height of airfoil), the curvature of the nose and curvature of the back. 3. PRINCIPAL COMPONENTS ANALYSIS (PCA) The principal components technique was first described by Karl Pearson (1901). He believed that was the correct solution for some problems of interest in biometrics, although his proposal was a practical method of calculation for two or three variables only. A description of practical computational methods came later, but even then the calculations were daunting for some variables because they had all by hand. Only after computers become widely available is that the principal components technique reached widespread use. (Manly, 2008). The principal components analysis is the transformation of a matrix of data into a smaller number of factors, which have more information as possible, in order to represent these variations. Thus, in order to reduce the dimensionality of the original set of data through by means of mutually orthogonal new variables, called principal components. The principal components analysis is a statistical approach that can be used to analyze inter-relationships between a large number of variables and explain these variables in terms of their inherent common dimensions (factors). The goal is to find a way to condense information from a number of original variables into a smaller set of statistical variables (factors) with a minimum loss of information. (Hair et al, 2005) Thus, the principal components analysis is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Since patterns in data can be hard to find in data of high dimension, where the luxury of graphical representation is not available, PCA is a powerful tool for data analysis. The other main advantage of PCA is that once you have found these patterns in the data, and you compress the data, i. e., by reducing the number of dimensions, without much loss of information. (Smith, 2002). The steps to be chosen to apply the PCA, Smith (2002), are the following: 3.1. Acquire data Initially it is acquired the data to which it this wanted to apply the principal components analysis (PCA). 3.2. Substract the mean For PCA to work properly, the mean is subtracted from each of the data dimensions. The mean subtracted is the average across each dimension. So, all the x values have x (the mean of the x values of all the data points) subtracted, and all the y values have y subtracted from them. The new data set has a null mean.

3.3. Calculate the covariance matrix The covariance is a measure of the strength of the correlation between two or more random variables, defined by: cov ( X,Y ) = n ( X i X )( Yi Y ) i = 1 ( n 1) A useful way to get all the possible covariance values between all the different dimensions is to calculate them all and put them in a matrix. The definition for the covariance matrix for a set of data with n dimensions is: C n n = ~ ~ ( c c cov( X, X ) i, j, i, j = i j (1) (2) Where n n C is a matrix with n rows and n columns, and X ~ is the new variable. 3.4. Calculate the eigenvectores and eigenvalues of the covariance matrix The eigenvalues and eigenvectors are a special set of scalars and vectors, respectively, associated with a linear system of equations (i.e., a matrix equation). Definition: Let be A an n x n matrix. A non-zero vector x in R n is called a eigenvector of A if A x is a scalar multiple of x, i.e., Ax = λx for some scalar λ, which is called the eigenvalue of A and we say that x is associated eigenvector with λ. (Anton and Rorres, 2006) 3.5. Choosing eigenvalues more representatives and forming a feature vector: In fact, it turns out that the eigenvector with the highest eigenvalue is the principle component of the data set. In our example, the eigenvector with the larges eigenvalue was the one that pointed down the middle of the data. It is the most significant relationship between the data dimensions. In general, once eigenvectors are found from the covariance matrix, the next step is to order them by eigenvalue, highest to lowest. This gives you the components in order of significance. What needs to be done now is you need to form a feature vector ( V ), which is just a fancy name for a matrix of vectors. This is constructed considering the eigenvectors chosen from the list of eigenvectors (for dominant eigenvalues), and forming a matrix with these eigenvectors in the columns. V = vet vet vet vet ) (3) ( 1 2 3 n 3.6. Deriving the new data set Once we have chosen the components (eigenvectors) that we wish to keep in our data and formed a feature vector, we simply take the transpose of the vector and multiply it on the left of the original data set, transposed. FD = V C T D A T (4) Where V C is the matrix with the eigenvectors in the columns transposed so that the eigenvectors are now in the rows, with the most significant eigenvector at the top, and D A is the mean-adjusted data transposed, i.e. the data items are in each column, with each row holding a separate dimension. 3.7. The PCA is represented in a graphical way Finally, the PCA is plotted, allowing for greater understanding of the data, since the samples that have greater similarity are grouped. 3.8. To go back when original data O D t t ( V C D ) + mean data = F, (5) where OD represents the original data.

4. RESULTS There will be presented the results for the classification of thirty kinds of airfoils using the Principal Components Analysis (PCA). For this, there are considered three variables associated of the airfoils: aspect ratio, curvature of the nose and curvature of the upper surface. A remarkable point is that the calculations of the curves were accomplished in an approximate way. For this work they ware used the following models of airfoils: Althaus 93k132, Archer 18sm, Boeing 103, Boeing 707e, Boeing 737b, Clark ys, Drela ag45c03, Drela ag13, Eppler 379, Eppler 857, Fage&collins 1, Fage&collins 3, Goettingen 394, Goettingen 492, John Yost eh1070, John Yost eh2070, Martin Hepperle 23, Martin Hepperle 91, NACA 0006, NACA 0008, NACA 0010, NACA 0012, Raf 27, Raf 34, Selig s1046, Selig s2060, Quabeck hq209, Quabeck hq1511, Wortmann 05188 and Wortmann fx84w097. A table with these chosen parameters is presented as an appendix, showing only some airfoils (Tab.1). Figure 3 shows the original data set of the airfoils. Figure 3. Original data. Initially, it was calculated the mean values for each variable and focused on the data in the associated mean (each value was subtracted from the corresponding average). Later, it is founded the covariance matrix of the data, whose representation is outlined in the figure below, see Fig. 4. Figure 4. Covariance matrix of the data used. The covariance matrix plot shows that the curvature of the nose is dominant, not allowing the finer analysis of the other components. Therefore, the data were calculated in a dimensionless way and again the covariance is represented in Fig. 5:

Figure 5. The dimensionless covariance matrix. In this representation is possible to check the covariance of each principal component, showing that the aspect ratio and curvature of the upper have a significant covariance. The next step is calculate of the eigenvalues and associated eigenvectors of the covariance matrix, and it is chosen the most representative eigenvalues to form the characteristic vector. In this case, there will be used eigenvectors associated with the main eigenvalues of the covariance matrix for the construction of the characteristic vector. Thereafter, it is determined the product between the transposed matrix of eigenvectors and the transposed matrix of the dimensionless adjusted data. The next step is the graphical representation of the principal components analysis, which is presented in Fig. 6, comparing to the original data. Figure 6. Comparison chart of the original data with the results obtained by Principal Components Analysis, where X is a new aspect ratio and Y is a new curvature of the upper surface It can be observed the generation of two groups, in reason of the similarity of the samples. It is checked also that the obtained data after the PCA have been lined up. 5. CONCLUSION The main conclusion of this work is that the statistical methods of classification and data analysis are satisfactory. The employment of PCA allows the airfoil classification with only three parameters. The data are grouped following the similarity between them. This method can be used for diverse applications. 6. ACKNOWLEDGEMENTS The authors thank University of Santa Cruz do Sul (UNISC), specially the Post Graduate Program in Industrial Systems Processes for financial support. Furthermore, the first author is grateful to the Brazilian Commission for Personal Improvement of Higher Education - CAPES for the outstanding support.

7. REFERENCES Anton, H.; Rorres, C., 2006. Elementary Linear Algebra with Applications, Wiley 9 th edition. Hair, J. F. (Et al.), 1998. Multivariate data analysis with readings. Prentice-Hall, New Jersey, 4.ed. Manly, B. F. J., 1994. Multivariate statistical methods: a primer. Chapman and Hall, London, 215p. Smith, L. I., 2002. A tutorial on principal component analysis, www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf Table 1. Associated Parameters for the Airfoils (excerpt) Airfoil Aspect Ratio Nose Curvature Back Curvature Drela_Ag45c03 13,60040805 27,94827534 0,1484127995 Boeing 707 11,08592143 51,94352895 0,4443226132 Eppler 379 11,44323644 19,28333669 0,3163567058 Naca 12 8,330972891 26,57038122 0,2693973025

8. RESPONSIBILITY NOTICE The authors are the only responsible for the printed material included in this paper.