Efficiency Measurement Using Independent Component Analysis And Data Envelopment Analysis

Size: px
Start display at page:

Download "Efficiency Measurement Using Independent Component Analysis And Data Envelopment Analysis"

Transcription

1 Efficiency Measurement Using Independent Component Analysis And Data Envelopment Analysis Ling-Jing Kao a, Chi-Jie Lu b, Chih-Chou Chiu a a Department of Business Management, National Taipei University of Technology, Taiwan, ROC b Department of Industrial Engineering and Management, Ching Yun University, Taiwan, ROC (This paper has been published in European Journal of Operational Research, DOI: /j.ejor ) ABSTRACT Efficiency measurement is an important issue for any firm or organization. Efficiency measurement allows organizations to compare their performance with their competitors and then develop corresponding plans to improve performance. Various efficiency measurement tools, such as conventional statistical methods and non-parametric methods, have been successfully developed in the literature. Among these tools, the data envelopment analysis (DEA) approach is one of the most widely discussed. However, problems of discrimination between efficient and inefficient decision-making units also exist in the DEA context (Adler and Yazhemsky, 010). In this paper, a two-stage approach of integrating independent component analysis (ICA) and data envelopment analysis (DEA) is proposed to overcome this issue. We suggest using ICA first to extract the input variables for generating independent components, then selecting the ICs representing the independent sources of input variables, and finally, inputting the selected ICs as new variables in the DEA model. A simulated dataset and a hospital dataset provided by the Office of Statistics in Taiwan s Department of Health are used to demonstrate the validity of the proposed two-stage approach. The results show that the proposed 1

2 method can not only separate performance differences between the DMUs but also improve the discriminatory capability of the DEA s efficiency measurement. Keywords: Independent component analysis, Data envelopment analysis, Efficiency measurement

3 1. Introduction Efficiency measurement is an important issue for any type of business or organization. Efficiency measurement allows businesses or organizations to compare their performance with their competition and develop a corresponding strategy to improve performance. Among various efficiency measurement tools, such as conventional statistical methods, non-parametric methods, and artificial intelligence methods, developed in the literature, it has been affirmed that DEA can effectively measure the relative efficiencies of multiple decision making units (DMUs) with similar goals and objectives. For example, DEA has been widely applied in different types of businesses or organizations, such as banks (Kao and Liu, 009; Sahoo and Tone, 009; Cooper et al., 008), schools (Hu et al., 009; Ray and Jeon, 008; Mancebón and Mũiz, 008) and hospitals (Hua et al., 009; Ancarani et al., 009; Kirigia et al., 008). However, DMU has two shortcomings. First, when the inputs of DMU are strongly correlated, the efficiency estimates of DMU obtained from the slack variable analysis of DEA can be biased. This is caused by the fact that DEA techniques mainly use the method of weighting to calculate the ratio between the inputs and outputs of each DMU. Second, when the model is incorrectly specified or the number of units is too small, DEA may fail in its discrimination capability. To solve these problems, Adler and Golany (001, 00) suggested using the principal component analysis (PCA) to produce uncorrelated linear combinations of original inputs and outputs. Alder and Yazhemsky (010) concluded that PCA-DEA outperforms PCA-variable reduction by comparing their discrimination performance in a simulation exercise. Independent component analysis (ICA) is the other solution to the problem of input correlation. Essentially, ICA is a novel statistical signal processing technique used to extract independent sources from observed multivariate statistical data where no relevant data mixture mechanisms are available (Hyvärinen et al., 001; Hyvärinen and Oja, 000). It is a methodology for capturing both second and higher order statistics, and it projects the input data onto the basis vectors that are as statistically independent as possible (Bartlett et al., 00; Draper et al., 003). These characteristics of ICA distinguish ICA from PCA which is used to find a set of the most representative projection vectors such that the projected samples retain the 3

4 most information about the original samples (Turk and Pendland, 1991). The literature has applied ICA in human face recognition on FERET database (Bartlett et al., 00; Liu and Wechsler, 1999) and the Olivetti and Yale databases (Yuen and Lai, 00). The latter study, Liu and Wechsler (1999), and Bartlett et al. (00) have shown that ICA outperforms PCA. However, Moghaddam (00) states that the performances of ICA and PCA have no significant difference. In this paper, we propose a new ICA-DEA approach of efficiency measurement not only to address the problem of input correlation but also to improve discrimination capability. Our proposed approach consists of two stages. In the first stage, we use ICA to generate independent components (ICs) to ensure statistical independence among the input variables. In the second stage, the estimated ICs containing the key factors that affect efficiency measurement are applied in DEA as the new input variables. To evaluate the performance of the proposed method, a simulated dataset and a hospital dataset provided by the Office of Statistics in Taiwan s Department of Health are used in this study. We also compare the discrimination capability of ICA-DEA with other approaches such as PCA-DEA and variable reduction (VR). The result shows that the efficiency analysis performed by the proposed ICA-DEA approach can avoid efficiency misjudgment. Moreover, the optimal level of input and output for each inefficient hospital can be estimated successfully. The rest of this paper is organized as follows: Sections and 3 give a brief introduction to independent component analysis and data envelopment analysis, respectively. In section 4, we begin by developing the proposed two-stage model and comparing it to PCA-DEA and VR with one simulated dataset. In section 5, one empirical application is provided. The hospital data from the Office of Statistics in Taiwan s Department of Health is analyzed by the proposed approach. And concluding remarks are offered in section 6.. Independent component analysis Let X = [ x% 1, x%, L, x% ] T m be a multivariate data matrix of size m n, m n, consisting of observed random variables x% i of size 1 n, i = 1,,..., m. In the basic ICA model, the matrix X can be modeled as 4

5 m X AS as %% (1) = = th th where a% i is the i column of unknown mixing matrix A of size m m ; s% i is the i row of source matrix S of size m n. The vectors s% i are unknown latent sources (variables) that cannot be directly observed from the observed variables x% i. The ICA model aims at finding an m m de-mixing matrix W such that i= 1 i i B= [ b % ] = WX= [ w % X], () i where b % i is the i th row of the matrix B, i = 1,,..., m. The vectors b % i must be as statistically independent as possible, and are called independent components (ICs). The ICs are used to estimate the latent variables s% i. The vector w% i in eq. () is the i th row of the de-mixing matrix W, i = 1,,..., m. It is used to transform the observed multivariate matrix X to generate the corresponding IC, i.e. b % i = w % ix, i = 1,,..., m. The ICA modeling is formulated as an optimization problem by setting up the measure of independence of ICs as an objective function and using some optimization techniques to solve for the de-mixing matrix W (Bell and Sejnowski, 1995; Sánchez, 00). In general, the ICs are obtained by using the de-mixing matrix W to multiply the matrix X. The de-mixing matrix W can be determined using an unsupervised learning algorithm with the objective of maximizing the statistical independence of ICs. And the statistical independence of ICs can be measured in terms of their non-gaussian properties (Hyvärinen et al., 001; Hyvärinen and Oja, 000). Normally, non-gaussianity can be verified by two common statistics: kurtosis and negentropy. The kurtosis of a random variable b %, fourth-order cumulant, is classically defined by i % % 4 %. (3) kurt( b) = E( b ) 3( E( b )) If variable b % is assumed to be zero mean and unit variance, the right-hand side simplifies to 4 Eb (% 4 ) 3. This shows that kurtosis is simply a normalized version of the fourth moment E( b % ). For a Gaussian b %, the fourth moment equals 3( Eb ( % )) random variable and non-zero for most non-gaussian random variables.. Thus, kurtosis is zero for a Gaussian Unlike kurtosis, negentropy is determined according to the information quantity of 5

6 (differential) entropy. Entropy is a measure of the average uncertainty in a random variable. The differential entropy H of random variable b % with density f ( b % ) is defined as H ( b% ) p( b% = )log p( b% ) db%. According to a fundamental result of information theory, a Gaussian variable will have the highest entropy value among a set of random variables with equal variance (Hyvärinen and Oja, 000). For obtaining a measure of non-gaussianity, the negentropy J is defined as follows: Jb (%) = Hb (% ) Hb (%) (4) where b % gauss is a Gaussian random vector of the same covariance matrix as b %. The negentropy is always non-negative and is zero if and only if b % has a Gaussian distribution. Since negentropy is very difficult to compute, an approximation of negentropy is proposed as follows (Hyvärinen and Oja, 000): gauss % % % (5) Jb ( ) [ EGb { ( )} EGo { ( )}] where o% is a Gaussian variable of zero mean and unit variance, and b % is a random variable with zero mean and unit variance. G is a nonquadratic function, and is given by Gb ( % ) = exp( b % / ) in this study. The FastICA algorithm proposed by Hyvärinen et al. (001) is adopted in this paper to solve for the de-mixing matrix W. The ICA modeling usually consists of two preprocessing steps: centering and whitening (Hyvarinen an Oja, 000). In the centering step, the matrix X is centered by subtracting the row means of the matrix, i.e., x% ( x% E( x% )). The matrix X with zero mean is then passed through i i i the whitening matrix Q to remove the second order statistic of the input matrix, i.e. Z = QX. The whitening matrix Q is twice the inverse square root of the covariance matrix of the input (1/ ) matrix, i.e., Q = ( C )) T %, where C% = E( xx %% ) is the covariance matrix of X. The rows of the x x whitened input matrix Z, denoted by z%, are uncorrelated and have unit variance, i.e., T C% = E( zz %% ). x ICA can be viewed as an extension of principal component analysis (PCA) (Hyvärinen and Oja, 000). However, the objective of ICA is different from that of PCA. PCA is a dimensionality reduction technique that reduces the data dimension by projecting the correlated 6

7 variables into a smaller set of new variables that are uncorrelated and retain most of the original variance. But the objective of PCA is only to de-correlate variables, not to make them independent. PCA can only impose independence up to second order statistical information while constraining the direction vectors to be orthogonal, whereas ICA has no orthogonal constraint and involves high-order statistics, i.e., it not only de-correlates the data (second order statistics) but also reduces high order statistical dependencies (Lee, 1998). Hence, ICs reveal more useful information from observed data than principal components (PCs) do. The differences between PCA and ICA also exist between ICA and principal axis factoring (PAF) which is a well-known exploratory factor analytical model. Basically, PAF extracts factors based on second order statistical information, and the factors are assumed to be uncorrelated and Gaussian distributed (Hyvärinen et al., 001). A simple comparison between PCA and ICA is shown in Figure 1. Figure 1(a) shows three original variables that are two different types of sinusoidal variables ( s% 1 and s% ) and a random variable ( s% 3 ). These original variables are transformed into observed mixing variables ( x% 1, x% and x% ) using unknown mixing matrix A, i.e., X=AS. When PCA is applied to these 3 mixing variables, it gives three principal components as observed in Figure 1(c). Figure 1(c) shows that the PCA results are different from the original variables in Figure 1(a). The ICA solution is depicted in Figure 1(d). Figure 1(d) clearly reveals that the original variables can be reconstructed by ICA without any knowledge of the original variables and mixing matrix. The ICA model in Eq(1) and the simple ICA example in Figure 1 show that the ICA model contains two limitations (Hyvärinen and Oja, 000). The first of them is that the variances (energies) of the independent components cannot be determined because both S and A are unknown. In other words, S and A are unidentified unless one of them is fixed. To alleviate this problem, most researchers assume that each IC has zero mean and unit variance in the ICA model (Hyvärinen and Oja, 000). However, that still leaves the ambiguity of the sign because one could multiply the IC by -1 without affecting the model. The other limitation of the ICA model is that one cannot determine the order of the ICs because of the simultaneous unknowns of S and A. A number of methods have been suggested to 7

8 determine the component order (Cardoso and Souloumiac, 1993; Back and Weigend, 1997; Cheung and Xu, 001). Hyvärinen (1999) suggested that ICs can be sorted according to their non-gaussianity. In this paper, the ICs are sorted based on their kurtosis values. s% 1 x% 1 s% x% s% 3 x% 3 (a) Original variables (S) (b) Observed variables (X) PC 1 IC 1 PC IC PC 3 IC 3 (c) PCA results (d) ICA results Figure 1. A simple comparison between PCA and ICA. 3. Data envelopment analysis basis DEA is also known as the efficient frontier approach (Cooper et al., 000; Cooper et al., 004). The term envelopment refers to the idea that inefficient DMUs are located inside an area enveloped by the efficient DMUs. DEA is constructed based on the concept of relative efficiency, which is defined as the ratio of the weighted sum of outputs to the weighted sum of inputs (Cooper et al., 004). The solution of DEA requires that the weights for inputs and 8

9 outputs of each unit are selected to maximize its efficiency under certain constraints. Thus, the mathematical programming form of the CCR model is formulated as follows (Cooper et al., 004): Maximize Subject to u1 y θ = v x u y 1 1 j v x 1 1 j 1p 1 1p + u + v + u y j + v x j y p x p u v u y d v x m v > 0 for i=1,,, m i dj mj u > 0 for r=1,,, d r d m y x dp mp 1, j=1 to n (6) where x 1 j, x j,..., xmj are the m inputs, y1 j, y j,..., ydj are the d outputs of the unit j, v 1, v,..., vm are weights for the inputs, and u, u, 1..., us are the weights for the outputs, p is the designated unit for an optimization run, and n is the total number of units in the study. Because each decision unit has its own preferences in input and output variables, the implementation of the model in Eq.(6) can derive the values of v i (i=1,,.., m) and u r (r=1,,.., d) for each decision unit. 4. Research methodology and simulation study The proposed ICA-DEA approach is illustrated in Figure. After deciding the input variables, we used ICA to convert observed input data into separate independent signals. Then, these separated signals were integrated into the DEA approach in the second stage of our technique to construct a model that could measure efficiency. Stage I Stage II observations of inputs data conversion by ICA construction of efficiency measurement model by DEA efficiency measurement 9

10 Figure. Research structure The performance of the proposed ICA-DEA model relative to the performance of PCA- DEA and VR is investigated by a simulation analysis. To be consistent with other studies, we used the dataset determined by simulated Cobb-Douglas production functions y% = x% x% ( x% x% x% x% x% x% ) e τ with a half normal inefficiency distribution (Adler and Yazhemsky, 010) for comparison. There are 50 DMUs in total, and 8 inputs are included in the dataset. Following Adler and Yazhemsky (010), we first generated 10,000 positive observations of the variables x% from a normal distribution with mean 10 and variation 1. A single output was then chosen to permit the utilization of standard production function to compute the output values. Consequently, over the entire simulated population of 10,000 observations and a subsample size of 50 DMUs, we can have the average percentage reported from 10,000/50=00 samples. In the data generation process, we assumed no probability mass along the frontier, and only one single inefficiency (τ) was simulated from each DMU. In this study, we draw the inefficiency independently from a half normal distribution with μ=0 and σ=1. The simulated data was then used to compare results from the PCA-DEA model, the VR model, and our ICA-DEA model. To investigate the influence of the percentage of retained information on the number of inputs in the analysis, we followed Adler and Yazhemsky (010) and reduced the percentage of retained information from 100% to 80% in 4 percent steps. As in the work done by Adler and Yazhemsky (010), we used the percentage of retained information to decide the number of PCs or variables included in the DEA model. In other words, we manually determined the number of ICs, PCs, or variables to retain, such that the percentage of information remaining is greater or equal to the setup level. Table 1 provides the incorrect efficiency classification using ICA-DEA, PCA-DEA and VR with different input dimensions. According to Table 1, the level of information reduction has appreciable influence when the three approaches are used to classify the inefficiency. Moreover, the ICA-DEA model gives the lowest incorrect classification result, and the classification results made by the PCA-DEA model are slightly better than those generated by VR. They are 10

11 consistent with the conclusion made by Lee et al., (004). (The ICA solution extracts the original source signal to a much greater extent than the PCA solution if the latent variables follow non- Gaussian distribution.) 5. Empirical application 5.1 Data Hospital data (from 557 hospitals in 005) provided by the Office of Statistics in Taiwan s Department of Health is used in this study to illustrate the proposed ICA-DEA approach and to compare the performance of the proposed method with alternative approaches. These hospitals can be categorized into three different classes according to their functional complexity: (1) medical centers and regional hospitals (highest level); () district hospitals (medium and low level); and (3) primary clinics (basic level). Usually, hospitals with different sizes provide different levels of medical treatment and service. According to the government accreditation definition, a medical center has at least 500 beds, and a regional hospital has at least 50 beds. In order to control the difference in sizes and make the data sample more homogeneous, we selected 1 hospitals (i.e., 1 DMUs) with more than 500 beds for analysis. Since the results of efficiency measurement are related to the selection of input and output variables, we chose variables by reviewing the existing literature on hospital efficiency (Hu et al., 009; Puig-Junoy, 000; Sahin and Ozcan, 000; Parkin and Hollingsworth, 1997; Hu and Huang, 004). In our empirical study, five input and three output variables were chosen to calculate the efficiency of each DMU. These five input variables are four labor-related variables (including doctors, nurses, paramedical persons, and administrative staffers) and one capitalrelated variable (beds). A patient s health improvement is the most commonly used output indicator for a hospital, but measuring the level of improved health is very difficult. One possible solution is to adopt the immediate products of a hospital, such as a hospital s service volume, as a proxy for medical outputs. Therefore, in this empirical study, we selected outpatient visits, emergency visits, and operations as our three output variables. Table illustrates the definitions 11

12 and explanations of input and output variables. Table 3 provides summary statistics of input and output variables. 1

13 Table 1 (a) Incorrect efficiency classification under varying inputs reduction for y% = x% x% ( x% x% x% x% x% x% ) e τ function Incorrect classification Incorrectly defined inefficient Incorrectly defined efficient Method ICA- DEA * PCA-DEA * VR ICA- DEA * PCA-DEA * VR Percentage information retained * The DEA model is under a constant return-to-scale case (CRS). Table 1 (b) The number of samples containing m inputs as a function of information retained Method ICA-DEA PCA-DEA VR m Percentage information retained

14 Table 3 also shows that the variance of each input or output variable is still large. The variables having the largest standard deviation of input and output variables are beds and outpatient visits, respectively. As discussed above, the correlation between input and output variables will cause biased estimates in DEA. We report the result of correlation analysis in Table 4. Table 4 shows that there are significant positive correlations among all of the variables. The lowest correlation coefficient is between doctors and emergency visits. The highest coefficient 0.97 exists between nurses and doctors. Practically, subsets of the inputs or outputs are always correlated. The high correlations between variables could cause issues with the distribution of the weights. Dropping some from the assessment sometimes could reduce the efficiency ratings for some DMUs (Nunamaker, 1985). However, it also occasionally leads to significant changes in efficiencies (Dyson et al., 001). Therefore, there is a need to convert the observed input data into separate independent signals by the ICA approach before conducting DEA. 5. Efficient Score computations After the ICs are determined, efficiency scores are computed by the CCR input orienting model, proposed by Charnes, Cooper and Rhodes (000), in the DEA-PRO software. To demonstrate the validity of the proposed model, the performance of the proposed ICA-DEA method is compared to the single DEA model and the PCA-DEA model. The single DEA model simply applies the DEA model to input variables to measure the efficiency of hospitals without using ICA or PCA as preprocessing tools. The PCA-DEA method first applies PCA to the input variables for generating principal components (PCs) and then conducts DEA analysis based on the generated PCs. 14

15 Inputs Outputs Table. Definition and explanation of variables Variables Definition and explanation Beds The total number of registered beds within the hospital, including acute, chronic, and special beds The total number of physicians who are full-time Doctors employees, including dentists and Chinese medicine doctors Nurses The total number of nurses employed in hospitals (including midwives) The total number of health service providers employed in Administrative hospitals, including pharmacists, dietitians, persons physiotherapists, occupational therapy technologists, and radiological technologists Administrative staffers Outpatient visits Emergency visits Operations The total number of full-time equivalent personnel (including social workers, researchers, and nonprofessionals) The total number of patients to outpatient departments within a year The total number of patients to emergency room within a year The total number of inpatient and outpatient surgeries within a year Inputs Outputs Table 3. Summary statistics of variables Variables Mean Std. Dev. Min. Max. Beds ( x% 1 ) 1, ,36 Doctors ( x% ) ,106 Nurses ( x% 3 ) 1, ,73 Administrative persons ( x% 4 ) Administrative staffers ( x% 5 ) ,64 Outpatient visits ( y% 1 ) 1,70, ,016 86,63,514,534 Emergency visits ( y% ) 83,094 30,404 40,615 13,550 Operations ( y% ) 3 8,60 15,750 5,614 75,348 15

16 Table 4. Correlation coefficients between variables x% 1 x% x% 3 x% 4 x% 5 y% 1 y% y% 3 x% x% * x% 0.97 * * x% * * * x% * * * * y% * * * * * y% * * * * * 0.64 * y% 0.85 * * * * * * * * p-value <0.05 In the single DEA model, the overall efficiency scores of 1 hospitals (DMUs) are summarized in Table 5. From Table 5, it can be found that the single DEA model produces a high average efficiency score (0.948) and a small standard deviation (0.1). The number of efficient DMUs in the single DEA model is 1. The single DEA model results in too many efficient DMUs and cannot distinguish the difference in performance between the DMUs very well. In the PCA-DEA method, component loadings and the variances regarding the components were computed for the five input variables first. The proportion of the total variance explained by each principal component is additive, with each new component contributing less than the preceding one to the explained variance. Subsequently, the components were rotated to eliminate medium-range loadings to make the interpretation of the components easier (Johnson and Wichern, 007). An interpretation of the rotated five principal components in Table 6 is made by examining the component loadings, noting the relationship to the original variables. Among the five principal components, the first component appears to measure capital and partial labor (nurse) investment, with bed and nurse increasing with positive values. In the second component, doctors and administrative staffers are important; the effect of medical expertise is demonstrated by the positive loading of doctors in this component. According to Table 6, we found that PC 1 16

17 can explain % of data variation, and PC can explain.8061% of sample variation. Because the two main PCs can totally explain data variation, these two PCs would be enough to represent the features of input data and are used as new input variables for the DEA model. The results of the PCA-DEA model are also summarized in Table 5. From the table, we found that the average efficiency score, standard deviation, and number of efficient DMUs in the PCA-DEA model are 0.48, 0.195, and 8, respectively. Compared to the single DEA model, the PCA-DEA method has a lower average efficiency score, a smaller number of efficient DMUs, and a larger standard deviation. In the proposed ICA-DEA model, we first applied the basic ICA approach to estimate a demixing matrix W and five independent components ( b % 1, b %,, b % 5 ). In order to select the more meaningful ICs, the statistical independence of ICs is evaluated by computing the kurtosis values of the ICs herein. The estimated de-mixing matrix W and the kurtosis value for each IC are summarized in Table 7. Because the IC with the larger kurtosis value can be considered the more important IC (Hyvärinen, 1999), IC 1, IC and IC 3 (i.e. b % 1, b %, b % 3 ) are regarded as key factors affecting the results of efficiency measurement and used as three new input variables for the DEA model. Note that the extracted IC might have negative values which violate the semipositive assumption for the DEA model, i.e., all inputs and all outputs are non negative, and at least one input and one output are positive. To solve this problem, we simply subtract each IC i from its corresponding minimum value, i.e. min(ic i ). Table 5. Summary of the results of single DEA, ICA-DEA and PCA-DEA models Single DEA model PCA-DEA model ICA-DEA model Average score Standard deviation Maximum efficiency score Minimum efficiency score Number of efficient DMUs Total number of DMUs % of efficient DMUs

18 Table 6. Varimax rotated five components Rotating loadings PC 1 PC PC 3 PC 4 PC 5 x% x% x% x% x% Proportion of Var Cumulative Var Table 7. The Kurtosis values and de-mixing matrix (W) corresponding to the ICs b % 1 b % b % 3 b % 4 b % 5 x% x% x% x% x% Kurtosis values From Eq., we know that b % i can be obtained by multiplying X by its corresponding row vector of the de-mixing matrix (i.e. w% i ). Thus, the values in the de-mixing matrix can be used to explain the relationship between the selected b % i and the original input variables X as follows: b % 1 = x% x% x% x% x% 5 (7) b % = x% x% x% x% x% 5 (8) b % 3 = x% x% x% x% x% 5 (9) From Eq. (7), we understand that x% 3 and x% 5 have a relatively significant impact on b % 1. This implies that b % 1 is mainly affected by nurses ( x% 3 ) and administrative staffers ( x% 5 ). Thus, b % 1 18

19 can be used to represent the features of nurses and administrative staffers. As seen from Eqs. (8-9), b % is mainly influenced by beds ( x% 1 ); b % 3 contains more information about doctors ( x% ) and administrative persons ( x% 4 ). Therefore, b % and b % 3 can be used to represent the information about beds and the features of doctors and administrative persons, respectively. The efficiency measurement results of the proposed ICA-DEA model are included in Table 5. It can be seen that the average efficiency score, standard deviation, and number of efficient DMUs in the proposed method are 0.16, 0.54 and 4, respectively. Compared to the single DEA and PCA-DEA models, the proposed method has the lowest average efficiency score, the least number of efficient DMUs, and the highest standard deviation. This indicates that the proposed method has more ability to distinguish between the performances of DMUs. In other words, the ICA approach can improve the discriminatory capability of the DEA model in performance measurement. 5.3 Slack analysis DEA provides not only efficiency results but also slack analysis. Slack analysis can provide guidelines to derive the optimal level of input and output resources for each DMU. As a result, each DMU could have its input and output resources set at the optimal level the original level minus the inefficient and slack amounts from the DEA results (Luo and Donthu, 001). Table 7 reports the results of slack analysis for DMU 1 by using the single DEA model and ICA-DEA approach. The slack entries of the single DEA model are all positive. This implies that, compared to the efficient hospitals, the investment in various inputs is excessive, The slack analysis suggests that many of the expenditures could have been reduced while maintaining the same outputs -Outpatient visits, Emergency visits, and Operations in this case thus improving the efficiency. For example, to be as efficient as 0 other hospitals, DMU 1 can maintain the same output levels by cutting down 91 in Beds, 38 in Doctors, 3 in Nurses, 5 in Administrative persons, and 51 in Administrative staffers. Although the slack analysis can investigate the utilization of input or output resources to improve efficiency scores, the analysis results from the single DEA method could be either 19

20 underestimated or overestimated as long as the inputs of DMU are correlated. To solve this problem, we replaced the single DEA model s inputs with the transformed inputs made by the ICA approach before running the DEA analysis. The result of ICA-DEA model is summarized in Table 7. Due to the adoption of the ICA technique, the slack entries generated by the ICA-DEA model need to be re-transformed in order to understand exactly the utilization of input and output resources to improve efficiency scores. In the re-transformed procedure, we first define the slack entries generated by the ICA-DEA model as Δ bi. Then, based on Eqs. (7-9) and correlation coefficients ( ρ ij ) in Table 4, we can formulate our re-transforming procedure as an optimization problem shown as follows: Minimize ε 1 + ε + ε 3 Subject to Δb Δx Δx Δx Δx Δx5 ε 1 (10) Δb Δx Δx Δx Δx Δx5 ε Δb Δx Δx Δx Δx Δx5 ε 3 x% i = ρijx% i=1,,, 5; j=1,,, 5; j Δx Integer i=1,,, 5 i where x% or i x% is the original inputs; ρ j ij is the correlation coefficient between input variables x% i and x% ; Δ b j i is the slack entries generated by the ICA-DEA model; and Δ xi is the re-transformed slack entries of the ICA-DEA model. Because the above optimization problem is in a simple linear programming format, solutions can be derived with the simplex method as in the usual linear programming approach. The 3 rd column in Table 8 shows the final slack analysis results of the proposed ICA-DEA method for DMU 1. Similar to the results made by the single DEA model, the slack entries are all positive compared to the efficient hospitals. But the results made by the ICA-DEA approach suggest fewer slack entries for Beds, Doctors, Nurses, and Administrative staffers, and a greater number of slack entries for Administrative persons. 0

21 Table 8. Slack analysis of single DEA method and ICA-DEA method using DMU 1 as example Original Slack entry by Slack entry by Variables real value single DEA model ICA-DEA model Beds ( x% 1 ) Doctors ( x% ) Nurses ( x% 3 ) Administrative persons ( x% 4 ) Administrative staffers ( x% 5 ) DMU 1 efficiency score Conclusions One of the most important managerial issues of any type of institution is measuring the relative efficiency of its decision making units. DEA is an efficiency comparison method based on linear programming and has been used in a variety of contexts. In this study, a two-stage ICA- DEA approach is proposed to improve the discriminatory capability of DEA results. The proposed method first applies ICA to the input variables to generate ICs representing the independent sources of input variables. Then, the important ICs are identified and selected based on their kurtosis values. The selected ICs, regarded as the key factors affecting efficiency measurement, are utilized as new input variables in the DEA model. The proposed ICA-DEA approach is illustrated by Taiwanese hospital data from 85 hospitals (DMUs) in 005. Compared to the single DEA and integrated PCA and DEA models, the proposed ICA-DEA method has the lowest average efficiency score, the fewest efficient DMUs, and the highest standard deviation. This result provides evidence that the proposed ICA-DEA approach has a superior ability to differentiate the performance of the DMUs and to overcome the shortcomings of DEA. References Adler, N., Golany, B., 001. Evaluation of deregulated airline networks using data envelopment analysis combined with principal component analysis with an application to Western Europe. European Journal of Operational Research 13 (),

22 Adler, N., Golany, B., 00. Including principal component weights to improve discrimination in data envelopment analysis. Journal of the Operational Research Society 53(9), Adler, N., Yazhemsky, E., 010. Improving discrimination in data envelopment analysis: PCA- DEA or variable reduction. European Journal of Operational Research 0(1), Ancarani, A., Di Mauro, C., Giammanco, M.D., 009. The impact of managerial and organizational aspects on hospital wards' efficiency: Evidence from a case study. European Journal of Operational Research 194 (1), Back, A., Weigend, A., A first application of independent component analysis to extracting structure from stock returns. International Journal of Neural Systems 8, Bartlett, M.S., Movellan, J.R., Sejnowski, T.J., 00. Face recognition by independent component analysis. IEEE Transactions Neural Networks 13(6), Bell, A.J., Sejnowski, T.J., An information-maximization approach to blind separation and blind deconvolution. Neural Computation 7 (6), Biørn, E., Hagen, T.P., Iversen, T., Magnussen, J., 003. The effect of activity-based financing on hospital efficiency: A panel data analysis of DEA efficiency scores Health Care Management Science 6 (4), Cardoso, J.F., Souloumiac, A., Blind beamforming for non-gaussian signals. IEE Proceedings, Part F: Radar and Signal Processing 140 (6), Charnes, A., Cooper, W.W., Rhodes, E., Measuring the efficiency of decision making units. European Journal of Operational Research (6), Cheung, Y.-M., Xu, L., 001. Independent component ordering in ICA time series analysis. Neurocomputing 41, Cooper, W.W., Seiford, L.M., Zhu, J., 004. Handbook on Data Envelopment Analysis. Kluwer Academic, Boston, MA. Cooper, W.W., Seiford, L.M., Tone, K., 000. Data Envelopment Analysis-a Comprehensive Text with Models, Applications, References and DEA-Solver Software. Kluwer Academic, Boston, MA. Cooper, W.W., Timothy, W.R., Deng, H., Wu, J., Zhang, Z., 008. Are state-owned banks less efficient? A long- vs. short-run data envelopment analysis of Chinese banks. International Journal of Operational Research 3 (5), Draper, B., Baek, K., Bartlett, M.S., Beveridge, J.R., 003. Recognizing faces with PCA and ICA. Computer Vision and Image Understanding 91(1-), Dyson, R., Allen, R., Camanho, A.S., Podinovski, V.V., Sarrico, C.S., Shale, E.A., 001. Pitfalls and protocols in DEA. European Journal of Operational Research 13(),

23 Hu, J.L., Huang, Y.F., 004. Technical efficiencies in large hospitals: A managerial perspective. International Journal of Management 1, Hu, Y., Zhang, Z., Liang, W., 009. Efficiency of primary schools in Beijing, China: An evaluation by data envelopment analysis. International Journal of Educational Management 3 (1), Hua, H., Li-xin, S., Xian-li, Z., Sheng-xin, C., 009. Data envelopment analysis-based evaluation of pharmacy efficiencies of military hospitals. Academic Journal of Second Military Medical University 30 (5), Hyvärinen, A., Fast and robust fixed-point algorithms for independent component analysis. IEEE Transactions on Neural Networks 10 (3), Hyvärinen, A., Karhunen, J., Oja, E., 001. Independent Component Analysis. John Wiley & Sons, New York. Hyvärinen, A., Oja, E., 000. Independent component analysis: Algorithms and applications. Neural Networks 13 (4-5), Johnson, R.A., Wichern, D.W., 007. Applied Multivariate Statistical Analysis. Prentice Hall, New Jersey. Kao, C., Liu, S.-T, 009. Stochastic data envelopment analysis in measuring the efficiency of Taiwan commercial banks. European Journal of Operational Research 196 (1), Kirigia, J.M., Emrouznejad, A., Cassoma, B., Asbu, E.Z., Barry, S., 008. A performance assessment method for hospitals: The case of municipal hospitals in Angola. Journal of Medical Systems 3 (6), Lee, J.M.; Yoo, C., Lee, I.B., 004. Statistical process monitoring with independent component analysis. Journal of Process Control 14(5), Lee, T.W., Independent Component Analysis: Theory and Application. Kluwer Academic Publishers, Boston, MA. Liu, C., Wechsler, H., Comparative assessment of independent component analysis (ICA) for face recognition. In: Proceedings of the Second International Conference on Audio- and Video-based Biometric Person Authentication (AVBPA'99), Washington DC. Luo, X., Donthu, N., 001. Benchmarking advertising efficiency. Journal of Advertising Research 41(6), Mancebón, M.J., Mũiz, M.A., 008. Private versus public high schools in Spain: Disentangling managerial and programme efficiencies. Journal of the Operational Research Society 59 (7), Moghaddam, B., 00. Principal manifolds and probabilistic subspaces for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 4(6),

24 Nunamaker, T. R Using data envelopment analysis to measure the efficiency of non-profit organization: A critical evaluation. Managerial and Decision Economics 6(1), Parkin, D., Hollingsworth, B., Measuring production efficiency of acute hospitals in Scotland : validity issues in data envelopment analysis. Applied Economics 9, Puig-Junoy, J., 000. Partitioning input cost efficiency into its allocative and technical components: An empirical DEA application to hospitals. Socio-economic Planning Sciences 34 (3), Ray, S.C., Jeon, Y., 008. Reputation and efficiency: A non-parametric assessment of America's top-rated MBA programs. European Journal of Operational Research 189 (1), Sahin, I., Ozcan, Y.A., 000. Public sector hospital efficiency for provincial markets in Turkey. Journal of Medical Systems 4 (6), Sahoo, B.K., Tone, K., 009. Decomposing capacity utilization in data envelopment analysis: An application to banks in India. European Journal of Operational Research 195 (), Sánchez, V.D.A., 00. Frontiers of research in BSS/ICA. Neurocomputing 49, 7-3. Turk, M., Pentland, A., Eigenfaces for recognition. Journal of Cognitive Neuroscience 3(1), Yuen, P.C., Lai, J.H., 00. Face representation using independent component analysis. Pattern Recognition 35(6),

AN IMPROVED APPROACH FOR MEASUREMENT EFFICIENCY OF DEA AND ITS STABILITY USING LOCAL VARIATIONS

AN IMPROVED APPROACH FOR MEASUREMENT EFFICIENCY OF DEA AND ITS STABILITY USING LOCAL VARIATIONS Bulletin of Mathematics Vol. 05, No. 01 (2013), pp. 27 42. AN IMPROVED APPROACH FOR MEASUREMENT EFFICIENCY OF DEA AND ITS STABILITY USING LOCAL VARIATIONS Isnaini Halimah Rambe, M. Romi Syahputra, Herman

More information

Blind Source Separation Using Artificial immune system

Blind Source Separation Using Artificial immune system American Journal of Engineering Research (AJER) e-issn : 2320-0847 p-issn : 2320-0936 Volume-03, Issue-02, pp-240-247 www.ajer.org Research Paper Open Access Blind Source Separation Using Artificial immune

More information

Joint Use of Factor Analysis (FA) and Data Envelopment Analysis (DEA) for Ranking of Data Envelopment Analysis

Joint Use of Factor Analysis (FA) and Data Envelopment Analysis (DEA) for Ranking of Data Envelopment Analysis Joint Use of Factor Analysis () and Data Envelopment Analysis () for Ranking of Data Envelopment Analysis Reza Nadimi, Fariborz Jolai Abstract This article combines two techniques: data envelopment analysis

More information

Selective Measures under Constant and Variable Returns-To- Scale Assumptions

Selective Measures under Constant and Variable Returns-To- Scale Assumptions ISBN 978-93-86878-06-9 9th International Conference on Business, Management, Law and Education (BMLE-17) Kuala Lumpur (Malaysia) Dec. 14-15, 2017 Selective Measures under Constant and Variable Returns-To-

More information

Comparative Analysis of ICA Based Features

Comparative Analysis of ICA Based Features International Journal of Emerging Engineering Research and Technology Volume 2, Issue 7, October 2014, PP 267-273 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Comparative Analysis of ICA Based Features

More information

Introduction to Independent Component Analysis. Jingmei Lu and Xixi Lu. Abstract

Introduction to Independent Component Analysis. Jingmei Lu and Xixi Lu. Abstract Final Project 2//25 Introduction to Independent Component Analysis Abstract Independent Component Analysis (ICA) can be used to solve blind signal separation problem. In this article, we introduce definition

More information

Research Article A Hybrid ICA-SVM Approach for Determining the Quality Variables at Fault in a Multivariate Process

Research Article A Hybrid ICA-SVM Approach for Determining the Quality Variables at Fault in a Multivariate Process Mathematical Problems in Engineering Volume 1, Article ID 8491, 1 pages doi:1.1155/1/8491 Research Article A Hybrid ICA-SVM Approach for Determining the Quality Variables at Fault in a Multivariate Process

More information

Fundamentals of Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Independent Vector Analysis (IVA)

Fundamentals of Principal Component Analysis (PCA), Independent Component Analysis (ICA), and Independent Vector Analysis (IVA) Fundamentals of Principal Component Analysis (PCA),, and Independent Vector Analysis (IVA) Dr Mohsen Naqvi Lecturer in Signal and Information Processing, School of Electrical and Electronic Engineering,

More information

Independent Component Analysis and Its Applications. By Qing Xue, 10/15/2004

Independent Component Analysis and Its Applications. By Qing Xue, 10/15/2004 Independent Component Analysis and Its Applications By Qing Xue, 10/15/2004 Outline Motivation of ICA Applications of ICA Principles of ICA estimation Algorithms for ICA Extensions of basic ICA framework

More information

ABSTRACT INTRODUCTION

ABSTRACT INTRODUCTION Implementation of A Log-Linear Poisson Regression Model to Estimate the Odds of Being Technically Efficient in DEA Setting: The Case of Hospitals in Oman By Parakramaweera Sunil Dharmapala Dept. of Operations

More information

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition.

Comparative Assessment of Independent Component. Component Analysis (ICA) for Face Recognition. Appears in the Second International Conference on Audio- and Video-based Biometric Person Authentication, AVBPA 99, ashington D. C. USA, March 22-2, 1999. Comparative Assessment of Independent Component

More information

Classifying inputs and outputs in data envelopment analysis

Classifying inputs and outputs in data envelopment analysis European Journal of Operational Research 180 (2007) 692 699 Decision Support Classifying inputs and outputs in data envelopment analysis Wade D. Cook a, *, Joe Zhu b a Department of Management Science,

More information

Research Article Integrating Independent Component Analysis and Principal Component Analysis with Neural Network to Predict Chinese Stock Market

Research Article Integrating Independent Component Analysis and Principal Component Analysis with Neural Network to Predict Chinese Stock Market Mathematical Problems in Engineering Volume 211, Article ID 382659, 15 pages doi:1.1155/211/382659 Research Article Integrating Independent Component Analysis and Principal Component Analysis with Neural

More information

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II

Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Theoretical Neuroscience Lectures: Non-Gaussian statistics and natural images Parts I-II Gatsby Unit University College London 27 Feb 2017 Outline Part I: Theory of ICA Definition and difference

More information

Research Article A Data Envelopment Analysis Approach to Supply Chain Efficiency

Research Article A Data Envelopment Analysis Approach to Supply Chain Efficiency Advances in Decision Sciences Volume 2011, Article ID 608324, 8 pages doi:10.1155/2011/608324 Research Article A Data Envelopment Analysis Approach to Supply Chain Efficiency Alireza Amirteimoori and Leila

More information

CIFAR Lectures: Non-Gaussian statistics and natural images

CIFAR Lectures: Non-Gaussian statistics and natural images CIFAR Lectures: Non-Gaussian statistics and natural images Dept of Computer Science University of Helsinki, Finland Outline Part I: Theory of ICA Definition and difference to PCA Importance of non-gaussianity

More information

A Slacks-base Measure of Super-efficiency for Dea with Negative Data

A Slacks-base Measure of Super-efficiency for Dea with Negative Data Australian Journal of Basic and Applied Sciences, 4(12): 6197-6210, 2010 ISSN 1991-8178 A Slacks-base Measure of Super-efficiency for Dea with Negative Data 1 F. Hosseinzadeh Lotfi, 2 A.A. Noora, 3 G.R.

More information

Principal Component Analysis vs. Independent Component Analysis for Damage Detection

Principal Component Analysis vs. Independent Component Analysis for Damage Detection 6th European Workshop on Structural Health Monitoring - Fr..D.4 Principal Component Analysis vs. Independent Component Analysis for Damage Detection D. A. TIBADUIZA, L. E. MUJICA, M. ANAYA, J. RODELLAR

More information

Prediction of A CRS Frontier Function and A Transformation Function for A CCR DEA Using EMBEDED PCA

Prediction of A CRS Frontier Function and A Transformation Function for A CCR DEA Using EMBEDED PCA 03 (03) -5 Available online at www.ispacs.com/dea Volume: 03, Year 03 Article ID: dea-0006, 5 Pages doi:0.5899/03/dea-0006 Research Article Data Envelopment Analysis and Decision Science Prediction of

More information

An Improved Cumulant Based Method for Independent Component Analysis

An Improved Cumulant Based Method for Independent Component Analysis An Improved Cumulant Based Method for Independent Component Analysis Tobias Blaschke and Laurenz Wiskott Institute for Theoretical Biology Humboldt University Berlin Invalidenstraße 43 D - 0 5 Berlin Germany

More information

Independent Component Analysis for Redundant Sensor Validation

Independent Component Analysis for Redundant Sensor Validation Independent Component Analysis for Redundant Sensor Validation Jun Ding, J. Wesley Hines, Brandon Rasmussen The University of Tennessee Nuclear Engineering Department Knoxville, TN 37996-2300 E-mail: hines2@utk.edu

More information

Data envelopment analysis

Data envelopment analysis 15 Data envelopment analysis The purpose of data envelopment analysis (DEA) is to compare the operating performance of a set of units such as companies, university departments, hospitals, bank branch offices,

More information

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello

Artificial Intelligence Module 2. Feature Selection. Andrea Torsello Artificial Intelligence Module 2 Feature Selection Andrea Torsello We have seen that high dimensional data is hard to classify (curse of dimensionality) Often however, the data does not fill all the space

More information

Chance Constrained Data Envelopment Analysis The Productive Efficiency of Units with Stochastic Outputs

Chance Constrained Data Envelopment Analysis The Productive Efficiency of Units with Stochastic Outputs Chance Constrained Data Envelopment Analysis The Productive Efficiency of Units with Stochastic Outputs Michal Houda Department of Applied Mathematics and Informatics ROBUST 2016, September 11 16, 2016

More information

Enhanced Fisher Linear Discriminant Models for Face Recognition

Enhanced Fisher Linear Discriminant Models for Face Recognition Appears in the 14th International Conference on Pattern Recognition, ICPR 98, Queensland, Australia, August 17-2, 1998 Enhanced isher Linear Discriminant Models for ace Recognition Chengjun Liu and Harry

More information

TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES. Mika Inki and Aapo Hyvärinen

TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES. Mika Inki and Aapo Hyvärinen TWO METHODS FOR ESTIMATING OVERCOMPLETE INDEPENDENT COMPONENT BASES Mika Inki and Aapo Hyvärinen Neural Networks Research Centre Helsinki University of Technology P.O. Box 54, FIN-215 HUT, Finland ABSTRACT

More information

Mining Big Data Using Parsimonious Factor and Shrinkage Methods

Mining Big Data Using Parsimonious Factor and Shrinkage Methods Mining Big Data Using Parsimonious Factor and Shrinkage Methods Hyun Hak Kim 1 and Norman Swanson 2 1 Bank of Korea and 2 Rutgers University ECB Workshop on using Big Data for Forecasting and Statistics

More information

MULTIVARIATE STATISTICAL ANALYSIS OF SPECTROSCOPIC DATA. Haisheng Lin, Ognjen Marjanovic, Barry Lennox

MULTIVARIATE STATISTICAL ANALYSIS OF SPECTROSCOPIC DATA. Haisheng Lin, Ognjen Marjanovic, Barry Lennox MULTIVARIATE STATISTICAL ANALYSIS OF SPECTROSCOPIC DATA Haisheng Lin, Ognjen Marjanovic, Barry Lennox Control Systems Centre, School of Electrical and Electronic Engineering, University of Manchester Abstract:

More information

Factor Analysis (10/2/13)

Factor Analysis (10/2/13) STA561: Probabilistic machine learning Factor Analysis (10/2/13) Lecturer: Barbara Engelhardt Scribes: Li Zhu, Fan Li, Ni Guan Factor Analysis Factor analysis is related to the mixture models we have studied.

More information

FEATURE EXTRACTION USING SUPERVISED INDEPENDENT COMPONENT ANALYSIS BY MAXIMIZING CLASS DISTANCE

FEATURE EXTRACTION USING SUPERVISED INDEPENDENT COMPONENT ANALYSIS BY MAXIMIZING CLASS DISTANCE FEATURE EXTRACTION USING SUPERVISED INDEPENDENT COMPONENT ANALYSIS BY MAXIMIZING CLASS DISTANCE Yoshinori Sakaguchi*, Seiichi Ozawa*, and Manabu Kotani** *Graduate School of Science and Technology, Kobe

More information

European Journal of Operational Research

European Journal of Operational Research European Journal of Operational Research 207 (200) 22 29 Contents lists available at ScienceDirect European Journal of Operational Research journal homepage: www.elsevier.com/locate/ejor Interfaces with

More information

COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection

COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection COMP 551 Applied Machine Learning Lecture 13: Dimension reduction and feature selection Instructor: Herke van Hoof (herke.vanhoof@cs.mcgill.ca) Based on slides by:, Jackie Chi Kit Cheung Class web page:

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Independent Component Analysis Barnabás Póczos Independent Component Analysis 2 Independent Component Analysis Model original signals Observations (Mixtures)

More information

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014

Dimensionality Reduction Using PCA/LDA. Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction Using PCA/LDA Hongyu Li School of Software Engineering TongJi University Fall, 2014 Dimensionality Reduction One approach to deal with high dimensional data is by reducing their

More information

A Unified Bayesian Framework for Face Recognition

A Unified Bayesian Framework for Face Recognition Appears in the IEEE Signal Processing Society International Conference on Image Processing, ICIP, October 4-7,, Chicago, Illinois, USA A Unified Bayesian Framework for Face Recognition Chengjun Liu and

More information

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY

Keywords Eigenface, face recognition, kernel principal component analysis, machine learning. II. LITERATURE REVIEW & OVERVIEW OF PROPOSED METHODOLOGY Volume 6, Issue 3, March 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com Eigenface and

More information

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani

PCA & ICA. CE-717: Machine Learning Sharif University of Technology Spring Soleymani PCA & ICA CE-717: Machine Learning Sharif University of Technology Spring 2015 Soleymani Dimensionality Reduction: Feature Selection vs. Feature Extraction Feature selection Select a subset of a given

More information

An Introduction to Independent Components Analysis (ICA)

An Introduction to Independent Components Analysis (ICA) An Introduction to Independent Components Analysis (ICA) Anish R. Shah, CFA Northfield Information Services Anish@northinfo.com Newport Jun 6, 2008 1 Overview of Talk Review principal components Introduce

More information

Course 495: Advanced Statistical Machine Learning/Pattern Recognition

Course 495: Advanced Statistical Machine Learning/Pattern Recognition Course 495: Advanced Statistical Machine Learning/Pattern Recognition Deterministic Component Analysis Goal (Lecture): To present standard and modern Component Analysis (CA) techniques such as Principal

More information

Modeling Dynamic Production Systems with Network Structure

Modeling Dynamic Production Systems with Network Structure Iranian Journal of Mathematical Sciences and Informatics Vol. 12, No. 1 (2017), pp 13-26 DOI: 10.7508/ijmsi.2017.01.002 Modeling Dynamic Production Systems with Network Structure F. Koushki Department

More information

Independent Component Analysis and Its Application on Accelerator Physics

Independent Component Analysis and Its Application on Accelerator Physics Independent Component Analysis and Its Application on Accelerator Physics Xiaoying Pang LA-UR-12-20069 ICA and PCA Similarities: Blind source separation method (BSS) no model Observed signals are linear

More information

Indian Institute of Management Calcutta. Working Paper Series. WPS No. 787 September 2016

Indian Institute of Management Calcutta. Working Paper Series. WPS No. 787 September 2016 Indian Institute of Management Calcutta Working Paper Series WPS No. 787 September 2016 Improving DEA efficiency under constant sum of inputs/outputs and Common weights Sanjeet Singh Associate Professor

More information

What is Principal Component Analysis?

What is Principal Component Analysis? What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most

More information

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017 CPSC 340: Machine Learning and Data Mining More PCA Fall 2017 Admin Assignment 4: Due Friday of next week. No class Monday due to holiday. There will be tutorials next week on MAP/PCA (except Monday).

More information

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)

CS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision) CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions

More information

7. Variable extraction and dimensionality reduction

7. Variable extraction and dimensionality reduction 7. Variable extraction and dimensionality reduction The goal of the variable selection in the preceding chapter was to find least useful variables so that it would be possible to reduce the dimensionality

More information

Lecture: Face Recognition

Lecture: Face Recognition Lecture: Face Recognition Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 12-1 What we will learn today Introduction to face recognition The Eigenfaces Algorithm Linear

More information

Independent Component Analysis

Independent Component Analysis Independent Component Analysis James V. Stone November 4, 24 Sheffield University, Sheffield, UK Keywords: independent component analysis, independence, blind source separation, projection pursuit, complexity

More information

Subhash C Ray Department of Economics University of Connecticut Storrs CT

Subhash C Ray Department of Economics University of Connecticut Storrs CT CATEGORICAL AND AMBIGUOUS CHARACTERIZATION OF RETUNS TO SCALE PROPERTIES OF OBSERVED INPUT-OUTPUT BUNDLES IN DATA ENVELOPMENT ANALYSIS Subhash C Ray Department of Economics University of Connecticut Storrs

More information

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations

More information

Separation of the EEG Signal using Improved FastICA Based on Kurtosis Contrast Function

Separation of the EEG Signal using Improved FastICA Based on Kurtosis Contrast Function Australian Journal of Basic and Applied Sciences, 5(9): 2152-2156, 211 ISSN 1991-8178 Separation of the EEG Signal using Improved FastICA Based on Kurtosis Contrast Function 1 Tahir Ahmad, 2 Hjh.Norma

More information

Donghoh Kim & Se-Kang Kim

Donghoh Kim & Se-Kang Kim Behav Res (202) 44:239 243 DOI 0.3758/s3428-02-093- Comparing patterns of component loadings: Principal Analysis (PCA) versus Independent Analysis (ICA) in analyzing multivariate non-normal data Donghoh

More information

PCA, Kernel PCA, ICA

PCA, Kernel PCA, ICA PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per

More information

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition ace Recognition Identify person based on the appearance of face CSED441:Introduction to Computer Vision (2017) Lecture10: Subspace Methods and ace Recognition Bohyung Han CSE, POSTECH bhhan@postech.ac.kr

More information

30E00300 Productivity and Efficiency Analysis Abolfazl Keshvari, Ph.D.

30E00300 Productivity and Efficiency Analysis Abolfazl Keshvari, Ph.D. 30E00300 Productivity and Efficiency Analysis 2016 Abolfazl Keshvari, Ph.D. abolfazl.keshvari@aalto.fi Mathematics and statistics We need to know some basics of math and stat What is a function, and its

More information

Further discussion on linear production functions and DEA

Further discussion on linear production functions and DEA European Journal of Operational Research 127 (2000) 611±618 www.elsevier.com/locate/dsw Theory and Methodology Further discussion on linear production functions and DEA Joe Zhu * Department of Management,

More information

Measuring the efficiency of assembled printed circuit boards with undesirable outputs using Two-stage Data Envelopment Analysis

Measuring the efficiency of assembled printed circuit boards with undesirable outputs using Two-stage Data Envelopment Analysis Measuring the efficiency of assembled printed circuit boards with undesirable outputs using Two-stage ata Envelopment Analysis Vincent Charles CENTRUM Católica, Graduate School of Business, Pontificia

More information

Adjusting Variables in Constructing Composite Indices by Using Principal Component Analysis: Illustrated By Colombo District Data

Adjusting Variables in Constructing Composite Indices by Using Principal Component Analysis: Illustrated By Colombo District Data Tropical Agricultural Research Vol. 27 (1): 95 102 (2015) Short Communication Adjusting Variables in Constructing Composite Indices by Using Principal Component Analysis: Illustrated By Colombo District

More information

Revenue Malmquist Index with Variable Relative Importance as a Function of Time in Different PERIOD and FDH Models of DEA

Revenue Malmquist Index with Variable Relative Importance as a Function of Time in Different PERIOD and FDH Models of DEA Revenue Malmquist Index with Variable Relative Importance as a Function of Time in Different PERIOD and FDH Models of DEA Mohammad Ehsanifar, Golnaz Mohammadi Department of Industrial Management and Engineering

More information

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007

HST.582J / 6.555J / J Biomedical Signal and Image Processing Spring 2007 MIT OpenCourseWare http://ocw.mit.edu HST.582J / 6.555J / 16.456J Biomedical Signal and Image Processing Spring 2007 For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

More information

Principal Component Analysis

Principal Component Analysis B: Chapter 1 HTF: Chapter 1.5 Principal Component Analysis Barnabás Póczos University of Alberta Nov, 009 Contents Motivation PCA algorithms Applications Face recognition Facial expression recognition

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

Symmetric Error Structure in Stochastic DEA

Symmetric Error Structure in Stochastic DEA Available online at http://ijim.srbiau.ac.ir Int. J. Industrial Mathematics (ISSN 2008-5621) Vol. 4, No. 4, Year 2012 Article ID IJIM-00191, 9 pages Research Article Symmetric Error Structure in Stochastic

More information

A Model for Estimating the Correlation of Latent Variable in Social Network and Social Integration

A Model for Estimating the Correlation of Latent Variable in Social Network and Social Integration Journal of Information Hiding and Multimedia Signal Processing c 208 ISSN 2073-422 Ubiquitous International Volume 9, Number 3, May 208 A Model for Estimating the Correlation of Latent Variable in Social

More information

Multiple Similarities Based Kernel Subspace Learning for Image Classification

Multiple Similarities Based Kernel Subspace Learning for Image Classification Multiple Similarities Based Kernel Subspace Learning for Image Classification Wang Yan, Qingshan Liu, Hanqing Lu, and Songde Ma National Laboratory of Pattern Recognition, Institute of Automation, Chinese

More information

Recursive Generalized Eigendecomposition for Independent Component Analysis

Recursive Generalized Eigendecomposition for Independent Component Analysis Recursive Generalized Eigendecomposition for Independent Component Analysis Umut Ozertem 1, Deniz Erdogmus 1,, ian Lan 1 CSEE Department, OGI, Oregon Health & Science University, Portland, OR, USA. {ozertemu,deniz}@csee.ogi.edu

More information

Expectation Maximization

Expectation Maximization Expectation Maximization Machine Learning CSE546 Carlos Guestrin University of Washington November 13, 2014 1 E.M.: The General Case E.M. widely used beyond mixtures of Gaussians The recipe is the same

More information

A DIMENSIONAL DECOMPOSITION APPROACH TO IDENTIFYING EFFICIENT UNITS IN LARGE-SCALE DEA MODELS

A DIMENSIONAL DECOMPOSITION APPROACH TO IDENTIFYING EFFICIENT UNITS IN LARGE-SCALE DEA MODELS Pekka J. Korhonen Pyry-Antti Siitari A DIMENSIONAL DECOMPOSITION APPROACH TO IDENTIFYING EFFICIENT UNITS IN LARGE-SCALE DEA MODELS HELSINKI SCHOOL OF ECONOMICS WORKING PAPERS W-421 Pekka J. Korhonen Pyry-Antti

More information

Short-Time ICA for Blind Separation of Noisy Speech

Short-Time ICA for Blind Separation of Noisy Speech Short-Time ICA for Blind Separation of Noisy Speech Jing Zhang, P.C. Ching Department of Electronic Engineering The Chinese University of Hong Kong, Hong Kong jzhang@ee.cuhk.edu.hk, pcching@ee.cuhk.edu.hk

More information

Image Analysis. PCA and Eigenfaces

Image Analysis. PCA and Eigenfaces Image Analysis PCA and Eigenfaces Christophoros Nikou cnikou@cs.uoi.gr Images taken from: D. Forsyth and J. Ponce. Computer Vision: A Modern Approach, Prentice Hall, 2003. Computer Vision course by Svetlana

More information

Independent Component Analysis (ICA)

Independent Component Analysis (ICA) Independent Component Analysis (ICA) Université catholique de Louvain (Belgium) Machine Learning Group http://www.dice.ucl ucl.ac.be/.ac.be/mlg/ 1 Overview Uncorrelation vs Independence Blind source separation

More information

Data Envelopment Analysis and its aplications

Data Envelopment Analysis and its aplications Data Envelopment Analysis and its aplications VŠB-Technical University of Ostrava Czech Republic 13. Letní škola aplikované informatiky Bedřichov Content Classic Special - for the example The aim to calculate

More information

Machine Learning (BSMC-GA 4439) Wenke Liu

Machine Learning (BSMC-GA 4439) Wenke Liu Machine Learning (BSMC-GA 4439) Wenke Liu 02-01-2018 Biomedical data are usually high-dimensional Number of samples (n) is relatively small whereas number of features (p) can be large Sometimes p>>n Problems

More information

Independent Component Analysis. PhD Seminar Jörgen Ungh

Independent Component Analysis. PhD Seminar Jörgen Ungh Independent Component Analysis PhD Seminar Jörgen Ungh Agenda Background a motivater Independence ICA vs. PCA Gaussian data ICA theory Examples Background & motivation The cocktail party problem Bla bla

More information

Independent Component Analysis and Unsupervised Learning

Independent Component Analysis and Unsupervised Learning Independent Component Analysis and Unsupervised Learning Jen-Tzung Chien National Cheng Kung University TABLE OF CONTENTS 1. Independent Component Analysis 2. Case Study I: Speech Recognition Independent

More information

PRINCIPAL COMPONENT ANALYSIS TO RANKING TECHNICAL EFFICIENCIES THROUGH STOCHASTIC FRONTIER ANALYSIS AND DEA

PRINCIPAL COMPONENT ANALYSIS TO RANKING TECHNICAL EFFICIENCIES THROUGH STOCHASTIC FRONTIER ANALYSIS AND DEA PRINCIPAL COMPONENT ANALYSIS TO RANKING TECHNICAL EFFICIENCIES THROUGH STOCHASTIC FRONTIER ANALYSIS AND DEA Sergio SCIPPACERCOLA Associate Professor, Department of Economics, Management, Institutions University

More information

Determination of Economic Optimal Strategy for Increment of the Electricity Supply Industry in Iran by DEA

Determination of Economic Optimal Strategy for Increment of the Electricity Supply Industry in Iran by DEA International Mathematical Forum, 2, 2007, no. 64, 3181-3189 Determination of Economic Optimal Strategy for Increment of the Electricity Supply Industry in Iran by DEA KH. Azizi Department of Management,

More information

Linear Subspace Models

Linear Subspace Models Linear Subspace Models Goal: Explore linear models of a data set. Motivation: A central question in vision concerns how we represent a collection of data vectors. The data vectors may be rasterized images,

More information

The Comparison of Stochastic and Deterministic DEA Models

The Comparison of Stochastic and Deterministic DEA Models The International Scientific Conference INPROFORUM 2015, November 5-6, 2015, České Budějovice, 140-145, ISBN 978-80-7394-536-7. The Comparison of Stochastic and Deterministic DEA Models Michal Houda, Jana

More information

The Allocation of Talent and U.S. Economic Growth

The Allocation of Talent and U.S. Economic Growth The Allocation of Talent and U.S. Economic Growth Chang-Tai Hsieh Erik Hurst Chad Jones Pete Klenow October 2012 Large changes in the occupational distribution... White Men in 1960: 94% of Doctors, 96%

More information

Sensitivity and Stability Radius in Data Envelopment Analysis

Sensitivity and Stability Radius in Data Envelopment Analysis Available online at http://ijim.srbiau.ac.ir Int. J. Industrial Mathematics Vol. 1, No. 3 (2009) 227-234 Sensitivity and Stability Radius in Data Envelopment Analysis A. Gholam Abri a, N. Shoja a, M. Fallah

More information

Research on Feature Extraction Method for Handwritten Chinese Character Recognition Based on Kernel Independent Component Analysis

Research on Feature Extraction Method for Handwritten Chinese Character Recognition Based on Kernel Independent Component Analysis Research Journal of Applied Sciences, Engineering and echnology 6(7): 183-187, 013 ISSN: 040-7459; e-issn: 040-7467 Maxwell Scientific Organization, 013 Submitted: November 13, 01 Accepted: January 11,

More information

Independent Component Analysis (ICA) Bhaskar D Rao University of California, San Diego

Independent Component Analysis (ICA) Bhaskar D Rao University of California, San Diego Independent Component Analysis (ICA) Bhaskar D Rao University of California, San Diego Email: brao@ucsdedu References 1 Hyvarinen, A, Karhunen, J, & Oja, E (2004) Independent component analysis (Vol 46)

More information

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra

Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE Artificial Intelligence Grad Project Dr. Debasis Mitra Factor Analysis (FA) Non-negative Matrix Factorization (NMF) CSE 5290 - Artificial Intelligence Grad Project Dr. Debasis Mitra Group 6 Taher Patanwala Zubin Kadva Factor Analysis (FA) 1. Introduction Factor

More information

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University

Lecture 24: Principal Component Analysis. Aykut Erdem May 2016 Hacettepe University Lecture 4: Principal Component Analysis Aykut Erdem May 016 Hacettepe University This week Motivation PCA algorithms Applications PCA shortcomings Autoencoders Kernel PCA PCA Applications Data Visualization

More information

Dimension Reduction (PCA, ICA, CCA, FLD,

Dimension Reduction (PCA, ICA, CCA, FLD, Dimension Reduction (PCA, ICA, CCA, FLD, Topic Models) Yi Zhang 10-701, Machine Learning, Spring 2011 April 6 th, 2011 Parts of the PCA slides are from previous 10-701 lectures 1 Outline Dimension reduction

More information

A Modular NMF Matching Algorithm for Radiation Spectra

A Modular NMF Matching Algorithm for Radiation Spectra A Modular NMF Matching Algorithm for Radiation Spectra Melissa L. Koudelka Sensor Exploitation Applications Sandia National Laboratories mlkoude@sandia.gov Daniel J. Dorsey Systems Technologies Sandia

More information

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018

CPSC 340: Machine Learning and Data Mining. Sparse Matrix Factorization Fall 2018 CPSC 340: Machine Learning and Data Mining Sparse Matrix Factorization Fall 2018 Last Time: PCA with Orthogonal/Sequential Basis When k = 1, PCA has a scaling problem. When k > 1, have scaling, rotation,

More information

Independent component analysis: algorithms and applications

Independent component analysis: algorithms and applications PERGAMON Neural Networks 13 (2000) 411 430 Invited article Independent component analysis: algorithms and applications A. Hyvärinen, E. Oja* Neural Networks Research Centre, Helsinki University of Technology,

More information

Independent Component Analysis

Independent Component Analysis A Short Introduction to Independent Component Analysis Aapo Hyvärinen Helsinki Institute for Information Technology and Depts of Computer Science and Psychology University of Helsinki Problem of blind

More information

Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule

Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule Iterative face image feature extraction with Generalized Hebbian Algorithm and a Sanger-like BCM rule Clayton Aldern (Clayton_Aldern@brown.edu) Tyler Benster (Tyler_Benster@brown.edu) Carl Olsson (Carl_Olsson@brown.edu)

More information

Principal Component Analysis

Principal Component Analysis Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand

More information

Kazuhiro Fukui, University of Tsukuba

Kazuhiro Fukui, University of Tsukuba Subspace Methods Kazuhiro Fukui, University of Tsukuba Synonyms Multiple similarity method Related Concepts Principal component analysis (PCA) Subspace analysis Dimensionality reduction Definition Subspace

More information

Independent Component Analysis

Independent Component Analysis 1 Independent Component Analysis Background paper: http://www-stat.stanford.edu/ hastie/papers/ica.pdf 2 ICA Problem X = AS where X is a random p-vector representing multivariate input measurements. S

More information

Learning features by contrasting natural images with noise

Learning features by contrasting natural images with noise Learning features by contrasting natural images with noise Michael Gutmann 1 and Aapo Hyvärinen 12 1 Dept. of Computer Science and HIIT, University of Helsinki, P.O. Box 68, FIN-00014 University of Helsinki,

More information

Eigenface-based facial recognition

Eigenface-based facial recognition Eigenface-based facial recognition Dimitri PISSARENKO December 1, 2002 1 General This document is based upon Turk and Pentland (1991b), Turk and Pentland (1991a) and Smith (2002). 2 How does it work? The

More information

Two-Layered Face Detection System using Evolutionary Algorithm

Two-Layered Face Detection System using Evolutionary Algorithm Two-Layered Face Detection System using Evolutionary Algorithm Jun-Su Jang Jong-Hwan Kim Dept. of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology (KAIST),

More information

Feature Extraction with Weighted Samples Based on Independent Component Analysis

Feature Extraction with Weighted Samples Based on Independent Component Analysis Feature Extraction with Weighted Samples Based on Independent Component Analysis Nojun Kwak Samsung Electronics, Suwon P.O. Box 105, Suwon-Si, Gyeonggi-Do, KOREA 442-742, nojunk@ieee.org, WWW home page:

More information

Forecasting Wind Ramps

Forecasting Wind Ramps Forecasting Wind Ramps Erin Summers and Anand Subramanian Jan 5, 20 Introduction The recent increase in the number of wind power producers has necessitated changes in the methods power system operators

More information

Advanced Introduction to Machine Learning CMU-10715

Advanced Introduction to Machine Learning CMU-10715 Advanced Introduction to Machine Learning CMU-10715 Principal Component Analysis Barnabás Póczos Contents Motivation PCA algorithms Applications Some of these slides are taken from Karl Booksh Research

More information