Clustering VS Classification

Similar documents
L11: Pattern recognition principles

Machine learning for pervasive systems Classification in high-dimensional spaces

Machine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling

Math Bootcamp An p-dimensional vector is p numbers put together. Written as. x 1 x =. x p

Clustering. CSL465/603 - Fall 2016 Narayanan C Krishnan

1. Let m 1 and n 1 be two natural numbers such that m > n. Which of the following is/are true?

Probabilistic Latent Semantic Analysis

p(x ω i 0.4 ω 2 ω

Parametric Unsupervised Learning Expectation Maximization (EM) Lecture 20.a

ECE 521. Lecture 11 (not on midterm material) 13 February K-means clustering, Dimensionality reduction

Data Preprocessing. Cluster Similarity

CSC411: Final Review. James Lucas & David Madras. December 3, 2018

Mathematical foundations - linear algebra

December 20, MAA704, Multivariate analysis. Christopher Engström. Multivariate. analysis. Principal component analysis

(a) If A is a 3 by 4 matrix, what does this tell us about its nullspace? Solution: dim N(A) 1, since rank(a) 3. Ax =

Linear Algebra Primer

Pattern Recognition and Machine Learning

Lecture 7: Con3nuous Latent Variable Models

(a) II and III (b) I (c) I and III (d) I and II and III (e) None are true.

Background Mathematics (2/2) 1. David Barber

Quiz ) Locate your 1 st order neighbors. 1) Simplify. Name Hometown. Name Hometown. Name Hometown.

ECE521 week 3: 23/26 January 2017

PATTERN CLASSIFICATION

Statistical Machine Learning

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION)

Linear Algebra and Matrices

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

STA 414/2104: Machine Learning

Linear Algebra Review. Vectors

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

EE731 Lecture Notes: Matrix Computations for Signal Processing

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to Linear Algebra

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Math 4A Notes. Written by Victoria Kala Last updated June 11, 2017

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

Face Recognition. Face Recognition. Subspace-Based Face Recognition Algorithms. Application of Face Recognition

Lecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26

LINEAR ALGEBRA QUESTION BANK

SUMMARY OF MATH 1600

Linear & nonlinear classifiers

LINEAR ALGEBRA KNOWLEDGE SURVEY

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

7. Symmetric Matrices and Quadratic Forms

Cheat Sheet for MATH461

Regularized Discriminant Analysis and Reduced-Rank LDA

1 9/5 Matrices, vectors, and their applications

Problem Set (T) If A is an m n matrix, B is an n p matrix and D is a p s matrix, then show

Main matrix factorizations

Linear Classifiers as Pattern Detectors

CS534 Machine Learning - Spring Final Exam

Data Mining Techniques

Machine Learning Techniques for Computer Vision

Review of Some Concepts from Linear Algebra: Part 2

LECTURE NOTE #10 PROF. ALAN YUILLE

Math 1553, Introduction to Linear Algebra

Computational functional genomics

Computer Vision Group Prof. Daniel Cremers. 3. Regression

SAMPLE OF THE STUDY MATERIAL PART OF CHAPTER 1 Introduction to Linear Algebra

Math Final December 2006 C. Robinson

Expectation Maximization

Statistical Pattern Recognition

Face Recognition Using Laplacianfaces He et al. (IEEE Trans PAMI, 2005) presented by Hassan A. Kingravi

Linear Dimensionality Reduction

Computer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)

Machine Learning 2017

18.06SC Final Exam Solutions

Math 224, Fall 2007 Exam 3 Thursday, December 6, 2007

CS 195-5: Machine Learning Problem Set 1

Introduction to Matrix Algebra

Ch 4. Linear Models for Classification

There are six more problems on the next two pages

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Linear Algebra Methods for Data Mining

I. Multiple Choice Questions (Answer any eight)

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

B553 Lecture 5: Matrix Algebra Review

Machine Learning Lecture 5

Pattern Classification

Linear Models for Classification

Data Analysis and Manifold Learning Lecture 6: Probabilistic PCA and Factor Analysis

MATH 369 Linear Algebra

CS281 Section 4: Factor Analysis and PCA

PRACTICE FINAL EXAM. why. If they are dependent, exhibit a linear dependence relation among them.

Data Exploration and Unsupervised Learning with Clustering

LINEAR ALGEBRA SUMMARY SHEET.

Principal Component Analysis

Unsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent

Dimensionality Reduction and Principle Components

Lecture 16: Small Sample Size Problems (Covariance Estimation) Many thanks to Carlos Thomaz who authored the original version of these slides

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Dimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas

UNIT 6: The singular value decomposition.

EEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

Dimensionality Reduction and Principal Components

Solution of Linear Equations

Math 265 Linear Algebra Sample Spring 2002., rref (A) =

Lecture 2: Linear Algebra Review

Transcription:

MCQ Clustering VS Classification 1. What is the relation between the distance between clusters and the corresponding class discriminability? a. proportional b. inversely-proportional c. no-relation Ans: (d) 2. To measure the density at a point, consider a. sphere of any size b. sphere of unit volume c. hyper-cube of unit volume d. both (b) and (c) 3. Agglomerative clustering falls under which type of clustering method? a. partition b. hierarchical c. none of the above Ans: (c) 4. Indicate which is/are a method of clustering a. linkage method b. split and merge c. both a and b d. neither a nor b 5. K means and K-medioids are example of which type of clustering method? a. Hierarchical b. partition c. probabilistic d. None of the above.

6. Unsupervised classification can be termed as a. distance measurement b. dimensionality reduction c. clustering d. none of the above Ans: (d) Ans: (c) 7. Indicate which one is a method of density estimation a. Histogram based b. Branch and bound procedure c. Neighborhood distance d. all of the above

MCQ Linear Algebra 1. Which of the properties are true for matrix multiplication a. Distributive b. Commutative c. both a and b d. neither a nor b Ans: (c) 2. Which of the operations can be valid with two matrices of different sizes? a. addition b. subtraction c. multiplication d. Division Ans: (c) 3. Which of the following statements are true? a. trace(a)=trace(a ) b. det(a)=det(a ) c. both a and b d. neither a nor b 4. Which property ensures that inverse of a matrix exists? a. determinant is non-zero b. determinant is zero c. matrix is square d. trace of matrix is positive value. 5. Identify the correct order of general to specific matrix? a. square->identity->symmetric->diagonal b. symmetric->diagonal->square->identity c. square->diagonal->identity->symmetric

d. square->symmetric->diagonal->identity Ans: (d) Ans: (d) 6. Which of the statements are true? a. If A is a symmetric matrix, inv(a) is also symmetric b. det(inv(a)) = 1/det(A) c. If A and B are invertible matrices, AB is an invertible matrix too. d. all of the above Ans: (d) 7. Which of the following options hold true? a. inv(inv(a)) = A b. inv(ka)=inv(a)/k c. inv(a ) = inv(a) d. all of the above

MCQ Eigenvalues and Eigenvectors 1. The Eigenvalues of a matrix 2 7 are 1 6 a. 3 and 0 b. -2 and 7 c. -5 and 1 d. 3 and -5 Ans: (c) 0 1 1 2. The Eigenvalues of 1 0 1 are 1 1 0 a. -1, 1 and 2 b. 1, 1 and -2 c. -1, -1 and 2 d. 1, 1 and 2 Ans: (c) 0 1 1 3. The Eigenvectors of 1 0 1 are 1 1 0 a. (1 1 1), (1 0 1) and (1 1 0) b. (1 1-1), (1 0-1) and (1 1 0) c. (-1 1-1), (1 0 1) and (1 1 0) d. (1 1 1), (-1 0 1) and (-1 1 0) Ans: (d) Ans: (c) 4. Indicate which of the statements are true? a. A and A*A have same Eigenvectors b. If m is an Eigenvalue of A, then m^2 is an Eigenvalue of A*A. c. both a and b d. neither a nor b 5. Indicate which of the statements are true? a. If m is an Eigenvalue of A, the m is an Eigenvalue of A

b. If m is an Eigenvalue of A, then 1/m is the Eigenvalue of inv(a) c. both a and b d. neither a nor b Ans: (c) 6. Indicate which of the statements are true? a. A singular matrix must have a zero Eigenvalue b. A singular matrix must have a negative Eigenvalue c. A singular matrix must have a complex Eigenvalue d. (d) All of the above

MCQ Vector Spaces 1. Which of these is a vector space? a. {(x y z w) R 4 x + y z + w = 0} b. {(x y z) R 3 x + y + z = 0} c. {(x y z) R 3 x 2 + y 2 + z 2 = 1} d. { a 1 a, b, c R} b c Ans: (d) Ans: (d) 2. Under which of the following operations {(x, y) x, y R} is a vector space? a. (x 1, y 1 ) + (x 2, y 2 ) = (x 1 + x 2, y 1 + y 2 ) and r. (x, y) = (rx, y) b. (x 1, y 1 ) + (x 2, y 2 ) = (x 1 + x 2, y 1 + y 2 ) and r. (x, y) = (rx, 0) c. both a and b d. neither a nor b 3. Which of the following statements are true? a. r. v = 0, if and only if r=0 b. r 1. v = r 2. v, if and only if r 1 = r 2 c. set of all matrices under usual operations is not a vector space d. all of the above a 3b + 6c 4. What is the dimension of the subspace H = 5a + 4d : a, b, c, d R b 2c d 5d a. 1 b. 2 c. 3 d. 4 Ans: (c) 2 5. What is the rank of the matrix 1 7 4 1 2 8 5 1 4 10 7 6 8 3 3 2 10 0 4

a. 2 b. 3 c. 4 d. 5 6. If v1, v2, v3, v4 are in R 4 and v3 is not a linear combination of v1, v2, v4, then {v1, v2, v3, v4} must be linearly independent. a. True b. False. For example, if v4 = v1 + v2, then 1v1 + 1 v2 + 0 v3-1 v4 = 0. 1 1 3 7. The vectors x1= 1, x2= 1, x3= 1 are : 1 2 4 a. Linearly dependent b. Linearly independent. Because 2x1 + x2 -x3 = 0. 8. The vectors x1= 1, x2= 5 are : 2 3 a. Linearly dependent b. Linearly independent.

Rank and SVD MCQ 1. The number of non-zero rows in an echelon form is called? a. reduced echelon form b. rank of a matrix c. conjugate of the matrix d. cofactor of the matrix 2. Let A and В be arbitrary m x n matrices. Then which one of the following statement is true a. rank(a + B) rank(a) + rank(b) b. rank(a + B) < rank(a) + rank(b) c. rank(a + B) rank(a) + rank(b) d. rank(a + B) > rank(a) + rank(b) 3. The rank of the matrix 0 0 0 is 0 0 0 a. 0 b. 2 c. 1 d. 3 1 1 1 4. The rank of 1 1 1 is 1 1 1 a. 3 b. 2 c. 1 d. 0 Ans: (c)

5. Consider the following two statements: I. The maximum number of linearly independent column vectors of a matrix A is called the rank of A. II. If A is an n x n square matrix, it will be nonsingular is rank A = n. With reference to the above statements, which of the following applies? a. Both the statements are false2 b. Both the statements are true c. I is true but II is false. d. I is false but II is true 6. The rank of a 3 x 3 matrix C (= AB), found by multiplying a non-zero column matrix A of size 3 x 1 and a non-zero row matrix B of size 1 x 3, is a. 0 b. 1 c. 2 d. 3 7. Find the singular values of the matrix B = 1 2 2 1 a. 2 and 4 b. 3 and 4 c. 2 and 3 d. 3 and 1 Ans: (d) 8. Grahm-Schmidt Process involves factorizing a matrix as a multiplication of two matrices a. One is Orthogonal and the other one is upper-triangular b. Both are symmetric c. One is symmetric and the other one is anti-symmetric d. One is diagonal and the other one is symmetric

9. SVD is defined as A = U ΣV T where U consists of Eigenvectors of a. AA T b. A T A c. AA -1 d. A*A 10. SVD is defined as A = U ΣV T, where Σ is : a. diagonal matrix having singular values b. diagonal matrix having arbitrary values c. identity matrix d. non diagonal matrix

MCQ Normal Distribution and Decision Boundary I 1. Three components of Bayes decision rule are class prior, likelihood and a. Evidence b. Instance c. Confidence d. Salience 2. Gaussian function is also called function a. Bell b. Signum c. Fixed Point d. Quintic 3. The span of the Gaussian curve is determined by the. of the distribution a. Mean b. Mode c. Median d. Variance Ans: (d) 4. When the value of the data is equal to the mean of the distribution in which it belongs to, the Gaussian function attains value a. Minimum b. Maximum c. Zero d. None of the above 5. The full width of the Gaussian function at half the maximum is a. 2.35σ b. 1. 5σ c. 0.5σ d. 0.355σ

6. Property of correlation coefficient is a. 1 ρ xy 1 b. 0.5 ρ xy 1 c. 1 ρ xy 1.5 d. 0.5 ρ xy 0.5 7. The correlation coefficient can be viewed as angle between two vectors in R D a. Sin b. Cos c. Tan d. Sec 8. For a n-dimensional data, number of correlation coefficient is equal to a. n C 2 b. n-1 c. n 2 d. log(n) 9. Iso-contour lines of smaller radius depicts. value of the density function a. Higher b. Lower c. Equal d. None of the above

MCQ Normal Distribution and Decision Boundary II 1. If the covariance matrix is strictly diagonal with equal variance then the iso-contour lines (data scatter) of the data resembles a. Concentric circle b. Ellipse c. Oriented Ellipse d. None of the above 2. Nature of the decision boundary is determined by a. Decision Rule b. Decision boundary c. Discriminant function d. None of the above Ans: (c) 3. In Supervised learning, class labels of the training samples are a. Known b. Unknown c. Doesn t matter d. Partially known 4. In learning is online then it is called a. Supervised b. Unsupervised c. Semi-supervised d. None of the above 5. In supervised learning, the process of learning is a. Online b. Offline c. Partially online and offline d. Doesn t matter

6. For spiral data the decision boundary will be a. Linear b. Non-linear c. Does not exist 7. In a 2-class problem, if the discriminant function satisfies g 1 (x) = g 2 (x) then, the data point lies a. On the DB b. Class 1 s side c. Class 2 s side d. None of the above

Bayes Theorem MCQ 1. P X P w i X = Ans: (c) a. P 1 X P w i X b. P X P 1 w i X c. P X w i P(w i ) d. P X w i P w i X 2. In Bayes Theorem, unconditional probability is called as a. Evidence b. Likelihood c. Prior d. Posterior 3. In Bayes Theorem, Class conditional probability is called as a. Evidence b. Likelihood c. Prior d. Posterior 4. When the covariance term in Mahalobian distance becomes Identity then the distance is similar to a. Euclidean distance b. Manhattan distance c. City block distance d. Geodesic distance 5. The decision boundary for an N-dimensional (N>3) data will be a a. Point b. Line c. Plane d. Hyperplane Ans: (d)

6. Bayes error is the.. bound of probability of classification error. a. Lower b. Upper 7. Bayes decision rule is the theoretically.. classifier that minimize probability of classification error. a. Best b. Worst c. Average

MCQ Linear Discriminant Function and Perceptron Learning 1. A perceptron is: a. a single McCulloch-Pitts neuron b. an autoassociative neural network c. a double layer autoassociative neural network d. All the above 2. Perceptron is used as a classifier for a. Linearly separable data b. Non-linearly separable data c. Linearly non-separable data d. Any data 3. A 4-input neuron has weights 1, 2, 3 and 4. The transfer function is linear with the constant of proportionality being equal to 2. The inputs are 4, 10, 5 and 20 respectively. The output will be: a. 238 b. 76 c. 119 d. 178 4. Consider a perceptron for which training sample, uu R 2 and 1 1 for a > 0 f(a) = 0 for a = 0 1 for a < 0 uu 1 uu 2 f(a) Let the desired output (y) be 1 when elements of class A = {(1,2),(2,4),(3,3),(4,4)} is applied as input and let it be -1 for the class B = {(0,0),(2,3),(3,0),(4,2)}. Let the initial connection weights w 0 (0) = +1, w 1 (0) = -2, w 2 (0) = +1 and learning rate be η = 0.5.

This perceptron is to be trained by perceptron convergence procedure, for which the weight update formula is (t + 1) = w(t) + η(y f(a))uu, where f(a) is the actual output. A. If u = (4,4) is applied as input, then w(1)=? a. [2,2,5] T b. [2,1,5] T c. [2,1,1] T d. [2,0,5] T B. If (4,2) is then applied, what will be w(2) a. [1,-2,3] T b. [-1,-2,3] T c. [1,-2,-3] T d. [1,2,3] T 5. Perceptron training rule converges, if data is a. Linearly separable b. Non-linearly separable c. Linearly non-separable data d. Any data 6. Is XOR problem solvable using a single perceptron a. Yes b. No c. Can t say 7. Consider a perceptron for which training sample, uu R 2 and actual output, x {0,1}, let the desired output be 0 when elements of class A={(2,4),(3,2),(3,4)} is applied as input and let it be 1 for the class B={(1,0),(1,2),(2,1)}. Let the learning rate η be 0.5 and initial connection weights are w 0 =0, w 1 =1, w 2 =1. Answer the following questions: A. Shall the perceptron convergence procedure terminate if the input patterns from class A and B are repeatedly applied by choosing a very small learning rate?

a. Yes b. No c. Can t say. Since Classes are linearly separable. B. Now add sample (5,2) to class B, what is your answer now, i.e. will it converge or not? a. Yes b. No c. Can t say. After adding above sample, classes become non linear separable.

MCQ Linear and Non-Linear Decision Boundaries 1. Decision Boundary in case of same covariance matrix, with identical diagonal elements is : a. Linear b. Non-Linear c. None of the above 2. Decision Boundary in case of diagonal covariance matrix, with identical diagonal elements is given by W T (X X 0 ) = 0, where W is given by: a. (μ k μ l )/ σ 2 b. (μ k + μ l )/ σ 2 c. (μ k 2 + μ l 2 )/ σ 2 d. (μ k + μ l )/ σ 3. Decision Boundary in case of arbitrary covariance matrix but identical for all class is : a. Linear b. Non-Linear c. None of the above 4. Decision Boundary in case of arbitrary covariance matrix but identical for all class is given by W T (X X 0 ) = 0, where W is given by: a. (μ k μ l )/ σ 2 b. Σ 1 ( µ k µ l ) c. (μ 2 k + μ 2 l )/ σ 2 1 2 2 d. Σ ( µ ) k µ l 5. Decision Boundary in case of arbitrary covariance matrix and also unequal is : a. Linear b. Non-Linear c. None of the above

6. Discriminant function in case of arbitrary covariance matrix and all parameters are class dependent is given by X T W i X + w i T X + w io = 0, where W is given by: d. 1 Σ 1 i a. 2 b. Σ 1 i i µ 1 Σ 1 i µ c. i 2 1 Σ 1 i 4

MCQ PCA Ans: (c) 1. The tool used to obtain a PCA is a. LU Decomposition b. QR Decomposition c. SVD d. Cholesky Decomposition 2. PCA is used for a. Dimensionality Enhancement b. Dimensionality Reduction c. Both d. None 3. The scatter matrix of the transformed feature vector is given by a. (x k μ)(x k μ) T N k=1 N b. k=1 (x k μ) T (x k μ) c. N k=1 (μ x k )(μ x k ) T d. N (μ x k ) T (μ x k ) k=1 4. PCA is used for a. Supervised Classification b. Unsupervised Classification c. Semi-supervised Classification d. Cannot be used for classification

5. The vectors which correspond to the vanishing singular values of a matrix that span the null space of the matrix are: a. Right singular vectors b. Left singular vectors c. All the singular vectors d. None Ans: (d) 6. If S is the scatter of the data in the original domain, then the scatter of the transformed feature vectors is given by a. S T b. S c. WSW T d. W T SW 7. The largest Eigen vector gives the direction of the a. Maximum scatter of the data b. Minimum scatter of the data c. No such information can be interpreted d. Second largest Eigen vector which is in the same direction. Ans: (d) 8. The following linear transform does not have a fixed set of basis vectors: a. DCT b. DFT c. DWT d. PCA 9. The Within Class scatter matrix is given by: c a. (x k μ i )(x k μ i ) T i=1 c i=1 c i=1 c i=1 N k=1 N b. k=1(x k μ i ) T (x k μ i ) c. k=1(x i μ k )(x i μ k ) T d. N (x i μ k ) T (x i μ k ) k=1

10. The Between Class scatter matrix is given by: a. N i (μ i μ)(μ i μ) T c i=1 c b. i=1 N i (μ i μ) T (μ i μ) c. c i=1 N i (μ μ i )(μ μ i ) T c d. N i (μ μ i ) T (μ μ i ) i=1 11. Which of the following is unsupervised technique? a. PCA b. LDA c. Bayes d. None of the above

MCQ Linear Discriminant Analysis 1. Linear Discriminant Analysis is a. Unsupervised Learning b. Supervised Learning c. Semi-supervised Learning d. None of the above 2. The following property of a within-class scatter matrix is a must for LDA: a. Singular b. Non-singular c. Does not matter d. Problem-specific 3. In Supervised learning, class labels of the training samples are a. Known b. Unknown c. Doesn t matter d. Partially known 4. The upper bound of the number of non-zero Eigenvalues of S w -1S B (C = No. of Classes) a. C - 1 b. C + 1 c. C d. None of the above 5. If S w is singular and N<D, its rank is at most (N is total number of samples, D dimension of data, C is number of classes) a. N + C b. N c. C d. N - C Ans: (d)

6. If S w is singular and N<D the alternative solution is to use (N is total number of samples, D dimension of data) a. EM b. PCA c. ML d. Any one of the above

MCQ GMM 1. A method to estimate the parameters of a distribution is a. Maximum Likelihood b. Linear Programming c. Dynamic Programming d. Convex Optimization 2. Gaussian mixtures are also known as a. Gaussian multiplication b. Non-linear super-position of Gaussians c. Linear super-position of Gaussians d. None of the above Ans: (c) 3. The mixture coefficients of the GMM add upto a. 1 b. 0 c. Any value greater then 0 d. Any value less than 0 4. The mixture coefficients are a. Strictly positive b. Positive c. Strictly negative d. Negative 5. The mixture coefficients can take a value a. Greater than zero b. Greater than 1 c. Less than zero d. Between zero and 1 Ans: (d) 6. For Gaussian mixture models, parameters are estimated using a closed form solution by

a. Expectation Minimization b. Expectation Maximization c. Maximum Likelihood d. None of the above 7. Latent Variable in GMM is also known as: a. Prior Probability b. Posterior Probability c. Responsibility d. None of the above Ans: (b,c) 8. A GMM with K Gaussian mixture has K covariance matrices, with dimension: a. Arbitrary b. K X K c. D X D (Dimension of data) d. N X N (No of samples in the dataset) Ans: c

References: 1. Pattern Recognition and Machine Learning, Christopher M. Bishop, ISBN-13: 978-0387310732, Springer, 2007. 2. Linear Algebra and Its Applications, David C. Lay, ISBN-13: 978-0321780720, Pearson, 20011. 3. Pattern Classification. Richard O. Duda, Peter E. Hart, David G. Strok, ISBN- 9814-12-602-0, Wiley, 2004. 4. http://home.scarlet.be/math/pvect.htm 5. http://en.wikibooks.org/wiki/linear_algebra/combining_subspaces/solutions 6. http://www.eee.metu.edu.tr/~halici/courses/543lecturenotes/questions/qch6/index.html