Descriptive Statistics
|
|
- Joel Cain
- 5 years ago
- Views:
Transcription
1 Descriptive Statistics DS GA 1002 Probability and Statistics for Data Science Carlos Fernandez-Granda
2 Descriptive statistics Techniques to visualize and summarize data Can often be interpreted within a probabilistic framework Often probabilistic assumptions do not hold, but techniques are still useful We describe them from a deterministic point of view
3 Histogram Empirical mean and variance Order statistics Empirical covariance Empirical covariance matrix
4 Histogram Technique to visualize one-dimensional data Bin range of the data, then count the number of instances in each bin The width of the bins can be adjusted to yield higher or lower resolution Approximation to their pmf or pdf if data are iid
5 Temperature in Oxford January August Degrees (Celsius)
6 GDP per capita of different countries Thousands of dollars
7 Histogram Empirical mean and variance Order statistics Empirical covariance Empirical covariance matrix
8 Empirical mean Let {x 1, x 2,..., x n } be a set of real-valued data The empirical mean is defined as av (x 1, x 2,..., x n ) := 1 n n i=1 x i Temperature data: 6.73 C in January and 21.3 C in August GDP per capita: $16 500
9 Empirical mean Let { x 1, x 2,..., x n } be a set of d-dimensional real-valued data The empirical mean is defined as av ( x 1, x 2,..., x n ) := 1 n n x i i=1
10 Centering Let { x 1, x 2,..., x n } be a set of d-dimensional real-valued data To center the data set we: 1. Compute the empirical mean 2. Subtract it from each vector y i := x i av ( x 1, x 2,..., x n ), 1 i n y 1,..., y n are centered at the origin
11 Centering Uncentered data Centered data
12 Empirical variance Let {x 1, x 2,..., x n } be a set of real-valued data The empirical variance is defined as var (x 1, x 2,..., x n ) := 1 n 1 n (x i av (x 1, x 2,..., x n )) 2 The empirical standard deviation is the square root of the empirical variance Temperature data: 1.99 C in January and 1.73 C in August GDP per capita: $ i=1
13 Histogram Empirical mean and variance Order statistics Empirical covariance Empirical covariance matrix
14 Temperature dataset In January the temperature in Oxford is around 6.73 C give or take 2 C
15 GDP dataset Countries typically have a GDP per capita of about $ give or take $25 300
16 Quantiles and percentiles Let x (1) x (2)... x (n) denote the ordered elements of a dataset {x 1, x 2,..., x n } The q quantile of the data for 0 < q < 1 is x ([q(n+1)]) [q (n + 1)] is the closest integer to q (n + 1) The 100 p quantile is known as the p percentile
17 Quartiles and median The 0.25 and 0.75 quantiles are the first and third quartiles The 0.5 quantile is the empirical median If n is even, the empirical median is usually set to x (n/2) + x (n/2+1) 2 The difference between the 3rd and 1st quartiles is the interquartile range (IQR)
18 Quartiles and median Temperature data (January): Sample mean: 6.73 C Median: 6.80 C Interquartile range: 2.9 C Temperature data (August): Sample mean: 21.3 C Median: 21.2 C Interquartile range: 2.1 C
19 Quartiles and median GDP per capita: Sample mean: $ (71% of the countries have lower GDP per capita!) Median: $6 350 Interquartile range: $ Five-number summary: $130, $1 960, $6 350, $20 100, $
20 Boxplot of temperature data Degrees (Celsius) January April August November
21 Boxplot of GDP data Thousands of dollars
22 Histogram Empirical mean and variance Order statistics Empirical covariance Empirical covariance matrix
23 Multidimensional data Each dimension represents a feature We can visualize two-dimensional data using scatter plots
24 Scatter plot April August
25 Scatter plot 20 Minimum temperature Maximum temperature
26 Empirical covariance Data: {(x 1, y 1 ), (x 2, y 2 ),..., (x n, y n )} The empirical covariance is defined as cov ((x 1, y 1 ),..., (x n, y n )) := 1 n 1 n (x i av (x 1,..., x n )) (y i av (y 1,..., y n )) i=1
27 Empirical correlation coefficient Data: {(x 1, y 1 ), (x 2, y 2 ),..., (x n, y n )} The empirical correlation coefficient is defined as ρ ((x 1, y 1 ),..., (x n, y n )) := cov ((x 1, y 1 ),..., (x n, y n )) std (x 1,..., x n ) std (y 1,..., y n ) Cauchy-Schwarz inequality: for any a, b 1 a T b a 2 b 2 1 Consequence: 1 ρ ((x 1, y 1 ),..., (x n, y n )) 1
28 ρ = April August
29 ρ = Minimum temperature Maximum temperature
30 Histogram Empirical mean and variance Order statistics Empirical covariance Empirical covariance matrix
31 Empirical covariance matrix Data: { x 1, x 2,..., x n } (d features) The empirical covariance matrix is defined as Σ ( x 1,..., x n ) := 1 n 1 n ( x i av ( x 1,..., x n )) ( x i av ( x 1,..., x n )) T i=1 The (i, j) entry, 1 i, j d, is given by { var (( x1 ) i,..., ( x n i ) if i = j, Σ ( x 1,..., x n ) ij = ) )) cov ((( x 1 ) i, ( x 1 ) j,..., (( x n ) i, ( x n ) j if i j.
32 Empirical variance in a certain direction Let v be a unit-norm vector aligned with a direction of interest ) var ( v T x 1,..., v T x n
33 Empirical variance in a certain direction Let v be a unit-norm vector aligned with a direction of interest ) var ( v T x 1,..., v T x n = 1 n 1 n i=1 ( )) 2 v T x i av ( v T x 1,..., v T x n
34 Empirical variance in a certain direction Let v be a unit-norm vector aligned with a direction of interest ) var ( v T x 1,..., v T x n = 1 n 1 = 1 n 1 n i=1 n i=1 ( )) 2 v T x i av ( v T x 1,..., v T x n ( ) 2 v T ( x i av ( x 1,..., x n ))
35 Empirical variance in a certain direction Let v be a unit-norm vector aligned with a direction of interest ) var ( v T x 1,..., v T x n = 1 n 1 = 1 n 1 ( = v T n i=1 n i=1 1 n 1 ( )) 2 v T x i av ( v T x 1,..., v T x n ( ) 2 v T ( x i av ( x 1,..., x n )) ) n ( x i av ( x 1,..., x n )) ( x i av ( x 1,..., x n )) T v i=1
36 Empirical variance in a certain direction Let v be a unit-norm vector aligned with a direction of interest ) var ( v T x 1,..., v T x n = 1 n 1 = 1 n 1 ( = v T n i=1 n i=1 1 n 1 ( )) 2 v T x i av ( v T x 1,..., v T x n ( ) 2 v T ( x i av ( x 1,..., x n )) ) n ( x i av ( x 1,..., x n )) ( x i av ( x 1,..., x n )) T v i=1 = v T Σ ( x 1,..., x n ) v
37 Eigendecomposition of the covariance matrix Let v be a unit-norm vector aligned with a direction of interest Σ ( x 1,..., x n ) = UΛU T λ = [ ] u 1 u 2 u n 0 λ 2 0 [ u1 u 2 ] T u n 0 0 λ n
38 Eigendecomposition of the covariance matrix For any symmetric matrix A R n with normalized eigenvectors u 1, u 2,..., u n and corresponding eigenvalues λ 1 λ 2... λ n λ 1 = max v 2 =1 v T A v u 1 = arg max v 2 =1 v T A v λ k = max v 2 =1, u u 1,..., u k 1 v T A v u k = arg max v T A v v 2 =1, u u 1,..., u k 1
39 Principal component analysis Compute eigenvectors of empirical covariance matrix to determine directions of maximum variation
40 Example: 2D data σ 1 n = σ 2 n = u 1 u 2
41 Example: 2D data σ 1 n = σ 2 n = u 1 u 2
42 Example: 2D data σ 1 n = σ 2 n = u 1 u 2
43 Centering is important! σ 1 n = σ 2 n = u 1 u 2
44 Centering is important! σ 1 n = σ 2 n = u 2 u 1
45 Dimensionality reduction Projection of data onto a lower-dimensional space Applications: Visualization / computational efficiency / denoising Example: Seeds from 3 varieties of wheat (Kama, Rosa and Canadian) 7 features: area, perimeter, compactness, length of kernel, width of kernel, asymmetry coefficient and length of kernel groove
46 PCA dimensionality reduction Projection onto second PC Projection onto first PC
47 PCA dimensionality reduction Projection onto dth PC Projection onto (d-1)th PC
48 Whitening Preprocessing procedure Linear transformation to eliminate skew in the data Enhances nonlinear structure After whitening, the data are uncorrelated
49 Whitening Let x 1,..., x n be a set of d-dimensional centered data with a full-rank covariance matrix. To whiten the data we 1. Compute the eigendecomposition of the empirical covariance matrix Σ ( x 1,..., x n ) = UΛU T 2. For i = 1,..., n set y i := Λ 1 U T x i, λ1 0 0 Λ := 0 λ λn
50 Whitening Σ ( y 1,..., y n )
51 Whitening Σ ( y 1,..., y n ) := 1 n 1 n i=1 y i y T i
52 Whitening Σ ( y 1,..., y n ) := 1 n 1 = 1 n 1 n i=1 y i y T i n 1 ( ) Λ U T 1 T x i Λ U T x i i=1
53 Whitening Σ ( y 1,..., y n ) := 1 n 1 = 1 n 1 n i=1 y i y T i n 1 ( ) Λ U T 1 T x i Λ U T x i i=1 = ( Λ 1 U T 1 n 1 ) n x i x i T U Λ 1 i=1
54 Whitening Σ ( y 1,..., y n ) := 1 n 1 = 1 n 1 n i=1 y i y T i n 1 ( ) Λ U T 1 T x i Λ U T x i i=1 = ( Λ 1 U T 1 n 1 ) n x i x i T U Λ 1 i=1 = Λ 1 U T Σ ( x 1,..., x n ) U Λ 1
55 Whitening Σ ( y 1,..., y n ) := 1 n 1 = 1 n 1 n i=1 y i y T i n 1 ( ) Λ U T 1 T x i Λ U T x i i=1 = ( Λ 1 U T 1 n 1 ) n x i x i T U Λ 1 i=1 = Λ 1 U T Σ ( x 1,..., x n ) U Λ 1 = Λ 1 U T U Λ ΛU T U Λ 1
56 Whitening Σ ( y 1,..., y n ) := 1 n 1 = 1 n 1 n i=1 y i y T i n 1 ( ) Λ U T 1 T x i Λ U T x i i=1 = ( Λ 1 U T 1 n 1 ) n x i x i T U Λ 1 i=1 = Λ 1 U T Σ ( x 1,..., x n ) U Λ 1 = Λ 1 U T U Λ ΛU T U Λ 1 = I
57 x
58 U T x
59 Λ 1 U T x
Statistical Data Analysis
DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the
More informationVector spaces. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Vector spaces DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Vector space Consists of: A set V A scalar
More informationLecture Notes 2: Matrices
Optimization-based data analysis Fall 2017 Lecture Notes 2: Matrices Matrices are rectangular arrays of numbers, which are extremely useful for data analysis. They can be interpreted as vectors in a vector
More informationExpectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Expectation DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean, variance,
More informationLinear regression. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Linear regression DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall15 Carlos Fernandez-Granda Linear models Least-squares estimation Overfitting Example:
More informationIntroduction to Machine Learning
10-701 Introduction to Machine Learning PCA Slides based on 18-661 Fall 2018 PCA Raw data can be Complex, High-dimensional To understand a phenomenon we measure various related quantities If we knew what
More informationExpectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda
Expectation DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Describe random variables with a few numbers: mean,
More informationThe Singular-Value Decomposition
Mathematical Tools for Data Science Spring 2019 1 Motivation The Singular-Value Decomposition The singular-value decomposition (SVD) is a fundamental tool in linear algebra. In this section, we introduce
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationLinear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Linear Models DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Linear regression Least-squares estimation
More informationSection 3. Measures of Variation
Section 3 Measures of Variation Range Range = (maximum value) (minimum value) It is very sensitive to extreme values; therefore not as useful as other measures of variation. Sample Standard Deviation The
More informationUnsupervised Machine Learning and Data Mining. DS 5230 / DS Fall Lecture 7. Jan-Willem van de Meent
Unsupervised Machine Learning and Data Mining DS 5230 / DS 4420 - Fall 2018 Lecture 7 Jan-Willem van de Meent DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Dimensionality Reduction Goal:
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationLearning representations
Learning representations Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 4/11/2016 General problem For a dataset of n signals X := [ x 1 x
More informationDescriptive Data Summarization
Descriptive Data Summarization Descriptive data summarization gives the general characteristics of the data and identify the presence of noise or outliers, which is useful for successful data cleaning
More informationSUMMARIZING MEASURED DATA. Gaia Maselli
SUMMARIZING MEASURED DATA Gaia Maselli maselli@di.uniroma1.it Computer Network Performance 2 Overview Basic concepts Summarizing measured data Summarizing data by a single number Summarizing variability
More informationUnit 2. Describing Data: Numerical
Unit 2 Describing Data: Numerical Describing Data Numerically Describing Data Numerically Central Tendency Arithmetic Mean Median Mode Variation Range Interquartile Range Variance Standard Deviation Coefficient
More informationChapter 1 - Lecture 3 Measures of Location
Chapter 1 - Lecture 3 of Location August 31st, 2009 Chapter 1 - Lecture 3 of Location General Types of measures Median Skewness Chapter 1 - Lecture 3 of Location Outline General Types of measures What
More informationPCA and admixture models
PCA and admixture models CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar, Alkes Price PCA and admixture models 1 / 57 Announcements HW1
More informationConvergence of Random Processes
Convergence of Random Processes DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Define convergence for random
More informationChapter 2 Class Notes Sample & Population Descriptions Classifying variables
Chapter 2 Class Notes Sample & Population Descriptions Classifying variables Random Variables (RVs) are discrete quantitative continuous nominal qualitative ordinal Notation and Definitions: a Sample is
More informationRandom Processes. DS GA 1002 Probability and Statistics for Data Science.
Random Processes DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Aim Modeling quantities that evolve in time (or space)
More informationPCA, Kernel PCA, ICA
PCA, Kernel PCA, ICA Learning Representations. Dimensionality Reduction. Maria-Florina Balcan 04/08/2015 Big & High-Dimensional Data High-Dimensions = Lot of Features Document classification Features per
More informationPrincipal Components Analysis (PCA)
Principal Components Analysis (PCA) Principal Components Analysis (PCA) a technique for finding patterns in data of high dimension Outline:. Eigenvectors and eigenvalues. PCA: a) Getting the data b) Centering
More informationMachine Learning - MT & 14. PCA and MDS
Machine Learning - MT 2016 13 & 14. PCA and MDS Varun Kanade University of Oxford November 21 & 23, 2016 Announcements Sheet 4 due this Friday by noon Practical 3 this week (continue next week if necessary)
More informationunadjusted model for baseline cholesterol 22:31 Monday, April 19,
unadjusted model for baseline cholesterol 22:31 Monday, April 19, 2004 1 Class Level Information Class Levels Values TRETGRP 3 3 4 5 SEX 2 0 1 Number of observations 916 unadjusted model for baseline cholesterol
More informationLecture 2. Quantitative variables. There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data:
Lecture 2 Quantitative variables There are three main graphical methods for describing, summarizing, and detecting patterns in quantitative data: Stemplot (stem-and-leaf plot) Histogram Dot plot Stemplots
More informationExploratory data analysis: numerical summaries
16 Exploratory data analysis: numerical summaries The classical way to describe important features of a dataset is to give several numerical summaries We discuss numerical summaries for the center of a
More informationPrincipal Component Analysis
Principal Component Analysis Yingyu Liang yliang@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [based on slides from Nina Balcan] slide 1 Goals for the lecture you should understand
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional
More informationReview (Probability & Linear Algebra)
Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint
More informationData Mining Techniques
Data Mining Techniques CS 6220 - Section 2 - Spring 2017 Lecture 4 Jan-Willem van de Meent (credit: Yijun Zhao, Arthur Gretton Rasmussen & Williams, Percy Liang) Kernel Regression Basis function regression
More informationModeling Uncertainty in the Earth Sciences Jef Caers Stanford University
Probability theory and statistical analysis: a review Modeling Uncertainty in the Earth Sciences Jef Caers Stanford University Concepts assumed known Histograms, mean, median, spread, quantiles Probability,
More informationLecture 13. Principal Component Analysis. Brett Bernstein. April 25, CDS at NYU. Brett Bernstein (CDS at NYU) Lecture 13 April 25, / 26
Principal Component Analysis Brett Bernstein CDS at NYU April 25, 2017 Brett Bernstein (CDS at NYU) Lecture 13 April 25, 2017 1 / 26 Initial Question Intro Question Question Let S R n n be symmetric. 1
More informationData Mining and Exploration
Lecture Notes Data Mining and Exploration Michael Gutmann The University of Edinburgh Spring Semester 27 February 27, 27 Contents First steps in exploratory data analysis. Distribution of single variables......................
More informationTastitsticsss? What s that? Principles of Biostatistics and Informatics. Variables, outcomes. Tastitsticsss? What s that?
Tastitsticsss? What s that? Statistics describes random mass phanomenons. Principles of Biostatistics and Informatics nd Lecture: Descriptive Statistics 3 th September Dániel VERES Data Collecting (Sampling)
More informationStatistics for Managers using Microsoft Excel 6 th Edition
Statistics for Managers using Microsoft Excel 6 th Edition Chapter 3 Numerical Descriptive Measures 3-1 Learning Objectives In this chapter, you learn: To describe the properties of central tendency, variation,
More informationLinear Algebra Methods for Data Mining
Linear Algebra Methods for Data Mining Saara Hyvönen, Saara.Hyvonen@cs.helsinki.fi Spring 2007 The Singular Value Decomposition (SVD) continued Linear Algebra Methods for Data Mining, Spring 2007, University
More informationSparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda
Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic
More informationLecture Slides. Elementary Statistics Eleventh Edition. by Mario F. Triola. and the Triola Statistics Series 3.1-1
Lecture Slides Elementary Statistics Eleventh Edition and the Triola Statistics Series by Mario F. Triola 3.1-1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Review and Preview
More informationAverages How difficult is QM1? What is the average mark? Week 1b, Lecture 2
Averages How difficult is QM1? What is the average mark? Week 1b, Lecture 2 Topics: 1. Mean 2. Mode 3. Median 4. Order Statistics 5. Minimum, Maximum, Range 6. Percentiles, Quartiles, Interquartile Range
More informationPCA: Principal Component Analysis
PCA: Principal Component Analysis Lyron Winderbaum University of Adelaide January 29, 2015 PCA is the vanilla flavour of Component Analysis What is a component? What makes a component principal? A Contrived
More informationDS-GA 1002 Lecture notes 12 Fall Linear regression
DS-GA Lecture notes 1 Fall 16 1 Linear models Linear regression In statistics, regression consists of learning a function relating a certain quantity of interest y, the response or dependent variable,
More informationSTOR 155 Introductory Statistics. Lecture 4: Displaying Distributions with Numbers (II)
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL STOR 155 Introductory Statistics Lecture 4: Displaying Distributions with Numbers (II) 9/8/09 Lecture 4 1 Numerical Summary for Distributions Center Mean
More informationLearning gradients: prescriptive models
Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan
More informationMeelis Kull Autumn Meelis Kull - Autumn MTAT Data Mining - Lecture 03
Meelis Kull meelis.kull@ut.ee Autumn 2017 1 Demo: Data science mini-project CRISP-DM: cross-industrial standard process for data mining Data understanding: Types of data Data understanding: First look
More informationRandom projections. 1 Introduction. 2 Dimensionality reduction. Lecture notes 5 February 29, 2016
Lecture notes 5 February 9, 016 1 Introduction Random projections Random projections are a useful tool in the analysis and processing of high-dimensional data. We will analyze two applications that use
More informationNeuroscience Introduction
Neuroscience Introduction The brain As humans, we can identify galaxies light years away, we can study particles smaller than an atom. But we still haven t unlocked the mystery of the three pounds of matter
More information14 Singular Value Decomposition
14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More informationExpectation Maximization
Expectation Maximization Machine Learning CSE546 Carlos Guestrin University of Washington November 13, 2014 1 E.M.: The General Case E.M. widely used beyond mixtures of Gaussians The recipe is the same
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional
More information3.1 Measure of Center
3.1 Measure of Center Calculate the mean for a given data set Find the median, and describe why the median is sometimes preferable to the mean Find the mode of a data set Describe how skewness affects
More informationData Preprocessing Tasks
Data Tasks 1 2 3 Data Reduction 4 We re here. 1 Dimensionality Reduction Dimensionality reduction is a commonly used approach for generating fewer features. Typically used because too many features can
More informationDescribing Distributions
Describing Distributions With Numbers April 18, 2012 Summary Statistics. Measures of Center. Percentiles. Measures of Spread. A Summary Statement. Choosing Numerical Summaries. 1.0 What Are Summary Statistics?
More informationSummary statistics. G.S. Questa, L. Trapani. MSc Induction - Summary statistics 1
Summary statistics 1. Visualize data 2. Mean, median, mode and percentiles, variance, standard deviation 3. Frequency distribution. Skewness 4. Covariance and correlation 5. Autocorrelation MSc Induction
More informationDescriptive Univariate Statistics and Bivariate Correlation
ESC 100 Exploring Engineering Descriptive Univariate Statistics and Bivariate Correlation Instructor: Sudhir Khetan, Ph.D. Wednesday/Friday, October 17/19, 2012 The Central Dogma of Statistics used to
More informationWhat is Principal Component Analysis?
What is Principal Component Analysis? Principal component analysis (PCA) Reduce the dimensionality of a data set by finding a new set of variables, smaller than the original set of variables Retains most
More informationStatistics I Chapter 2: Univariate data analysis
Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,
More informationMath 14 Lecture Notes Ch Percentile
.3 Measures of the Location of the Data Percentile g A measure of position, the percentile, p, is an integer (1 p 99) such that the p th percentile is the position of a data value where p% of the data
More informationLearning Objectives for Stat 225
Learning Objectives for Stat 225 08/20/12 Introduction to Probability: Get some general ideas about probability, and learn how to use sample space to compute the probability of a specific event. Set Theory:
More informationDimensionality Reduction: PCA. Nicholas Ruozzi University of Texas at Dallas
Dimensionality Reduction: PCA Nicholas Ruozzi University of Texas at Dallas Eigenvalues λ is an eigenvalue of a matrix A R n n if the linear system Ax = λx has at least one non-zero solution If Ax = λx
More informationMeasures of center. The mean The mean of a distribution is the arithmetic average of the observations:
Measures of center The mean The mean of a distribution is the arithmetic average of the observations: x = x 1 + + x n n n = 1 x i n i=1 The median The median is the midpoint of a distribution: the number
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters, transforms,
More informationLecture 3: Review of Linear Algebra
ECE 83 Fall 2 Statistical Signal Processing instructor: R Nowak, scribe: R Nowak Lecture 3: Review of Linear Algebra Very often in this course we will represent signals as vectors and operators (eg, filters,
More informationStructure in Data. A major objective in data analysis is to identify interesting features or structure in the data.
Structure in Data A major objective in data analysis is to identify interesting features or structure in the data. The graphical methods are very useful in discovering structure. There are basically two
More informationPRINCIPAL COMPONENTS ANALYSIS
PRINCIPAL COMPONENTS ANALYSIS Iris Data Let s find Principal Components using the iris dataset. This is a well known dataset, often used to demonstrate the effect of clustering algorithms. It contains
More informationStatistics I Chapter 2: Univariate data analysis
Statistics I Chapter 2: Univariate data analysis Chapter 2: Univariate data analysis Contents Graphical displays for categorical data (barchart, piechart) Graphical displays for numerical data data (histogram,
More information200 participants [EUR] ( =60) 200 = 30% i.e. nearly a third of the phone bills are greater than 75 EUR
Ana Jerončić 200 participants [EUR] about half (71+37=108) 200 = 54% of the bills are small, i.e. less than 30 EUR (18+28+14=60) 200 = 30% i.e. nearly a third of the phone bills are greater than 75 EUR
More informationPrincipal Component Analysis -- PCA (also called Karhunen-Loeve transformation)
Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations
More informationDescribing distributions with numbers
Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central
More information= n 1. n 1. Measures of Variability. Sample Variance. Range. Sample Standard Deviation ( ) 2. Chapter 2 Slides. Maurice Geraghty
Chapter Slides Inferential Statistics and Probability a Holistic Approach Chapter Descriptive Statistics This Course Material by Maurice Geraghty is licensed under a Creative Commons Attribution-ShareAlike.
More informationCS 147: Computer Systems Performance Analysis
CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions CS 147: Computer Systems Performance Analysis Summarizing Variability and Determining Distributions 1
More informationDetermining the Spread of a Distribution
Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative
More informationMATH4427 Notebook 4 Fall Semester 2017/2018
MATH4427 Notebook 4 Fall Semester 2017/2018 prepared by Professor Jenny Baglivo c Copyright 2009-2018 by Jenny A. Baglivo. All Rights Reserved. 4 MATH4427 Notebook 4 3 4.1 K th Order Statistics and Their
More informationMatrices and Multivariate Statistics - II
Matrices and Multivariate Statistics - II Richard Mott November 2011 Multivariate Random Variables Consider a set of dependent random variables z = (z 1,..., z n ) E(z i ) = µ i cov(z i, z j ) = σ ij =
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA. Tobias Scheffer
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen PCA Tobias Scheffer Overview Principal Component Analysis (PCA) Kernel-PCA Fisher Linear Discriminant Analysis t-sne 2 PCA: Motivation
More informationChapter 4. Displaying and Summarizing. Quantitative Data
STAT 141 Introduction to Statistics Chapter 4 Displaying and Summarizing Quantitative Data Bin Zou (bzou@ualberta.ca) STAT 141 University of Alberta Winter 2015 1 / 31 4.1 Histograms 1 We divide the range
More informationConvergence of Eigenspaces in Kernel Principal Component Analysis
Convergence of Eigenspaces in Kernel Principal Component Analysis Shixin Wang Advanced machine learning April 19, 2016 Shixin Wang Convergence of Eigenspaces April 19, 2016 1 / 18 Outline 1 Motivation
More informationBNG 495 Capstone Design. Descriptive Statistics
BNG 495 Capstone Design Descriptive Statistics Overview The overall goal of this short course in statistics is to provide an introduction to descriptive and inferential statistical methods, with a focus
More informationDetermining the Spread of a Distribution
Determining the Spread of a Distribution 1.3-1.5 Cathy Poliak, Ph.D. cathy@math.uh.edu Department of Mathematics University of Houston Lecture 3-2311 Lecture 3-2311 1 / 58 Outline 1 Describing Quantitative
More informationCS4495/6495 Introduction to Computer Vision. 8B-L2 Principle Component Analysis (and its use in Computer Vision)
CS4495/6495 Introduction to Computer Vision 8B-L2 Principle Component Analysis (and its use in Computer Vision) Wavelength 2 Wavelength 2 Principal Components Principal components are all about the directions
More informationPrincipal Component Analysis
CSci 5525: Machine Learning Dec 3, 2008 The Main Idea Given a dataset X = {x 1,..., x N } The Main Idea Given a dataset X = {x 1,..., x N } Find a low-dimensional linear projection The Main Idea Given
More informationUnsupervised Learning
2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and
More informationChapter 3 Examining Data
Chapter 3 Examining Data This chapter discusses methods of displaying quantitative data with the objective of understanding the distribution of the data. Example During childhood and adolescence, bone
More informationPRINCIPAL COMPONENTS ANALYSIS (PCA)
PRINCIPAL COMPONENTS ANALYSIS (PCA) Introduction PCA is considered an exploratory technique that can be used to gain a better understanding of the interrelationships between variables. PCA is performed
More informationStatistical Machine Learning
Statistical Machine Learning Christoph Lampert Spring Semester 2015/2016 // Lecture 12 1 / 36 Unsupervised Learning Dimensionality Reduction 2 / 36 Dimensionality Reduction Given: data X = {x 1,..., x
More informationDEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS QM 120. Spring 2008
DEPARTMENT OF QUANTITATIVE METHODS & INFORMATION SYSTEMS Introduction to Business Statistics QM 120 Chapter 3 Spring 2008 Measures of central tendency for ungrouped data 2 Graphs are very helpful to describe
More informationDescribing distributions with numbers
Describing distributions with numbers A large number or numerical methods are available for describing quantitative data sets. Most of these methods measure one of two data characteristics: The central
More informationModule 3. Function of a Random Variable and its distribution
Module 3 Function of a Random Variable and its distribution 1. Function of a Random Variable Let Ω, F, be a probability space and let be random variable defined on Ω, F,. Further let h: R R be a given
More informationCOMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017
COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University PRINCIPAL COMPONENT ANALYSIS DIMENSIONALITY
More informationMATH 829: Introduction to Data Mining and Analysis Principal component analysis
1/11 MATH 829: Introduction to Data Mining and Analysis Principal component analysis Dominique Guillot Departments of Mathematical Sciences University of Delaware April 4, 2016 Motivation 2/11 High-dimensional
More informationGetting To Know Your Data
Getting To Know Your Data Road Map 1. Data Objects and Attribute Types 2. Descriptive Data Summarization 3. Measuring Data Similarity and Dissimilarity Data Objects and Attribute Types Types of data sets
More informationLecture Slides. Elementary Statistics Tenth Edition. by Mario F. Triola. and the Triola Statistics Series. Slide 1
Lecture Slides Elementary Statistics Tenth Edition and the Triola Statistics Series by Mario F. Triola Slide 1 Chapter 3 Statistics for Describing, Exploring, and Comparing Data 3-1 Overview 3-2 Measures
More informationObjective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode.
Chapter 3 Numerically Summarizing Data Chapter 3.1 Measures of Central Tendency Objective A: Mean, Median and Mode Three measures of central of tendency: the mean, the median, and the mode. A1. Mean The
More information15 Singular Value Decomposition
15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing
More information1.3: Describing Quantitative Data with Numbers
1.3: Describing Quantitative Data with Numbers Section 1.3 Describing Quantitative Data with Numbers After this section, you should be able to MEASURE center with the mean and median MEASURE spread with
More informationP8130: Biostatistical Methods I
P8130: Biostatistical Methods I Lecture 2: Descriptive Statistics Cody Chiuzan, PhD Department of Biostatistics Mailman School of Public Health (MSPH) Lecture 1: Recap Intro to Biostatistics Types of Data
More informationPrincipal Component Analysis. Applied Multivariate Statistics Spring 2012
Principal Component Analysis Applied Multivariate Statistics Spring 2012 Overview Intuition Four definitions Practical examples Mathematical example Case study 2 PCA: Goals Goal 1: Dimension reduction
More information1. Exploratory Data Analysis
1. Exploratory Data Analysis 1.1 Methods of Displaying Data A visual display aids understanding and can highlight features which may be worth exploring more formally. Displays should have impact and be
More informationAfter completing this chapter, you should be able to:
Chapter 2 Descriptive Statistics Chapter Goals After completing this chapter, you should be able to: Compute and interpret the mean, median, and mode for a set of data Find the range, variance, standard
More information