Data Mining and Analysis: Fundamental Concepts and Algorithms
|
|
- Spencer Harrell
- 5 years ago
- Views:
Transcription
1 Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA Department of Computer Science Universidade Federal de Minas Gerais, Belo Horizonte, Brazil Chapter 1: Data Mining and Analysis Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 1 / 3
2 Data Matrix Data can often be represented or abstracted as an n d data matrix, with n rows and d columns, given as X 1 X X d x 1 x 11 x 1 x 1d D = x x 1 x x d x n x n1 x n x nd Rows: Also called instances, examples, records, transactions, objects, points, feature-vectors, etc. Given as a d-tuple x i = (x i1, x i,...,x id ) Columns: Also called attributes, properties, features, dimensions, variables, fields, etc. Given as an n-tuple X j = (x 1j, x j,...,x nj ) Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis / 3
3 Iris Dataset Extract Sepal Sepal Petal Petal Class length width length width X 1 X X 3 X 4 X 5 x Iris-versicolor x Iris-versicolor x Iris-versicolor x Iris-setosa x Iris-versicolor x Iris-setosa x Iris-virginica x Iris-virginica x Iris-virginica x Iris-setosa Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 3 / 3
4 Attributes Attributes may be classified into two main types Numeric Attributes: real-valued or integer-valued domain Interval-scaled: only differences are meaningful e.g., temperature Ratio-scaled: differences and ratios are meaningful e..g,age Categorical Attributes: set-valued domain composed of a set of symbols Nominal: only equality is meaningful e.g., domain(sex) = {M, F} Ordinal: both equality (are two values the same?) and inequality (is one value less than another?) are meaningful e.g., domain(education) = {High School,BS,MS,PhD} Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 4 / 3
5 Data: Algebraic and Geometric View For numeric data matrix D, each row or point is a d-dimensional column vector: x i1 x i x i =. = ( x i1 x i ) T x id R d x id whereas each column or attribute is a n-dimensional column vector: X j = ( x 1j x j x nj ) T R n X3 X x 1 = (5.9, 3.0) T 3 x1 = (5.9, 3.0, 4.) T (a) R X 1 X (b) R 3 X Figure: Projections of x 1 = (5.9, 3.0, 4., 1.5) T in D and 3D Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 5 / 3
6 Scatterplot: D Iris Dataset sepal length versussepal width. Visualizing Iris dataset as points/vectors in D Solid circle shows the mean point X: sepal width X 1 : sepal length Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 6 / 3
7 Numeric Data Matrix If all attributes are numeric, then the data matrix D is an n d matrix, or equivalently a set of n row vectors x T i R d or a set of d column vectors X j R n x 11 x 1 x 1d x T 1 x 1 x x d D = = x T. = X 1 X X d x n1 x n x nd x T n The mean of the data matrix D is the average of all the points: mean(d) = µ = 1 n x i n i=1 The centered data matrix is obtained by subtracting the mean from all the points: x T 1 µ T x T 1 µ T z T 1 x T Z = D 1 µ T =. µ T. = x T µ T. = z T. x T n µ T x T n µ T z T n (1) where z i = x i µ is a centered point, and 1 R n is the vector of ones. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 7 / 3
8 Norm, Distance and Angle Given two points a, b R m, their dot product is defined as the scalar a T b = a 1 b 1 + a b + +a mb m m = a i b i i=1 The Euclidean norm or length of a vector a is defined as a = a T a = m i=1 The unit vector in the direction of a is u = a with a = 1. a a i Distance between a and b is given as a b = m (a i b i ) i=1 Angle between a and b is given as ( ) T ( ) cosθ = at b a b a b = a b X (1, 4) b θ a a b (5, 3) X1 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 8 / 3
9 Orthogonal Projection Two vectors a and b are orthogonal iff a T b = 0, i.e., the angle between them is 90. Orthogonal projection of b on a comprises the vector p = b parallel to a, and r = b perpendicular or orthogonal to a, given as b = b + b = p+r where X p = b = ( a T ) b a T a a 4 b 3 r = b a 1 p = b Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 9 / 3 X 1
10 Projection of Centered Iris Data Onto a Line l. l X X 1 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 10 / 3
11 Data: Probabilistic View The sample space of A is O = [4.3, 7.9], and its range is {0, 1}. Thus, A is discrete. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 11 / 3 A random variable X is a function X: O R, where O is the set of all possible outcomes of the experiment, also called the sample space. A discrete random variable takes on only a finite or countably infinite number of values, whereas a continuous random variable if it can take on any value in R. By default, a numeric attribute X j is considered as the identity random variable given as X(v) = v for all v O. Here O = R. Discrete Variable: Long Sepal Length Define random variable A, denoting long sepal length (7cm or more) as follows: { 0 if v < 7 A(v) = 1 if v 7
12 Probability Mass Function If X is discrete, the probability mass function of X is defined as f(x) = P(X = x) for all x R f must obey the basic rules of probability. That is, f must be non-negative: f(x) 0 and the sum of all probabilities should add to 1: f(x) = 1 x Intuitively, for a discrete variable X, the probability is concentrated or massed at only discrete values in the range of X, and is zero for all other values. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 1 / 3
13 Sepal Length: Bernoulli Distribution Iris Dataset Extract: sepal length (in centimeters) Define random variable A as follows: A(v) = { 0 if v < 7 1 if v 7 We find that only 13 Irises have sepal length of at least 7 cm. Thus, the probability mass function of A can be estimated as: f(1) = P(A = 1) = = = p and f(0) = P(A = 0) = 137 = = 1 p 150 A has a Bernoulli distribution with parameter p [0, 1], which denotes the probability of a success, that is, the probability of picking an Iris with a long sepal length at random from the set of all points. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 13 / 3
14 Sepal Length: Binomial Distribution Define discrete random variable B, denoting the number of Irises with long sepal length in m independent Bernoulli trials with probability of success p. In this case, B takes on the discrete values [0, m], and its probability mass function is given by the Binomial distribution f(k) = P(B = k) = ( m k ) p k (1 p) m k Binomial distribution for long sepal length (p = 0.087) for m = 10 trials P(B=k) k Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 14 / 3
15 Probability Density Function If X is continuous, the probability density function of X is defined as P ( X [a, b] ) = b a f(x) dx f must obey the basic rules of probability. That is, f must be non-negative: f(x) 0 and the sum of all probabilities should add to 1: f(x) dx = 1 Note that P(X = v) = 0 for all v R since there are infinite possible values in the sample space. What it means is that the probability mass is spread so thinly over the range of values that it can be measured only over intervals [a, b] R, rather than at specific points. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 15 / 3
16 Sepal Length: Normal Distribution We model sepal length via the Gaussian or normal density function, given as { } 1 (x µ) f(x) = exp πσ σ where µ = 1 n n i=1 x i is the mean value, and σ = 1 n n i=1 (x i µ) is the variance. Normal distribution for sepal length: µ = 5.84, σ = f(x) 0.5 µ±ǫ x Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 16 / 3
17 Cumulative Distribution Function For random variable X, its cumulative distribution function (CDF) F : R [0, 1], gives the probability of observing a value at most some given value x: F(x) = P(X x) for all < x < When X is discrete, F is given as F(x) = P(X x) = u x f(u) CDF for binomial distribution (p = 0.087, m = 10) F(x) CDF for the normal distribution (µ = 5.84,σ = 0.681) F(x) x When X is continuous, F is given as F(x) = P(X x) = x f(u) du (µ, F(µ)) = (5.84, 0.5) x Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 17 / 3
18 Bivariate Random Variable: Joint Probability Mass Function Iris: joint PMF for long sepal length and Define discrete random variables sepal width { 1 ifv 7 f (0, 0) = P(X 1 = 0, X = 0) = 116/150 = long sepal length:x 1 (v) = 0 otherwise f (0, 1) = P(X 1 = 0, X = 1) = 1/150 = { f (1, 0) = P(X 1 = 1, X = 0) = 10/150 = ifv 3.5 long sepal width:x (v) = f (1, 1) = P(X 0 otherwise 1 = 1, X = 1) = 3/150 = 0.00 The bivariate random variable ( ) X1 X = X has the joint probability mass function f(x) = P(X = x) i.e., f(x 1, x ) = P(X 1 = x 1, X = x ) X f(x) X Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 18 / 3
19 Bivariate Random Variable: Probability Density Function Bivariate Normal: modeling joint distribution for long sepal length (X 1 ) and sepal width (X ) 1 f(x µ,σ) = π exp Σ (x µ)t Σ 1 (x µ) Bivariate Normal µ = (5.843, 3.054) T ( ) Σ = where µ and Σ specify the D mean and covariance matrix: ( ) µ = (µ 1,µ ) T σ Σ = 1 σ 1 σ 1 σ f(x) with mean µ i = 1 n n k=1 x ki and covariance σ ij = 1 n k=1 (x ki µ i )(x kj µ j ). Also, X1 σi = σ ii X Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 19 / 3
20 Random Sample and Statistics Given a random variable X, a random sample of size n from X is defined as a set of n independent and identically distributed (IID) random variables S 1, S,...,S n The S i s have the same probability distribution as X, and are statistically independent. Two random variables X 1 and X are (statistically) independent if, for every W 1 R and W R, we have which also implies that P(X 1 W 1 and X W ) = P(X 1 W 1 ) P(X W ) F(x) = F(x 1, x ) = F 1 (x 1 ) F (x ) f(x) = f(x 1, x ) = f 1 (x 1 ) f (x ) where F i is the cumulative distribution function, and f i is the probability mass or density function for random variable X i. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 0 / 3
21 Multivariate Sample Given dataset D, the n data points x i (with 1 i n) constitute a d-dimensional multivariate random sample drawn from the vector random variable X = (X 1, X,...,X d ). Since the x i are assumed to be independent and identically distributed, their joint distribution is given as f(x 1, x,...,x n ) = n f X (x i ) where f X is the probability mass or density function for X. Assuming that the d attributes X 1, X,...,X d are statistically independent, the joint distribution for the entire dataset is given as: f(x 1, x,...,x n ) = i=1 n f(x i ) = i=1 n i=1 j=1 d f Xj (x ij ) Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 1 / 3
22 Sample Statistics Let {S i } m i=1 be a random sample of size m drawn from a (multivariate) random variable X. A statistic ˆθ is a function ˆθ: (S 1, S,...,S m) R The statistic is an estimate of the corresponding population parameterθ, where the population refers to the entire universe of entities under study. The statistic is itself a random variable. The sample mean is a statistic, defined as the average ˆµ = 1 n n i=1 x i For sepal length, we have ˆµ = 5.84, which is an estimator for the (unknown) true population mean sepal length. Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis / 3
23 Sample Statistics: Variance The sample variance is a statistic ˆσ = 1 n n (x i µ) i=1 For sepal length, we have ˆσ = The total variance is a multivariate statistic var(d) = 1 n n x i µ For the Iris data (with 4 attributes: sepal length and width, petal length and width), we have var(d) = i=1 Zaki & Meira Jr. (RPI and UFMG) Data Mining and Analysis Chapter 1: Data Mining and Analysis 3 / 3
Data Mining and Analysis
978--5-766- - Data Mining and Analysis: Fundamental Concepts and Algorithms CHAPTER Data Mining and Analysis Data mining is the process of discovering insightful, interesting, and novel patterns, as well
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki Wagner Meira Jr. Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA Department
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA 2 Department of Computer
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationQuick Tour of Basic Probability Theory and Linear Algebra
Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra CS224w: Social and Information Network Analysis Fall 2011 Quick Tour of and Linear Algebra Quick Tour of and Linear Algebra Outline Definitions
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationISyE 6416: Computational Statistics Spring Lecture 5: Discriminant analysis and classification
ISyE 6416: Computational Statistics Spring 2017 Lecture 5: Discriminant analysis and classification Prof. Yao Xie H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationMA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems
MA/ST 810 Mathematical-Statistical Modeling and Analysis of Complex Systems Review of Basic Probability The fundamentals, random variables, probability distributions Probability mass/density functions
More informationReview (Probability & Linear Algebra)
Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint
More informationRandom Variables and Their Distributions
Chapter 3 Random Variables and Their Distributions A random variable (r.v.) is a function that assigns one and only one numerical value to each simple event in an experiment. We will denote r.vs by capital
More informationMultivariate Statistics
Multivariate Statistics Chapter 2: Multivariate distributions and inference Pedro Galeano Departamento de Estadística Universidad Carlos III de Madrid pedro.galeano@uc3m.es Course 2016/2017 Master in Mathematical
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationClassification Methods II: Linear and Quadratic Discrimminant Analysis
Classification Methods II: Linear and Quadratic Discrimminant Analysis Rebecca C. Steorts, Duke University STA 325, Chapter 4 ISL Agenda Linear Discrimminant Analysis (LDA) Classification Recall that linear
More informationBasics on Probability. Jingrui He 09/11/2007
Basics on Probability Jingrui He 09/11/2007 Coin Flips You flip a coin Head with probability 0.5 You flip 100 coins How many heads would you expect Coin Flips cont. You flip a coin Head with probability
More informationAlgorithms for Uncertainty Quantification
Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example
More informationAn Introduction to Multivariate Methods
Chapter 12 An Introduction to Multivariate Methods Multivariate statistical methods are used to display, analyze, and describe data on two or more features or variables simultaneously. I will discuss multivariate
More informationReview (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology
Review (probability, linear algebra) CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adopted from Prof. H.R. Rabiee s and also Prof. R. Gutierrez-Osuna
More informationData Mining and Analysis: Fundamental Concepts and Algorithms
Data Mining and Analysis: Fundamental Concepts and Algorithms dataminingbook.info Mohammed J. Zaki 1 Wagner Meira Jr. 2 1 Department of Computer Science Rensselaer Polytechnic Institute, Troy, NY, USA
More informationFour Paradigms in Data Mining
Four Paradigms in Data Mining dataminingbook.info Wagner Meira Jr. 1 1 Department of Computer Science Universidade Federal de Minas Gerais, Belo Horizonte, Brazil October 13, 2015 Meira Jr. (UFMG) Four
More informationChapter 5 continued. Chapter 5 sections
Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationEE4601 Communication Systems
EE4601 Communication Systems Week 2 Review of Probability, Important Distributions 0 c 2011, Georgia Institute of Technology (lect2 1) Conditional Probability Consider a sample space that consists of two
More information[POLS 8500] Review of Linear Algebra, Probability and Information Theory
[POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming
More informationUniversity of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians
Engineering Part IIB: Module F Statistical Pattern Processing University of Cambridge Engineering Part IIB Module F: Statistical Pattern Processing Handout : Multivariate Gaussians. Generative Model Decision
More informationMultiple Random Variables
Multiple Random Variables This Version: July 30, 2015 Multiple Random Variables 2 Now we consider models with more than one r.v. These are called multivariate models For instance: height and weight An
More informationReview of Probability. CS1538: Introduction to Simulations
Review of Probability CS1538: Introduction to Simulations Probability and Statistics in Simulation Why do we need probability and statistics in simulation? Needed to validate the simulation model Needed
More informationLinear Models. DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis.
Linear Models DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Linear regression Least-squares estimation
More informationUQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables
UQ, Semester 1, 2017, Companion to STAT2201/CIVL2530 Exam Formulae and Tables To be provided to students with STAT2201 or CIVIL-2530 (Probability and Statistics) Exam Main exam date: Tuesday, 20 June 1
More informationMaximum Likelihood, Logistic Regression, and Stochastic Gradient Training
Maximum Likelihood, Logistic Regression, and Stochastic Gradient Training Charles Elkan elkan@cs.ucsd.edu January 17, 2013 1 Principle of maximum likelihood Consider a family of probability distributions
More informationA Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.
A Probability Primer A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes. Are you holding all the cards?? Random Events A random event, E,
More informationContents 1. Contents
Contents 1 Contents 6 Distributions of Functions of Random Variables 2 6.1 Transformation of Discrete r.v.s............. 3 6.2 Method of Distribution Functions............. 6 6.3 Method of Transformations................
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationRandom variables. DS GA 1002 Probability and Statistics for Data Science.
Random variables DS GA 1002 Probability and Statistics for Data Science http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall17 Carlos Fernandez-Granda Motivation Random variables model numerical quantities
More informationChapter 5. Chapter 5 sections
1 / 43 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions
More informationWhy study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables
ECE 6010 Lecture 1 Introduction; Review of Random Variables Readings from G&S: Chapter 1. Section 2.1, Section 2.3, Section 2.4, Section 3.1, Section 3.2, Section 3.5, Section 4.1, Section 4.2, Section
More informationEstimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator
Estimation Theory Estimation theory deals with finding numerical values of interesting parameters from given set of data. We start with formulating a family of models that could describe how the data were
More informationL2: Review of probability and statistics
Probability L2: Review of probability and statistics Definition of probability Axioms and properties Conditional probability Bayes theorem Random variables Definition of a random variable Cumulative distribution
More informationUniversity of Cambridge Engineering Part IIB Module 4F10: Statistical Pattern Processing Handout 2: Multivariate Gaussians
University of Cambridge Engineering Part IIB Module 4F: Statistical Pattern Processing Handout 2: Multivariate Gaussians.2.5..5 8 6 4 2 2 4 6 8 Mark Gales mjfg@eng.cam.ac.uk Michaelmas 2 2 Engineering
More informationRandom Variables Example:
Random Variables Example: We roll a fair die 6 times. Suppose we are interested in the number of 5 s in the 6 rolls. Let X = number of 5 s. Then X could be 0, 1, 2, 3, 4, 5, 6. X = 0 corresponds to the
More informationLecture 1: August 28
36-705: Intermediate Statistics Fall 2017 Lecturer: Siva Balakrishnan Lecture 1: August 28 Our broad goal for the first few lectures is to try to understand the behaviour of sums of independent random
More informationProbability and Distributions
Probability and Distributions What is a statistical model? A statistical model is a set of assumptions by which the hypothetical population distribution of data is inferred. It is typically postulated
More informationIntroduction to Probability and Stocastic Processes - Part I
Introduction to Probability and Stocastic Processes - Part I Lecture 2 Henrik Vie Christensen vie@control.auc.dk Department of Control Engineering Institute of Electronic Systems Aalborg University Denmark
More informationMultivariate Random Variable
Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate
More informationLecture 1 October 9, 2013
Probabilistic Graphical Models Fall 2013 Lecture 1 October 9, 2013 Lecturer: Guillaume Obozinski Scribe: Huu Dien Khue Le, Robin Bénesse The web page of the course: http://www.di.ens.fr/~fbach/courses/fall2013/
More informationStatistics for scientists and engineers
Statistics for scientists and engineers February 0, 006 Contents Introduction. Motivation - why study statistics?................................... Examples..................................................3
More informationRecitation. Shuang Li, Amir Afsharinejad, Kaushik Patnaik. Machine Learning CS 7641,CSE/ISYE 6740, Fall 2014
Recitation Shuang Li, Amir Afsharinejad, Kaushik Patnaik Machine Learning CS 7641,CSE/ISYE 6740, Fall 2014 Probability and Statistics 2 Basic Probability Concepts A sample space S is the set of all possible
More informationPart IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015
Part IA Probability Definitions Based on lectures by R. Weber Notes taken by Dexter Chua Lent 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after lectures.
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2
MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is
More informationProbability Theory and Statistics. Peter Jochumzen
Probability Theory and Statistics Peter Jochumzen April 18, 2016 Contents 1 Probability Theory And Statistics 3 1.1 Experiment, Outcome and Event................................ 3 1.2 Probability............................................
More informationNaïve Bayes Introduction to Machine Learning. Matt Gormley Lecture 3 September 14, Readings: Mitchell Ch Murphy Ch.
School of Computer Science 10-701 Introduction to Machine Learning aïve Bayes Readings: Mitchell Ch. 6.1 6.10 Murphy Ch. 3 Matt Gormley Lecture 3 September 14, 2016 1 Homewor 1: due 9/26/16 Project Proposal:
More informationChapter 5. Statistical Models in Simulations 5.1. Prof. Dr. Mesut Güneş Ch. 5 Statistical Models in Simulations
Chapter 5 Statistical Models in Simulations 5.1 Contents Basic Probability Theory Concepts Discrete Distributions Continuous Distributions Poisson Process Empirical Distributions Useful Statistical Models
More information1 Review of Probability and Distributions
Random variables. A numerically valued function X of an outcome ω from a sample space Ω X : Ω R : ω X(ω) is called a random variable (r.v.), and usually determined by an experiment. We conventionally denote
More informationPerhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.
Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage
More informationLEC 4: Discriminant Analysis for Classification
LEC 4: Discriminant Analysis for Classification Dr. Guangliang Chen February 25, 2016 Outline Last time: FDA (dimensionality reduction) Today: QDA/LDA (classification) Naive Bayes classifiers Matlab/Python
More informationRecitation 2: Probability
Recitation 2: Probability Colin White, Kenny Marino January 23, 2018 Outline Facts about sets Definitions and facts about probability Random Variables and Joint Distributions Characteristics of distributions
More informationChapter 2. Discrete Distributions
Chapter. Discrete Distributions Objectives ˆ Basic Concepts & Epectations ˆ Binomial, Poisson, Geometric, Negative Binomial, and Hypergeometric Distributions ˆ Introduction to the Maimum Likelihood Estimation
More informationSystem Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models
System Simulation Part II: Mathematical and Statistical Models Chapter 5: Statistical Models Fatih Cavdur fatihcavdur@uludag.edu.tr March 20, 2012 Introduction Introduction The world of the model-builder
More informationIntroduction to Normal Distribution
Introduction to Normal Distribution Nathaniel E. Helwig Assistant Professor of Psychology and Statistics University of Minnesota (Twin Cities) Updated 17-Jan-2017 Nathaniel E. Helwig (U of Minnesota) Introduction
More informationLecture Notes 1: Vector spaces
Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector
More informationProbability. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. August 2014
Probability Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh August 2014 (All of the slides in this course have been adapted from previous versions
More informationThe Gaussian distribution
The Gaussian distribution Probability density function: A continuous probability density function, px), satisfies the following properties:. The probability that x is between two points a and b b P a
More informationRandom vectors X 1 X 2. Recall that a random vector X = is made up of, say, k. X k. random variables.
Random vectors Recall that a random vector X = X X 2 is made up of, say, k random variables X k A random vector has a joint distribution, eg a density f(x), that gives probabilities P(X A) = f(x)dx Just
More informationSlides 8: Statistical Models in Simulation
Slides 8: Statistical Models in Simulation Purpose and Overview The world the model-builder sees is probabilistic rather than deterministic: Some statistical model might well describe the variations. An
More informationPRINCIPAL COMPONENTS ANALYSIS
PRINCIPAL COMPONENTS ANALYSIS Iris Data Let s find Principal Components using the iris dataset. This is a well known dataset, often used to demonstrate the effect of clustering algorithms. It contains
More informationProbability Distributions Columns (a) through (d)
Discrete Probability Distributions Columns (a) through (d) Probability Mass Distribution Description Notes Notation or Density Function --------------------(PMF or PDF)-------------------- (a) (b) (c)
More informationPhysics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester
Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability
More informationProbability Distributions: Continuous
Probability Distributions: Continuous INFO-2301: Quantitative Reasoning 2 Michael Paul and Jordan Boyd-Graber FEBRUARY 28, 2017 INFO-2301: Quantitative Reasoning 2 Paul and Boyd-Graber Probability Distributions:
More informationFormulas for probability theory and linear models SF2941
Formulas for probability theory and linear models SF2941 These pages + Appendix 2 of Gut) are permitted as assistance at the exam. 11 maj 2008 Selected formulae of probability Bivariate probability Transforms
More informationSample Spaces, Random Variables
Sample Spaces, Random Variables Moulinath Banerjee University of Michigan August 3, 22 Probabilities In talking about probabilities, the fundamental object is Ω, the sample space. (elements) in Ω are denoted
More information15 Discrete Distributions
Lecture Note 6 Special Distributions (Discrete and Continuous) MIT 4.30 Spring 006 Herman Bennett 5 Discrete Distributions We have already seen the binomial distribution and the uniform distribution. 5.
More informationLINEAR ALGEBRA - CHAPTER 1: VECTORS
LINEAR ALGEBRA - CHAPTER 1: VECTORS A game to introduce Linear Algebra In measurement, there are many quantities whose description entirely rely on magnitude, i.e., length, area, volume, mass and temperature.
More informationBayesian Classification Methods
Bayesian Classification Methods Suchit Mehrotra North Carolina State University smehrot@ncsu.edu October 24, 2014 Suchit Mehrotra (NCSU) Bayesian Classification October 24, 2014 1 / 33 How do you define
More informationMath Review Sheet, Fall 2008
1 Descriptive Statistics Math 3070-5 Review Sheet, Fall 2008 First we need to know about the relationship among Population Samples Objects The distribution of the population can be given in one of the
More informationLecture Notes 6: Linear Models
Optimization-based data analysis Fall 17 Lecture Notes 6: Linear Models 1 Linear regression 1.1 The regression problem In statistics, regression is the problem of characterizing the relation between a
More informationIntroduction to Machine Learning
Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB
More informationProbability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More informationLecture 2: Review of Basic Probability Theory
ECE 830 Fall 2010 Statistical Signal Processing instructor: R. Nowak, scribe: R. Nowak Lecture 2: Review of Basic Probability Theory Probabilistic models will be used throughout the course to represent
More informationIntroduction to Machine Learning
What does this mean? Outline Contents Introduction to Machine Learning Introduction to Probabilistic Methods Varun Chandola December 26, 2017 1 Introduction to Probability 1 2 Random Variables 3 3 Bayes
More informationDefinitions and Properties of R N
Definitions and Properties of R N R N as a set As a set R n is simply the set of all ordered n-tuples (x 1,, x N ), called vectors. We usually denote the vector (x 1,, x N ), (y 1,, y N ), by x, y, or
More informationStatistical Pattern Recognition
Statistical Pattern Recognition A Brief Mathematical Review Hamid R. Rabiee Jafar Muhammadi, Ali Jalali, Alireza Ghasemi Spring 2012 http://ce.sharif.edu/courses/90-91/2/ce725-1/ Agenda Probability theory
More informationconditional cdf, conditional pdf, total probability theorem?
6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random
More informationNorthwestern University Department of Electrical Engineering and Computer Science
Northwestern University Department of Electrical Engineering and Computer Science EECS 454: Modeling and Analysis of Communication Networks Spring 2008 Probability Review As discussed in Lecture 1, probability
More informationSTAT 730 Chapter 1 Background
STAT 730 Chapter 1 Background Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 27 Logistics Course notes hopefully posted evening before lecture,
More informationMachine learning - HT Maximum Likelihood
Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce
More informationToday. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion
Today Probability and Statistics Naïve Bayes Classification Linear Algebra Matrix Multiplication Matrix Inversion Calculus Vector Calculus Optimization Lagrange Multipliers 1 Classical Artificial Intelligence
More informationIn most cases, a plot of d (j) = {d (j) 2 } against {χ p 2 (1-q j )} is preferable since there is less piling up
THE UNIVERSITY OF MINNESOTA Statistics 5401 September 17, 2005 Chi-Squared Q-Q plots to Assess Multivariate Normality Suppose x 1, x 2,..., x n is a random sample from a p-dimensional multivariate distribution
More informationMethods for sparse analysis of high-dimensional data, II
Methods for sparse analysis of high-dimensional data, II Rachel Ward May 23, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 47 High dimensional
More information2. The Sample Mean and the Law of Large Numbers
1 of 11 7/16/2009 6:05 AM Virtual Laboratories > 6. Random Samples > 1 2 3 4 5 6 7 2. The Sample Mean and the Law of Large Numbers The Sample Mean Suppose that we have a basic random experiment and that
More informationStatistical Data Analysis
DS-GA 0 Lecture notes 8 Fall 016 1 Descriptive statistics Statistical Data Analysis In this section we consider the problem of analyzing a set of data. We describe several techniques for visualizing the
More informationIntro Vectors 2D implicit curves 2D parametric curves. Graphics 2012/2013, 4th quarter. Lecture 2: vectors, curves, and surfaces
Lecture 2, curves, and surfaces Organizational remarks Tutorials: TA sessions for tutorial 1 start today Tutorial 2 will go online after lecture 3 Practicals: Make sure to find a team partner very soon
More informationIntro Vectors 2D implicit curves 2D parametric curves. Graphics 2011/2012, 4th quarter. Lecture 2: vectors, curves, and surfaces
Lecture 2, curves, and surfaces Organizational remarks Tutorials: Tutorial 1 will be online later today TA sessions for questions start next week Practicals: Exams: Make sure to find a team partner very
More informationContinuous Distributions
Continuous Distributions 1.8-1.9: Continuous Random Variables 1.10.1: Uniform Distribution (Continuous) 1.10.4-5 Exponential and Gamma Distributions: Distance between crossovers Prof. Tesler Math 283 Fall
More information