Lecture Notes Special: Using Neural Networks for Mean-Squared Estimation. So: How can we evaluate E(X Y )? What is a neural network? How is it useful?
|
|
- Vincent Myles Gibbs
- 5 years ago
- Views:
Transcription
1 Lecture Notes Special: Using Neural Networks for Mean-Squared Estimation The Story So Far... So: How can we evaluate E(X Y )? What is a neural network? How is it useful? EE 278: Using Neural Networks for Mean-Squared Estimation age 1
2 The Story So Far... Consider the AWGN Channel: Y = X + Z X N (0, ), Z N (0, N), X and Z are independent. MMSE roblem: Find ˆX = g(y ) such that E(X ˆX) 2 is minimized. Solution: ˆX = g(y ) = E(X Y ). The best linear estimator ˆX lin = ay + b = + N Y also turns out to be the best MMSE estimator for the AWGN channel. That is, E(X Y ) = + N Y. EE 278: Using Neural Networks for Mean-Squared Estimation age 2
3 In general, estimation problems are formulated as follows: Given Y = h(x, Z), where X is signal and Z is noise, find ˆX = g(y ) such that E(X ˆX) 2 is minimized. We have seen that the best MMSE estimate is ˆX = g(y ) = E(X Y ). However, E(X Y ) can be very hard to compute. For example, consider some example h(, ) and X, Z distributions Linear channel: Y = X + Z, but X, Z are not Gaussian already very hard. Nonlinear channels: Y = XZ X 2 Z X 2 + Z X, Z arbitrarily distributed (including Gaussian) also very hard! Vector channels (linear or nonlinear) X, Z arbitrarily distributed. Y = h(x, Z) EE 278: Using Neural Networks for Mean-Squared Estimation age 3
4 These channels arise in many practical applications, including Communications (mostly additive or multiplicative channels with Gaussian variables) Biological systems (arbitrary channels and X, Z distributions) E-commerce/retail data (arbitrary channels and X, Z distributions) Networking (arbitrary channels and X, Z distributions) etc. EE 278: Using Neural Networks for Mean-Squared Estimation age 4
5 So: How can we evaluate E(X Y )? Two ways to proceed: Approximate E(X Y ) = g(y ) with polynomials. Approximate g( ) using neural networks and DATA. Let s look at polynomial approximations first. ( ( 1 Take a simple case Y = X + Z, X Laplace ), Z Laplace X and Z are independent. Additive Laplacian Channel: Y = X + Z f X (x) = 1 e x 2, f Z (z) = 1 e z 2 N, X and Z are independent. N 1 N ), EE 278: Using Neural Networks for Mean-Squared Estimation age 5
6 Note: Similarly E(X k ) = E(X) = 0 = E(Z) E(X 2 ) = 2 ; E(Z 2 ) = 2N 0 k odd xk e x E(Z k ) = dx = 1 k! { 0 k odd k! N k 2 k even ( 1 ) k+1 = k! k 2 k even First, let s see if we can determine E(X Y ): E(X Y ) = where we use Bayes rule to get f X Y (x y). x f X Y (x y)dx, EE 278: Using Neural Networks for Mean-Squared Estimation age 6
7 f X Y (x y) = f Y X(y x)f X (x) f Y (y) = f Z (y x)f X (x) f X(v)f Z (y v)dv Highly complicated! = 1 e y x 2 N 1 e x N 2 1 e v 2 1 e y v 2 N N dv So let s look for polynomial approximations to g( ) (where E(X Y ) = g(y )). Linear: ˆX = ay + b Using the orthogonality principle, we get a = ˆX lin = ay + b = +N Y. Quadratic: ˆX = ay 2 + by + c By the orthogonality principle, (X ˆX) 1, Y, Y 2. +N, b = 0 so (X ˆX) 1 E [ (X (ay 2 + by + c)) 1 ] = 0 E(X) a E(Y 2 ) b E(Y ) c = 0 c = a E(Y 2 ) EE 278: Using Neural Networks for Mean-Squared Estimation age 7
8 (X ˆX) Y E [ (X (ay 2 + by + c)) Y ] = 0 E(XY ) a E(Y 3 ) b E(Y 2 ) c E(Y ) = 0 E(X 2 + XN) 0 b E(X 2 + Z 2 + 2XZ) 0 = 0 E(X 2 ) = b((e(x 2 ) + E(Z 2 ) + 0) b = E(X 2 ) E(X 2 ) + E(Z 2 ) = N = + N (X ˆX) Y 2 E [ (X (ay 2 + by + c)) Y 2] = 0 E(X(X 2 + Z 2 + 2XZ)) a E(Y 4 ) b E(Y 3 ) c E(Y 2 ) = 0 E(X 3 + XZ 2 + 2X 2 Z) a E(Y 4 ) 0 c E(Y 2 ) = 0 Substituting c = a E(Y 2 ), 0 a E(Y 4 ) + a[e(y 2 )] 2 = 0 a(e(y 4 ) [E(Y 2 )] 2 ) = avar(y 2 ) = 0 Since Var(Y 2 ) > 0, ( Y 2 > 0 is a random variable and (Y 2 > 0) = 1) It must be that a = 0; Hence c = 0. ˆX quadratic = by = +N Y = ˆX lin!!! EE 278: Using Neural Networks for Mean-Squared Estimation age 8
9 Cubic: ˆX = ay 3 + by 2 + cy + d Again, by orthogonality (X ˆX) 1, Y, Y 2, Y 3. Solving the equations (and after a lot of algebra), we get where b = d = 0 a = 2( + N)[E(X4 ) + 24 N] 2 E(Y 6 ) 2( + N) E(Y 6 ) [E(Y 4 )] 2 c = [2 a E(Y 4 )], 2( + N) E(X 4 ) = 24 2 E(Y 4 ) = 24( 2 + N + N 2 ) E(Y 6 ) = 6!( N + N 2 + N 3 ) Highly messy! Now let s consider using neural networks and data EE 278: Using Neural Networks for Mean-Squared Estimation age 9
10 Take Y = h(x, Z); X = g(y) = E(X Y) EE 278: Using Neural Networks for Mean-Squared Estimation age 10
11 What is a neural network? EE 278: Using Neural Networks for Mean-Squared Estimation age 11
12 h 1,before = W Y + b = W 1 W 2. W k Y 1 Y 2. Y n b 1 + b 2. = W 1, Y + b 1. W k, Y + b 2 where W R k n is the weight matrix and b R k 1 is the bias vector. Apply ReLU (Rectified Linear Unit) type of nonlinearity h 1,after = max{ W 1, Y + b 1, 0}. max{ W k, Y + b 2, 0} h 2,after is an input to the 2 nd hidden layer and so on... Geometrically, the h 2,after vector takes Y and projects it onto the affine places defined by (W i, b i ) and takes the projection only if Y is on the positive side of the affine hyperplane. And so on.. Thus layer 1 classifies Y into regions defined by the projections onto (W i, b i ) hyperplanes. It then passes the positive projections to layer 2, setting negative projections to zero. b k EE 278: Using Neural Networks for Mean-Squared Estimation age 12
13 And so on for subsequent layers. Eventually, ˆX is a piecewise linear combination of the affine transformations in the neural network. We train the neural network with data (Y i, X i ) (Recall that X i ˆX i ) to get the weights W ij and the biases b i,k. Then, we use the trained neural network to estimate a new ˆX given a new Y. EE 278: Using Neural Networks for Mean-Squared Estimation age 13
UCSD ECE153 Handout #27 Prof. Young-Han Kim Tuesday, May 6, Solutions to Homework Set #5 (Prepared by TA Fatemeh Arbabjolfaei)
UCSD ECE53 Handout #7 Prof. Young-Han Kim Tuesday, May 6, 4 Solutions to Homework Set #5 (Prepared by TA Fatemeh Arbabjolfaei). Neural net. Let Y = X + Z, where the signal X U[,] and noise Z N(,) are independent.
More informationCSC321 Lecture 5: Multilayer Perceptrons
CSC321 Lecture 5: Multilayer Perceptrons Roger Grosse Roger Grosse CSC321 Lecture 5: Multilayer Perceptrons 1 / 21 Overview Recall the simple neuron-like unit: y output output bias i'th weight w 1 w2 w3
More informationEEL 5544 Noise in Linear Systems Lecture 30. X (s) = E [ e sx] f X (x)e sx dx. Moments can be found from the Laplace transform as
L30-1 EEL 5544 Noise in Linear Systems Lecture 30 OTHER TRANSFORMS For a continuous, nonnegative RV X, the Laplace transform of X is X (s) = E [ e sx] = 0 f X (x)e sx dx. For a nonnegative RV, the Laplace
More informationCSC 411 Lecture 10: Neural Networks
CSC 411 Lecture 10: Neural Networks Roger Grosse, Amir-massoud Farahmand, and Juan Carrasquilla University of Toronto UofT CSC 411: 10-Neural Networks 1 / 35 Inspiration: The Brain Our brain has 10 11
More informationESTIMATION THEORY. Chapter Estimation of Random Variables
Chapter ESTIMATION THEORY. Estimation of Random Variables Suppose X,Y,Y 2,...,Y n are random variables defined on the same probability space (Ω, S,P). We consider Y,...,Y n to be the observed random variables
More informationSolutions to Homework Set #5 (Prepared by Lele Wang) MSE = E [ (sgn(x) g(y)) 2],, where f X (x) = 1 2 2π e. e (x y)2 2 dx 2π
Solutions to Homework Set #5 (Prepared by Lele Wang). Neural net. Let Y X + Z, where the signal X U[,] and noise Z N(,) are independent. (a) Find the function g(y) that minimizes MSE E [ (sgn(x) g(y))
More informationFinal Examination Solutions (Total: 100 points)
Final Examination Solutions (Total: points) There are 4 problems, each problem with multiple parts, each worth 5 points. Make sure you answer all questions. Your answer should be as clear and readable
More informationLecture 4: Least Squares (LS) Estimation
ME 233, UC Berkeley, Spring 2014 Xu Chen Lecture 4: Least Squares (LS) Estimation Background and general solution Solution in the Gaussian case Properties Example Big picture general least squares estimation:
More informationa b = a T b = a i b i (1) i=1 (Geometric definition) The dot product of two Euclidean vectors a and b is defined by a b = a b cos(θ a,b ) (2)
This is my preperation notes for teaching in sections during the winter 2018 quarter for course CSE 446. Useful for myself to review the concepts as well. More Linear Algebra Definition 1.1 (Dot Product).
More information2 (Statistics) Random variables
2 (Statistics) Random variables References: DeGroot and Schervish, chapters 3, 4 and 5; Stirzaker, chapters 4, 5 and 6 We will now study the main tools use for modeling experiments with unknown outcomes
More informationUCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, Practice Final Examination (Winter 2017)
UCSD ECE250 Handout #27 Prof. Young-Han Kim Friday, June 8, 208 Practice Final Examination (Winter 207) There are 6 problems, each problem with multiple parts. Your answer should be as clear and readable
More informationNumerical Methods. Equations and Partial Fractions. Jaesung Lee
Numerical Methods Equations and Partial Fractions Jaesung Lee Solving linear equations Solving linear equations Introduction Many problems in engineering reduce to the solution of an equation or a set
More informationSummary of EE226a: MMSE and LLSE
1 Summary of EE226a: MMSE and LLSE Jean Walrand Here are the key ideas and results: I. SUMMARY The MMSE of given Y is E[ Y]. The LLSE of given Y is L[ Y] = E() + K,Y K 1 Y (Y E(Y)) if K Y is nonsingular.
More informationPerhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.
Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage
More informationDiscrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 26. Estimation: Regression and Least Squares
CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 26 Estimation: Regression and Least Squares This note explains how to use observations to estimate unobserved random variables.
More informationMachine Learning (CSE 446): Neural Networks
Machine Learning (CSE 446): Neural Networks Noah Smith c 2017 University of Washington nasmith@cs.washington.edu November 6, 2017 1 / 22 Admin No Wednesday office hours for Noah; no lecture Friday. 2 /
More informationWhy is Deep Learning so effective?
Ma191b Winter 2017 Geometry of Neuroscience The unreasonable effectiveness of deep learning This lecture is based entirely on the paper: Reference: Henry W. Lin and Max Tegmark, Why does deep and cheap
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More information4 Derivations of the Discrete-Time Kalman Filter
Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof N Shimkin 4 Derivations of the Discrete-Time
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2
MA 575 Linear Models: Cedric E Ginestet, Boston University Revision: Probability and Linear Algebra Week 1, Lecture 2 1 Revision: Probability Theory 11 Random Variables A real-valued random variable is
More informationCS70: Jean Walrand: Lecture 22.
CS70: Jean Walrand: Lecture 22. Confidence Intervals; Linear Regression 1. Review 2. Confidence Intervals 3. Motivation for LR 4. History of LR 5. Linear Regression 6. Derivation 7. More examples Review:
More informationLinear Algebra. Chapter 8: Eigenvalues: Further Applications and Computations Section 8.2. Applications to Geometry Proofs of Theorems.
Linear Algebra Chapter 8: Eigenvalues: Further Applications and Computations Section 8.2. Applications to Geometry Proofs of Theorems May 1, 2018 () Linear Algebra May 1, 2018 1 / 8 Table of contents 1
More informationEcon 2120: Section 2
Econ 2120: Section 2 Part I - Linear Predictor Loose Ends Ashesh Rambachan Fall 2018 Outline Big Picture Matrix Version of the Linear Predictor and Least Squares Fit Linear Predictor Least Squares Omitted
More informationIntroduction to Machine Learning Fall 2017 Note 5. 1 Overview. 2 Metric
CS 189 Introduction to Machine Learning Fall 2017 Note 5 1 Overview Recall from our previous note that for a fixed input x, our measurement Y is a noisy measurement of the true underlying response f x):
More informationKalman Filter. Predict: Update: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q
Kalman Filter Kalman Filter Predict: x k k 1 = F k x k 1 k 1 + B k u k P k k 1 = F k P k 1 k 1 F T k + Q Update: K = P k k 1 Hk T (H k P k k 1 Hk T + R) 1 x k k = x k k 1 + K(z k H k x k k 1 ) P k k =(I
More informationpolynomial function polynomial function of degree n leading coefficient leading-term test quartic function turning point
polynomial function polynomial function of degree n leading coefficient leading-term test quartic function turning point quadratic form repeated zero multiplicity Graph Transformations of Monomial Functions
More informationLecture Semidefinite Programming and Graph Partitioning
Approximation Algorithms and Hardness of Approximation April 16, 013 Lecture 14 Lecturer: Alantha Newman Scribes: Marwa El Halabi 1 Semidefinite Programming and Graph Partitioning In previous lectures,
More informationUCSD ECE153 Handout #30 Prof. Young-Han Kim Thursday, May 15, Homework Set #6 Due: Thursday, May 22, 2011
UCSD ECE153 Handout #30 Prof. Young-Han Kim Thursday, May 15, 2014 Homework Set #6 Due: Thursday, May 22, 2011 1. Linear estimator. Consider a channel with the observation Y = XZ, where the signal X and
More informationA D VA N C E D P R O B A B I L - I T Y
A N D R E W T U L L O C H A D VA N C E D P R O B A B I L - I T Y T R I N I T Y C O L L E G E T H E U N I V E R S I T Y O F C A M B R I D G E Contents 1 Conditional Expectation 5 1.1 Discrete Case 6 1.2
More informationLECTURE 16 GAUSS QUADRATURE In general for Newton-Cotes (equispaced interpolation points/ data points/ integration points/ nodes).
CE 025 - Lecture 6 LECTURE 6 GAUSS QUADRATURE In general for ewton-cotes (equispaced interpolation points/ data points/ integration points/ nodes). x E x S fx dx hw' o f o + w' f + + w' f + E 84 f 0 f
More informationNotes on Discriminant Functions and Optimal Classification
Notes on Discriminant Functions and Optimal Classification Padhraic Smyth, Department of Computer Science University of California, Irvine c 2017 1 Discriminant Functions Consider a classification problem
More informationLectures 22-23: Conditional Expectations
Lectures 22-23: Conditional Expectations 1.) Definitions Let X be an integrable random variable defined on a probability space (Ω, F 0, P ) and let F be a sub-σ-algebra of F 0. Then the conditional expectation
More informationMATH 304 Linear Algebra Lecture 19: Least squares problems (continued). Norms and inner products.
MATH 304 Linear Algebra Lecture 19: Least squares problems (continued). Norms and inner products. Orthogonal projection Theorem 1 Let V be a subspace of R n. Then any vector x R n is uniquely represented
More informationECE 650 Lecture 4. Intro to Estimation Theory Random Vectors. ECE 650 D. Van Alphen 1
EE 650 Lecture 4 Intro to Estimation Theory Random Vectors EE 650 D. Van Alphen 1 Lecture Overview: Random Variables & Estimation Theory Functions of RV s (5.9) Introduction to Estimation Theory MMSE Estimation
More informationJoint Distributions. (a) Scalar multiplication: k = c d. (b) Product of two matrices: c d. (c) The transpose of a matrix:
Joint Distributions Joint Distributions A bivariate normal distribution generalizes the concept of normal distribution to bivariate random variables It requires a matrix formulation of quadratic forms,
More informationRecap from previous lecture
Recap from previous lecture Learning is using past experience to improve future performance. Different types of learning: supervised unsupervised reinforcement active online... For a machine, experience
More information1 What a Neural Network Computes
Neural Networks 1 What a Neural Network Computes To begin with, we will discuss fully connected feed-forward neural networks, also known as multilayer perceptrons. A feedforward neural network consists
More informationUnderstanding Generalization Error: Bounds and Decompositions
CIS 520: Machine Learning Spring 2018: Lecture 11 Understanding Generalization Error: Bounds and Decompositions Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the
More informationENGG2430A-Homework 2
ENGG3A-Homework Due on Feb 9th,. Independence vs correlation a For each of the following cases, compute the marginal pmfs from the joint pmfs. Explain whether the random variables X and Y are independent,
More informationIntelligent Systems Discriminative Learning, Neural Networks
Intelligent Systems Discriminative Learning, Neural Networks Carsten Rother, Dmitrij Schlesinger WS2014/2015, Outline 1. Discriminative learning 2. Neurons and linear classifiers: 1) Perceptron-Algorithm
More informationModule 2: First-Order Partial Differential Equations
Module 2: First-Order Partial Differential Equations The mathematical formulations of many problems in science and engineering reduce to study of first-order PDEs. For instance, the study of first-order
More informationECE534, Spring 2018: Solutions for Problem Set #5
ECE534, Spring 08: s for Problem Set #5 Mean Value and Autocorrelation Functions Consider a random process X(t) such that (i) X(t) ± (ii) The number of zero crossings, N(t), in the interval (0, t) is described
More information5 Operations on Multiple Random Variables
EE360 Random Signal analysis Chapter 5: Operations on Multiple Random Variables 5 Operations on Multiple Random Variables Expected value of a function of r.v. s Two r.v. s: ḡ = E[g(X, Y )] = g(x, y)f X,Y
More informationEstimation Theory Fredrik Rusek. Chapters
Estimation Theory Fredrik Rusek Chapters 3.5-3.10 Recap We deal with unbiased estimators of deterministic parameters Performance of an estimator is measured by the variance of the estimate (due to the
More informationCovariance and Correlation
Covariance and Correlation ST 370 The probability distribution of a random variable gives complete information about its behavior, but its mean and variance are useful summaries. Similarly, the joint probability
More informationLecture 6. Notes on Linear Algebra. Perceptron
Lecture 6. Notes on Linear Algebra. Perceptron COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne This lecture Notes on linear algebra Vectors
More information14 EE 2402 Engineering Mathematics III Solutions to Tutorial 3 1. For n =0; 1; 2; 3; 4; 5 verify that P n (x) is a solution of Legendre's equation wit
EE 0 Engineering Mathematics III Solutions to Tutorial. For n =0; ; ; ; ; verify that P n (x) is a solution of Legendre's equation with ff = n. Solution: Recall the Legendre's equation from your text or
More informationALGEBRAIC GEOMETRY HOMEWORK 3
ALGEBRAIC GEOMETRY HOMEWORK 3 (1) Consider the curve Y 2 = X 2 (X + 1). (a) Sketch the curve. (b) Determine the singular point P on C. (c) For all lines through P, determine the intersection multiplicity
More informationARCS IN FINITE PROJECTIVE SPACES. Basic objects and definitions
ARCS IN FINITE PROJECTIVE SPACES SIMEON BALL Abstract. These notes are an outline of a course on arcs given at the Finite Geometry Summer School, University of Sussex, June 26-30, 2017. Let K denote an
More informationProblem Set 7 Due March, 22
EE16: Probability and Random Processes SP 07 Problem Set 7 Due March, Lecturer: Jean C. Walrand GSI: Daniel Preda, Assane Gueye Problem 7.1. Let u and v be independent, standard normal random variables
More informationMulti-layer Neural Networks
Multi-layer Neural Networks Steve Renals Informatics 2B Learning and Data Lecture 13 8 March 2011 Informatics 2B: Learning and Data Lecture 13 Multi-layer Neural Networks 1 Overview Multi-layer neural
More informationReminders. Thought questions should be submitted on eclass. Please list the section related to the thought question
Linear regression Reminders Thought questions should be submitted on eclass Please list the section related to the thought question If it is a more general, open-ended question not exactly related to a
More informationconditional cdf, conditional pdf, total probability theorem?
6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random
More informationSolutions to Homework Set #6 (Prepared by Lele Wang)
Solutions to Homework Set #6 (Prepared by Lele Wang) Gaussian random vector Given a Gaussian random vector X N (µ, Σ), where µ ( 5 ) T and 0 Σ 4 0 0 0 9 (a) Find the pdfs of i X, ii X + X 3, iii X + X
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 9 1 / 23 Overview
More informationLinear regression COMS 4771
Linear regression COMS 4771 1. Old Faithful and prediction functions Prediction problem: Old Faithful geyser (Yellowstone) Task: Predict time of next eruption. 1 / 40 Statistical model for time between
More informationDeep Neural Networks (1) Hidden layers; Back-propagation
Deep Neural Networs (1) Hidden layers; Bac-propagation Steve Renals Machine Learning Practical MLP Lecture 3 2 October 2018 http://www.inf.ed.ac.u/teaching/courses/mlp/ MLP Lecture 3 / 2 October 2018 Deep
More informationMATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation)
MATHEMATICS 154, SPRING 2009 PROBABILITY THEORY Outline #11 (Tail-Sum Theorem, Conditional distribution and expectation) Last modified: March 7, 2009 Reference: PRP, Sections 3.6 and 3.7. 1. Tail-Sum Theorem
More informationDeep Neural Networks (1) Hidden layers; Back-propagation
Deep Neural Networs (1) Hidden layers; Bac-propagation Steve Renals Machine Learning Practical MLP Lecture 3 4 October 2017 / 9 October 2017 MLP Lecture 3 Deep Neural Networs (1) 1 Recap: Softmax single
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationUCSD ECE153 Handout #34 Prof. Young-Han Kim Tuesday, May 27, Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei)
UCSD ECE53 Handout #34 Prof Young-Han Kim Tuesday, May 7, 04 Solutions to Homework Set #6 (Prepared by TA Fatemeh Arbabjolfaei) Linear estimator Consider a channel with the observation Y XZ, where the
More informationUNIT-2: MULTIPLE RANDOM VARIABLES & OPERATIONS
UNIT-2: MULTIPLE RANDOM VARIABLES & OPERATIONS In many practical situations, multiple random variables are required for analysis than a single random variable. The analysis of two random variables especially
More informationLeast Squares. Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter UCSD
Least Squares Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 75A Winter 0 - UCSD (Unweighted) Least Squares Assume linearity in the unnown, deterministic model parameters Scalar, additive noise model: y f (
More informationMultiple Regression Model: I
Multiple Regression Model: I Suppose the data are generated according to y i 1 x i1 2 x i2 K x ik u i i 1...n Define y 1 x 11 x 1K 1 u 1 y y n X x n1 x nk K u u n So y n, X nxk, K, u n Rks: In many applications,
More informationThe Multivariate Gaussian Distribution
The Multivariate Gaussian Distribution Chuong B. Do October, 8 A vector-valued random variable X = T X X n is said to have a multivariate normal or Gaussian) distribution with mean µ R n and covariance
More informationIntroduction to Machine Learning Spring 2018 Note Neural Networks
CS 189 Introduction to Machine Learning Spring 2018 Note 14 1 Neural Networks Neural networks are a class of compositional function approximators. They come in a variety of shapes and sizes. In this class,
More informationCS489/698: Intro to ML
CS489/698: Intro to ML Lecture 03: Multi-layer Perceptron Outline Failure of Perceptron Neural Network Backpropagation Universal Approximator 2 Outline Failure of Perceptron Neural Network Backpropagation
More informationRecall that if X 1,...,X n are random variables with finite expectations, then. The X i can be continuous or discrete or of any other type.
Expectations of Sums of Random Variables STAT/MTHE 353: 4 - More on Expectations and Variances T. Linder Queen s University Winter 017 Recall that if X 1,...,X n are random variables with finite expectations,
More informationMath Circles: Number Theory III
Math Circles: Number Theory III Centre for Education in Mathematics and Computing University of Waterloo March 9, 2011 A prime-generating polynomial The polynomial f (n) = n 2 n + 41 generates a lot of
More informationWeek 5: Logistic Regression & Neural Networks
Week 5: Logistic Regression & Neural Networks Instructor: Sergey Levine 1 Summary: Logistic Regression In the previous lecture, we covered logistic regression. To recap, logistic regression models and
More informationMachine Learning Lecture 5
Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Course Outline Fundamentals Bayes Decision Theory
More informationLecture 1: Systems of linear equations and their solutions
Lecture 1: Systems of linear equations and their solutions Course overview Topics to be covered this semester: Systems of linear equations and Gaussian elimination: Solving linear equations and applications
More informationLinear Algebra review Powers of a diagonalizable matrix Spectral decomposition
Linear Algebra review Powers of a diagonalizable matrix Spectral decomposition Prof. Tesler Math 283 Fall 2016 Also see the separate version of this with Matlab and R commands. Prof. Tesler Diagonalizing
More informationMachine Learning. Lecture 4: Regularization and Bayesian Statistics. Feng Li. https://funglee.github.io
Machine Learning Lecture 4: Regularization and Bayesian Statistics Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 207 Overfitting Problem
More informationECE 541 Stochastic Signals and Systems Problem Set 11 Solution
ECE 54 Stochastic Signals and Systems Problem Set Solution Problem Solutions : Yates and Goodman,..4..7.3.3.4.3.8.3 and.8.0 Problem..4 Solution Since E[Y (t] R Y (0, we use Theorem.(a to evaluate R Y (τ
More informationAppendix A : Introduction to Probability and stochastic processes
A-1 Mathematical methods in communication July 5th, 2009 Appendix A : Introduction to Probability and stochastic processes Lecturer: Haim Permuter Scribe: Shai Shapira and Uri Livnat The probability of
More informationSDS 321: Introduction to Probability and Statistics
SDS 321: Introduction to Probability and Statistics Lecture 17: Continuous random variables: conditional PDF Purnamrita Sarkar Department of Statistics and Data Science The University of Texas at Austin
More informationEE/Stat 376B Handout #5 Network Information Theory October, 14, Homework Set #2 Solutions
EE/Stat 376B Handout #5 Network Information Theory October, 14, 014 1. Problem.4 parts (b) and (c). Homework Set # Solutions (b) Consider h(x + Y ) h(x + Y Y ) = h(x Y ) = h(x). (c) Let ay = Y 1 + Y, where
More informationMidterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Non-parametric
More informationProbabilistic Graphical Models
Probabilistic Graphical Models Lecture 12 Dynamical Models CS/CNS/EE 155 Andreas Krause Homework 3 out tonight Start early!! Announcements Project milestones due today Please email to TAs 2 Parameter learning
More informationDefinition and basic properties of heat kernels I, An introduction
Definition and basic properties of heat kernels I, An introduction Zhiqin Lu, Department of Mathematics, UC Irvine, Irvine CA 92697 April 23, 2010 In this lecture, we will answer the following questions:
More informationEngineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I
Engineering Part IIB: Module 4F10 Statistical Pattern Processing Lecture 6: Multi-Layer Perceptrons I Phil Woodland: pcw@eng.cam.ac.uk Michaelmas 2012 Engineering Part IIB: Module 4F10 Introduction In
More informationORDINARY DIFFERENTIAL EQUATIONS
ORDINARY DIFFERENTIAL EQUATIONS Basic concepts: Find y(x) where x is the independent and y the dependent varible, based on an equation involving x, y(x), y 0 (x),...e.g.: y 00 (x) = 1+y(x) y0 (x) 1+x or,
More informationBANA 7046 Data Mining I Lecture 6. Other Data Mining Algorithms 1
BANA 7046 Data Mining I Lecture 6. Other Data Mining Algorithms 1 Shaobo Li University of Cincinnati 1 Partially based on Hastie, et al. (2009) ESL, and James, et al. (2013) ISLR Data Mining I Lecture
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationLecture 2: Repetition of probability theory and statistics
Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:
More informationQuantum Mechanics for Scientists and Engineers. David Miller
Quantum Mechanics for Scientists and Engineers David Miller Vector spaces, operators and matrices Vector spaces, operators and matrices Vector space Vector space We need a space in which our vectors exist
More information12 Statistical Justifications; the Bias-Variance Decomposition
Statistical Justifications; the Bias-Variance Decomposition 65 12 Statistical Justifications; the Bias-Variance Decomposition STATISTICAL JUSTIFICATIONS FOR REGRESSION [So far, I ve talked about regression
More informationSupport Vector Regression (SVR) Descriptions of SVR in this discussion follow that in Refs. (2, 6, 7, 8, 9). The literature
Support Vector Regression (SVR) Descriptions of SVR in this discussion follow that in Refs. (2, 6, 7, 8, 9). The literature suggests the design variables should be normalized to a range of [-1,1] or [0,1].
More informationLecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable
Lecture Notes 1 Probability and Random Variables Probability Spaces Conditional Probability and Independence Random Variables Functions of a Random Variable Generation of a Random Variable Jointly Distributed
More informationCMU-Q Lecture 24:
CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input
More informationInput: A set (x i -yy i ) data. Output: Function value at arbitrary point x. What for x = 1.2?
Applied Numerical Analysis Interpolation Lecturer: Emad Fatemizadeh Interpolation Input: A set (x i -yy i ) data. Output: Function value at arbitrary point x. 0 1 4 1-3 3 9 What for x = 1.? Interpolation
More informationVectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1. x 2. x =
Linear Algebra Review Vectors To begin, let us describe an element of the state space as a point with numerical coordinates, that is x 1 x x = 2. x n Vectors of up to three dimensions are easy to diagram.
More informationLecture 7: Kernels for Classification and Regression
Lecture 7: Kernels for Classification and Regression CS 194-10, Fall 2011 Laurent El Ghaoui EECS Department UC Berkeley September 15, 2011 Outline Outline A linear regression problem Linear auto-regressive
More informationLearning with kernels and SVM
Learning with kernels and SVM Šámalova chata, 23. května, 2006 Petra Kudová Outline Introduction Binary classification Learning with Kernels Support Vector Machines Demo Conclusion Learning from data find
More informationCS 6501: Deep Learning for Computer Graphics. Basics of Neural Networks. Connelly Barnes
CS 6501: Deep Learning for Computer Graphics Basics of Neural Networks Connelly Barnes Overview Simple neural networks Perceptron Feedforward neural networks Multilayer perceptron and properties Autoencoders
More informationENGR352 Problem Set 02
engr352/engr352p02 September 13, 2018) ENGR352 Problem Set 02 Transfer function of an estimator 1. Using Eq. (1.1.4-27) from the text, find the correct value of r ss (the result given in the text is incorrect).
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 89 Part II
More informationNeural Networks. Nicholas Ruozzi University of Texas at Dallas
Neural Networks Nicholas Ruozzi University of Texas at Dallas Handwritten Digit Recognition Given a collection of handwritten digits and their corresponding labels, we d like to be able to correctly classify
More information