PATTERN CLASSIFICATION


 Kathlyn Anis Sparks
 1 years ago
 Views:
Transcription
1 PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wileylnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto
2 CONTENTS PREFACE XVII INTRODUCTION 1.1 Machine Perception, An Example, Related Fields, Pattern Recognition Systems, Sensing, Segmentation and Grouping, Feature Extraction, Classification, Postprocessing, The Design Cycle, Data Collection, Feature Choice, Model Choice, Training, Evaluation, Computational Complexity, Learning and Adaptation, Supervised Learning, Unsupervised Learning, Reinforcement Learning, Conclusion, 17 Summary by Chapters, 17 Bibliographical and Historical Remarks, 18 Bibliography, 19 BAYESIAN DECISION THEORY Introduction, Bayesian Decision Theory Continuous Features, TwoCategory Classification, MinimumErrorRate Classification, 26 *2.3.1 Minimax Criterion, 27 VII
3 *2.3.2 NeymanPearson Criterion, Classifiers, Discriminant Functions, and Decision Surfaces, The Multicategory Case, The TwoCategory Case, The Normal Density, Univariate Density, Multivariate Density, Discriminant Functions for the Normal Density, Case 1: X, = a 2 I, Case 2: X; = X, Case 3: X, = arbitrary, 41 Example 1 Decision Regions for TwoDimensional Gaussian Data, 41 *2.7 Error Probabilities and Integrals, 45 *2.8 Error Bounds for Normal Densities, Chernoff Bound, Bhattacharyya Bound, 47 Example 2 Error Bounds for Gaussian Distributions, Signal Detection Theory and Operating Characteristics, Bayes Decision Theory Discrete Features, Independent Binary Features, 52 Example 3 Bayesian Decisions for ThreeDimensional Binary Data, 53 *2.10 Missing and Noisy Features, Missing Features, Noisy Features, 55 *2.11 Bayesian Belief Networks, 56 Example 4 Belief Network for Fish, 59 *2.12 Compound Bayesian Decision Theory and Context, 62 Summary, 63 Bibliographical and Historical Remarks, 64 Problems, 65 Computer exercises, 80 Bibliography, 82 MAXIMUMLIKELIHOOD AND BAYESIAN PARAMETER ESTIMATION Introduction, MaximumLikelihood Estimation, The General Principle, The Gaussian Case: Unknown /ut, The Gaussian Case: Unknown /u, and X, Bias, Bayesian Estimation, The ClassConditional Densities, The Parameter Distribution, Bayesian Parameter Estimation: Gaussian Case, The Univariate Case: p(p\t>), The Univariate Case: p(x V), The Multivariate Case, 95
4 CONTENTS ix 3.5 Bayesian Parameter Estimation: General Theory, 97 Example l Recursive Bayes Learning, When Do MaximumLikelihood and Bayes Methods Differ?, Noninformative Priors and Invariance, Gibbs Algorithm, 102 *3.6 Sufficient Statistics, Sufficient Statistics and the Exponential Family, Problems of Dimensionality, Accuracy, Dimension, and Training Sample Size, Computational Complexity, Overfitting, 113 *3.8 Component Analysis and Discriminants, Principal Component Analysis (PCA), Fisher Linear Discriminant, Multiple Discriminant Analysis, 121 *3.9 ExpectationMaximization (EM), 124 Example 2 ExpectationMaximization for a 2D Normal Model, Hidden Markov Models, FirstOrder Markov Models, FirstOrder Hidden Markov Models, Hidden Markov Model Computation, Evaluation, 131 Example 3 Hidden Markov Model, Decoding, 135 Example 4 HMM Decoding, Learning, 137 Summary, 139 Bibliographical and Historical Remarks, 139 Problems, 140 Computer exercises, 155 Bibliography, 159 NONPARAMEJRIC TECHNIQUES Introduction, Density Estimation, Parzen Windows, Convergence of the Mean, Convergence of the Variance, Illustrations, Classification Example, Probabilistic Neural Networks (PNNs), Choosing the Window Function, ^ NearestNeighbor Estimation, ^ NearestNeighbor and ParzenWindow Estimation, Estimation of A Posteriori Probabilities, The NearestNeighbor Rule, Convergence of the Nearest Neighbor, Error Rate for the NearestNeighbor Rule, Error Bounds, The ^NearestNeighbor Rule, 182 j
5 4.5.5 Computational Complexity of the кnearestneighbor Rule, Metrics and NearestNeighbor Classification, Properties of Metrics, Tangent Distance, 188 *4.7 Fuzzy Classification, 192 *4.8 Reduced Coulomb Energy Networks, Approximations by Series Expansions, 197 Summary, 199 Bibliographical and Historical Remarks, 200 Problems, 201 Computer exercises, 209 Bibliography, 213 LINEAR DISCRIMINANT FUNCTIONS Introduction, Linear Discriminant Functions and Decision Surfaces, The TwoCategory Case, The Multicategory Case, Generalized Linear Discriminant Functions, The TwoCategory Linearly Separable Case, Geometry and Terminology, Gradient Descent Procedures, Minimizing the Perceptron Criterion Function, The Perceptron Criterion Function, Convergence Proof for SingleSample Correction, Some Direct Generalizations, Relaxation Procedures, The Descent Algorithm, Convergence Proof, Nonseparable Behavior, Minimum SquaredError Procedures, Minimum SquaredError and the Pseudoinverse, 240 Example 1 Constructing a Linear Classifier by Matrix Pseudoinverse, Relation to Fisher's Linear Discriminant, Asymptotic Approximation to an Optimal Discriminant, The WidrowHoff or LMS Procedure, Stochastic Approximation Methods, The HoKashyap Procedures, The Descent Procedure, Convergence Proof, Nonseparable Behavior, Some Related Procedures, 253 *5.10 Linear Programming Algorithms, Linear Programming, The Linearly Separable Case, Minimizing the Perceptron Criterion Function, 258 *5.11 Support Vector Machines, SVM Training, 263 Example 2 SVM for the XOR Problem, 264
6 CONTENTS xi 5.12 Multicategory Generalizations, Kesler's Construction, Convergence of the FixedIncrement Rule, Generalizations for MSE Procedures, 268 Summary, 269 Bibliographical and Historical Remarks, 270 Problems, 271 Computer exercises, 278 Bibliography, 281 MULTILAYER NEURAL NETWORKS Introduction, Feedforward Operation and Classification, General Feedforward Operation, Expressive Power of Multilayer Networks, Backpropagation Algorithm, Network Learning, Training Protocols, Learning Curves, Error Surfaces, Some Small Networks, The ExclusiveOR (XOR), Larger Networks, How Important Are Multiple Minima?, Backpropagation as Feature Mapping, Representations at the Hidden Layer Weights, Backpropagation, Bayes Theory and Probability, Bayes Discriminants and Neural Networks, Outputs as Probabilities, 304 *6.7 Related Statistical Techniques, Practical Techniques for Improving Backpropagation, Activation Function, Parameters for the Sigmoid, Scaling Input, Target Values, Training with Noise, Manufacturing Data, Number of Hidden Units, Initializing Weights, Learning Rates, Momentum, Weight Decay, Hints, OnLine, Stochastic or Batch Training?, Stopped Training, Number of Hidden Layers, Criterion Function, 318 *6.9 SecondOrder Methods, Hessian Matrix, Newton's Method, 319
7 6.9.3 Quickprop, Conjugate Gradient Descent, 321 Example 1 Conjugate Gradient Descent, 322 *6.10 Additional Networks and Training Methods, Radial Basis Function Networks (RBFs), Special Bases, Matched Filters, Convolutional Networks, Recurrent Networks, CascadeCorrelation, Regularization, Complexity Adjustment and Praning, 330 Summary, 333 Bibliographical and Historical Remarks, 333 Problems, 335 Computer exercises, 343 Bibliography, 347 STOCHASTIC METHODS Introduction, Stochastic Search, Simulated Annealing, The Boltzmann Factor, Deterministic Simulated Annealing, Boltzmann Learning, Stochastic Boltzmann Learning of Visible States, Missing Features and Category Constraints, Deterministic Boltzmann Learning, Initialization and Setting Parameters, 367 *7.4 Boltzmann Networks and Graphical Models, Other Graphical Models, 372 *7.5 Evolutionary Methods, Genetic Algorithms, Further Heuristics, Why Do They Work?, 378 *7.6 Genetic Programming, 378 Summary, 381 Bibliographical and Historical Remarks, 381 Problems, 383 Computer exercises, 388 Bibliography, NONMETRIC METHODS Introduction, Decision Trees, CART, Number of Splits, Query Selection and Node Impurity, When to Stop Splitting, Pruning, 403
8 CONTENTS XIII Assignment of Leaf Node Labels, 404 Example 1 A Simple Tree, Computational Complexity, Feature Choice, Multivariate Decision Trees, Priors and Costs, Missing Attributes, 409 Example 2 Surrogate Splits and Missing Attributes, Other Tree Methods, ID3, C4.5, Which Tree Classifier Is Best?, 412 *8.5 Recognition with Strings, String Matching, Edit Distance, Computational Complexity, String Matching with Errors, String Matching with the "Don'tCare" Symbol, Grammatical Methods, Grammars, Types of String Grammars, 424 Example 3 A Grammar for Pronouncing Numbers, Recognition Using Grammars, Grammatical Inference, 429 Example 4 Grammatical Inference, 431 *8.8 RuleBased Methods, Learning Rules, 433 Summary, 434 Bibliographical and Historical Remarks, 435 Problems, 437 Computer exercises, 446 Bibliography, 450 ALGORITHMINDEPENDENT MACHINE LEARNING Introduction, Lack of Inherent Superiority of Any Classifier, No Free Lunch Theorem, 454 Example 1 No Free Lunch for Binary Data, 457 *9.2.2 Ugly Duckling Theorem, Minimum Description Length (MDL), Minimum Description Length Principle, Overfitting Avoidance and Occam's Razor, Bias and Variance, Bias and Variance for Regression, Bias and Variance for Classification, Resampling for Estimating Statistics, Jackknife, 472 Example 2 Jackknife Estimate of Bias and Variance of the Mode, Bootstrap, Resampling for Classifier Design, 475
9 9.5.1 Bagging, Boosting, Learning with Queries, Arcing, Learning with Queries, Bias and Variance, Estimating and Comparing Classifiers, Parametric Models, CrossValidation, Jackknife and Bootstrap Estimation of Classification Accuracy, MaximumLikelihood Model Comparison, Bayesian Model Comparison, The ProblemAverage Error Rate, Predicting Final Performance from Learning Curves, The Capacity of a Separating Plane, Combining Classifiers, Component Classifiers with Discriminant Functions, Component Classifiers without Discriminant Functions, 498 Summary, 499 Bibliographical and Historical Remarks, 500 Problems, 502 Computer exercises, 508 Bibliography, UNSUPERVISED LEARNING AND CLUSTERING Introduction, Mixture Densities and Identifiability, MaximumLikelihood Estimates, Application to Normal Mixtures, Case 1: Unknown Mean Vectors, Case 2: All Parameters Unknown, fcmeans Clustering, 526 * Fuzzy itmeans Clustering, Unsupervised Bayesian Learning, The Bayes Classifier, Learning the Parameter Vector, 531 Example 1 Unsupervised Learning of Gaussian Data, DecisionDirected Approximation, Data Description and Clustering, Similarity Measures, Criterion Functions for Clustering, The SumofSquaredError Criterion, Related Minimum Variance Criteria, Scatter Criteria, 544 Example 2 Clustering Criteria, 546 *10.8 Iterative Optimization, Hierarchical Clustering, Definitions, Agglomerative Hierarchical Clustering, StepwiseOptimal Hierarchical Clustering, Hierarchical Clustering and Induced Metrics, 556 * The Problem of Validity, 557
10 CONTENTS XV *10.11 Online clustering, Unknown Number of Clusters, Adaptive Resonance, Learning with a Critic, 565 *10.12 GraphTheoretic Methods, Component Analysis, Principal Component Analysis (PCA), Nonlinear Component Analysis (NLCA), 569 * Independent Component Analysis (ICA), LowDimensional Representations and Multidimensional Scaling (MDS), SelfOrganizing Feature Maps, Clustering and Dimensionality Reduction, 580 Summary, 581 Bibliographical and Historical Remarks, 582 Problems, 583 Computer exercises, 593 Bibliography, 598 MATHEMATICAL FOUNDATIONS 601 A.l Notation, 601 A.2 Linear Algebra, 604 A.2.1 Notation and Preliminaries, 604 A.2.2 Inner Product, 605 A.2.3 Outer Product, 606 A.2.4 Derivatives of Matrices, 606 A.2.5 Determinant and Trace, 608 A.2.6 Matrix Inversion, 609 A.2.7 Eigenvectors and Eigenvalues, 609 A.3 Lagrange Optimization, 610 A.4 Probability Theory, 611 A.4.1 Discrete Random Variables, 611 A.4.2 Expected Values, 611 A.4.3 Pairs of Discrete Random Variables, 612 A.4.4 Statistical Independence, 613 A.4.5 Expected Values of Functions of Two Variables, 613 A.4.6 Conditional Probability, 614 A.4.7 The Law of Total Probability and Bayes Rule, 615 A.4.8 Vector Random Variables, 616 A.4.9 Expectations, Mean Vectors and Covariance Matrices, 617 A.4.10 Continuous Random Variables, 618 A.4.11 Distributions of Sums of Independent Random Variables, 620 A.4.12 Normal Distributions, 621 A.5 Gaussian Derivatives and Integrals, 623 A.5.1 Multivariate Normal Densities, 624 A.5.2 Bivariate Normal Densities, 626 A.6 Hypothesis Testing, 628 A.6.1 ChiSquared Test, 629 A.7 Information Theory, 630 A.7.1 Entropy and Information, 630
11 xvi CONTENTS A.7.2 Relative Entropy, 632 A.7.3 Mutual Information, 632 A.8 Computational Complexity, 633 Bibliography, 635 INDEX 637
Pattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationLearning From Data Lecture 15 Reflecting on Our Path  Epilogue to Part I
Learning From Data Lecture 15 Reflecting on Our Path  Epilogue to Part I What We Did The Machine Learning Zoo Moving Forward M MagdonIsmail CSCI 4100/6100 recap: Three Learning Principles Scientist 2
More informationDeep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville.  Cambridge, MA ; London, Spis treści
Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville.  Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?
More informationMidterm Review CS 6375: Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 6375: Machine Learning Vibhav Gogate The University of Texas at Dallas Machine Learning Supervised Learning Unsupervised Learning Reinforcement Learning Parametric Y Continuous Nonparametric
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationEEL 851: Biometrics. An Overview of Statistical Pattern Recognition EEL 851 1
EEL 851: Biometrics An Overview of Statistical Pattern Recognition EEL 851 1 Outline Introduction Pattern Feature Noise Example Problem Analysis Segmentation Feature Extraction Classification Design Cycle
More informationCondensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C.
Condensed Table of Contents for Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control by J. C. Spall John Wiley and Sons, Inc., 2003 Preface... xiii 1. Stochastic Search
More informationAlgorithmIndependent Learning Issues
AlgorithmIndependent Learning Issues Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007, Selim Aksoy Introduction We have seen many learning
More informationMachine Learning 2017
Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section
More informationIndependent Component Analysis. Contents
Contents Preface xvii 1 Introduction 1 1.1 Linear representation of multivariate data 1 1.1.1 The general statistical setting 1 1.1.2 Dimension reduction methods 2 1.1.3 Independence as a guiding principle
More informationLecture 5: Logistic Regression. Neural Networks
Lecture 5: Logistic Regression. Neural Networks Logistic regression Comparison with generative models Feedforward neural networks Backpropagation Tricks for training neural networks COMP652, Lecture
More informationMidterm Review CS 7301: Advanced Machine Learning. Vibhav Gogate The University of Texas at Dallas
Midterm Review CS 7301: Advanced Machine Learning Vibhav Gogate The University of Texas at Dallas Supervised Learning Issues in supervised learning What makes learning hard Point Estimation: MLE vs Bayesian
More informationMachine Learning A Bayesian and Optimization Perspective
Machine Learning A Bayesian and Optimization Perspective Sergios Theodoridis AMSTERDAM BOSTON HEIDELBERG LONDON NEW YORK OXFORD PARIS SAN DIEGO SAN FRANCISCO SINGAPORE SYDNEY TOKYO Academic Press is an
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationp(d θ ) l(θ ) 1.2 x x x
p(d θ ).2 x 07 0.8 x 07 0.4 x 07 l(θ ) 2040 6080 00 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ ˆ 2 3 4 5 6 7 θ θ x FIGURE 3.. The top graph shows several training points in one dimension, known or assumed to
More informationCS 6375 Machine Learning
CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.
More informationCh 4. Linear Models for Classification
Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongamro,
More informationMathematical Formulation of Our Example
Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot
More informationInformation Theory in Computer Vision and Pattern Recognition
Francisco Escolano Pablo Suau Boyan Bonev Information Theory in Computer Vision and Pattern Recognition Foreword by Alan Yuille ~ Springer Contents 1 Introduction...............................................
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013 Exam policy: This exam allows two onepage, twosided cheat sheets; No other materials. Time: 2 hours. Be sure to write your name and
More informationL11: Pattern recognition principles
L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction
More information9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering
Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make
More informationday month year documentname/initials 1
ECE471571 Pattern Recognition Lecture 13 Decision Tree Hairong Qi, Gonzalez Family Professor Electrical Engineering and Computer Science University of Tennessee, Knoxville http://www.eecs.utk.edu/faculty/qi
More informationLinear Discrimination Functions
Laurea Magistrale in Informatica Nicola Fanizzi Dipartimento di Informatica Università degli Studi di Bari November 4, 2009 Outline Linear models Gradient descent Perceptron Minimum square error approach
More informationOBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES
OBJECT DETECTION AND RECOGNITION IN DIGITAL IMAGES THEORY AND PRACTICE Bogustaw Cyganek AGH University of Science and Technology, Poland WILEY A John Wiley &. Sons, Ltd., Publication Contents Preface Acknowledgements
More informationCOPYRIGHTED MATERIAL CONTENTS. Preface Preface to the First Edition
Preface Preface to the First Edition xi xiii 1 Basic Probability Theory 1 1.1 Introduction 1 1.2 Sample Spaces and Events 3 1.3 The Axioms of Probability 7 1.4 Finite Sample Spaces and Combinatorics 15
More informationBasic Principles of Unsupervised and Unsupervised
Basic Principles of Unsupervised and Unsupervised Learning Toward Deep Learning Shun ichi Amari (RIKEN Brain Science Institute) collaborators: R. Karakida, M. Okada (U. Tokyo) Deep Learning Self Organization
More informationReading Group on Deep Learning Session 1
Reading Group on Deep Learning Session 1 Stephane Lathuiliere & Pablo Mesejo 2 June 2016 1/31 Contents Introduction to Artificial Neural Networks to understand, and to be able to efficiently use, the popular
More informationCourse in Data Science
Course in Data Science About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst. The course gives an
More informationThe exam is closed book, closed notes except your onepage (two sides) or twopage (one side) crib sheet.
CS 189 Spring 013 Introduction to Machine Learning Final You have 3 hours for the exam. The exam is closed book, closed notes except your onepage (two sides) or twopage (one side) crib sheet. Please
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression  Maximum likelihood estimation (MLE)  Maximum a posteriori (MAP) estimation Biasvariance tradeoff Linear
More informationHANDBOOK OF APPLICABLE MATHEMATICS
HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A WileyInterscience Publication JOHN WILEY & SONS Chichester
More informationFINAL: CS 6375 (Machine Learning) Fall 2014
FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a onepage cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for
More informationMachine Learning Lecture 5
Machine Learning Lecture 5 Linear Discriminant Functions 26.10.2017 Bastian Leibe RWTH Aachen http://www.vision.rwthaachen.de leibe@vision.rwthaachen.de Course Outline Fundamentals Bayes Decision Theory
More informationECE662: Pattern Recognition and Decision Making Processes: HW TWO
ECE662: Pattern Recognition and Decision Making Processes: HW TWO Purdue University Department of Electrical and Computer Engineering West Lafayette, INDIANA, USA Abstract. In this report experiments are
More informationIntroduction to Neural Networks
Introduction to Neural Networks What are (Artificial) Neural Networks? Models of the brain and nervous system Highly parallel Process information much more like the brain than a serial computer Learning
More informationLessons in Estimation Theory for Signal Processing, Communications, and Control
Lessons in Estimation Theory for Signal Processing, Communications, and Control Jerry M. Mendel Department of Electrical Engineering University of Southern California Los Angeles, California PRENTICE HALL
More informationNeural Networks. Nethra Sambamoorthi, Ph.D. Jan CRMportals Inc., Nethra Sambamoorthi, Ph.D. Phone:
Neural Networks Nethra Sambamoorthi, Ph.D Jan 2003 CRMportals Inc., Nethra Sambamoorthi, Ph.D Phone: 7329728969 Nethra@crmportals.com What? Saying it Again in Different ways Artificial neural network
More informationContinuous Univariate Distributions
Continuous Univariate Distributions Volume 1 Second Edition NORMAN L. JOHNSON University of North Carolina Chapel Hill, North Carolina SAMUEL KOTZ University of Maryland College Park, Maryland N. BALAKRISHNAN
More informationLogistic Regression. Machine Learning Fall 2018
Logistic Regression Machine Learning Fall 2018 1 Where are e? We have seen the folloing ideas Linear models Learning as loss minimization Bayesian learning criteria (MAP and MLE estimation) The Naïve Bayes
More informationMODULE 4 BAYEIAN LEARNING
MODULE 4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities
More informationArtificial Neural Networks Examination, June 2005
Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either
More informationGeneralized, Linear, and Mixed Models
Generalized, Linear, and Mixed Models CHARLES E. McCULLOCH SHAYLER.SEARLE Departments of Statistical Science and Biometrics Cornell University A WILEYINTERSCIENCE PUBLICATION JOHN WILEY & SONS, INC. New
More informationUNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014
UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two onepage, twosided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write
More informationHoldout and CrossValidation Methods Overfitting Avoidance
Holdout and CrossValidation Methods Overfitting Avoidance Decision Trees Reduce error pruning Costcomplexity pruning Neural Networks Early stopping Adjusting Regularizers via CrossValidation Nearest
More informationLecture 24: Other (Nonlinear) Classifiers: Decision Tree Learning, Boosting, and Support Vector Classification Instructor: Prof. Ganesh Ramakrishnan
Lecture 24: Other (Nonlinear) Classifiers: Decision Tree Learning, Boosting, and Support Vector Classification Instructor: Prof Ganesh Ramakrishnan October 20, 2016 1 / 25 Decision Trees: Cascade of step
More informationClustering VS Classification
MCQ Clustering VS Classification 1. What is the relation between the distance between clusters and the corresponding class discriminability? a. proportional b. inverselyproportional c. norelation Ans:
More informationMIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,
MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a onepage cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run
More informationNeural Networks. Singlelayer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /9/17
3/9/7 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Singlelayer neural network 3/9/7 Perceptron as a neural
More informationExperimental Design and Data Analysis for Biologists
Experimental Design and Data Analysis for Biologists Gerry P. Quinn Monash University Michael J. Keough University of Melbourne CAMBRIDGE UNIVERSITY PRESS Contents Preface page xv I I Introduction 1 1.1
More informationIntroduction to Graphical Models
Introduction to Graphical Models The 15 th Winter School of Statistical Physics POSCO International Center & POSTECH, Pohang 2018. 1. 9 (Tue.) YungKyun Noh GENERALIZATION FOR PREDICTION 2 Probabilistic
More informationAlgorithmisches Lernen/Machine Learning
Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) DecisionTrees, Genetic Algorithms Part 2: Norman Hendrich SupportVector Machines
More informationPattern Classification
Pattern Classification All materials in these slides were taen from Pattern Classification (2nd ed) by R. O. Duda,, P. E. Hart and D. G. Stor, John Wiley & Sons, 2000 with the permission of the authors
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More information10701/ Machine Learning, Fall
070/578 Machine Learning, Fall 2003 Homework 2 Solution If you have questions, please contact Jiayong Zhang .. (Error Function) The sumofsquares error is the most common training
More informationA graph contains a set of nodes (vertices) connected by links (edges or arcs)
BOLTZMANN MACHINES Generative Models Graphical Models A graph contains a set of nodes (vertices) connected by links (edges or arcs) In a probabilistic graphical model, each node represents a random variable,
More informationMachine Learning. CUNY Graduate Center, Spring Lectures 1112: Unsupervised Learning 1. Professor Liang Huang.
Machine Learning CUNY Graduate Center, Spring 2013 Lectures 1112: Unsupervised Learning 1 (Clustering: kmeans, EM, mixture models) Professor Liang Huang huang@cs.qc.cuny.edu http://acl.cs.qc.edu/~lhuang/teaching/machinelearning
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationMultivariate statistical methods and data mining in particle physics
Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general
More informationUnsupervised Learning
Unsupervised Learning Bayesian Model Comparison Zoubin Ghahramani zoubin@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit, and MSc in Intelligent Systems, Dept Computer Science University College
More informationClassification CE717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012
Classification CE717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Topics Discriminant functions Logistic regression Perceptron Generative models Generative vs. discriminative
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2016
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2016 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationFinal Exam, Machine Learning, Spring 2009
Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009  The exam is openbook, opennotes, no electronics other than calculators.  The maximum possible score on this exam is 100. You have 3
More informationDiscriminant Analysis and Statistical Pattern Recognition
Discriminant Analysis and Statistical Pattern Recognition GEOFFREY J. McLACHLAN Department of Mathematics The University of Queensland St. Lucia, Queensland, Australia A WileyInterscience Publication
More informationThe exam is closed book, closed notes except your onepage cheat sheet.
CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the topright
More informationMachine Learning, Midterm Exam
10601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv BarJoseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have
More informationMachine Learning Lecture 12
Machine Learning Lecture 12 Neural Networks 30.11.2017 Bastian Leibe RWTH Aachen http://www.vision.rwthaachen.de leibe@vision.rwthaachen.de Course Outline Fundamentals Bayes Decision Theory Probability
More informationADAPTIVE FILTER THEORY
ADAPTIVE FILTER THEORY Fourth Edition Simon Haykin Communications Research Laboratory McMaster University Hamilton, Ontario, Canada Front ice Hall PRENTICE HALL Upper Saddle River, New Jersey 07458 Preface
More informationNONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 MultiLayer Perceptrons The BackPropagation Learning Algorithm Generalized Linear Models Radial Basis Function
More informationInformation Dynamics Foundations and Applications
Gustavo Deco Bernd Schürmann Information Dynamics Foundations and Applications With 89 Illustrations Springer PREFACE vii CHAPTER 1 Introduction 1 CHAPTER 2 Dynamical Systems: An Overview 7 2.1 Deterministic
More informationFinal Exam, Fall 2002
15781 Final Exam, Fall 22 1. Write your name and your andrew email address below. Name: Andrew ID: 2. There should be 17 pages in this exam (excluding this cover sheet). 3. If you need more room to work
More informationArtificial Neural Networks
Artificial Neural Networks 鮑興國 Ph.D. National Taiwan University of Science and Technology Outline Perceptrons Gradient descent Multilayer networks Backpropagation Hidden layer representations Examples
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Nonlinearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Nonlinearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationNeed for Deep Networks Perceptron. Can only model linear functions. Kernel Machines. Nonlinearity provided by kernels
Need for Deep Networks Perceptron Can only model linear functions Kernel Machines Nonlinearity provided by kernels Need to design appropriate kernels (possibly selecting from a set, i.e. kernel learning)
More informationBayesian Networks: Construction, Inference, Learning and Causal Interpretation. Volker Tresp Summer 2014
Bayesian Networks: Construction, Inference, Learning and Causal Interpretation Volker Tresp Summer 2014 1 Introduction So far we were mostly concerned with supervised learning: we predicted one or several
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More information6.036 midterm review. Wednesday, March 18, 15
6.036 midterm review 1 Topics covered supervised learning labels available unsupervised learning no labels available semisupervised learning some labels available  what algorithms have you learned that
More informationIntroduction to Machine Learning
Introduction to Machine Learning Thomas G. Dietterich tgd@eecs.oregonstate.edu 1 Outline What is Machine Learning? Introduction to Supervised Learning: Linear Methods Overfitting, Regularization, and the
More informationStatistical and Inductive Inference by Minimum Message Length
C.S. Wallace Statistical and Inductive Inference by Minimum Message Length With 22 Figures Springer Contents Preface 1. Inductive Inference 1 1.1 Introduction 1 1.2 Inductive Inference 5 1.3 The Demise
More informationFINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE
FINAL EXAM: FALL 2013 CS 6375 INSTRUCTOR: VIBHAV GOGATE You are allowed a twopage cheat sheet. You are also allowed to use a calculator. Answer the questions in the spaces provided on the question sheets.
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 11 Project
More informationTesting Statistical Hypotheses
E.L. Lehmann Joseph P. Romano Testing Statistical Hypotheses Third Edition 4y Springer Preface vii I SmallSample Theory 1 1 The General Decision Problem 3 1.1 Statistical Inference and Statistical Decisions
More informationEMalgorithm for Training of Statespace Models with Application to Time Series Prediction
EMalgorithm for Training of Statespace Models with Application to Time Series Prediction Elia Liitiäinen, Nima Reyhani and Amaury Lendasse Helsinki University of Technology  Neural Networks Research
More informationStatistical Learning Reading Assignments
Statistical Learning Reading Assignments S. Gong et al. Dynamic Vision: From Images to Face Recognition, Imperial College Press, 2001 (Chapt. 3, hard copy). T. Evgeniou, M. Pontil, and T. Poggio, "Statistical
More informationECE 521. Lecture 11 (not on midterm material) 13 February Kmeans clustering, Dimensionality reduction
ECE 521 Lecture 11 (not on midterm material) 13 February 2017 Kmeans clustering, Dimensionality reduction With thanks to Ruslan Salakhutdinov for an earlier version of the slides Overview Kmeans clustering
More informationDeep Neural Networks
Deep Neural Networks DT2118 Speech and Speaker Recognition Giampiero Salvi KTH/CSC/TMH giampi@kth.se VT 2015 1 / 45 Outline StatetoOutput Probability Model Artificial Neural Networks Perceptron Multi
More informationStatistical learning. Chapter 20, Sections 1 4 1
Statistical learning Chapter 20, Sections 1 4 Chapter 20, Sections 1 4 1 Outline Bayesian learning Maximum a posteriori and maximum likelihood learning Bayes net learning ML parameter learning with complete
More informationIndex. Santanu Pattanayak 2017 S. Pattanayak, Pro Deep Learning with TensorFlow,
Index A Activation functions, neuron/perceptron binary threshold activation function, 102 103 linear activation function, 102 rectified linear unit, 106 sigmoid activation function, 103 104 SoftMax activation
More informationMachine Learning Lecture 7
Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant
More informationCSC411: Final Review. James Lucas & David Madras. December 3, 2018
CSC411: Final Review James Lucas & David Madras December 3, 2018 Agenda 1. A brief overview 2. Some sample questions Basic ML Terminology The final exam will be on the entire course; however, it will be
More informationClassification: The rest of the story
U NIVERSITY OF ILLINOIS AT URBANACHAMPAIGN CS598 Machine Learning for Signal Processing Classification: The rest of the story 3 October 2017 Today s lecture Important things we haven t covered yet Fisher
More informationContinuous Univariate Distributions
Continuous Univariate Distributions Volume 2 Second Edition NORMAN L. JOHNSON University of North Carolina Chapel Hill, North Carolina SAMUEL KOTZ University of Maryland College Park, Maryland N. BALAKRISHNAN
More informationNeural Networks (Part 1) Goals for the lecture
Neural Networks (Part ) Mark Craven and David Page Computer Sciences 760 Spring 208 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed
More informationClassification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees
Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Rafdord M. Neal and Jianguo Zhang Presented by Jiwen Li Feb 2, 2006 Outline Bayesian view of feature
More information