A Comparison of Probabilistic, Neural, and Fuzzy Modeling in Ecotoxicity

Size: px
Start display at page:

Download "A Comparison of Probabilistic, Neural, and Fuzzy Modeling in Ecotoxicity"

Transcription

1 A Comparison of Probabilistic, Neural, and Fuzzy Modeling in Ecotoxicity Giuseppina Gini, Marco Giumelli DEI, Politecnico di Milano, piazza L. da Vinci 32, Milan, Italy Emilio Benfenati, Nadège Piclin Istituto Mario Negri, via Eritrea 62, Milan, Italy Jacques Chrétien, Marco Pintore Lab Chemometrics & Bioinformatics, University of Orléans, France Abstract. The common practice in inducing toxicity models from data is regression analysis. The predictive power of such models is usually poor with very different molecules and toxicity end-points. In the present work, we study toxicity classification of pesticides: we discuss about knowledge representation, and we test probabilistic and softcomputing techniques. We conclude with the interpretability of the induced models. 1. Introduction The study of the consequences of chemicals on the health of human beings and wildlife requires new computer-based approaches to analyse complex information and to automatically discover knowledge implicitly contained in data [4]. The goal of toxicity prediction is to describe the relationship between chemical properties and biological and toxicological processes. So far most of the studies are Quantitative Structure-Activity Relationship (QSAR) models, which predict a continuous value [5]. However, due to the variability of the toxicity phenomenon [1], classification may present advantages because it determines intervals of the toxic effect, and has direct application for regulation of chemicals. In the present study we report on different methods to build models for aquatic toxicity prediction, as linear and non linear regression, Bayesian networks [7], Self-Organizing Map (SOM) [8], and Support Vector Machine (SVM) [2]. Fuzzy logic [12] is finally used. Some major points are discussed: - Choice of the chemical class before studying toxic activity: it is difficult to obtain good predictive models for heterogeneous sets of compounds.; - Reliability of prediction; neither the means square error, nor the correlation, are good measures of the predictive power of a model. A commonly used validation method is the leave-one-out (LOO) crossvalidated coefficient R 2 cv, calculated as 1 PRESS/SS Y, where PRESS is the prediction error sum of squares of the model and SS Y is the sum of squares for the Y variable. With this method results are optimistic compared to the validation with an external set; however a major problem in toxicity prediction is the availability of too few experimental data to split them in two sets (training and test); - Molecule description: our hypothesis is to consider the whole molecule and not some of its fragments. The road here proposed is to see how to use the available knowledge, not to produce a symbolic system to integrate the induced models. 2. Knowledge Representation: Eco-Toxicity and Molecules In QSAR models, logp is commonly used to model the bioaccumulation potential of chemicals, but alone is not enough to model toxicity. Two of the open problems in QSAR are defining the relevant molecular descriptor,s and the availability of general models. We faced these problems studying pesticides. Structures and toxicity data of pesticides have been obtained from "The Pesticide Manual (1997)". Lethal concentration for 50% of the animals (LC 50 ) is the chosen endpoint. A set of 235 pesticides (Table 1). has been obtained, and checked, for rat and aquatic toxicity. Modelling has been done on the whole data set or on chemically homogeneous subsets, which share the same biological mechanism. Correlation analysis of the toxicity is in Table 2. About chemical knowledge, the practice in AI has been to represent molecules as graphs. Nodes represent atoms, arcs are the bonds. This view of the chemical structures is too weak: graphs represent only the planar topology of the molecule, and are unable to consider the 3D structure or energy formation. Our model building uses more chemical information, as in Figure 1.

2 Table 1: Number of compounds for each chemical class. Chemical Class Total Training Set Test Set Anilines Aromatic halogenated Carbamates Heterocycles Organophosphorous Ureas Different Class Total Table 2: Correlation matrix of LC 50 (correlation coefficient r). Quail Trout Daphnia Trout Daphnia Duck Training Set 165 Structures Test Set 70 Structures Hyperchem 5.0 3D Representation Quantum Chemical Log D Geometrical Topological Electrostatic PM 3 Energy minimization Hyperchem 5.0 Pallas 2.1 CODESSA descriptors PCA, GA, etc. Selection of descriptors Model construction Fuzzy Partition (AFP) Model development and Prediction Figure 1: Construction of the models. 3. Regression studies 3.1. Multivariate Analysis Data have been analyzed using principal component regression (PCR) and partial least squares (PLS). Table 3 shows, for internal validation, that no results were obtained modelling all pesticides. PLS models on single chemical classes showed variable results: no toxicological end-point was predicted for all chemical classes, and no chemical class gave good results for all end-points (Table 3). Table 3: Linear regression for LC 50 using PLS. Figures indicate R 2 cv, when > 0.5. Chemical Class rainbow trout daphnia rat duck quail Aniline No results No results No results Carbamate No results No results No results No results No results Organophosphorus No results 0.69 No results No results No results Urea No results No results Heterocyclic No results 0.56 No results 0.55 No results Halogenated aromatic No results No results No results No results NN We used NN as non-linear regression. Descriptors selection was necessary. In order to statistically study the input variables, Principal Component Analysis (PCA) has been used, since it permits to evaluate correlation and relevance of variables, to see the multivariate information characterizing the objects in a two dimensional orthogonal space, to synthesise data and eliminate noise. The loading plot graph permits

3 to identify the role of each variable (molecular descriptor) towards the two principal components considered and their direct or inverse correlations. Considering loading plots on I and II component, 15 variables with the greatest weight have been selected. The same approach has been used on the III and IV principal component yielding a selection of six other descriptors. The total variance expressed by the four components is 90%. With the selected variables we built a predictor with back-propagation NN. Results were good in LOO, not for an external validation set of 13 molecules. A problem may be the coupling of PCA and NN: PCA compresses data on the basis of linear behavior, but does not keep into account non-linearity assumed for aquatic toxicity. So we switched from regression to classification, a matter addressed also in other studies [11]. 4. Classifiers We used the log values of LC 50, scaled between 0 and 1, and we divided this interval into four classes (ClassA ; ClassB ; ClassC ; ClassD ). 4.3 Bayesian Bayes theorem expresses a way to update beliefs in a hypothesis given additional evidence and the background context. Bayesian networks are a graphical formalism for reasoning about probability distributions: they use a directed acyclic graph (DAG) to encode conditional independence assumptions about the domain. Each variable is represented as a node, an arc between two nodes denotes the existence of a direct probabilistic dependency between the two variables. A Bayesian Network classifier contains a node C for the class variable and a node X for each of the domain feature. Given an instance vector x, the network computes the probability P(C = c k X = x) for each class c k. If the true distribution P(C X) is known we achieve the optimal classification by selecting the class c k for which the probability is maximum. Unfortunately the true distribution can only be approximated from the training set, a process NP-hard for general networks; so we chose the Naï ve Bayes Classifier, where all the features nodes are directly connected to the class node. Table 4 shows results using LOO with Bayda ( Table 4: Correct classification (%) by Bayda using LOO for all or selected descriptors Trout Daphnia Rat Chemical class All descr. Sel. descr. All descr.sel. descrall descr.sel. descr. Anilines Carbamates Heterocycles Halogenated Aromatics Organophosphorous Ureas Self Organizing Maps (SOM) Vector Quantization (VQ) networks are unsupervised density estimators. Each competitive unit corresponds to a cluster, the center of which is called a "codebook vector". Kohonen's learning law finds the codebook vector closest to each training case and moves the "winning" codebook vector closer to the training case, according to a learning rate. In a SOM, the neurons (clusters) are organized into a grid. The grid exists in a space that is separate from the input space; any number of inputs may be used as long as the number of inputs is greater than the dimensionality of the grid space. A SOM tries to find clusters such that any two clusters that are close to each other in the grid space have codebook vectors close to each other in the input space. After SOM have been trained, the quality of prediction has been evaluated on the mapping precision (Table 5). However with an external validation set, no results could be obtained. For this reason, subsets of chemical classes were considered for daphnia (Table 6). The results are correct, (about 70%) but difficult to generalize with an external validation set Support Vector Machines (SVM) A common problem of neural networks is overfitting. SVM try to avoid this by finding the hyperplane with maximal distance from the hyper plane to data in the feature space. The hyper plane is represented by an expansion of a subset of the training data, called support vector. Support Vector Machines nonlinearly map their n-dimensional input space into a high dimensional feature space, where a linear classifier is constructed. Two features make this approach successful: the generalisation ability, and the

4 Table 5: Correct classification (%) obtained with SOM using LOO for all the compounds on five species. Animal Daphnia Duck Quail Rat Trout Correct % Table 6: Comparison between %values obtained with LOO and validation set for Daphnia (SOM). LOO Validation Set Chemical Class Training Set Test Set Anilines Aromatic halogenated Carbamates Heterocycles Organophosphorous Ureas simplicity (construction of the classifier only needs to evaluate an inner product between two vectors of the training data). For classification, SVM operate by finding a hyper surface in the space of possible inputs, to split the positive examples from the negative ones. The split will be chosen to have the largest distance from the hyper surface to the nearest of the positive and negative examples. Although the separating hyper plane is linear, it is on a feature space induced by a kerne, hence can separate data, which are linearly no separable in input space. For our experiments (table 7) we applied the MatLab OSU_SVM toolbox, ( maj/osu_svm). Table 7: Comparison between % values obtained with LOO and validation set for Daphnia (SVM). LOO Validation Set Chemical Class Training Set Test Set Anilines Aromatic halogenated Carbamates Heterocycles Organophosphorous Ureas The new Adaptative Fuzzy Partition (AFP) classifier AFP is a supervised classification method [9]. It models relations between molecular descriptors and chemical activities by dynamically dividing the descriptor space into a set of fuzzy partitioned subspaces. In a first phase, the global descriptor hyperspace is cut into two subspaces where the fuzzy rules are derived. These two subspaces are divided step by step into smaller subspaces until certain conditions are satisfied, namely: the number of molecular vectors within a subspace attains a minimum threshold number; the difference between two generated subspaces is negligible in terms of chemical activities; the number of subspaces exceeds a maximum threshold number. The aim of the algorithm is to select the descriptor and the cut position which allow getting the maximal difference between the two fuzzy rule scores generated by the new subspaces. The score is determined by the weighted average of the chemical activity values in an active subspace A and in its neighbouring subspaces. If the number of trial cuts per descriptor is defined by N_cut, the number of trial partitions equals (N_cut + 1) N. Only the best cut is selected to subdivide the original subspace. All rules created during the fuzzy procedure are considered to establish the model between descriptors hyperspace and chemical activities. Indicating with P(x 1, x 2,... x n ) a molecular vector, a rule for a subspace S k is defined by: if x 1 is associated with µ 1k (x 1 ) and x 2 is associated with µ 2k (x 2 ) and x N is associated with µ Nk (x N ) the score of the activity O for P is O kp, where x i represents the value of the i th descriptor for the molecule P, µ 1k is the membership function for the subspace k related to descriptor i and O kp is the activity value related to the subspace S k. The and is represented by the Min operator and the membership functions are defined by trapezoidal shapes. If the width of a subspace S k on the i th dimension, after each cut, is represented by w i, the p and q parameters defining the shape of the trapezoid are calculated as p = λ i w i and q = ν i w i, where the to parameters λ i and ν i vary so that p 1 and q 1. All the rules created during the fuzzy procedure are considered to build the model. After establishing AFP model, a centroid defuzzification procedure [5] determines the chemical activity of a new test molecule. All the subspaces k are considered and the the membership degree for the activity O and a molecule Pj is:

5 O (P j ) = N_subsp N (Min i µ ik ( x i ) P j ).( O k ) k = 1 N_subsp k = 1 (Min i N µ ik (x i ) P j ) µ ik (x i ) P membership function for descriptor i in molecule Pj j N total number of molecular descriptors N_subsp total number of subspaces O k, global score of the activity in the subspace S k The full set of the pesticides was used for the prediction of toxicity against the trout, considering two different classifications. The first one includes four toxicity classes as established by the EU Directive 92/32/EEC, the second defines three toxicity classes so to include in the training set a similar number of compounds. The relevant descriptors were selected by an innovative procedure based on genetic algorithms [10]. After establishing the AFP model on the training set, the toxicity classes for the test set were predicted. The statistical results for the validation tests are reported in Table 8. For the homogeneous intervals, the AFP model predicts the correct activity for 71% of the test set compounds. Similar results are derived for the training set s, confirming the goodness of the model. Moreover, class 3, including high toxicity compounds, was the best correctly predicted (86%). In the case of the EU intervals, AFP established a model able to predict correctly 60% of the test set compounds and 78% of the training set compounds (Table 8). The most toxic class was better predicted (69%). AFP builds up a scheme of the rules used for each toxicity class, as for example: if 0 < x(log D-pH5) < 0.26 and 0 < x(balaban Index) < 0.51 and x(randic Index) > the membership degree of class 1, for the compound 34, is 0.5. Table 8: Statistical values (%) for classification established on the homogeneous and EU intervals. Classes range EU EU % range hom. Homogeneous % LC50 (mg/l) Training Set Test Set LC50 (mg/l) Training se Test Set Class1 > > Class Class < Class4 < Total Conclusions Specific problems for ecotoxicity are the incomplete knowledge on toxicity, the quality of the experimental data affected by a high variability, their limited number. Here we show results with LOO and validation set; and observe that LOO is quite optimistic, probably for the high discontinuity of the function to be learned. AFP shows the best capability to generalize, and allows deriving rules which could give insights in the biochemical processes. Acknowledgments We acknowledge the financial contribution of the European Commission, project HPRN-CT References [1] Benfenati, E., Grasso, P., Pelagatti, S. and Gini, G. On variables and variability in predictive toxicology. IV Girona Seninar on Molecular Similarity, July 5-7, 1999, Girona, Spain. [2] Cristianini, N. and Shawe-Taylor, J. An Introduction to Support Vector Machines, Cambridge Univ. Press, [3] Gini, G., Lorenzini, M., Benfenati, E., Grasso, P. and Bruschi, M.. Predictive Carcinogenicity: a Model for Aromatic Compounds, with Nitrogen-containing Substituents, Based on Molecular Descriptors Using an Artificial Neural Network, J. Chem. Inf. Comp. Sci., 39 (1999) [4] Gini, G. and. Katrizky, A. (eds). Predictive Toxicology of Chemicals: Experiences and Impact of AI Tools. SS , AAAI Press, Menlo Park, California, [5] Gupta, M. M. and Qi, J. Theory of T-norms and fuzzy inference methods, Fuzzy Sets and Systems, 40 (1991). [6] Hansch, C., Hoekman, D., Leo, A., Zhang, L. and Li, P. The expanding role of quantitative structure-activity relationships (QSAR) in toxicology. Toxicology Letters 79 (1995) [7] Heckerman, D., Geiger, D. and Chickering, D. Learning Bayesian Networks: the combination of knowledge and statistical data. Machine Learning 20, (1995) 3, p [8] Kohonen, T. The self-organizing map, Neurocomputing, 21 (1998) 1-6. [9] Pintore, M., Ros, F., Audouze, K. and Chrétien, J.R. Adaptive Fuzzy Partition in Data Base Mining, [10] Ros, F., Pintore, M. and Chrétien, J. R. Molecular Descriptor Selection Combining Genetic Algorithms and Fuzzy Logic, submitted, [11] Torgo, L. and Gama, I. Regression Using Classification Algorithms. Intelligent Data Analysis, [12] Zadeh, L.A. Fuzzy sets and their applications to classification and clustering. In Van Ryzin J. Ed. Classification and Clustering. Academic Press, New York,1977, pp

Structure-Activity Modeling - QSAR. Uwe Koch

Structure-Activity Modeling - QSAR. Uwe Koch Structure-Activity Modeling - QSAR Uwe Koch QSAR Assumption: QSAR attempts to quantify the relationship between activity and molecular strcucture by correlating descriptors with properties Biological activity

More information

Tuning Neural and Fuzzy-Neural Networks for Toxicity Modeling

Tuning Neural and Fuzzy-Neural Networks for Toxicity Modeling J. Chem. Inf. Comput. Sci. 2003, 43, 513-518 513 Tuning Neural and Fuzzy-Neural Networks for Toxicity Modeling P. Mazzatorta,*, E. Benfenati, C.-D. Neagu, and G. Gini Istituto Mario Negri, via Eritrea

More information

The Importance of Scaling in Data Mining for Toxicity Prediction

The Importance of Scaling in Data Mining for Toxicity Prediction 1250 J. Chem. Inf. Comput. Sci. 2002, 42, 1250-1255 The Importance of Scaling in Data Mining for Toxicity Prediction Paolo Mazzatorta and Emilio Benfenati* Istituto di Ricerche Farmacologiche Mario Negri

More information

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted

More information

Training through European Research Training Networks - Analysis of IMAGETOX

Training through European Research Training Networks - Analysis of IMAGETOX Training through European Research Training Networks - Analysis of IMAGETOX Giuseppina C. Gini Emilio Benfenati Ciprian Daniel Neagu 1 Dept of Electronics and Information Istituto di Ricerche Farmacologiche

More information

Artificial Neural Networks Examination, March 2004

Artificial Neural Networks Examination, March 2004 Artificial Neural Networks Examination, March 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

CHAPTER-17. Decision Tree Induction

CHAPTER-17. Decision Tree Induction CHAPTER-17 Decision Tree Induction 17.1 Introduction 17.2 Attribute selection measure 17.3 Tree Pruning 17.4 Extracting Classification Rules from Decision Trees 17.5 Bayesian Classification 17.6 Bayes

More information

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan

Support'Vector'Machines. Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan Support'Vector'Machines Machine(Learning(Spring(2018 March(5(2018 Kasthuri Kannan kasthuri.kannan@nyumc.org Overview Support Vector Machines for Classification Linear Discrimination Nonlinear Discrimination

More information

Modeling Mutagenicity Status of a Diverse Set of Chemical Compounds by Envelope Methods

Modeling Mutagenicity Status of a Diverse Set of Chemical Compounds by Envelope Methods Modeling Mutagenicity Status of a Diverse Set of Chemical Compounds by Envelope Methods Subho Majumdar School of Statistics, University of Minnesota Envelopes in Chemometrics August 4, 2014 1 / 23 Motivation

More information

1. Introduction. Central European Journal of Chemistry

1. Introduction. Central European Journal of Chemistry Cent. Eur. J. Chem. 9(1) 2011 165-174 DOI: 10.2478/s11532-010-0135-7 Central European Journal of Chemistry a tool for QSAR-modeling of carcinogenicity: an unexpected good prediction based on a model that

More information

Bayesian Networks Inference with Probabilistic Graphical Models

Bayesian Networks Inference with Probabilistic Graphical Models 4190.408 2016-Spring Bayesian Networks Inference with Probabilistic Graphical Models Byoung-Tak Zhang intelligence Lab Seoul National University 4190.408 Artificial (2016-Spring) 1 Machine Learning? Learning

More information

Learning Vector Quantization

Learning Vector Quantization Learning Vector Quantization Neural Computation : Lecture 18 John A. Bullinaria, 2015 1. SOM Architecture and Algorithm 2. Vector Quantization 3. The Encoder-Decoder Model 4. Generalized Lloyd Algorithms

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) Human Brain Neurons Input-Output Transformation Input Spikes Output Spike Spike (= a brief pulse) (Excitatory Post-Synaptic Potential)

More information

Learning Vector Quantization (LVQ)

Learning Vector Quantization (LVQ) Learning Vector Quantization (LVQ) Introduction to Neural Computation : Guest Lecture 2 John A. Bullinaria, 2007 1. The SOM Architecture and Algorithm 2. What is Vector Quantization? 3. The Encoder-Decoder

More information

Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems.

Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems. Molecular descriptors and chemometrics: a powerful combined tool for pharmaceutical, toxicological and environmental problems. Roberto Todeschini Milano Chemometrics and QSAR Research Group - Dept. of

More information

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin

Introduction to Machine Learning. PCA and Spectral Clustering. Introduction to Machine Learning, Slides: Eran Halperin 1 Introduction to Machine Learning PCA and Spectral Clustering Introduction to Machine Learning, 2013-14 Slides: Eran Halperin Singular Value Decomposition (SVD) The singular value decomposition (SVD)

More information

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others)

Machine Learning. Neural Networks. (slides from Domingos, Pardo, others) Machine Learning Neural Networks (slides from Domingos, Pardo, others) For this week, Reading Chapter 4: Neural Networks (Mitchell, 1997) See Canvas For subsequent weeks: Scaling Learning Algorithms toward

More information

10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification

10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification 10-810: Advanced Algorithms and Models for Computational Biology Optimal leaf ordering and classification Hierarchical clustering As we mentioned, its one of the most popular methods for clustering gene

More information

Chemometrics: Classification of spectra

Chemometrics: Classification of spectra Chemometrics: Classification of spectra Vladimir Bochko Jarmo Alander University of Vaasa November 1, 2010 Vladimir Bochko Chemometrics: Classification 1/36 Contents Terminology Introduction Big picture

More information

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD

ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD ARTIFICIAL NEURAL NETWORK PART I HANIEH BORHANAZAD WHAT IS A NEURAL NETWORK? The simplest definition of a neural network, more properly referred to as an 'artificial' neural network (ANN), is provided

More information

QSAR in Green Chemistry

QSAR in Green Chemistry QSAR in Green Chemistry Activity Relationship QSAR is the acronym for Quantitative Structure-Activity Relationship Chemistry is based on the premise that similar chemicals will behave similarly The behavior/activity

More information

The problem of classification in Content-Based Image Retrieval systems

The problem of classification in Content-Based Image Retrieval systems The problem of classification in Content-Based Image Retrieval systems Tatiana Jaworska Systems Research Institute, Polish Academy of Sciences Warsaw, Poland Presentation plan Overview of the main idea

More information

Overview of Different Artificial Intelligence Approaches Combined with a Deductive Logic-based Expert System for Predicting Chemical Toxicity

Overview of Different Artificial Intelligence Approaches Combined with a Deductive Logic-based Expert System for Predicting Chemical Toxicity 94 From: AAAI Technical Report SS-99-01. Compilation copyright 1999, AAAI (www.aaai.org). All rights reserved. Overview of Different Artificial Intelligence Approaches Combined with a Deductive Logic-based

More information

Artificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence

Artificial Intelligence (AI) Common AI Methods. Training. Signals to Perceptrons. Artificial Neural Networks (ANN) Artificial Intelligence Artificial Intelligence (AI) Artificial Intelligence AI is an attempt to reproduce intelligent reasoning using machines * * H. M. Cartwright, Applications of Artificial Intelligence in Chemistry, 1993,

More information

Final Examination CS 540-2: Introduction to Artificial Intelligence

Final Examination CS 540-2: Introduction to Artificial Intelligence Final Examination CS 540-2: Introduction to Artificial Intelligence May 7, 2017 LAST NAME: SOLUTIONS FIRST NAME: Problem Score Max Score 1 14 2 10 3 6 4 10 5 11 6 9 7 8 9 10 8 12 12 8 Total 100 1 of 11

More information

The Lady Tasting Tea. How to deal with multiple testing. Need to explore many models. More Predictive Modeling

The Lady Tasting Tea. How to deal with multiple testing. Need to explore many models. More Predictive Modeling The Lady Tasting Tea More Predictive Modeling R. A. Fisher & the Lady B. Muriel Bristol claimed she prefers tea added to milk rather than milk added to tea Fisher was skeptical that she could distinguish

More information

Artificial Neural Networks Examination, June 2005

Artificial Neural Networks Examination, June 2005 Artificial Neural Networks Examination, June 2005 Instructions There are SIXTY questions. (The pass mark is 30 out of 60). For each question, please select a maximum of ONE of the given answers (either

More information

Artificial Neural Networks Examination, June 2004

Artificial Neural Networks Examination, June 2004 Artificial Neural Networks Examination, June 2004 Instructions There are SIXTY questions (worth up to 60 marks). The exam mark (maximum 60) will be added to the mark obtained in the laborations (maximum

More information

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009

AN INTRODUCTION TO NEURAL NETWORKS. Scott Kuindersma November 12, 2009 AN INTRODUCTION TO NEURAL NETWORKS Scott Kuindersma November 12, 2009 SUPERVISED LEARNING We are given some training data: We must learn a function If y is discrete, we call it classification If it is

More information

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu

From statistics to data science. BAE 815 (Fall 2017) Dr. Zifei Liu From statistics to data science BAE 815 (Fall 2017) Dr. Zifei Liu Zifeiliu@ksu.edu Why? How? What? How much? How many? Individual facts (quantities, characters, or symbols) The Data-Information-Knowledge-Wisdom

More information

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine

CS 484 Data Mining. Classification 7. Some slides are from Professor Padhraic Smyth at UC Irvine CS 484 Data Mining Classification 7 Some slides are from Professor Padhraic Smyth at UC Irvine Bayesian Belief networks Conditional independence assumption of Naïve Bayes classifier is too strong. Allows

More information

Mining Classification Knowledge

Mining Classification Knowledge Mining Classification Knowledge Remarks on NonSymbolic Methods JERZY STEFANOWSKI Institute of Computing Sciences, Poznań University of Technology SE lecture revision 2013 Outline 1. Bayesian classification

More information

Predictive analysis on Multivariate, Time Series datasets using Shapelets

Predictive analysis on Multivariate, Time Series datasets using Shapelets 1 Predictive analysis on Multivariate, Time Series datasets using Shapelets Hemal Thakkar Department of Computer Science, Stanford University hemal@stanford.edu hemal.tt@gmail.com Abstract Multivariate,

More information

Probabilistic Graphical Models (I)

Probabilistic Graphical Models (I) Probabilistic Graphical Models (I) Hongxin Zhang zhx@cad.zju.edu.cn State Key Lab of CAD&CG, ZJU 2015-03-31 Probabilistic Graphical Models Modeling many real-world problems => a large number of random

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Matrix Data: Classification: Part 2 Instructor: Yizhou Sun yzsun@ccs.neu.edu September 21, 2014 Methods to Learn Matrix Data Set Data Sequence Data Time Series Graph & Network

More information

Brief Introduction of Machine Learning Techniques for Content Analysis

Brief Introduction of Machine Learning Techniques for Content Analysis 1 Brief Introduction of Machine Learning Techniques for Content Analysis Wei-Ta Chu 2008/11/20 Outline 2 Overview Gaussian Mixture Model (GMM) Hidden Markov Model (HMM) Support Vector Machine (SVM) Overview

More information

Machine Learning Concepts in Chemoinformatics

Machine Learning Concepts in Chemoinformatics Machine Learning Concepts in Chemoinformatics Martin Vogt B-IT Life Science Informatics Rheinische Friedrich-Wilhelms-Universität Bonn BigChem Winter School 2017 25. October Data Mining in Chemoinformatics

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ Bayesian paradigm Consistent use of probability theory

More information

Approximate Inference Part 1 of 2

Approximate Inference Part 1 of 2 Approximate Inference Part 1 of 2 Tom Minka Microsoft Research, Cambridge, UK Machine Learning Summer School 2009 http://mlg.eng.cam.ac.uk/mlss09/ 1 Bayesian paradigm Consistent use of probability theory

More information

Real Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report

Real Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report Real Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report Hujia Yu, Jiafu Wu [hujiay, jiafuwu]@stanford.edu 1. Introduction Housing prices are an important

More information

Machine Learning 2nd Edition

Machine Learning 2nd Edition INTRODUCTION TO Lecture Slides for Machine Learning 2nd Edition ETHEM ALPAYDIN, modified by Leonardo Bobadilla and some parts from http://www.cs.tau.ac.il/~apartzin/machinelearning/ The MIT Press, 2010

More information

Machine Learning : Support Vector Machines

Machine Learning : Support Vector Machines Machine Learning Support Vector Machines 05/01/2014 Machine Learning : Support Vector Machines Linear Classifiers (recap) A building block for almost all a mapping, a partitioning of the input space into

More information

Machine Learning, Midterm Exam

Machine Learning, Midterm Exam 10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have

More information

Issues and Techniques in Pattern Classification

Issues and Techniques in Pattern Classification Issues and Techniques in Pattern Classification Carlotta Domeniconi www.ise.gmu.edu/~carlotta Machine Learning Given a collection of data, a machine learner eplains the underlying process that generated

More information

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści

Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, Spis treści Deep learning / Ian Goodfellow, Yoshua Bengio and Aaron Courville. - Cambridge, MA ; London, 2017 Spis treści Website Acknowledgments Notation xiii xv xix 1 Introduction 1 1.1 Who Should Read This Book?

More information

Brief Introduction to Machine Learning

Brief Introduction to Machine Learning Brief Introduction to Machine Learning Yuh-Jye Lee Lab of Data Science and Machine Intelligence Dept. of Applied Math. at NCTU August 29, 2016 1 / 49 1 Introduction 2 Binary Classification 3 Support Vector

More information

Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees

Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Classification for High Dimensional Problems Using Bayesian Neural Networks and Dirichlet Diffusion Trees Rafdord M. Neal and Jianguo Zhang Presented by Jiwen Li Feb 2, 2006 Outline Bayesian view of feature

More information

Quantitative Structure-Activity Relationship (QSAR) computational-drug-design.html

Quantitative Structure-Activity Relationship (QSAR)  computational-drug-design.html Quantitative Structure-Activity Relationship (QSAR) http://www.biophys.mpg.de/en/theoretical-biophysics/ computational-drug-design.html 07.11.2017 Ahmad Reza Mehdipour 07.11.2017 Course Outline 1. 1.Ligand-

More information

Linear discriminant functions

Linear discriminant functions Andrea Passerini passerini@disi.unitn.it Machine Learning Discriminative learning Discriminative vs generative Generative learning assumes knowledge of the distribution governing the data Discriminative

More information

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition

Data Mining Classification: Basic Concepts and Techniques. Lecture Notes for Chapter 3. Introduction to Data Mining, 2nd Edition Data Mining Classification: Basic Concepts and Techniques Lecture Notes for Chapter 3 by Tan, Steinbach, Karpatne, Kumar 1 Classification: Definition Given a collection of records (training set ) Each

More information

CS 6375 Machine Learning

CS 6375 Machine Learning CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.

More information

The In Silico Model for Mutagenicity

The In Silico Model for Mutagenicity Inn o vat i v e te c h n o l o g i e s, c o n c e p t s an d ap p r o a c h e s The In Silico Model for Mutagenicity Giuseppina Gini 1, Thomas Ferrari 1 and Alessandra Roncaglioni 2 1 DEI, Politecnico

More information

Neural Networks and Machine Learning research at the Laboratory of Computer and Information Science, Helsinki University of Technology

Neural Networks and Machine Learning research at the Laboratory of Computer and Information Science, Helsinki University of Technology Neural Networks and Machine Learning research at the Laboratory of Computer and Information Science, Helsinki University of Technology Erkki Oja Department of Computer Science Aalto University, Finland

More information

Analysis of Interest Rate Curves Clustering Using Self-Organising Maps

Analysis of Interest Rate Curves Clustering Using Self-Organising Maps Analysis of Interest Rate Curves Clustering Using Self-Organising Maps M. Kanevski (1), V. Timonin (1), A. Pozdnoukhov(1), M. Maignan (1,2) (1) Institute of Geomatics and Analysis of Risk (IGAR), University

More information

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning

CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska. NEURAL NETWORKS Learning CSE 352 (AI) LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Learning Neural Networks Classifier Short Presentation INPUT: classification data, i.e. it contains an classification (class) attribute.

More information

Classification of Ordinal Data Using Neural Networks

Classification of Ordinal Data Using Neural Networks Classification of Ordinal Data Using Neural Networks Joaquim Pinto da Costa and Jaime S. Cardoso 2 Faculdade Ciências Universidade Porto, Porto, Portugal jpcosta@fc.up.pt 2 Faculdade Engenharia Universidade

More information

Machine Learning And Applications: Supervised Learning-SVM

Machine Learning And Applications: Supervised Learning-SVM Machine Learning And Applications: Supervised Learning-SVM Raphaël Bournhonesque École Normale Supérieure de Lyon, Lyon, France raphael.bournhonesque@ens-lyon.fr 1 Supervised vs unsupervised learning Machine

More information

Plan. Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics.

Plan. Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics. Plan Lecture: What is Chemoinformatics and Drug Design? Description of Support Vector Machine (SVM) and its used in Chemoinformatics. Exercise: Example and exercise with herg potassium channel: Use of

More information

Quantitative Structure Activity Relationships: An overview

Quantitative Structure Activity Relationships: An overview Quantitative Structure Activity Relationships: An overview Prachi Pradeep Oak Ridge Institute for Science and Education Research Participant National Center for Computational Toxicology U.S. Environmental

More information

Classification Based on Logical Concept Analysis

Classification Based on Logical Concept Analysis Classification Based on Logical Concept Analysis Yan Zhao and Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yanzhao, yyao}@cs.uregina.ca Abstract.

More information

Molecular Similarity Searching Using Inference Network

Molecular Similarity Searching Using Inference Network Molecular Similarity Searching Using Inference Network Ammar Abdo, Naomie Salim* Faculty of Computer Science & Information Systems Universiti Teknologi Malaysia Molecular Similarity Searching Search for

More information

L11: Pattern recognition principles

L11: Pattern recognition principles L11: Pattern recognition principles Bayesian decision theory Statistical classifiers Dimensionality reduction Clustering This lecture is partly based on [Huang, Acero and Hon, 2001, ch. 4] Introduction

More information

Mathematical Formulation of Our Example

Mathematical Formulation of Our Example Mathematical Formulation of Our Example We define two binary random variables: open and, where is light on or light off. Our question is: What is? Computer Vision 1 Combining Evidence Suppose our robot

More information

PATTERN CLASSIFICATION

PATTERN CLASSIFICATION PATTERN CLASSIFICATION Second Edition Richard O. Duda Peter E. Hart David G. Stork A Wiley-lnterscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane Singapore Toronto CONTENTS

More information

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Linear classifier Which classifier? x 2 x 1 2 Linear classifier Margin concept x 2

More information

Chap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University

Chap 1. Overview of Statistical Learning (HTF, , 2.9) Yongdai Kim Seoul National University Chap 1. Overview of Statistical Learning (HTF, 2.1-2.6, 2.9) Yongdai Kim Seoul National University 0. Learning vs Statistical learning Learning procedure Construct a claim by observing data or using logics

More information

Forecasting with Expert Opinions

Forecasting with Expert Opinions CS 229 Machine Learning Forecasting with Expert Opinions Khalid El-Awady Background In 2003 the Wall Street Journal (WSJ) introduced its Monthly Economic Forecasting Survey. Each month the WSJ polls between

More information

Statistical learning theory, Support vector machines, and Bioinformatics

Statistical learning theory, Support vector machines, and Bioinformatics 1 Statistical learning theory, Support vector machines, and Bioinformatics Jean-Philippe.Vert@mines.org Ecole des Mines de Paris Computational Biology group ENS Paris, november 25, 2003. 2 Overview 1.

More information

Overview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated

Overview of Statistical Tools. Statistical Inference. Bayesian Framework. Modeling. Very simple case. Things are usually more complicated Fall 3 Computer Vision Overview of Statistical Tools Statistical Inference Haibin Ling Observation inference Decision Prior knowledge http://www.dabi.temple.edu/~hbling/teaching/3f_5543/index.html Bayesian

More information

ESS2222. Lecture 4 Linear model

ESS2222. Lecture 4 Linear model ESS2222 Lecture 4 Linear model Hosein Shahnas University of Toronto, Department of Earth Sciences, 1 Outline Logistic Regression Predicting Continuous Target Variables Support Vector Machine (Some Details)

More information

18.6 Regression and Classification with Linear Models

18.6 Regression and Classification with Linear Models 18.6 Regression and Classification with Linear Models 352 The hypothesis space of linear functions of continuous-valued inputs has been used for hundreds of years A univariate linear function (a straight

More information

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project

Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Performance Comparison of K-Means and Expectation Maximization with Gaussian Mixture Models for Clustering EE6540 Final Project Devin Cornell & Sushruth Sastry May 2015 1 Abstract In this article, we explore

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

FINAL: CS 6375 (Machine Learning) Fall 2014

FINAL: CS 6375 (Machine Learning) Fall 2014 FINAL: CS 6375 (Machine Learning) Fall 2014 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for

More information

Machine Learning, Fall 2009: Midterm

Machine Learning, Fall 2009: Midterm 10-601 Machine Learning, Fall 009: Midterm Monday, November nd hours 1. Personal info: Name: Andrew account: E-mail address:. You are permitted two pages of notes and a calculator. Please turn off all

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 218 Outlines Overview Introduction Linear Algebra Probability Linear Regression 1

More information

Microarray Data Analysis: Discovery

Microarray Data Analysis: Discovery Microarray Data Analysis: Discovery Lecture 5 Classification Classification vs. Clustering Classification: Goal: Placing objects (e.g. genes) into meaningful classes Supervised Clustering: Goal: Discover

More information

Modern Information Retrieval

Modern Information Retrieval Modern Information Retrieval Chapter 8 Text Classification Introduction A Characterization of Text Classification Unsupervised Algorithms Supervised Algorithms Feature Selection or Dimensionality Reduction

More information

Pattern Recognition and Machine Learning

Pattern Recognition and Machine Learning Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability

More information

Classification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1).

Classification. The goal: map from input X to a label Y. Y has a discrete set of possible values. We focused on binary Y (values 0 or 1). Regression and PCA Classification The goal: map from input X to a label Y. Y has a discrete set of possible values We focused on binary Y (values 0 or 1). But we also discussed larger number of classes

More information

Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks

Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks Modeling High-Dimensional Discrete Data with Multi-Layer Neural Networks Yoshua Bengio Dept. IRO Université de Montréal Montreal, Qc, Canada, H3C 3J7 bengioy@iro.umontreal.ca Samy Bengio IDIAP CP 592,

More information

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE Data Provided: None DEPARTMENT OF COMPUTER SCIENCE Autumn Semester 203 204 MACHINE LEARNING AND ADAPTIVE INTELLIGENCE 2 hours Answer THREE of the four questions. All questions carry equal weight. Figures

More information

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014 Exam policy: This exam allows two one-page, two-sided cheat sheets (i.e. 4 sides); No other materials. Time: 2 hours. Be sure to write

More information

Introduction to Probabilistic Graphical Models

Introduction to Probabilistic Graphical Models Introduction to Probabilistic Graphical Models Kyu-Baek Hwang and Byoung-Tak Zhang Biointelligence Lab School of Computer Science and Engineering Seoul National University Seoul 151-742 Korea E-mail: kbhwang@bi.snu.ac.kr

More information

Protein Complex Identification by Supervised Graph Clustering

Protein Complex Identification by Supervised Graph Clustering Protein Complex Identification by Supervised Graph Clustering Yanjun Qi 1, Fernanda Balem 2, Christos Faloutsos 1, Judith Klein- Seetharaman 1,2, Ziv Bar-Joseph 1 1 School of Computer Science, Carnegie

More information

Recent Advances in Bayesian Inference Techniques

Recent Advances in Bayesian Inference Techniques Recent Advances in Bayesian Inference Techniques Christopher M. Bishop Microsoft Research, Cambridge, U.K. research.microsoft.com/~cmbishop SIAM Conference on Data Mining, April 2004 Abstract Bayesian

More information

Non-linear Measure Based Process Monitoring and Fault Diagnosis

Non-linear Measure Based Process Monitoring and Fault Diagnosis Non-linear Measure Based Process Monitoring and Fault Diagnosis 12 th Annual AIChE Meeting, Reno, NV [275] Data Driven Approaches to Process Control 4:40 PM, Nov. 6, 2001 Sandeep Rajput Duane D. Bruns

More information

COMS 4771 Introduction to Machine Learning. Nakul Verma

COMS 4771 Introduction to Machine Learning. Nakul Verma COMS 4771 Introduction to Machine Learning Nakul Verma Announcements HW1 due next lecture Project details are available decide on the group and topic by Thursday Last time Generative vs. Discriminative

More information

Grouping of correlated feature vectors using treelets

Grouping of correlated feature vectors using treelets Grouping of correlated feature vectors using treelets Jing Xiang Department of Machine Learning Carnegie Mellon University Pittsburgh, PA 15213 jingx@cs.cmu.edu Abstract In many applications, features

More information

Bayesian Networks. Motivation

Bayesian Networks. Motivation Bayesian Networks Computer Sciences 760 Spring 2014 http://pages.cs.wisc.edu/~dpage/cs760/ Motivation Assume we have five Boolean variables,,,, The joint probability is,,,, How many state configurations

More information

Unsupervised Learning

Unsupervised Learning 2018 EE448, Big Data Mining, Lecture 7 Unsupervised Learning Weinan Zhang Shanghai Jiao Tong University http://wnzhang.net http://wnzhang.net/teaching/ee448/index.html ML Problem Setting First build and

More information

Learning a probabalistic model of rainfall using graphical models

Learning a probabalistic model of rainfall using graphical models Learning a probabalistic model of rainfall using graphical models Byoungkoo Lee Computational Biology Carnegie Mellon University Pittsburgh, PA 15213 byounko@andrew.cmu.edu Jacob Joseph Computational Biology

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation)

Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) Principal Component Analysis -- PCA (also called Karhunen-Loeve transformation) PCA transforms the original input space into a lower dimensional space, by constructing dimensions that are linear combinations

More information

Quiz QSAR QSAR. The Hammett Equation. Hammett s Standard Reference Reaction. Substituent Effects on Equilibria

Quiz QSAR QSAR. The Hammett Equation. Hammett s Standard Reference Reaction. Substituent Effects on Equilibria Quiz Select a method you are using for your project and write ~1/2 page discussing the method. Address: What does it do? How does it work? What assumptions are made? Are there particular situations in

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Support Vector Machine. Industrial AI Lab.

Support Vector Machine. Industrial AI Lab. Support Vector Machine Industrial AI Lab. Classification (Linear) Autonomously figure out which category (or class) an unknown item should be categorized into Number of categories / classes Binary: 2 different

More information

CMU-Q Lecture 24:

CMU-Q Lecture 24: CMU-Q 15-381 Lecture 24: Supervised Learning 2 Teacher: Gianni A. Di Caro SUPERVISED LEARNING Hypotheses space Hypothesis function Labeled Given Errors Performance criteria Given a collection of input

More information

Final Exam, Machine Learning, Spring 2009

Final Exam, Machine Learning, Spring 2009 Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information