Inductive Bias: How to generalize on novel data. CS Inductive Bias 1

Size: px
Start display at page:

Download "Inductive Bias: How to generalize on novel data. CS Inductive Bias 1"

Transcription

1 Inductive Bias: How to generaize on nove data CS Inductive Bias 1

2 Overfitting Noise vs. Exceptions CS Inductive Bias 2

3 Non-Linear Tasks Linear Regression wi not generaize we to the task beow Needs a non-inear surface Coud do a feature pre-process as with the quadric machine For exampe, we coud use an arbitrary poynomia in x Thus it is sti inear in the coefficients, and can be soved with deta rue, etc. Y = β 0 + β 1 X + β 2 X β n X n What order poynomia shoud we use? Overfit issues can occur CS 478 Inductive Bias 3

4 Regression Reguarization How to avoid overfit Keep the mode simpe For regression, keep the function smooth Inductive bias is that f(x) f(x ± ε) Reguarization approach: F(h) = Error(h) + λ Compexity(h) Tradeoff accuracy vs compexity Ridge Regression Minimize: F(w) = TSS(w) + λ w 2 = Σ (predicted i actua i ) 2 + λσw i 2 Gradient of F(w): Δw i = c(t net)x i λw i (Weight decay) Especiay usefu when the features are a non-inear transform from the initia features (e.g. poynomias in x) Aso when the number of initia features is greater than the number of exampes Lasso regression uses an L1 vs an L2 weight penaty: TSS(w) +λσ w i CS Regression 4

5 Hypothesis Space The Hypothesis space H is the set a the possibe modes h which can be earned by the current earning agorithm e.g. Set of possibe weight settings for a perceptron Restricted hypothesis space Can be easier to search May avoid overfit since they are usuay simper (e.g. inear or ow order decision surface) Often wi underfit Unrestricted Hypothesis Space Can represent any possibe function and thus can fit the training set we Mechanisms must be used to avoid overfit CS Inductive Bias 5

6 Avoiding Overfit - Reguarization Reguarization: any modification we make to a earning agorithm that is intended to reduce its generaization error but not its training error Occam s Razor Wiiam of Ockham (c ) Simpest accurate mode: accuracy vs. compexity trade-off. Find h H which minimizes an objective function of the form: F(h) = Error(h) + λ Compexity(h) Compexity coud be number of nodes, size of tree, magnitude of weights, order of decision surface, etc. L2 and L1 common. More Training Data (vs. overtraining on same data) Aso Data set augmentation Fake data, Can be very effective, Jitter, but take care Denoising add random noise to inputs during training can act as a reguarizer Adding noise to nodes, weights, outputs, etc. E.g. Dropout (discuss with ensembes) Most common reguarization approach: Eary Stopping Start with simpe mode (sma parameters/weights) and stop training as soon as we attain good generaization accuracy (before parameters get arge) Use a vaidation Set (next side: requires separate test set) Wi discuss other approaches with specific modes CS Inductive Bias 6

7 Stopping/Mode Seection with Vaidation Set SSE Epochs (new h at each) Vaidation Set Training Set There is a different mode h after each epoch Seect a mode in the area where the vaidation set accuracy fattens When no improvement occurs over m epochs The vaidation set comes out of training set data Sti need a separate test set to use after seecting mode h to predict future accuracy Simpe and unobtrusive, does not change objective function, etc Can be done in parae on a separate processor Can be used aone or in conjunction with other reguarizers CS Inductive Bias 7

8 Inductive Bias The approach used to decide how to generaize nove cases One common approach is Occam s Razor The simpest hypothesis which expains/fits the data is usuay the best Many other rationae biases and variations ABC Z AB C Z ABC Z AB C Z A B C Z A BC? When you get the new input Ā B C. What is your output? CS Inductive Bias 8

9 One Definition for Inductive Bias Inductive Bias: Any basis for choosing one generaization over another, other than strict consistency with the observed training instances Sometimes just caed the Bias of the agorithm (don't confuse with the bias weight in a neura network). Bias-Variance Trade-off Wi discuss in more detai when we discuss ensembes CS Inductive Bias 9

10 Some Inductive Bias Approaches Restricted Hypothesis Space - Can just try to minimize error since hypotheses are aready simpe Linear or ow order threshod function k-dnf, k-cnf, etc. Low order poynomia Preference Bias Prefer one hypothesis over another even though they have simiar training accuracy Occam s Razor Smaest DNF representation which matches we Shaow decision tree with high information gain Neura Network with ow vaidation error and sma magnitude weights CS Inductive Bias 10

11 Need for Bias 2 2n Booean functions of n inputs x1 x2 x3 Cass Possibe Consistent Function Hypotheses ? CS Inductive Bias 11

12 Need for Bias 2 2n Booean functions of n inputs x1 x2 x3 Cass Possibe Consistent Function Hypotheses ? 0 CS Inductive Bias 12

13 Need for Bias 2 2n Booean functions of n inputs x1 x2 x3 Cass Possibe Consistent Function Hypotheses ? 0 1 CS Inductive Bias 13

14 Need for Bias 2 2n Booean functions of n inputs x1 x2 x3 Cass Possibe Consistent Function Hypotheses ? Without an Inductive Bias we have no rationae to choose one hypothesis over another and thus a random guess woud be as good as any other option. CS Inductive Bias 14

15 Need for Bias 2 2n Booean functions of n inputs x1 x2 x3 Cass Possibe Consistent Function Hypotheses ? Inductive Bias guides which hypothesis we shoud prefer? What happens in this case if we use simpicity (Occam s Razor) as our inductive Bias? CS Inductive Bias 15

16 Learnabe Probems Raster Screen Probem Pattern Theory Reguarity in a task Compressibiity Don t care features and Impossibe states Interesting/Learnabe Probems What we actuay dea with Can we formay characterize them? Learning a training set vs. generaizing A function where each output is set randomy (coin-fip) Output cass is independent of a other instances in the data set Computabiity vs. Learnabiity (Optiona) CS Inductive Bias 16

17 Computabiity and Learnabiity Finite Probems Finite probems assume finite number of mappings (Finite Tabe) Fixed input size arithmetic Random memory in a RAM Learnabe: Can do better than random on nove exampes CS Inductive Bias 17

18 Computabiity and Learnabiity Finite Probems Finite probems assume finite number of mappings (Finite Tabe) Fixed input size arithmetic Random memory in a RAM Learnabe: Can do better than random on nove exampes Finite Probems A are Computabe Learnabe Probems: Those with Reguarity CS Inductive Bias 18

19 Computabiity and Learnabiity Infinite Probems Infinite number of mappings (Infinite Tabe) Arbitrary input size arithmetic Hating Probem (no imit on input size) Do two arbitrary strings match CS Inductive Bias 19

20 Computabiity and Learnabiity Infinite Probems Infinite number of mappings (Infinite Tabe) Arbitrary input size arithmetic Hating Probem (no imit on input size) Do two arbitrary strings match Infinite Probems Learnabe Probems: A reasonaby queried infinite subset has reguarity Computabe Probems: Ony those where a but a finite set of mappings have reguarity CS Inductive Bias 20

21 No Free Lunch Any inductive bias chosen wi have equa accuracy compared to any other bias over a possibe functions/tasks, assuming a functions are equay ikey. If a bias is correct on some cases, it must be incorrect on equay many cases. Is this a probem? Random vs. Reguar Anti-Bias? (even though reguar) The Interesting Probems subset of earnabe? Are a functions equay ikey in the rea word? CS Inductive Bias 21

22 Interesting Probems and Biases A Probems Structured Probems Interesting Probems Inductive Bias Inductive Bias Inductive Bias P I Inductive Bias Inductive Bias CS Inductive Bias 22

23 More on Inductive Bias Inductive Bias requires some set of prior assumptions about the tasks being considered and the earning approaches avaiabe Tom Mitche s definition: Inductive Bias of a earner is the set of additiona assumptions sufficient to justify its inductive inferences as deductive inferences We consider standard ML agorithms/hypothesis spaces to be different inductive biases: C4.5 (Greedy best attributes), Backpropagation (simpe to compex), etc. CS Inductive Bias 23

24 Which Bias is Best? Not one Bias that is best on a probems Our experiments Over 50 rea word probems Over 400 inductive biases mosty variations on critica variabe biases vs. simiarity biases Different biases were a better fit for different probems Given a data set, which Learning mode (Inductive Bias) shoud be chosen? CS Inductive Bias 24

25 Automatic Discovery of Inductive Bias Defining and characterizing the set of Interesting/Learnabe probems To what extent do current biases cover the set of interesting probems Automatic feature seection Automatic seection of Bias (before and/or during earning), incuding a earning parameters Dynamic Inductive Biases (in time and space) Combinations of Biases Ensembes, Orace Learning CS Inductive Bias 25

26 Dynamic Inductive Bias in Time Can be discovered as you earn May want to earn genera rues first foowed by true exceptions Can be based on ease of earning the probem Exampe: SoftProp From Lazy Learning to Backprop CS Inductive Bias 26

27 Dynamic Inductive Bias in Space CS Inductive Bias 27

28 ML Hoy Grai: We want a aspects of the earning mechanism automated, incuding the Inductive Bias Outputs Just a Data Set or just an expanation of the probem Automated Learner Hypothesis Input Features CS Inductive Bias 28

29 BYU Neura Network and Machine Learning Laboratory Work on Automatic Discover of Inductive Bias Proposing New Learning Agorithms (Inductive Biases) Theoretica issues Defining the set of Interesting/Learnabe probems Anaytica/empirica studies of differences between biases Ensembes Wagging, Mimicking, Orace Learning, etc. Meta-Learning A priori decision regarding which earning mode to use Features of the data set/appication Learning from mode experience Automatic seection of Parameters Constructive Agorithms ASOCS, DMPx, etc. Learning Parameters Windowed momentum, Automatic improved distance functions (IVDM) Automatic Bias in time SoftProp Automatic Bias in space Overfitting, sensitivity to compex portions of the space: DMP, higher order features CS Inductive Bias 29

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with? Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine

More information

A Brief Introduction to Markov Chains and Hidden Markov Models

A Brief Introduction to Markov Chains and Hidden Markov Models A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,

More information

Statistical Learning Theory: A Primer

Statistical Learning Theory: A Primer Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO

More information

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)

FRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA) 1 FRST 531 -- Mutivariate Statistics Mutivariate Discriminant Anaysis (MDA) Purpose: 1. To predict which group (Y) an observation beongs to based on the characteristics of p predictor (X) variabes, using

More information

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES

MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES Separation of variabes is a method to sove certain PDEs which have a warped product structure. First, on R n, a inear PDE of order m is

More information

Separation of Variables and a Spherical Shell with Surface Charge

Separation of Variables and a Spherical Shell with Surface Charge Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation

More information

FOURIER SERIES ON ANY INTERVAL

FOURIER SERIES ON ANY INTERVAL FOURIER SERIES ON ANY INTERVAL Overview We have spent considerabe time earning how to compute Fourier series for functions that have a period of 2p on the interva (-p,p). We have aso seen how Fourier series

More information

Appendix A: MATLAB commands for neural networks

Appendix A: MATLAB commands for neural networks Appendix A: MATLAB commands for neura networks 132 Appendix A: MATLAB commands for neura networks p=importdata('pn.xs'); t=importdata('tn.xs'); [pn,meanp,stdp,tn,meant,stdt]=prestd(p,t); for m=1:10 net=newff(minmax(pn),[m,1],{'tansig','purein'},'trainm');

More information

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah How the backpropagation agorithm works Srikumar Ramaingam Schoo of Computing University of Utah Reference Most of the sides are taken from the second chapter of the onine book by Michae Nieson: neuranetworksanddeepearning.com

More information

XSAT of linear CNF formulas

XSAT of linear CNF formulas XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open

More information

Formulas for Angular-Momentum Barrier Factors Version II

Formulas for Angular-Momentum Barrier Factors Version II BNL PREPRINT BNL-QGS-06-101 brfactor1.tex Formuas for Anguar-Momentum Barrier Factors Version II S. U. Chung Physics Department, Brookhaven Nationa Laboratory, Upton, NY 11973 March 19, 2015 abstract A

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view

More information

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18

CSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18 CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$

More information

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah

How the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah How the backpropagation agorithm works Srikumar Ramaingam Schoo of Computing University of Utah Reference Most of the sides are taken from the second chapter of the onine book by Michae Nieson: neuranetworksanddeepearning.com

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 11, 2012 Today: Computational Learning Theory Probably Approximately Coorrect (PAC) learning theorem

More information

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe

More information

Honors Thesis Bounded Query Functions With Limited Output Bits II

Honors Thesis Bounded Query Functions With Limited Output Bits II Honors Thesis Bounded Query Functions With Limited Output Bits II Daibor Zeený University of Maryand, Batimore County May 29, 2007 Abstract We sove some open questions in the area of bounded query function

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 11, 2012 Today: Computational Learning Theory Probably Approximately Coorrect (PAC) learning theorem

More information

Statistics for Applications. Chapter 7: Regression 1/43

Statistics for Applications. Chapter 7: Regression 1/43 Statistics for Appications Chapter 7: Regression 1/43 Heuristics of the inear regression (1) Consider a coud of i.i.d. random points (X i,y i ),i =1,...,n : 2/43 Heuristics of the inear regression (2)

More information

STA 216 Project: Spline Approach to Discrete Survival Analysis

STA 216 Project: Spline Approach to Discrete Survival Analysis : Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing

More information

SVM: Terminology 1(6) SVM: Terminology 2(6)

SVM: Terminology 1(6) SVM: Terminology 2(6) Andrew Kusiak Inteigent Systems Laboratory 39 Seamans Center he University of Iowa Iowa City, IA 54-57 SVM he maxima margin cassifier is simiar to the perceptron: It aso assumes that the data points are

More information

On the Goal Value of a Boolean Function

On the Goal Value of a Boolean Function On the Goa Vaue of a Booean Function Eric Bach Dept. of CS University of Wisconsin 1210 W. Dayton St. Madison, WI 53706 Lisa Heerstein Dept of CSE NYU Schoo of Engineering 2 Metrotech Center, 10th Foor

More information

Learning Fully Observed Undirected Graphical Models

Learning Fully Observed Undirected Graphical Models Learning Fuy Observed Undirected Graphica Modes Sides Credit: Matt Gormey (2016) Kayhan Batmangheich 1 Machine Learning The data inspires the structures we want to predict Inference finds {best structure,

More information

Ridge Regression 1. to which some random noise is added. So that the training labels can be represented as:

Ridge Regression 1. to which some random noise is added. So that the training labels can be represented as: CS 1: Machine Learning Spring 15 College of Computer and Information Science Northeastern University Lecture 3 February, 3 Instructor: Bilal Ahmed Scribe: Bilal Ahmed & Virgil Pavlu 1 Introduction Ridge

More information

Statistical Learning Theory: a Primer

Statistical Learning Theory: a Primer ??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa

More information

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network An Agorithm for Pruning Redundant Modues in Min-Max Moduar Network Hui-Cheng Lian and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University 1954 Hua Shan Rd., Shanghai

More information

Multilayer Kerceptron

Multilayer Kerceptron Mutiayer Kerceptron Zotán Szabó, András Lőrincz Department of Information Systems, Facuty of Informatics Eötvös Loránd University Pázmány Péter sétány 1/C H-1117, Budapest, Hungary e-mai: szzoi@csetehu,

More information

A Novel Learning Method for Elman Neural Network Using Local Search

A Novel Learning Method for Elman Neural Network Using Local Search Neura Information Processing Letters and Reviews Vo. 11, No. 8, August 2007 LETTER A Nove Learning Method for Eman Neura Networ Using Loca Search Facuty of Engineering, Toyama University, Gofuu 3190 Toyama

More information

Lecture Note 3: Stationary Iterative Methods

Lecture Note 3: Stationary Iterative Methods MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or

More information

CS 6375: Machine Learning Computational Learning Theory

CS 6375: Machine Learning Computational Learning Theory CS 6375: Machine Learning Computational Learning Theory Vibhav Gogate The University of Texas at Dallas Many slides borrowed from Ray Mooney 1 Learning Theory Theoretical characterizations of Difficulty

More information

Stochastic Variational Inference with Gradient Linearization

Stochastic Variational Inference with Gradient Linearization Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,

More information

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After

More information

Active Learning & Experimental Design

Active Learning & Experimental Design Active Learning & Experimenta Design Danie Ting Heaviy modified, of course, by Lye Ungar Origina Sides by Barbara Engehardt and Aex Shyr Lye Ungar, University of Pennsyvania Motivation u Data coection

More information

4 Separation of Variables

4 Separation of Variables 4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE

More information

Machine Learning for Biomedical Engineering. Enrico Grisan

Machine Learning for Biomedical Engineering. Enrico Grisan Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Curse of dimensionality Why are more features bad? Redundant features (useless or confounding) Hard to interpret and

More information

arxiv: v2 [cond-mat.stat-mech] 14 Nov 2008

arxiv: v2 [cond-mat.stat-mech] 14 Nov 2008 Random Booean Networks Barbara Drosse Institute of Condensed Matter Physics, Darmstadt University of Technoogy, Hochschustraße 6, 64289 Darmstadt, Germany (Dated: June 27) arxiv:76.335v2 [cond-mat.stat-mech]

More information

Manipulation in Financial Markets and the Implications for Debt Financing

Manipulation in Financial Markets and the Implications for Debt Financing Manipuation in Financia Markets and the Impications for Debt Financing Leonid Spesivtsev This paper studies the situation when the firm is in financia distress and faces bankruptcy or debt restructuring.

More information

221B Lecture Notes Notes on Spherical Bessel Functions

221B Lecture Notes Notes on Spherical Bessel Functions Definitions B Lecture Notes Notes on Spherica Besse Functions We woud ike to sove the free Schrödinger equation [ h d r R(r) = h k R(r). () m r dr r m R(r) is the radia wave function ψ( x) = R(r)Y m (θ,

More information

The EM Algorithm applied to determining new limit points of Mahler measures

The EM Algorithm applied to determining new limit points of Mahler measures Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM,

More information

Determining The Degree of Generalization Using An Incremental Learning Algorithm

Determining The Degree of Generalization Using An Incremental Learning Algorithm Determining The Degree of Generaization Using An Incrementa Learning Agorithm Pabo Zegers Facutad de Ingeniería, Universidad de os Andes San Caros de Apoquindo 22, Las Condes, Santiago, Chie pzegers@uandes.c

More information

CSCI 5622 Machine Learning

CSCI 5622 Machine Learning CSCI 5622 Machine Learning DATE READ DUE Mon, Aug 31 1, 2 & 3 Wed, Sept 2 3 & 5 Wed, Sept 9 TBA Prelim Proposal www.rodneynielsen.com/teaching/csci5622f09/ Instructor: Rodney Nielsen Assistant Professor

More information

[read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] General-to-specific ordering over hypotheses

[read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] General-to-specific ordering over hypotheses 1 CONCEPT LEARNING AND THE GENERAL-TO-SPECIFIC ORDERING [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-specific ordering over hypotheses Version spaces and

More information

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7 6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the

More information

MODULE -4 BAYEIAN LEARNING

MODULE -4 BAYEIAN LEARNING MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities

More information

A Solution to the 4-bit Parity Problem with a Single Quaternary Neuron

A Solution to the 4-bit Parity Problem with a Single Quaternary Neuron Neura Information Processing - Letters and Reviews Vo. 5, No. 2, November 2004 LETTER A Soution to the 4-bit Parity Probem with a Singe Quaternary Neuron Tohru Nitta Nationa Institute of Advanced Industria

More information

A. Distribution of the test statistic

A. Distribution of the test statistic A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch

More information

Support Vector Machine and Its Application to Regression and Classification

Support Vector Machine and Its Application to Regression and Classification BearWorks Institutiona Repository MSU Graduate Theses Spring 2017 Support Vector Machine and Its Appication to Regression and Cassification Xiaotong Hu As with any inteectua project, the content and views

More information

Lecture 2 Machine Learning Review

Lecture 2 Machine Learning Review Lecture 2 Machine Learning Review CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago March 29, 2017 Things we will look at today Formal Setup for Supervised Learning Things

More information

BP neural network-based sports performance prediction model applied research

BP neural network-based sports performance prediction model applied research Avaiabe onine www.jocpr.com Journa of Chemica and Pharmaceutica Research, 204, 6(7:93-936 Research Artice ISSN : 0975-7384 CODEN(USA : JCPRC5 BP neura networ-based sports performance prediction mode appied

More information

Lecture 9. Stability of Elastic Structures. Lecture 10. Advanced Topic in Column Buckling

Lecture 9. Stability of Elastic Structures. Lecture 10. Advanced Topic in Column Buckling Lecture 9 Stabiity of Eastic Structures Lecture 1 Advanced Topic in Coumn Bucking robem 9-1: A camped-free coumn is oaded at its tip by a oad. The issue here is to find the itica bucking oad. a) Suggest

More information

Paragraph Topic Classification

Paragraph Topic Classification Paragraph Topic Cassification Eugene Nho Graduate Schoo of Business Stanford University Stanford, CA 94305 enho@stanford.edu Edward Ng Department of Eectrica Engineering Stanford University Stanford, CA

More information

Algorithms to solve massively under-defined systems of multivariate quadratic equations

Algorithms to solve massively under-defined systems of multivariate quadratic equations Agorithms to sove massivey under-defined systems of mutivariate quadratic equations Yasufumi Hashimoto Abstract It is we known that the probem to sove a set of randomy chosen mutivariate quadratic equations

More information

4 1-D Boundary Value Problems Heat Equation

4 1-D Boundary Value Problems Heat Equation 4 -D Boundary Vaue Probems Heat Equation The main purpose of this chapter is to study boundary vaue probems for the heat equation on a finite rod a x b. u t (x, t = ku xx (x, t, a < x < b, t > u(x, = ϕ(x

More information

pp in Backpropagation Convergence Via Deterministic Nonmonotone Perturbed O. L. Mangasarian & M. V. Solodov Madison, WI Abstract

pp in Backpropagation Convergence Via Deterministic Nonmonotone Perturbed O. L. Mangasarian & M. V. Solodov Madison, WI Abstract pp 383-390 in Advances in Neura Information Processing Systems 6 J.D. Cowan, G. Tesauro and J. Aspector (eds), Morgan Kaufmann Pubishers, San Francisco, CA, 1994 Backpropagation Convergence Via Deterministic

More information

Trainable fusion rules. I. Large sample size case

Trainable fusion rules. I. Large sample size case Neura Networks 19 (2006) 1506 1516 www.esevier.com/ocate/neunet Trainabe fusion rues. I. Large sampe size case Šarūnas Raudys Institute of Mathematics and Informatics, Akademijos 4, Vinius 08633, Lithuania

More information

Noname manuscript No. (will be inserted by the editor) Can Li Ignacio E. Grossmann

Noname manuscript No. (will be inserted by the editor) Can Li Ignacio E. Grossmann Noname manuscript No. (wi be inserted by the editor) A finite ɛ-convergence agorithm for two-stage convex 0-1 mixed-integer noninear stochastic programs with mixed-integer first and second stage variabes

More information

Asynchronous Control for Coupled Markov Decision Systems

Asynchronous Control for Coupled Markov Decision Systems INFORMATION THEORY WORKSHOP (ITW) 22 Asynchronous Contro for Couped Marov Decision Systems Michae J. Neey University of Southern Caifornia Abstract This paper considers optima contro for a coection of

More information

14 Separation of Variables Method

14 Separation of Variables Method 14 Separation of Variabes Method Consider, for exampe, the Dirichet probem u t = Du xx < x u(x, ) = f(x) < x < u(, t) = = u(, t) t > Let u(x, t) = T (t)φ(x); now substitute into the equation: dt

More information

Homework #04 Answers and Hints (MATH4052 Partial Differential Equations)

Homework #04 Answers and Hints (MATH4052 Partial Differential Equations) Homework #4 Answers and Hints (MATH452 Partia Differentia Equations) Probem 1 (Page 89, Q2) Consider a meta rod ( < x < ), insuated aong its sides but not at its ends, which is initiay at temperature =

More information

High Spectral Resolution Infrared Radiance Modeling Using Optimal Spectral Sampling (OSS) Method

High Spectral Resolution Infrared Radiance Modeling Using Optimal Spectral Sampling (OSS) Method High Spectra Resoution Infrared Radiance Modeing Using Optima Spectra Samping (OSS) Method J.-L. Moncet and G. Uymin Background Optima Spectra Samping (OSS) method is a fast and accurate monochromatic

More information

Day 3: Classification, logistic regression

Day 3: Classification, logistic regression Day 3: Classification, logistic regression Introduction to Machine Learning Summer School June 18, 2018 - June 29, 2018, Chicago Instructor: Suriya Gunasekar, TTI Chicago 20 June 2018 Topics so far Supervised

More information

Haar Decomposition and Reconstruction Algorithms

Haar Decomposition and Reconstruction Algorithms Jim Lambers MAT 773 Fa Semester 018-19 Lecture 15 and 16 Notes These notes correspond to Sections 4.3 and 4.4 in the text. Haar Decomposition and Reconstruction Agorithms Decomposition Suppose we approximate

More information

Automobile Prices in Market Equilibrium. Berry, Pakes and Levinsohn

Automobile Prices in Market Equilibrium. Berry, Pakes and Levinsohn Automobie Prices in Market Equiibrium Berry, Pakes and Levinsohn Empirica Anaysis of demand and suppy in a differentiated products market: equiibrium in the U.S. automobie market. Oigopoistic Differentiated

More information

( ) is just a function of x, with

( ) is just a function of x, with II. MULTIVARIATE CALCULUS The first ecture covered functions where a singe input goes in, and a singe output comes out. Most economic appications aren t so simpe. In most cases, a number of variabes infuence

More information

David Eigen. MA112 Final Paper. May 10, 2002

David Eigen. MA112 Final Paper. May 10, 2002 David Eigen MA112 Fina Paper May 1, 22 The Schrodinger equation describes the position of an eectron as a wave. The wave function Ψ(t, x is interpreted as a probabiity density for the position of the eectron.

More information

From Margins to Probabilities in Multiclass Learning Problems

From Margins to Probabilities in Multiclass Learning Problems From Margins to Probabiities in Muticass Learning Probems Andrea Passerini and Massimiiano Ponti 2 and Paoo Frasconi 3 Abstract. We study the probem of muticass cassification within the framework of error

More information

Wave Equation Dirichlet Boundary Conditions

Wave Equation Dirichlet Boundary Conditions Wave Equation Dirichet Boundary Conditions u tt x, t = c u xx x, t, < x 1 u, t =, u, t = ux, = fx u t x, = gx Look for simpe soutions in the form ux, t = XxT t Substituting into 13 and dividing

More information

Data Discovery and Anomaly Detection Using Atypicality: Theory

Data Discovery and Anomaly Detection Using Atypicality: Theory Data Discovery and Anomay Detection Using Atypicaity: Theory Anders Høst-Madsen, Feow, IEEE, Eyas Sabeti, Member, IEEE, Chad Waton Abstract A centra question in the era of big data is what to do with the

More information

CS 331: Artificial Intelligence Propositional Logic 2. Review of Last Time

CS 331: Artificial Intelligence Propositional Logic 2. Review of Last Time CS 33 Artificia Inteigence Propositiona Logic 2 Review of Last Time = means ogicay foows - i means can be derived from If your inference agorithm derives ony things that foow ogicay from the KB, the inference

More information

Problem set 6 The Perron Frobenius theorem.

Problem set 6 The Perron Frobenius theorem. Probem set 6 The Perron Frobenius theorem. Math 22a4 Oct 2 204, Due Oct.28 In a future probem set I want to discuss some criteria which aow us to concude that that the ground state of a sef-adjoint operator

More information

Throughput Optimal Scheduling for Wireless Downlinks with Reconfiguration Delay

Throughput Optimal Scheduling for Wireless Downlinks with Reconfiguration Delay Throughput Optima Scheduing for Wireess Downinks with Reconfiguration Deay Vineeth Baa Sukumaran vineethbs@gmai.com Department of Avionics Indian Institute of Space Science and Technoogy. Abstract We consider

More information

HYDROGEN ATOM SELECTION RULES TRANSITION RATES

HYDROGEN ATOM SELECTION RULES TRANSITION RATES DOING PHYSICS WITH MATLAB QUANTUM PHYSICS Ian Cooper Schoo of Physics, University of Sydney ian.cooper@sydney.edu.au HYDROGEN ATOM SELECTION RULES TRANSITION RATES DOWNLOAD DIRECTORY FOR MATLAB SCRIPTS

More information

New Efficiency Results for Makespan Cost Sharing

New Efficiency Results for Makespan Cost Sharing New Efficiency Resuts for Makespan Cost Sharing Yvonne Beischwitz a, Forian Schoppmann a, a University of Paderborn, Department of Computer Science Fürstenaee, 3302 Paderborn, Germany Abstract In the context

More information

Efficient Part-of-Speech Tagging with a Min-Max Modular Neural-Network Model

Efficient Part-of-Speech Tagging with a Min-Max Modular Neural-Network Model Appied Inteigence 19, 65 81, 2003 c 2003 Kuwer Academic Pubishers. Manufactured in The Netherands. Efficient Part-of-Speech Tagging with a Min-Max Moduar Neura-Network Mode BAO-LIANG LU Department of Computer

More information

More Scattering: the Partial Wave Expansion

More Scattering: the Partial Wave Expansion More Scattering: the Partia Wave Expansion Michae Fower /7/8 Pane Waves and Partia Waves We are considering the soution to Schrödinger s equation for scattering of an incoming pane wave in the z-direction

More information

Higher dimensional PDEs and multidimensional eigenvalue problems

Higher dimensional PDEs and multidimensional eigenvalue problems Higher dimensiona PEs and mutidimensiona eigenvaue probems 1 Probems with three independent variabes Consider the prototypica equations u t = u (iffusion) u tt = u (W ave) u zz = u (Lapace) where u = u

More information

Stochastic Complement Analysis of Multi-Server Threshold Queues. with Hysteresis. Abstract

Stochastic Complement Analysis of Multi-Server Threshold Queues. with Hysteresis. Abstract Stochastic Compement Anaysis of Muti-Server Threshod Queues with Hysteresis John C.S. Lui The Dept. of Computer Science & Engineering The Chinese University of Hong Kong Leana Goubchik Dept. of Computer

More information

1D Heat Propagation Problems

1D Heat Propagation Problems Chapter 1 1D Heat Propagation Probems If the ambient space of the heat conduction has ony one dimension, the Fourier equation reduces to the foowing for an homogeneous body cρ T t = T λ 2 + Q, 1.1) x2

More information

Turbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University

Turbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University Turbo Codes Coding and Communication Laboratory Dept. of Eectrica Engineering, Nationa Chung Hsing University Turbo codes 1 Chapter 12: Turbo Codes 1. Introduction 2. Turbo code encoder 3. Design of intereaver

More information

Chemical Kinetics Part 2

Chemical Kinetics Part 2 Integrated Rate Laws Chemica Kinetics Part 2 The rate aw we have discussed thus far is the differentia rate aw. Let us consider the very simpe reaction: a A à products The differentia rate reates the rate

More information

Discrete Applied Mathematics

Discrete Applied Mathematics Discrete Appied Mathematics 159 (2011) 812 825 Contents ists avaiabe at ScienceDirect Discrete Appied Mathematics journa homepage: www.esevier.com/ocate/dam A direct barter mode for course add/drop process

More information

$, (2.1) n="# #. (2.2)

$, (2.1) n=# #. (2.2) Chapter. Eectrostatic II Notes: Most of the materia presented in this chapter is taken from Jackson, Chap.,, and 4, and Di Bartoo, Chap... Mathematica Considerations.. The Fourier series and the Fourier

More information

17 Lecture 17: Recombination and Dark Matter Production

17 Lecture 17: Recombination and Dark Matter Production PYS 652: Astrophysics 88 17 Lecture 17: Recombination and Dark Matter Production New ideas pass through three periods: It can t be done. It probaby can be done, but it s not worth doing. I knew it was

More information

Two Birds With One Stone: An Efficient Hierarchical Framework for Top-k and Threshold-based String Similarity Search

Two Birds With One Stone: An Efficient Hierarchical Framework for Top-k and Threshold-based String Similarity Search Two Birds With One Stone: An Efficient Hierarchica Framework for Top-k and Threshod-based String Simiarity Search Jin Wang Guoiang Li Dong Deng Yong Zhang Jianhua Feng Department of Computer Science and

More information

A Bayesian Framework for Learning Rule Sets for Interpretable Classification

A Bayesian Framework for Learning Rule Sets for Interpretable Classification Journa of Machine Learning Research 18 (2017) 1-37 Submitted 1/16; Revised 2/17; Pubished 8/17 A Bayesian Framework for Learning Rue Sets for Interpretabe Cassification Tong Wang Cynthia Rudin Finae Doshi-Veez

More information

Threshold Circuits for Multiplication and Related Problems

Threshold Circuits for Multiplication and Related Problems Optima-Depth Threshod Circuits for Mutipication and Reated Probems Chi-Hsiang Yeh Dept. of Eectrica & Computer Engineering Queen s University Kingston, Ontario, Canada, K7K 3N6 E.A. Varvarigos, B. Parhami,

More information

PhysicsAndMathsTutor.com

PhysicsAndMathsTutor.com . Two points A and B ie on a smooth horizonta tabe with AB = a. One end of a ight eastic spring, of natura ength a and moduus of easticity mg, is attached to A. The other end of the spring is attached

More information

Steepest Descent Adaptation of Min-Max Fuzzy If-Then Rules 1

Steepest Descent Adaptation of Min-Max Fuzzy If-Then Rules 1 Steepest Descent Adaptation of Min-Max Fuzzy If-Then Rues 1 R.J. Marks II, S. Oh, P. Arabshahi Λ, T.P. Caude, J.J. Choi, B.G. Song Λ Λ Dept. of Eectrica Engineering Boeing Computer Services University

More information

Copyright information to be inserted by the Publishers. Unsplitting BGK-type Schemes for the Shallow. Water Equations KUN XU

Copyright information to be inserted by the Publishers. Unsplitting BGK-type Schemes for the Shallow. Water Equations KUN XU Copyright information to be inserted by the Pubishers Unspitting BGK-type Schemes for the Shaow Water Equations KUN XU Mathematics Department, Hong Kong University of Science and Technoogy, Cear Water

More information

Fast Blind Recognition of Channel Codes

Fast Blind Recognition of Channel Codes Fast Bind Recognition of Channe Codes Reza Moosavi and Erik G. Larsson Linköping University Post Print N.B.: When citing this work, cite the origina artice. 213 IEEE. Persona use of this materia is permitted.

More information

Gauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law

Gauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law Gauss Law 1. Review on 1) Couomb s Law (charge and force) 2) Eectric Fied (fied and force) 2. Gauss s Law: connects charge and fied 3. Appications of Gauss s Law Couomb s Law and Eectric Fied Couomb s

More information

DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM

DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM MIKAEL NILSSON, MATTIAS DAHL AND INGVAR CLAESSON Bekinge Institute of Technoogy Department of Teecommunications and Signa Processing

More information

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons

Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Brandon Maone Department of Computer Science University of Hesini February 18, 2014 Abstract This document derives, in excrutiating

More information

Notes on Backpropagation with Cross Entropy

Notes on Backpropagation with Cross Entropy Notes on Backpropagation with Cross Entropy I-Ta ee, Dan Gowasser, Bruno Ribeiro Purue University October 3, 07. Overview This note introuces backpropagation for a common neura network muti-cass cassifier.

More information

Recursive Constructions of Parallel FIFO and LIFO Queues with Switched Delay Lines

Recursive Constructions of Parallel FIFO and LIFO Queues with Switched Delay Lines Recursive Constructions of Parae FIFO and LIFO Queues with Switched Deay Lines Po-Kai Huang, Cheng-Shang Chang, Feow, IEEE, Jay Cheng, Member, IEEE, and Duan-Shin Lee, Senior Member, IEEE Abstract One

More information

Cryptanalysis of PKP: A New Approach

Cryptanalysis of PKP: A New Approach Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in

More information

Computational Learning Theory

Computational Learning Theory CS 446 Machine Learning Fall 2016 OCT 11, 2016 Computational Learning Theory Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes 1 PAC Learning We want to develop a theory to relate the probability of successful

More information

MONTE CARLO SIMULATIONS

MONTE CARLO SIMULATIONS MONTE CARLO SIMULATIONS Current physics research 1) Theoretica 2) Experimenta 3) Computationa Monte Caro (MC) Method (1953) used to study 1) Discrete spin systems 2) Fuids 3) Poymers, membranes, soft matter

More information

IE 361 Exam 1. b) Give *&% confidence limits for the bias of this viscometer. (No need to simplify.)

IE 361 Exam 1. b) Give *&% confidence limits for the bias of this viscometer. (No need to simplify.) October 9, 00 IE 6 Exam Prof. Vardeman. The viscosity of paint is measured with a "viscometer" in units of "Krebs." First, a standard iquid of "known" viscosity *# Krebs is tested with a company viscometer

More information

II. PROBLEM. A. Description. For the space of audio signals

II. PROBLEM. A. Description. For the space of audio signals CS229 - Fina Report Speech Recording based Language Recognition (Natura Language) Leopod Cambier - cambier; Matan Leibovich - matane; Cindy Orozco Bohorquez - orozcocc ABSTRACT We construct a rea time

More information