Inductive Bias: How to generalize on novel data. CS Inductive Bias 1
|
|
- Beverly Simmons
- 6 years ago
- Views:
Transcription
1 Inductive Bias: How to generaize on nove data CS Inductive Bias 1
2 Overfitting Noise vs. Exceptions CS Inductive Bias 2
3 Non-Linear Tasks Linear Regression wi not generaize we to the task beow Needs a non-inear surface Coud do a feature pre-process as with the quadric machine For exampe, we coud use an arbitrary poynomia in x Thus it is sti inear in the coefficients, and can be soved with deta rue, etc. Y = β 0 + β 1 X + β 2 X β n X n What order poynomia shoud we use? Overfit issues can occur CS 478 Inductive Bias 3
4 Regression Reguarization How to avoid overfit Keep the mode simpe For regression, keep the function smooth Inductive bias is that f(x) f(x ± ε) Reguarization approach: F(h) = Error(h) + λ Compexity(h) Tradeoff accuracy vs compexity Ridge Regression Minimize: F(w) = TSS(w) + λ w 2 = Σ (predicted i actua i ) 2 + λσw i 2 Gradient of F(w): Δw i = c(t net)x i λw i (Weight decay) Especiay usefu when the features are a non-inear transform from the initia features (e.g. poynomias in x) Aso when the number of initia features is greater than the number of exampes Lasso regression uses an L1 vs an L2 weight penaty: TSS(w) +λσ w i CS Regression 4
5 Hypothesis Space The Hypothesis space H is the set a the possibe modes h which can be earned by the current earning agorithm e.g. Set of possibe weight settings for a perceptron Restricted hypothesis space Can be easier to search May avoid overfit since they are usuay simper (e.g. inear or ow order decision surface) Often wi underfit Unrestricted Hypothesis Space Can represent any possibe function and thus can fit the training set we Mechanisms must be used to avoid overfit CS Inductive Bias 5
6 Avoiding Overfit - Reguarization Reguarization: any modification we make to a earning agorithm that is intended to reduce its generaization error but not its training error Occam s Razor Wiiam of Ockham (c ) Simpest accurate mode: accuracy vs. compexity trade-off. Find h H which minimizes an objective function of the form: F(h) = Error(h) + λ Compexity(h) Compexity coud be number of nodes, size of tree, magnitude of weights, order of decision surface, etc. L2 and L1 common. More Training Data (vs. overtraining on same data) Aso Data set augmentation Fake data, Can be very effective, Jitter, but take care Denoising add random noise to inputs during training can act as a reguarizer Adding noise to nodes, weights, outputs, etc. E.g. Dropout (discuss with ensembes) Most common reguarization approach: Eary Stopping Start with simpe mode (sma parameters/weights) and stop training as soon as we attain good generaization accuracy (before parameters get arge) Use a vaidation Set (next side: requires separate test set) Wi discuss other approaches with specific modes CS Inductive Bias 6
7 Stopping/Mode Seection with Vaidation Set SSE Epochs (new h at each) Vaidation Set Training Set There is a different mode h after each epoch Seect a mode in the area where the vaidation set accuracy fattens When no improvement occurs over m epochs The vaidation set comes out of training set data Sti need a separate test set to use after seecting mode h to predict future accuracy Simpe and unobtrusive, does not change objective function, etc Can be done in parae on a separate processor Can be used aone or in conjunction with other reguarizers CS Inductive Bias 7
8 Inductive Bias The approach used to decide how to generaize nove cases One common approach is Occam s Razor The simpest hypothesis which expains/fits the data is usuay the best Many other rationae biases and variations ABC Z AB C Z ABC Z AB C Z A B C Z A BC? When you get the new input Ā B C. What is your output? CS Inductive Bias 8
9 One Definition for Inductive Bias Inductive Bias: Any basis for choosing one generaization over another, other than strict consistency with the observed training instances Sometimes just caed the Bias of the agorithm (don't confuse with the bias weight in a neura network). Bias-Variance Trade-off Wi discuss in more detai when we discuss ensembes CS Inductive Bias 9
10 Some Inductive Bias Approaches Restricted Hypothesis Space - Can just try to minimize error since hypotheses are aready simpe Linear or ow order threshod function k-dnf, k-cnf, etc. Low order poynomia Preference Bias Prefer one hypothesis over another even though they have simiar training accuracy Occam s Razor Smaest DNF representation which matches we Shaow decision tree with high information gain Neura Network with ow vaidation error and sma magnitude weights CS Inductive Bias 10
11 Need for Bias 2 2n Booean functions of n inputs x1 x2 x3 Cass Possibe Consistent Function Hypotheses ? CS Inductive Bias 11
12 Need for Bias 2 2n Booean functions of n inputs x1 x2 x3 Cass Possibe Consistent Function Hypotheses ? 0 CS Inductive Bias 12
13 Need for Bias 2 2n Booean functions of n inputs x1 x2 x3 Cass Possibe Consistent Function Hypotheses ? 0 1 CS Inductive Bias 13
14 Need for Bias 2 2n Booean functions of n inputs x1 x2 x3 Cass Possibe Consistent Function Hypotheses ? Without an Inductive Bias we have no rationae to choose one hypothesis over another and thus a random guess woud be as good as any other option. CS Inductive Bias 14
15 Need for Bias 2 2n Booean functions of n inputs x1 x2 x3 Cass Possibe Consistent Function Hypotheses ? Inductive Bias guides which hypothesis we shoud prefer? What happens in this case if we use simpicity (Occam s Razor) as our inductive Bias? CS Inductive Bias 15
16 Learnabe Probems Raster Screen Probem Pattern Theory Reguarity in a task Compressibiity Don t care features and Impossibe states Interesting/Learnabe Probems What we actuay dea with Can we formay characterize them? Learning a training set vs. generaizing A function where each output is set randomy (coin-fip) Output cass is independent of a other instances in the data set Computabiity vs. Learnabiity (Optiona) CS Inductive Bias 16
17 Computabiity and Learnabiity Finite Probems Finite probems assume finite number of mappings (Finite Tabe) Fixed input size arithmetic Random memory in a RAM Learnabe: Can do better than random on nove exampes CS Inductive Bias 17
18 Computabiity and Learnabiity Finite Probems Finite probems assume finite number of mappings (Finite Tabe) Fixed input size arithmetic Random memory in a RAM Learnabe: Can do better than random on nove exampes Finite Probems A are Computabe Learnabe Probems: Those with Reguarity CS Inductive Bias 18
19 Computabiity and Learnabiity Infinite Probems Infinite number of mappings (Infinite Tabe) Arbitrary input size arithmetic Hating Probem (no imit on input size) Do two arbitrary strings match CS Inductive Bias 19
20 Computabiity and Learnabiity Infinite Probems Infinite number of mappings (Infinite Tabe) Arbitrary input size arithmetic Hating Probem (no imit on input size) Do two arbitrary strings match Infinite Probems Learnabe Probems: A reasonaby queried infinite subset has reguarity Computabe Probems: Ony those where a but a finite set of mappings have reguarity CS Inductive Bias 20
21 No Free Lunch Any inductive bias chosen wi have equa accuracy compared to any other bias over a possibe functions/tasks, assuming a functions are equay ikey. If a bias is correct on some cases, it must be incorrect on equay many cases. Is this a probem? Random vs. Reguar Anti-Bias? (even though reguar) The Interesting Probems subset of earnabe? Are a functions equay ikey in the rea word? CS Inductive Bias 21
22 Interesting Probems and Biases A Probems Structured Probems Interesting Probems Inductive Bias Inductive Bias Inductive Bias P I Inductive Bias Inductive Bias CS Inductive Bias 22
23 More on Inductive Bias Inductive Bias requires some set of prior assumptions about the tasks being considered and the earning approaches avaiabe Tom Mitche s definition: Inductive Bias of a earner is the set of additiona assumptions sufficient to justify its inductive inferences as deductive inferences We consider standard ML agorithms/hypothesis spaces to be different inductive biases: C4.5 (Greedy best attributes), Backpropagation (simpe to compex), etc. CS Inductive Bias 23
24 Which Bias is Best? Not one Bias that is best on a probems Our experiments Over 50 rea word probems Over 400 inductive biases mosty variations on critica variabe biases vs. simiarity biases Different biases were a better fit for different probems Given a data set, which Learning mode (Inductive Bias) shoud be chosen? CS Inductive Bias 24
25 Automatic Discovery of Inductive Bias Defining and characterizing the set of Interesting/Learnabe probems To what extent do current biases cover the set of interesting probems Automatic feature seection Automatic seection of Bias (before and/or during earning), incuding a earning parameters Dynamic Inductive Biases (in time and space) Combinations of Biases Ensembes, Orace Learning CS Inductive Bias 25
26 Dynamic Inductive Bias in Time Can be discovered as you earn May want to earn genera rues first foowed by true exceptions Can be based on ease of earning the probem Exampe: SoftProp From Lazy Learning to Backprop CS Inductive Bias 26
27 Dynamic Inductive Bias in Space CS Inductive Bias 27
28 ML Hoy Grai: We want a aspects of the earning mechanism automated, incuding the Inductive Bias Outputs Just a Data Set or just an expanation of the probem Automated Learner Hypothesis Input Features CS Inductive Bias 28
29 BYU Neura Network and Machine Learning Laboratory Work on Automatic Discover of Inductive Bias Proposing New Learning Agorithms (Inductive Biases) Theoretica issues Defining the set of Interesting/Learnabe probems Anaytica/empirica studies of differences between biases Ensembes Wagging, Mimicking, Orace Learning, etc. Meta-Learning A priori decision regarding which earning mode to use Features of the data set/appication Learning from mode experience Automatic seection of Parameters Constructive Agorithms ASOCS, DMPx, etc. Learning Parameters Windowed momentum, Automatic improved distance functions (IVDM) Automatic Bias in time SoftProp Automatic Bias in space Overfitting, sensitivity to compex portions of the space: DMP, higher order features CS Inductive Bias 29
Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?
Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine
More informationA Brief Introduction to Markov Chains and Hidden Markov Models
A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,
More informationStatistical Learning Theory: A Primer
Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO
More informationFRST Multivariate Statistics. Multivariate Discriminant Analysis (MDA)
1 FRST 531 -- Mutivariate Statistics Mutivariate Discriminant Anaysis (MDA) Purpose: 1. To predict which group (Y) an observation beongs to based on the characteristics of p predictor (X) variabes, using
More informationMATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES
MATH 172: MOTIVATION FOR FOURIER SERIES: SEPARATION OF VARIABLES Separation of variabes is a method to sove certain PDEs which have a warped product structure. First, on R n, a inear PDE of order m is
More informationSeparation of Variables and a Spherical Shell with Surface Charge
Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation
More informationFOURIER SERIES ON ANY INTERVAL
FOURIER SERIES ON ANY INTERVAL Overview We have spent considerabe time earning how to compute Fourier series for functions that have a period of 2p on the interva (-p,p). We have aso seen how Fourier series
More informationAppendix A: MATLAB commands for neural networks
Appendix A: MATLAB commands for neura networks 132 Appendix A: MATLAB commands for neura networks p=importdata('pn.xs'); t=importdata('tn.xs'); [pn,meanp,stdp,tn,meant,stdt]=prestd(p,t); for m=1:10 net=newff(minmax(pn),[m,1],{'tansig','purein'},'trainm');
More informationHow the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah
How the backpropagation agorithm works Srikumar Ramaingam Schoo of Computing University of Utah Reference Most of the sides are taken from the second chapter of the onine book by Michae Nieson: neuranetworksanddeepearning.com
More informationXSAT of linear CNF formulas
XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open
More informationFormulas for Angular-Momentum Barrier Factors Version II
BNL PREPRINT BNL-QGS-06-101 brfactor1.tex Formuas for Anguar-Momentum Barrier Factors Version II S. U. Chung Physics Department, Brookhaven Nationa Laboratory, Upton, NY 11973 March 19, 2015 abstract A
More informationCS229 Lecture notes. Andrew Ng
CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationHow the backpropagation algorithm works Srikumar Ramalingam School of Computing University of Utah
How the backpropagation agorithm works Srikumar Ramaingam Schoo of Computing University of Utah Reference Most of the sides are taken from the second chapter of the onine book by Michae Nieson: neuranetworksanddeepearning.com
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 11, 2012 Today: Computational Learning Theory Probably Approximately Coorrect (PAC) learning theorem
More informationASummaryofGaussianProcesses Coryn A.L. Bailer-Jones
ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe
More informationHonors Thesis Bounded Query Functions With Limited Output Bits II
Honors Thesis Bounded Query Functions With Limited Output Bits II Daibor Zeený University of Maryand, Batimore County May 29, 2007 Abstract We sove some open questions in the area of bounded query function
More informationMachine Learning
Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University October 11, 2012 Today: Computational Learning Theory Probably Approximately Coorrect (PAC) learning theorem
More informationStatistics for Applications. Chapter 7: Regression 1/43
Statistics for Appications Chapter 7: Regression 1/43 Heuristics of the inear regression (1) Consider a coud of i.i.d. random points (X i,y i ),i =1,...,n : 2/43 Heuristics of the inear regression (2)
More informationSTA 216 Project: Spline Approach to Discrete Survival Analysis
: Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing
More informationSVM: Terminology 1(6) SVM: Terminology 2(6)
Andrew Kusiak Inteigent Systems Laboratory 39 Seamans Center he University of Iowa Iowa City, IA 54-57 SVM he maxima margin cassifier is simiar to the perceptron: It aso assumes that the data points are
More informationOn the Goal Value of a Boolean Function
On the Goa Vaue of a Booean Function Eric Bach Dept. of CS University of Wisconsin 1210 W. Dayton St. Madison, WI 53706 Lisa Heerstein Dept of CSE NYU Schoo of Engineering 2 Metrotech Center, 10th Foor
More informationLearning Fully Observed Undirected Graphical Models
Learning Fuy Observed Undirected Graphica Modes Sides Credit: Matt Gormey (2016) Kayhan Batmangheich 1 Machine Learning The data inspires the structures we want to predict Inference finds {best structure,
More informationRidge Regression 1. to which some random noise is added. So that the training labels can be represented as:
CS 1: Machine Learning Spring 15 College of Computer and Information Science Northeastern University Lecture 3 February, 3 Instructor: Bilal Ahmed Scribe: Bilal Ahmed & Virgil Pavlu 1 Introduction Ridge
More informationStatistical Learning Theory: a Primer
??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa
More informationAn Algorithm for Pruning Redundant Modules in Min-Max Modular Network
An Agorithm for Pruning Redundant Modues in Min-Max Moduar Network Hui-Cheng Lian and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University 1954 Hua Shan Rd., Shanghai
More informationMultilayer Kerceptron
Mutiayer Kerceptron Zotán Szabó, András Lőrincz Department of Information Systems, Facuty of Informatics Eötvös Loránd University Pázmány Péter sétány 1/C H-1117, Budapest, Hungary e-mai: szzoi@csetehu,
More informationA Novel Learning Method for Elman Neural Network Using Local Search
Neura Information Processing Letters and Reviews Vo. 11, No. 8, August 2007 LETTER A Nove Learning Method for Eman Neura Networ Using Loca Search Facuty of Engineering, Toyama University, Gofuu 3190 Toyama
More informationLecture Note 3: Stationary Iterative Methods
MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or
More informationCS 6375: Machine Learning Computational Learning Theory
CS 6375: Machine Learning Computational Learning Theory Vibhav Gogate The University of Texas at Dallas Many slides borrowed from Ray Mooney 1 Learning Theory Theoretical characterizations of Difficulty
More informationStochastic Variational Inference with Gradient Linearization
Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,
More informationMARKOV CHAINS AND MARKOV DECISION THEORY. Contents
MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After
More informationActive Learning & Experimental Design
Active Learning & Experimenta Design Danie Ting Heaviy modified, of course, by Lye Ungar Origina Sides by Barbara Engehardt and Aex Shyr Lye Ungar, University of Pennsyvania Motivation u Data coection
More information4 Separation of Variables
4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE
More informationMachine Learning for Biomedical Engineering. Enrico Grisan
Machine Learning for Biomedical Engineering Enrico Grisan enrico.grisan@dei.unipd.it Curse of dimensionality Why are more features bad? Redundant features (useless or confounding) Hard to interpret and
More informationarxiv: v2 [cond-mat.stat-mech] 14 Nov 2008
Random Booean Networks Barbara Drosse Institute of Condensed Matter Physics, Darmstadt University of Technoogy, Hochschustraße 6, 64289 Darmstadt, Germany (Dated: June 27) arxiv:76.335v2 [cond-mat.stat-mech]
More informationManipulation in Financial Markets and the Implications for Debt Financing
Manipuation in Financia Markets and the Impications for Debt Financing Leonid Spesivtsev This paper studies the situation when the firm is in financia distress and faces bankruptcy or debt restructuring.
More information221B Lecture Notes Notes on Spherical Bessel Functions
Definitions B Lecture Notes Notes on Spherica Besse Functions We woud ike to sove the free Schrödinger equation [ h d r R(r) = h k R(r). () m r dr r m R(r) is the radia wave function ψ( x) = R(r)Y m (θ,
More informationThe EM Algorithm applied to determining new limit points of Mahler measures
Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM,
More informationDetermining The Degree of Generalization Using An Incremental Learning Algorithm
Determining The Degree of Generaization Using An Incrementa Learning Agorithm Pabo Zegers Facutad de Ingeniería, Universidad de os Andes San Caros de Apoquindo 22, Las Condes, Santiago, Chie pzegers@uandes.c
More informationCSCI 5622 Machine Learning
CSCI 5622 Machine Learning DATE READ DUE Mon, Aug 31 1, 2 & 3 Wed, Sept 2 3 & 5 Wed, Sept 9 TBA Prelim Proposal www.rodneynielsen.com/teaching/csci5622f09/ Instructor: Rodney Nielsen Assistant Professor
More information[read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] General-to-specific ordering over hypotheses
1 CONCEPT LEARNING AND THE GENERAL-TO-SPECIFIC ORDERING [read Chapter 2] [suggested exercises 2.2, 2.3, 2.4, 2.6] Learning from examples General-to-specific ordering over hypotheses Version spaces and
More information6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7
6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the
More informationMODULE -4 BAYEIAN LEARNING
MODULE -4 BAYEIAN LEARNING CONTENT Introduction Bayes theorem Bayes theorem and concept learning Maximum likelihood and Least Squared Error Hypothesis Maximum likelihood Hypotheses for predicting probabilities
More informationA Solution to the 4-bit Parity Problem with a Single Quaternary Neuron
Neura Information Processing - Letters and Reviews Vo. 5, No. 2, November 2004 LETTER A Soution to the 4-bit Parity Probem with a Singe Quaternary Neuron Tohru Nitta Nationa Institute of Advanced Industria
More informationA. Distribution of the test statistic
A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch
More informationSupport Vector Machine and Its Application to Regression and Classification
BearWorks Institutiona Repository MSU Graduate Theses Spring 2017 Support Vector Machine and Its Appication to Regression and Cassification Xiaotong Hu As with any inteectua project, the content and views
More informationLecture 2 Machine Learning Review
Lecture 2 Machine Learning Review CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago March 29, 2017 Things we will look at today Formal Setup for Supervised Learning Things
More informationBP neural network-based sports performance prediction model applied research
Avaiabe onine www.jocpr.com Journa of Chemica and Pharmaceutica Research, 204, 6(7:93-936 Research Artice ISSN : 0975-7384 CODEN(USA : JCPRC5 BP neura networ-based sports performance prediction mode appied
More informationLecture 9. Stability of Elastic Structures. Lecture 10. Advanced Topic in Column Buckling
Lecture 9 Stabiity of Eastic Structures Lecture 1 Advanced Topic in Coumn Bucking robem 9-1: A camped-free coumn is oaded at its tip by a oad. The issue here is to find the itica bucking oad. a) Suggest
More informationParagraph Topic Classification
Paragraph Topic Cassification Eugene Nho Graduate Schoo of Business Stanford University Stanford, CA 94305 enho@stanford.edu Edward Ng Department of Eectrica Engineering Stanford University Stanford, CA
More informationAlgorithms to solve massively under-defined systems of multivariate quadratic equations
Agorithms to sove massivey under-defined systems of mutivariate quadratic equations Yasufumi Hashimoto Abstract It is we known that the probem to sove a set of randomy chosen mutivariate quadratic equations
More information4 1-D Boundary Value Problems Heat Equation
4 -D Boundary Vaue Probems Heat Equation The main purpose of this chapter is to study boundary vaue probems for the heat equation on a finite rod a x b. u t (x, t = ku xx (x, t, a < x < b, t > u(x, = ϕ(x
More informationpp in Backpropagation Convergence Via Deterministic Nonmonotone Perturbed O. L. Mangasarian & M. V. Solodov Madison, WI Abstract
pp 383-390 in Advances in Neura Information Processing Systems 6 J.D. Cowan, G. Tesauro and J. Aspector (eds), Morgan Kaufmann Pubishers, San Francisco, CA, 1994 Backpropagation Convergence Via Deterministic
More informationTrainable fusion rules. I. Large sample size case
Neura Networks 19 (2006) 1506 1516 www.esevier.com/ocate/neunet Trainabe fusion rues. I. Large sampe size case Šarūnas Raudys Institute of Mathematics and Informatics, Akademijos 4, Vinius 08633, Lithuania
More informationNoname manuscript No. (will be inserted by the editor) Can Li Ignacio E. Grossmann
Noname manuscript No. (wi be inserted by the editor) A finite ɛ-convergence agorithm for two-stage convex 0-1 mixed-integer noninear stochastic programs with mixed-integer first and second stage variabes
More informationAsynchronous Control for Coupled Markov Decision Systems
INFORMATION THEORY WORKSHOP (ITW) 22 Asynchronous Contro for Couped Marov Decision Systems Michae J. Neey University of Southern Caifornia Abstract This paper considers optima contro for a coection of
More information14 Separation of Variables Method
14 Separation of Variabes Method Consider, for exampe, the Dirichet probem u t = Du xx < x u(x, ) = f(x) < x < u(, t) = = u(, t) t > Let u(x, t) = T (t)φ(x); now substitute into the equation: dt
More informationHomework #04 Answers and Hints (MATH4052 Partial Differential Equations)
Homework #4 Answers and Hints (MATH452 Partia Differentia Equations) Probem 1 (Page 89, Q2) Consider a meta rod ( < x < ), insuated aong its sides but not at its ends, which is initiay at temperature =
More informationHigh Spectral Resolution Infrared Radiance Modeling Using Optimal Spectral Sampling (OSS) Method
High Spectra Resoution Infrared Radiance Modeing Using Optima Spectra Samping (OSS) Method J.-L. Moncet and G. Uymin Background Optima Spectra Samping (OSS) method is a fast and accurate monochromatic
More informationDay 3: Classification, logistic regression
Day 3: Classification, logistic regression Introduction to Machine Learning Summer School June 18, 2018 - June 29, 2018, Chicago Instructor: Suriya Gunasekar, TTI Chicago 20 June 2018 Topics so far Supervised
More informationHaar Decomposition and Reconstruction Algorithms
Jim Lambers MAT 773 Fa Semester 018-19 Lecture 15 and 16 Notes These notes correspond to Sections 4.3 and 4.4 in the text. Haar Decomposition and Reconstruction Agorithms Decomposition Suppose we approximate
More informationAutomobile Prices in Market Equilibrium. Berry, Pakes and Levinsohn
Automobie Prices in Market Equiibrium Berry, Pakes and Levinsohn Empirica Anaysis of demand and suppy in a differentiated products market: equiibrium in the U.S. automobie market. Oigopoistic Differentiated
More information( ) is just a function of x, with
II. MULTIVARIATE CALCULUS The first ecture covered functions where a singe input goes in, and a singe output comes out. Most economic appications aren t so simpe. In most cases, a number of variabes infuence
More informationDavid Eigen. MA112 Final Paper. May 10, 2002
David Eigen MA112 Fina Paper May 1, 22 The Schrodinger equation describes the position of an eectron as a wave. The wave function Ψ(t, x is interpreted as a probabiity density for the position of the eectron.
More informationFrom Margins to Probabilities in Multiclass Learning Problems
From Margins to Probabiities in Muticass Learning Probems Andrea Passerini and Massimiiano Ponti 2 and Paoo Frasconi 3 Abstract. We study the probem of muticass cassification within the framework of error
More informationWave Equation Dirichlet Boundary Conditions
Wave Equation Dirichet Boundary Conditions u tt x, t = c u xx x, t, < x 1 u, t =, u, t = ux, = fx u t x, = gx Look for simpe soutions in the form ux, t = XxT t Substituting into 13 and dividing
More informationData Discovery and Anomaly Detection Using Atypicality: Theory
Data Discovery and Anomay Detection Using Atypicaity: Theory Anders Høst-Madsen, Feow, IEEE, Eyas Sabeti, Member, IEEE, Chad Waton Abstract A centra question in the era of big data is what to do with the
More informationCS 331: Artificial Intelligence Propositional Logic 2. Review of Last Time
CS 33 Artificia Inteigence Propositiona Logic 2 Review of Last Time = means ogicay foows - i means can be derived from If your inference agorithm derives ony things that foow ogicay from the KB, the inference
More informationProblem set 6 The Perron Frobenius theorem.
Probem set 6 The Perron Frobenius theorem. Math 22a4 Oct 2 204, Due Oct.28 In a future probem set I want to discuss some criteria which aow us to concude that that the ground state of a sef-adjoint operator
More informationThroughput Optimal Scheduling for Wireless Downlinks with Reconfiguration Delay
Throughput Optima Scheduing for Wireess Downinks with Reconfiguration Deay Vineeth Baa Sukumaran vineethbs@gmai.com Department of Avionics Indian Institute of Space Science and Technoogy. Abstract We consider
More informationHYDROGEN ATOM SELECTION RULES TRANSITION RATES
DOING PHYSICS WITH MATLAB QUANTUM PHYSICS Ian Cooper Schoo of Physics, University of Sydney ian.cooper@sydney.edu.au HYDROGEN ATOM SELECTION RULES TRANSITION RATES DOWNLOAD DIRECTORY FOR MATLAB SCRIPTS
More informationNew Efficiency Results for Makespan Cost Sharing
New Efficiency Resuts for Makespan Cost Sharing Yvonne Beischwitz a, Forian Schoppmann a, a University of Paderborn, Department of Computer Science Fürstenaee, 3302 Paderborn, Germany Abstract In the context
More informationEfficient Part-of-Speech Tagging with a Min-Max Modular Neural-Network Model
Appied Inteigence 19, 65 81, 2003 c 2003 Kuwer Academic Pubishers. Manufactured in The Netherands. Efficient Part-of-Speech Tagging with a Min-Max Moduar Neura-Network Mode BAO-LIANG LU Department of Computer
More informationMore Scattering: the Partial Wave Expansion
More Scattering: the Partia Wave Expansion Michae Fower /7/8 Pane Waves and Partia Waves We are considering the soution to Schrödinger s equation for scattering of an incoming pane wave in the z-direction
More informationHigher dimensional PDEs and multidimensional eigenvalue problems
Higher dimensiona PEs and mutidimensiona eigenvaue probems 1 Probems with three independent variabes Consider the prototypica equations u t = u (iffusion) u tt = u (W ave) u zz = u (Lapace) where u = u
More informationStochastic Complement Analysis of Multi-Server Threshold Queues. with Hysteresis. Abstract
Stochastic Compement Anaysis of Muti-Server Threshod Queues with Hysteresis John C.S. Lui The Dept. of Computer Science & Engineering The Chinese University of Hong Kong Leana Goubchik Dept. of Computer
More information1D Heat Propagation Problems
Chapter 1 1D Heat Propagation Probems If the ambient space of the heat conduction has ony one dimension, the Fourier equation reduces to the foowing for an homogeneous body cρ T t = T λ 2 + Q, 1.1) x2
More informationTurbo Codes. Coding and Communication Laboratory. Dept. of Electrical Engineering, National Chung Hsing University
Turbo Codes Coding and Communication Laboratory Dept. of Eectrica Engineering, Nationa Chung Hsing University Turbo codes 1 Chapter 12: Turbo Codes 1. Introduction 2. Turbo code encoder 3. Design of intereaver
More informationChemical Kinetics Part 2
Integrated Rate Laws Chemica Kinetics Part 2 The rate aw we have discussed thus far is the differentia rate aw. Let us consider the very simpe reaction: a A à products The differentia rate reates the rate
More informationDiscrete Applied Mathematics
Discrete Appied Mathematics 159 (2011) 812 825 Contents ists avaiabe at ScienceDirect Discrete Appied Mathematics journa homepage: www.esevier.com/ocate/dam A direct barter mode for course add/drop process
More information$, (2.1) n="# #. (2.2)
Chapter. Eectrostatic II Notes: Most of the materia presented in this chapter is taken from Jackson, Chap.,, and 4, and Di Bartoo, Chap... Mathematica Considerations.. The Fourier series and the Fourier
More information17 Lecture 17: Recombination and Dark Matter Production
PYS 652: Astrophysics 88 17 Lecture 17: Recombination and Dark Matter Production New ideas pass through three periods: It can t be done. It probaby can be done, but it s not worth doing. I knew it was
More informationTwo Birds With One Stone: An Efficient Hierarchical Framework for Top-k and Threshold-based String Similarity Search
Two Birds With One Stone: An Efficient Hierarchica Framework for Top-k and Threshod-based String Simiarity Search Jin Wang Guoiang Li Dong Deng Yong Zhang Jianhua Feng Department of Computer Science and
More informationA Bayesian Framework for Learning Rule Sets for Interpretable Classification
Journa of Machine Learning Research 18 (2017) 1-37 Submitted 1/16; Revised 2/17; Pubished 8/17 A Bayesian Framework for Learning Rue Sets for Interpretabe Cassification Tong Wang Cynthia Rudin Finae Doshi-Veez
More informationThreshold Circuits for Multiplication and Related Problems
Optima-Depth Threshod Circuits for Mutipication and Reated Probems Chi-Hsiang Yeh Dept. of Eectrica & Computer Engineering Queen s University Kingston, Ontario, Canada, K7K 3N6 E.A. Varvarigos, B. Parhami,
More informationPhysicsAndMathsTutor.com
. Two points A and B ie on a smooth horizonta tabe with AB = a. One end of a ight eastic spring, of natura ength a and moduus of easticity mg, is attached to A. The other end of the spring is attached
More informationSteepest Descent Adaptation of Min-Max Fuzzy If-Then Rules 1
Steepest Descent Adaptation of Min-Max Fuzzy If-Then Rues 1 R.J. Marks II, S. Oh, P. Arabshahi Λ, T.P. Caude, J.J. Choi, B.G. Song Λ Λ Dept. of Eectrica Engineering Boeing Computer Services University
More informationCopyright information to be inserted by the Publishers. Unsplitting BGK-type Schemes for the Shallow. Water Equations KUN XU
Copyright information to be inserted by the Pubishers Unspitting BGK-type Schemes for the Shaow Water Equations KUN XU Mathematics Department, Hong Kong University of Science and Technoogy, Cear Water
More informationFast Blind Recognition of Channel Codes
Fast Bind Recognition of Channe Codes Reza Moosavi and Erik G. Larsson Linköping University Post Print N.B.: When citing this work, cite the origina artice. 213 IEEE. Persona use of this materia is permitted.
More informationGauss Law. 2. Gauss s Law: connects charge and field 3. Applications of Gauss s Law
Gauss Law 1. Review on 1) Couomb s Law (charge and force) 2) Eectric Fied (fied and force) 2. Gauss s Law: connects charge and fied 3. Appications of Gauss s Law Couomb s Law and Eectric Fied Couomb s
More informationDIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM
DIGITAL FILTER DESIGN OF IIR FILTERS USING REAL VALUED GENETIC ALGORITHM MIKAEL NILSSON, MATTIAS DAHL AND INGVAR CLAESSON Bekinge Institute of Technoogy Department of Teecommunications and Signa Processing
More informationExpectation-Maximization for Estimating Parameters for a Mixture of Poissons
Expectation-Maximization for Estimating Parameters for a Mixture of Poissons Brandon Maone Department of Computer Science University of Hesini February 18, 2014 Abstract This document derives, in excrutiating
More informationNotes on Backpropagation with Cross Entropy
Notes on Backpropagation with Cross Entropy I-Ta ee, Dan Gowasser, Bruno Ribeiro Purue University October 3, 07. Overview This note introuces backpropagation for a common neura network muti-cass cassifier.
More informationRecursive Constructions of Parallel FIFO and LIFO Queues with Switched Delay Lines
Recursive Constructions of Parae FIFO and LIFO Queues with Switched Deay Lines Po-Kai Huang, Cheng-Shang Chang, Feow, IEEE, Jay Cheng, Member, IEEE, and Duan-Shin Lee, Senior Member, IEEE Abstract One
More informationCryptanalysis of PKP: A New Approach
Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in
More informationComputational Learning Theory
CS 446 Machine Learning Fall 2016 OCT 11, 2016 Computational Learning Theory Professor: Dan Roth Scribe: Ben Zhou, C. Cervantes 1 PAC Learning We want to develop a theory to relate the probability of successful
More informationMONTE CARLO SIMULATIONS
MONTE CARLO SIMULATIONS Current physics research 1) Theoretica 2) Experimenta 3) Computationa Monte Caro (MC) Method (1953) used to study 1) Discrete spin systems 2) Fuids 3) Poymers, membranes, soft matter
More informationIE 361 Exam 1. b) Give *&% confidence limits for the bias of this viscometer. (No need to simplify.)
October 9, 00 IE 6 Exam Prof. Vardeman. The viscosity of paint is measured with a "viscometer" in units of "Krebs." First, a standard iquid of "known" viscosity *# Krebs is tested with a company viscometer
More informationII. PROBLEM. A. Description. For the space of audio signals
CS229 - Fina Report Speech Recording based Language Recognition (Natura Language) Leopod Cambier - cambier; Matan Leibovich - matane; Cindy Orozco Bohorquez - orozcocc ABSTRACT We construct a rea time
More information