} Often, when learning, we deal with uncertainty:
|
|
- Loren Gordon
- 5 years ago
- Views:
Transcription
1 Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally } And many more Class #03: Informaton Theory } Probablty theory gves us mathematcs for such cases } A precse mathematcal theory of chance and causalty Machne Learnng (CS 49/59): M. Allen, 0 Sept. 8 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) Basc Elements of Probablty } Suppose we have some event, e : some fact about the world that may be true or false } We wrte P (e ) for the probablty that e occurs: 0 apple P (e) apple } We can understand ths value as:. P (e ) = : e wll certanly happen. P (e ) = 0: e wll certanly not happen 3. P (e ) = k, 0 < k < : over an arbtrarly long stretch of tme, we wll observe the fracton Event e occurs Total # of events = k Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 3 Propertes of Probablty } Every event must ether occur, or not occur: P (e _ e) = P (e) = p( e) } Furthermore, suppose that we have a set of all possble events, each wth ts own probablty: E = {e,e,...,e k } } Ths set of probabltes s called a probablty dstrbuton, and t must have the followng property: X p = Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 4
2 Probablty Dstrbutons } A unform dstrbuton s one n whch every event occurs wth equal probablty, whch means that we have: ^ 8, p = k } Such dstrbutons are common n games of chance, e.g. where we have a far con-toss: E = {Heads, Tals} P = {0.5, 0.5} } Not every dstrbuton s unform, and we mght have a con that comes up tals more often than heads (or even always!) P = {0.5, 0.75} P 3 = {0.0,.0} Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 5 Informaton Theory } Claude Shannon created nformaton theory n hs 948 paper, A mathematcal theory of communcaton } A theory of the amount of nformaton that can be carred by communcaton channels } Has mplcatons n networks, encrypton, compresson, and many other areas } Also the source of the term bt (credted to John Tukey) Photo source: Konrad Jacobs ( Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 6 Informaton Carred by Events } Informaton s relatve to our uncertanty about an event } If we do not know whether an event has happened or not, then learnng that fact s a gan n nformaton } If we already know ths fact, then there s no nformaton ganed when we see the outcome } Thus, f we have a fxed con that always comes up tals, actually flppng t tells us nothng we don t already know } Flppng a far con does tell us somethng, on the other hand, snce we can t predct the outcome ahead of tme Amount of Informaton } From N. Abramson (963): If an event e occurs wth probablty p, the amount of nformaton carred s: I(e ) = log p } (The base of the logarthm doesn t really matter, but f we use base-, we are measurng nformaton n bts) } Thus, f we flp a far con, and t comes up tals, we have ganed nformaton equal to: I(Tals) = log P (Tals) = log 0.5 = log =.0 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 7 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 8
3 Based Data Carres Less Informaton } Whle flppng a far con yelds.0 bt of nformaton, flppng one that s based gves us less } If we have a somewhat based con, then we get: E = {Heads, Tals} P = {0.5, 0.75} I(Tals) = log P (Tals) = log 0.75 = log } If we have a totally based con, then we get: P 3 = {0.0,.0} I(Tals) = log P (Tals) = log.0 = log.0 =0.0 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 9 Entropy: Total Average Informaton } Shannon defned the entropy of a probablty dstrbuton as the average amount of nformaton carred by events: H(P) = X p log = X p log p p } Ths can be thought of n a varety of ways, ncludng: } How much uncertanty we have about the average event } How much nformaton we get when an average event occurs } How many bts on average are needed to communcate about the events (Shannon was nterested n fndng the most effcent overall encodngs to use n transmttng nformaton) Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 0 Entropy: Total Average Informaton } For a con, C, the formula for entropy becomes: H(C) = (P (Heads) log P (Heads)+P(Tals) log P (Tals)) } A far con, {0.5, 0.5}, has maxmum entropy: H(C) = (0.5 log log 0.5) =.0 } A somewhat based con, {0.5, 0.75}, has less: H(C) = (0.5 log log 0.75) 0.8 } And a fxed con, {0.0,.0}, has none: H(C) = (.0 log log 0.0) = 0.0 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) A Mathematcal Defnton H(P) = X p log p } It s easy to show that for any dstrbuton, entropy s always greater than or equal to 0 (never negatve) } Maxmum entropy occurs wth a unform dstrbuton } In such cases, entropy s log k, where k s the number of dfferent probablstc outcomes } Thus, for any dstrbuton possble, we have: 0 apple H(P) apple log k Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 3
4 Jont Probablty & Independence } If we have two events e and e, the probablty that both events occur, called the jont probablty, s wrtten: P (e ^ e )=P (e,e ) } We say that two events are ndependent f and only f: P (e,e )=P (e ) P (e ) } Independent events tell us nothng about each other Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 3 Jont Probablty & Independence } Independent events tell us nothng about each other: } For example, suppose rany weather s unformly dstrbuted } Suppose further that we choose a day of the week, unformly at random: that day s ether on a weekend or not, gvng us: W = {Ran, Ran} P W = {0.5, 0.5} D = {Weekend, Weekend} P D = {/7, 5/7} } If the weather on any day s ndependent of whether or not that day s a weekend, then we wll have the followng: P (Ran, W eekend) =P (Ran)P (Weekend) = 0.5 /7 =/4 P ( Ran, W eekend) =P ( Ran)P (Weekend) = 0.5 /7 =/4 P (Ran, Weekend)=P (Ran)P ( Weekend) = 0.5 5/7 =5/4 P ( Ran, Weekend)=P ( Ran)P ( Weekend) = 0.5 5/7 =5/4 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 4 Lack of Independence } Suppose we compare the probablty that t rans to the probablty that I brng an umbrella to work: W = {Ran, Ran} P W = {0.5, 0.5} U = {Umbrella, Umbrella} P U = {0., 0.8} } Note: presumably, nether of these s really purely random; we can stll treat them as random varables based upon observng how frequently they occur (ths s sometmes called the emprcal probablty) } Now, f these were ndependent events, then the probablty, e.g., that I am carryng an umbrella and t s ranng s: P (Ran, Umbrella) =P (Ran)P (Umbrella) = =0. } Obvously, however, these are not ndependent; and the actual probablty of seeng me wth my umbrella on rany days could be much hgher than just calculated Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 5 Condtonal Probablty } Gven two events e and e, the probablty that e occurs, gven that e also occurs, called the condtonal probablty of e gven e, s wrtten: P (e e ) } In general, the condtonal probablty of an event can be qute dfferent from the basc probablty that t occurs } Thus, for our weather example, we mght have: W = {R, R} P W = {0.5, 0.5} U = {U, U} P U = {0., 0.8} P (U R) =0.8 P (U R) =0. P ( U R) =0. P ( U R) =0.9 P (e e )+P ( e e )=.0 P (e e )+P (e e ) 6=.0 Can be equal, but not necessarly Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 6 4
5 Propertes of Condtonal Probablty } Condtonal probablty can be defned usng jont probablty: P (e e )= P (e,e ) P (e ) P (e,e )=P (e e )P (e ) } Thus, f the events are actually ndependent, we get: P (e e )= P (e,e ) P (e ) P (e e )= P (e )P (e ) P (e ) P (e e )=P (e ) By defnton of ndependence Ths Week } Informaton Theory & Decson Trees } Readngs: } Blog post on Informaton Theory (lnked from class schedule) } Secton 8.3 from Russell & Norvg } Offce Hours: Wng 0 } Monday/Wednesday/Frday, :00 PM :00 PM } Tuesday/Thursday, :30 PM 3:00 PM Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 7 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 8 5
Course 395: Machine Learning - Lectures
Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng
More informationLecture 3: Probability Distributions
Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the
More informationEGR 544 Communication Theory
EGR 544 Communcaton Theory. Informaton Sources Z. Alyazcoglu Electrcal and Computer Engneerng Department Cal Poly Pomona Introducton Informaton Source x n Informaton sources Analog sources Dscrete sources
More informationMachine learning: Density estimation
CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationLecture 3: Shannon s Theorem
CSE 533: Error-Correctng Codes (Autumn 006 Lecture 3: Shannon s Theorem October 9, 006 Lecturer: Venkatesan Guruswam Scrbe: Wdad Machmouch 1 Communcaton Model The communcaton model we are usng conssts
More informationLecture 4. Instructor: Haipeng Luo
Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would
More informationÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE
ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE School of Computer and Communcaton Scences Handout 0 Prncples of Dgtal Communcatons Solutons to Problem Set 4 Mar. 6, 08 Soluton. If H = 0, we have Y = Z Z = Y
More informationFor example, if the drawing pin was tossed 200 times and it landed point up on 140 of these trials,
Probablty In ths actvty you wll use some real data to estmate the probablty of an event happenng. You wll also use a varety of methods to work out theoretcal probabltes. heoretcal and expermental probabltes
More informationUsing T.O.M to Estimate Parameter of distributions that have not Single Exponential Family
IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran
More informationCS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements
CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before
More informationProbability and Random Variable Primer
B. Maddah ENMG 622 Smulaton 2/22/ Probablty and Random Varable Prmer Sample space and Events Suppose that an eperment wth an uncertan outcome s performed (e.g., rollng a de). Whle the outcome of the eperment
More informationModule 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur
Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationIntroduction to Information Theory, Data Compression,
Introducton to Informaton Theory, Data Compresson, Codng Mehd Ibm Brahm, Laura Mnkova Aprl 5, 208 Ths s the augmented transcrpt of a lecture gven by Luc Devroye on the 3th of March 208 for a Data Structures
More information1 The Mistake Bound Model
5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there
More informationLecture 4: November 17, Part 1 Single Buffer Management
Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 2003-2004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input
More informationj) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1
Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons
More informationEPR Paradox and the Physical Meaning of an Experiment in Quantum Mechanics. Vesselin C. Noninski
EPR Paradox and the Physcal Meanng of an Experment n Quantum Mechancs Vesseln C Nonnsk vesselnnonnsk@verzonnet Abstract It s shown that there s one purely determnstc outcome when measurement s made on
More informationRandomness and Computation
Randomness and Computaton or, Randomzed Algorthms Mary Cryan School of Informatcs Unversty of Ednburgh RC 208/9) Lecture 0 slde Balls n Bns m balls, n bns, and balls thrown unformly at random nto bns usually
More informationExpected Value and Variance
MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or
More informationStatistics and Quantitative Analysis U4320. Segment 3: Probability Prof. Sharyn O Halloran
Statstcs and Quanttatve Analyss U430 Segment 3: Probablty Prof. Sharyn O Halloran Revew: Descrptve Statstcs Code book for Measures Sample Data Relgon Employed 1. Catholc 0. Unemployed. Protestant 1. Employed
More informationA random variable is a function which associates a real number to each element of the sample space
Introducton to Random Varables Defnton of random varable Defnton of of random varable Dscrete and contnuous random varable Probablty blt functon Dstrbuton functon Densty functon Sometmes, t s not enough
More informationAssignment 2. Tyler Shendruk February 19, 2010
Assgnment yler Shendruk February 9, 00 Kadar Ch. Problem 8 We have an N N symmetrc matrx, M. he symmetry means M M and we ll say the elements of the matrx are m j. he elements are pulled from a probablty
More informationLearning from Data 1 Naive Bayes
Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why
More informationAn Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation
An Experment/Some Intuton I have three cons n my pocket, 6.864 (Fall 2006): Lecture 18 The EM Algorthm Con 0 has probablty λ of heads; Con 1 has probablty p 1 of heads; Con 2 has probablty p 2 of heads
More informationDecision-making and rationality
Reslence Informatcs for Innovaton Classcal Decson Theory RRC/TMI Kazuo URUTA Decson-makng and ratonalty What s decson-makng? Methodology for makng a choce The qualty of decson-makng determnes success or
More informationChapter 1. Probability
Chapter. Probablty Mcroscopc propertes of matter: quantum mechancs, atomc and molecular propertes Macroscopc propertes of matter: thermodynamcs, E, H, C V, C p, S, A, G How do we relate these two propertes?
More informationQuantum and Classical Information Theory with Disentropy
Quantum and Classcal Informaton Theory wth Dsentropy R V Ramos rubensramos@ufcbr Lab of Quantum Informaton Technology, Department of Telenformatc Engneerng Federal Unversty of Ceara - DETI/UFC, CP 6007
More informationP exp(tx) = 1 + t 2k M 2k. k N
1. Subgaussan tals Defnton. Say that a random varable X has a subgaussan dstrbuton wth scale factor σ< f P exp(tx) exp(σ 2 t 2 /2) for all real t. For example, f X s dstrbuted N(,σ 2 ) then t s subgaussan.
More informationThe Expectation-Maximization Algorithm
The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.
More informationC/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1
C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned
More informationIntroduction to information theory and data compression
Introducton to nformaton theory and data compresson Adel Magra, Emma Gouné, Irène Woo March 8, 207 Ths s the augmented transcrpt of a lecture gven by Luc Devroye on March 9th 207 for a Data Structures
More information6. Stochastic processes (2)
Contents Markov processes Brth-death processes Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 Markov process Consder a contnuous-tme and dscrete-state stochastc process X(t) wth state space
More informationLecture 10: May 6, 2013
TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,
More informationHomework Assignment 3 Due in class, Thursday October 15
Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.
More informationESCI 341 Atmospheric Thermodynamics Lesson 10 The Physical Meaning of Entropy
ESCI 341 Atmospherc Thermodynamcs Lesson 10 The Physcal Meanng of Entropy References: An Introducton to Statstcal Thermodynamcs, T.L. Hll An Introducton to Thermodynamcs and Thermostatstcs, H.B. Callen
More information6. Stochastic processes (2)
6. Stochastc processes () Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 6. Stochastc processes () Contents Markov processes Brth-death processes 6. Stochastc processes () Markov process
More information6.842 Randomness and Computation February 18, Lecture 4
6.842 Randomness and Computaton February 18, 2014 Lecture 4 Lecturer: Rontt Rubnfeld Scrbe: Amartya Shankha Bswas Topcs 2-Pont Samplng Interactve Proofs Publc cons vs Prvate cons 1 Two Pont Samplng 1.1
More informationCOS 511: Theoretical Machine Learning
COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that
More informationprinceton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora
prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable
More informationLinear Regression Analysis: Terminology and Notation
ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented
More informationIntroduction to Random Variables
Introducton to Random Varables Defnton of random varable Defnton of random varable Dscrete and contnuous random varable Probablty functon Dstrbuton functon Densty functon Sometmes, t s not enough to descrbe
More informationNAME and Section No.
Chemstry 391 Fall 2007 Exam I KEY (Monday September 17) 1. (25 Ponts) ***Do 5 out of 6***(If 6 are done only the frst 5 wll be graded)*** a). Defne the terms: open system, closed system and solated system
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationEquilibrium with Complete Markets. Instructor: Dmytro Hryshko
Equlbrum wth Complete Markets Instructor: Dmytro Hryshko 1 / 33 Readngs Ljungqvst and Sargent. Recursve Macroeconomc Theory. MIT Press. Chapter 8. 2 / 33 Equlbrum n pure exchange, nfnte horzon economes,
More informationIntroduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:
CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and
More informationEngineering Risk Benefit Analysis
Engneerng Rsk Beneft Analyss.55, 2.943, 3.577, 6.938, 0.86, 3.62, 6.862, 22.82, ESD.72, ESD.72 RPRA 2. Elements of Probablty Theory George E. Apostolaks Massachusetts Insttute of Technology Sprng 2007
More informationModule 2. Random Processes. Version 2 ECE IIT, Kharagpur
Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be
More informationLecture 7: Boltzmann distribution & Thermodynamics of mixing
Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters
More informationRetrieval Models: Language models
CS-590I Informaton Retreval Retreval Models: Language models Luo S Department of Computer Scence Purdue Unversty Introducton to language model Ungram language model Document language model estmaton Maxmum
More informationfind (x): given element x, return the canonical element of the set containing x;
COS 43 Sprng, 009 Dsjont Set Unon Problem: Mantan a collecton of dsjont sets. Two operatons: fnd the set contanng a gven element; unte two sets nto one (destructvely). Approach: Canoncal element method:
More informationSTAT 3008 Applied Regression Analysis
STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,
More informationOnline Classification: Perceptron and Winnow
E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng
More informationPredictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore
Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationLecture 3. Ax x i a i. i i
18.409 The Behavor of Algorthms n Practce 2/14/2 Lecturer: Dan Spelman Lecture 3 Scrbe: Arvnd Sankar 1 Largest sngular value In order to bound the condton number, we need an upper bound on the largest
More informationLecture 14 (03/27/18). Channels. Decoding. Preview of the Capacity Theorem.
Lecture 14 (03/27/18). Channels. Decodng. Prevew of the Capacty Theorem. A. Barg The concept of a communcaton channel n nformaton theory s an abstracton for transmttng dgtal (and analog) nformaton from
More information3.1 ML and Empirical Distribution
67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum
More informationBayesian belief networks
CS 1571 Introducton to I Lecture 24 ayesan belef networks los Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square CS 1571 Intro to I dmnstraton Homework assgnment 10 s out and due next week Fnal exam: December
More informationRules of Probability
( ) ( ) = for all Corollary: Rules of robablty The probablty of the unon of any two events and B s roof: ( Φ) = 0. F. ( B) = ( ) + ( B) ( B) If B then, ( ) ( B). roof: week 2 week 2 2 Incluson / Excluson
More informationPROBABILITY PRIMER. Exercise Solutions
PROBABILITY PRIMER Exercse Solutons 1 Probablty Prmer, Exercse Solutons, Prncples of Econometrcs, e EXERCISE P.1 (b) X s a random varable because attendance s not known pror to the outdoor concert. Before
More informationThe Order Relation and Trace Inequalities for. Hermitian Operators
Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence
More informationPower law and dimension of the maximum value for belief distribution with the max Deng entropy
Power law and dmenson of the maxmum value for belef dstrbuton wth the max Deng entropy Bngy Kang a, a College of Informaton Engneerng, Northwest A&F Unversty, Yanglng, Shaanx, 712100, Chna. Abstract Deng
More informationBoning Yang. March 8, 2018
Concentraton Inequaltes by concentraton nequalty Introducton to Basc Concentraton Inequaltes by Florda State Unversty March 8, 2018 Framework Concentraton Inequaltes by 1. concentraton nequalty concentraton
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationCSE 546 Midterm Exam, Fall 2014(with Solution)
CSE 546 Mdterm Exam, Fall 014(wth Soluton) 1. Personal nfo: Name: UW NetID: Student ID:. There should be 14 numbered pages n ths exam (ncludng ths cover sheet). 3. You can use any materal you brought:
More informationMarkov Chain Monte Carlo Lecture 6
where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways
More informationWinter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan
Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments
More informationExpectation Maximization Mixture Models HMMs
-755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood
More information10.34 Fall 2015 Metropolis Monte Carlo Algorithm
10.34 Fall 2015 Metropols Monte Carlo Algorthm The Metropols Monte Carlo method s very useful for calculatng manydmensonal ntegraton. For e.g. n statstcal mechancs n order to calculate the prospertes of
More informationVapnik-Chervonenkis theory
Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown
More informationChannel Encoder. Channel. Figure 7.1: Communication system
Chapter 7 Processes The model of a communcaton system that we have been developng s shown n Fgure 7.. Ths model s also useful for some computaton systems. The source s assumed to emt a stream of symbols.
More informationApplied Stochastic Processes
STAT455/855 Fall 23 Appled Stochastc Processes Fnal Exam, Bref Solutons 1. (15 marks) (a) (7 marks) The dstrbuton of Y s gven by ( ) ( ) y 2 1 5 P (Y y) for y 2, 3,... The above follows because each of
More informationNote on EM-training of IBM-model 1
Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are
More informationBayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County
Smart Home Health Analytcs Sprng 2018 Bayesan Learnng Nrmalya Roy Department of Informaton Systems Unversty of Maryland Baltmore ounty www.umbc.edu Bayesan Learnng ombnes pror knowledge wth evdence to
More informationLecture 4 Hypothesis Testing
Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to
More informationStochastic Structural Dynamics
Stochastc Structural Dynamcs Lecture-1 Defnton of probablty measure and condtonal probablty Dr C S Manohar Department of Cvl Engneerng Professor of Structural Engneerng Indan Insttute of Scence angalore
More informationBasically, if you have a dummy dependent variable you will be estimating a probability.
ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy
More informationFUZZY FINITE ELEMENT METHOD
FUZZY FINITE ELEMENT METHOD RELIABILITY TRUCTURE ANALYI UING PROBABILITY 3.. Maxmum Normal tress Internal force s the shear force, V has a magntude equal to the load P and bendng moment, M. Bendng moments
More informationLecture Space-Bounded Derandomization
Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval
More informationPh 219a/CS 219a. Exercises Due: Wednesday 12 November 2008
1 Ph 19a/CS 19a Exercses Due: Wednesday 1 November 008.1 Whch state dd Alce make? Consder a game n whch Alce prepares one of two possble states: ether ρ 1 wth a pror probablty p 1, or ρ wth a pror probablty
More informationEdge Isoperimetric Inequalities
November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary
More informationEnsemble Methods: Boosting
Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement
More informationLaboratory 3: Method of Least Squares
Laboratory 3: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly they are correlated wth
More informationStanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011
Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationLecture 10 Support Vector Machines. Oct
Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron
More informationAPPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14
APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce
More informationECE559VV Project Report
ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate
More informationDr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur
Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models
More informationLogistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI
Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton
More informationCalculation of time complexity (3%)
Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationProbabilistic & Unsupervised Learning. Introduction and Foundations
Probablstc & Unsupervsed Learnng Introducton and Foundatons Maneesh Sahan maneesh@gatsby.ucl.ac.uk Gatsby Computatonal Neuroscence Unt, and MSc ML/CSML, Dept Computer Scence Unversty College London Term
More informationSimulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests
Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth
More informationLaboratory 1c: Method of Least Squares
Lab 1c, Least Squares Laboratory 1c: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly
More informationHopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen
Hopfeld networks and Boltzmann machnes Geoffrey Hnton et al. Presented by Tambet Matsen 18.11.2014 Hopfeld network Bnary unts Symmetrcal connectons http://www.nnwj.de/hopfeld-net.html Energy functon The
More informationHopfield Training Rules 1 N
Hopfeld Tranng Rules To memorse a sngle pattern Suppose e set the eghts thus - = p p here, s the eght beteen nodes & s the number of nodes n the netor p s the value requred for the -th node What ll the
More information