} Often, when learning, we deal with uncertainty:

Size: px
Start display at page:

Download "} Often, when learning, we deal with uncertainty:"

Transcription

1 Uncertanty and Learnng } Often, when learnng, we deal wth uncertanty: } Incomplete data sets, wth mssng nformaton } Nosy data sets, wth unrelable nformaton } Stochastcty: causes and effects related non-determnstcally } And many more Class #03: Informaton Theory } Probablty theory gves us mathematcs for such cases } A precse mathematcal theory of chance and causalty Machne Learnng (CS 49/59): M. Allen, 0 Sept. 8 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) Basc Elements of Probablty } Suppose we have some event, e : some fact about the world that may be true or false } We wrte P (e ) for the probablty that e occurs: 0 apple P (e) apple } We can understand ths value as:. P (e ) = : e wll certanly happen. P (e ) = 0: e wll certanly not happen 3. P (e ) = k, 0 < k < : over an arbtrarly long stretch of tme, we wll observe the fracton Event e occurs Total # of events = k Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 3 Propertes of Probablty } Every event must ether occur, or not occur: P (e _ e) = P (e) = p( e) } Furthermore, suppose that we have a set of all possble events, each wth ts own probablty: E = {e,e,...,e k } } Ths set of probabltes s called a probablty dstrbuton, and t must have the followng property: X p = Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 4

2 Probablty Dstrbutons } A unform dstrbuton s one n whch every event occurs wth equal probablty, whch means that we have: ^ 8, p = k } Such dstrbutons are common n games of chance, e.g. where we have a far con-toss: E = {Heads, Tals} P = {0.5, 0.5} } Not every dstrbuton s unform, and we mght have a con that comes up tals more often than heads (or even always!) P = {0.5, 0.75} P 3 = {0.0,.0} Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 5 Informaton Theory } Claude Shannon created nformaton theory n hs 948 paper, A mathematcal theory of communcaton } A theory of the amount of nformaton that can be carred by communcaton channels } Has mplcatons n networks, encrypton, compresson, and many other areas } Also the source of the term bt (credted to John Tukey) Photo source: Konrad Jacobs ( Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 6 Informaton Carred by Events } Informaton s relatve to our uncertanty about an event } If we do not know whether an event has happened or not, then learnng that fact s a gan n nformaton } If we already know ths fact, then there s no nformaton ganed when we see the outcome } Thus, f we have a fxed con that always comes up tals, actually flppng t tells us nothng we don t already know } Flppng a far con does tell us somethng, on the other hand, snce we can t predct the outcome ahead of tme Amount of Informaton } From N. Abramson (963): If an event e occurs wth probablty p, the amount of nformaton carred s: I(e ) = log p } (The base of the logarthm doesn t really matter, but f we use base-, we are measurng nformaton n bts) } Thus, f we flp a far con, and t comes up tals, we have ganed nformaton equal to: I(Tals) = log P (Tals) = log 0.5 = log =.0 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 7 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 8

3 Based Data Carres Less Informaton } Whle flppng a far con yelds.0 bt of nformaton, flppng one that s based gves us less } If we have a somewhat based con, then we get: E = {Heads, Tals} P = {0.5, 0.75} I(Tals) = log P (Tals) = log 0.75 = log } If we have a totally based con, then we get: P 3 = {0.0,.0} I(Tals) = log P (Tals) = log.0 = log.0 =0.0 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 9 Entropy: Total Average Informaton } Shannon defned the entropy of a probablty dstrbuton as the average amount of nformaton carred by events: H(P) = X p log = X p log p p } Ths can be thought of n a varety of ways, ncludng: } How much uncertanty we have about the average event } How much nformaton we get when an average event occurs } How many bts on average are needed to communcate about the events (Shannon was nterested n fndng the most effcent overall encodngs to use n transmttng nformaton) Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 0 Entropy: Total Average Informaton } For a con, C, the formula for entropy becomes: H(C) = (P (Heads) log P (Heads)+P(Tals) log P (Tals)) } A far con, {0.5, 0.5}, has maxmum entropy: H(C) = (0.5 log log 0.5) =.0 } A somewhat based con, {0.5, 0.75}, has less: H(C) = (0.5 log log 0.75) 0.8 } And a fxed con, {0.0,.0}, has none: H(C) = (.0 log log 0.0) = 0.0 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) A Mathematcal Defnton H(P) = X p log p } It s easy to show that for any dstrbuton, entropy s always greater than or equal to 0 (never negatve) } Maxmum entropy occurs wth a unform dstrbuton } In such cases, entropy s log k, where k s the number of dfferent probablstc outcomes } Thus, for any dstrbuton possble, we have: 0 apple H(P) apple log k Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 3

4 Jont Probablty & Independence } If we have two events e and e, the probablty that both events occur, called the jont probablty, s wrtten: P (e ^ e )=P (e,e ) } We say that two events are ndependent f and only f: P (e,e )=P (e ) P (e ) } Independent events tell us nothng about each other Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 3 Jont Probablty & Independence } Independent events tell us nothng about each other: } For example, suppose rany weather s unformly dstrbuted } Suppose further that we choose a day of the week, unformly at random: that day s ether on a weekend or not, gvng us: W = {Ran, Ran} P W = {0.5, 0.5} D = {Weekend, Weekend} P D = {/7, 5/7} } If the weather on any day s ndependent of whether or not that day s a weekend, then we wll have the followng: P (Ran, W eekend) =P (Ran)P (Weekend) = 0.5 /7 =/4 P ( Ran, W eekend) =P ( Ran)P (Weekend) = 0.5 /7 =/4 P (Ran, Weekend)=P (Ran)P ( Weekend) = 0.5 5/7 =5/4 P ( Ran, Weekend)=P ( Ran)P ( Weekend) = 0.5 5/7 =5/4 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 4 Lack of Independence } Suppose we compare the probablty that t rans to the probablty that I brng an umbrella to work: W = {Ran, Ran} P W = {0.5, 0.5} U = {Umbrella, Umbrella} P U = {0., 0.8} } Note: presumably, nether of these s really purely random; we can stll treat them as random varables based upon observng how frequently they occur (ths s sometmes called the emprcal probablty) } Now, f these were ndependent events, then the probablty, e.g., that I am carryng an umbrella and t s ranng s: P (Ran, Umbrella) =P (Ran)P (Umbrella) = =0. } Obvously, however, these are not ndependent; and the actual probablty of seeng me wth my umbrella on rany days could be much hgher than just calculated Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 5 Condtonal Probablty } Gven two events e and e, the probablty that e occurs, gven that e also occurs, called the condtonal probablty of e gven e, s wrtten: P (e e ) } In general, the condtonal probablty of an event can be qute dfferent from the basc probablty that t occurs } Thus, for our weather example, we mght have: W = {R, R} P W = {0.5, 0.5} U = {U, U} P U = {0., 0.8} P (U R) =0.8 P (U R) =0. P ( U R) =0. P ( U R) =0.9 P (e e )+P ( e e )=.0 P (e e )+P (e e ) 6=.0 Can be equal, but not necessarly Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 6 4

5 Propertes of Condtonal Probablty } Condtonal probablty can be defned usng jont probablty: P (e e )= P (e,e ) P (e ) P (e,e )=P (e e )P (e ) } Thus, f the events are actually ndependent, we get: P (e e )= P (e,e ) P (e ) P (e e )= P (e )P (e ) P (e ) P (e e )=P (e ) By defnton of ndependence Ths Week } Informaton Theory & Decson Trees } Readngs: } Blog post on Informaton Theory (lnked from class schedule) } Secton 8.3 from Russell & Norvg } Offce Hours: Wng 0 } Monday/Wednesday/Frday, :00 PM :00 PM } Tuesday/Thursday, :30 PM 3:00 PM Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 7 Monday, 0 Sep. 08 Machne Learnng (CS 49/59) 8 5

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machne Learnng - Lectures Lecture 1-2: Concept Learnng (M. Pantc Lecture 3-4: Decson Trees & CC Intro (M. Pantc Lecture 5-6: Artfcal Neural Networks (S.Zaferou Lecture 7-8: Instance ased Learnng

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

EGR 544 Communication Theory

EGR 544 Communication Theory EGR 544 Communcaton Theory. Informaton Sources Z. Alyazcoglu Electrcal and Computer Engneerng Department Cal Poly Pomona Introducton Informaton Source x n Informaton sources Analog sources Dscrete sources

More information

Machine learning: Density estimation

Machine learning: Density estimation CS 70 Foundatons of AI Lecture 3 Machne learnng: ensty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square ata: ensty estmaton {.. n} x a vector of attrbute values Objectve: estmate the model of

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Lecture 3: Shannon s Theorem

Lecture 3: Shannon s Theorem CSE 533: Error-Correctng Codes (Autumn 006 Lecture 3: Shannon s Theorem October 9, 006 Lecturer: Venkatesan Guruswam Scrbe: Wdad Machmouch 1 Communcaton Model The communcaton model we are usng conssts

More information

Lecture 4. Instructor: Haipeng Luo

Lecture 4. Instructor: Haipeng Luo Lecture 4 Instructor: Hapeng Luo In the followng lectures, we focus on the expert problem and study more adaptve algorthms. Although Hedge s proven to be worst-case optmal, one may wonder how well t would

More information

ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE

ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE ÉCOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE School of Computer and Communcaton Scences Handout 0 Prncples of Dgtal Communcatons Solutons to Problem Set 4 Mar. 6, 08 Soluton. If H = 0, we have Y = Z Z = Y

More information

For example, if the drawing pin was tossed 200 times and it landed point up on 140 of these trials,

For example, if the drawing pin was tossed 200 times and it landed point up on 140 of these trials, Probablty In ths actvty you wll use some real data to estmate the probablty of an event happenng. You wll also use a varety of methods to work out theoretcal probabltes. heoretcal and expermental probabltes

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements

CS 2750 Machine Learning. Lecture 5. Density estimation. CS 2750 Machine Learning. Announcements CS 750 Machne Learnng Lecture 5 Densty estmaton Mlos Hauskrecht mlos@cs.ptt.edu 539 Sennott Square CS 750 Machne Learnng Announcements Homework Due on Wednesday before the class Reports: hand n before

More information

Probability and Random Variable Primer

Probability and Random Variable Primer B. Maddah ENMG 622 Smulaton 2/22/ Probablty and Random Varable Prmer Sample space and Events Suppose that an eperment wth an uncertan outcome s performed (e.g., rollng a de). Whle the outcome of the eperment

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification

2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton

More information

Introduction to Information Theory, Data Compression,

Introduction to Information Theory, Data Compression, Introducton to Informaton Theory, Data Compresson, Codng Mehd Ibm Brahm, Laura Mnkova Aprl 5, 208 Ths s the augmented transcrpt of a lecture gven by Luc Devroye on the 3th of March 208 for a Data Structures

More information

1 The Mistake Bound Model

1 The Mistake Bound Model 5-850: Advanced Algorthms CMU, Sprng 07 Lecture #: Onlne Learnng and Multplcatve Weghts February 7, 07 Lecturer: Anupam Gupta Scrbe: Bryan Lee,Albert Gu, Eugene Cho he Mstake Bound Model Suppose there

More information

Lecture 4: November 17, Part 1 Single Buffer Management

Lecture 4: November 17, Part 1 Single Buffer Management Lecturer: Ad Rosén Algorthms for the anagement of Networs Fall 2003-2004 Lecture 4: November 7, 2003 Scrbe: Guy Grebla Part Sngle Buffer anagement In the prevous lecture we taled about the Combned Input

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

EPR Paradox and the Physical Meaning of an Experiment in Quantum Mechanics. Vesselin C. Noninski

EPR Paradox and the Physical Meaning of an Experiment in Quantum Mechanics. Vesselin C. Noninski EPR Paradox and the Physcal Meanng of an Experment n Quantum Mechancs Vesseln C Nonnsk vesselnnonnsk@verzonnet Abstract It s shown that there s one purely determnstc outcome when measurement s made on

More information

Randomness and Computation

Randomness and Computation Randomness and Computaton or, Randomzed Algorthms Mary Cryan School of Informatcs Unversty of Ednburgh RC 208/9) Lecture 0 slde Balls n Bns m balls, n bns, and balls thrown unformly at random nto bns usually

More information

Expected Value and Variance

Expected Value and Variance MATH 38 Expected Value and Varance Dr. Neal, WKU We now shall dscuss how to fnd the average and standard devaton of a random varable X. Expected Value Defnton. The expected value (or average value, or

More information

Statistics and Quantitative Analysis U4320. Segment 3: Probability Prof. Sharyn O Halloran

Statistics and Quantitative Analysis U4320. Segment 3: Probability Prof. Sharyn O Halloran Statstcs and Quanttatve Analyss U430 Segment 3: Probablty Prof. Sharyn O Halloran Revew: Descrptve Statstcs Code book for Measures Sample Data Relgon Employed 1. Catholc 0. Unemployed. Protestant 1. Employed

More information

A random variable is a function which associates a real number to each element of the sample space

A random variable is a function which associates a real number to each element of the sample space Introducton to Random Varables Defnton of random varable Defnton of of random varable Dscrete and contnuous random varable Probablty blt functon Dstrbuton functon Densty functon Sometmes, t s not enough

More information

Assignment 2. Tyler Shendruk February 19, 2010

Assignment 2. Tyler Shendruk February 19, 2010 Assgnment yler Shendruk February 9, 00 Kadar Ch. Problem 8 We have an N N symmetrc matrx, M. he symmetry means M M and we ll say the elements of the matrx are m j. he elements are pulled from a probablty

More information

Learning from Data 1 Naive Bayes

Learning from Data 1 Naive Bayes Learnng from Data 1 Nave Bayes Davd Barber dbarber@anc.ed.ac.uk course page : http://anc.ed.ac.uk/ dbarber/lfd1/lfd1.html c Davd Barber 2001, 2002 1 Learnng from Data 1 : c Davd Barber 2001,2002 2 1 Why

More information

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation

An Experiment/Some Intuition (Fall 2006): Lecture 18 The EM Algorithm heads coin 1 tails coin 2 Overview Maximum Likelihood Estimation An Experment/Some Intuton I have three cons n my pocket, 6.864 (Fall 2006): Lecture 18 The EM Algorthm Con 0 has probablty λ of heads; Con 1 has probablty p 1 of heads; Con 2 has probablty p 2 of heads

More information

Decision-making and rationality

Decision-making and rationality Reslence Informatcs for Innovaton Classcal Decson Theory RRC/TMI Kazuo URUTA Decson-makng and ratonalty What s decson-makng? Methodology for makng a choce The qualty of decson-makng determnes success or

More information

Chapter 1. Probability

Chapter 1. Probability Chapter. Probablty Mcroscopc propertes of matter: quantum mechancs, atomc and molecular propertes Macroscopc propertes of matter: thermodynamcs, E, H, C V, C p, S, A, G How do we relate these two propertes?

More information

Quantum and Classical Information Theory with Disentropy

Quantum and Classical Information Theory with Disentropy Quantum and Classcal Informaton Theory wth Dsentropy R V Ramos rubensramos@ufcbr Lab of Quantum Informaton Technology, Department of Telenformatc Engneerng Federal Unversty of Ceara - DETI/UFC, CP 6007

More information

P exp(tx) = 1 + t 2k M 2k. k N

P exp(tx) = 1 + t 2k M 2k. k N 1. Subgaussan tals Defnton. Say that a random varable X has a subgaussan dstrbuton wth scale factor σ< f P exp(tx) exp(σ 2 t 2 /2) for all real t. For example, f X s dstrbuted N(,σ 2 ) then t s subgaussan.

More information

The Expectation-Maximization Algorithm

The Expectation-Maximization Algorithm The Expectaton-Maxmaton Algorthm Charles Elan elan@cs.ucsd.edu November 16, 2007 Ths chapter explans the EM algorthm at multple levels of generalty. Secton 1 gves the standard hgh-level verson of the algorthm.

More information

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1

C/CS/Phy191 Problem Set 3 Solutions Out: Oct 1, 2008., where ( 00. ), so the overall state of the system is ) ( ( ( ( 00 ± 11 ), Φ ± = 1 C/CS/Phy9 Problem Set 3 Solutons Out: Oct, 8 Suppose you have two qubts n some arbtrary entangled state ψ You apply the teleportaton protocol to each of the qubts separately What s the resultng state obtaned

More information

Introduction to information theory and data compression

Introduction to information theory and data compression Introducton to nformaton theory and data compresson Adel Magra, Emma Gouné, Irène Woo March 8, 207 Ths s the augmented transcrpt of a lecture gven by Luc Devroye on March 9th 207 for a Data Structures

More information

6. Stochastic processes (2)

6. Stochastic processes (2) Contents Markov processes Brth-death processes Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 Markov process Consder a contnuous-tme and dscrete-state stochastc process X(t) wth state space

More information

Lecture 10: May 6, 2013

Lecture 10: May 6, 2013 TTIC/CMSC 31150 Mathematcal Toolkt Sprng 013 Madhur Tulsan Lecture 10: May 6, 013 Scrbe: Wenje Luo In today s lecture, we manly talked about random walk on graphs and ntroduce the concept of graph expander,

More information

Homework Assignment 3 Due in class, Thursday October 15

Homework Assignment 3 Due in class, Thursday October 15 Homework Assgnment 3 Due n class, Thursday October 15 SDS 383C Statstcal Modelng I 1 Rdge regresson and Lasso 1. Get the Prostrate cancer data from http://statweb.stanford.edu/~tbs/elemstatlearn/ datasets/prostate.data.

More information

ESCI 341 Atmospheric Thermodynamics Lesson 10 The Physical Meaning of Entropy

ESCI 341 Atmospheric Thermodynamics Lesson 10 The Physical Meaning of Entropy ESCI 341 Atmospherc Thermodynamcs Lesson 10 The Physcal Meanng of Entropy References: An Introducton to Statstcal Thermodynamcs, T.L. Hll An Introducton to Thermodynamcs and Thermostatstcs, H.B. Callen

More information

6. Stochastic processes (2)

6. Stochastic processes (2) 6. Stochastc processes () Lect6.ppt S-38.45 - Introducton to Teletraffc Theory Sprng 5 6. Stochastc processes () Contents Markov processes Brth-death processes 6. Stochastc processes () Markov process

More information

6.842 Randomness and Computation February 18, Lecture 4

6.842 Randomness and Computation February 18, Lecture 4 6.842 Randomness and Computaton February 18, 2014 Lecture 4 Lecturer: Rontt Rubnfeld Scrbe: Amartya Shankha Bswas Topcs 2-Pont Samplng Interactve Proofs Publc cons vs Prvate cons 1 Two Pont Samplng 1.1

More information

COS 511: Theoretical Machine Learning

COS 511: Theoretical Machine Learning COS 5: Theoretcal Machne Learnng Lecturer: Rob Schapre Lecture #0 Scrbe: José Sões Ferrera March 06, 203 In the last lecture the concept of Radeacher coplexty was ntroduced, wth the goal of showng that

More information

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora

princeton univ. F 13 cos 521: Advanced Algorithm Design Lecture 3: Large deviations bounds and applications Lecturer: Sanjeev Arora prnceton unv. F 13 cos 521: Advanced Algorthm Desgn Lecture 3: Large devatons bounds and applcatons Lecturer: Sanjeev Arora Scrbe: Today s topc s devaton bounds: what s the probablty that a random varable

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Introduction to Random Variables

Introduction to Random Variables Introducton to Random Varables Defnton of random varable Defnton of random varable Dscrete and contnuous random varable Probablty functon Dstrbuton functon Densty functon Sometmes, t s not enough to descrbe

More information

NAME and Section No.

NAME and Section No. Chemstry 391 Fall 2007 Exam I KEY (Monday September 17) 1. (25 Ponts) ***Do 5 out of 6***(If 6 are done only the frst 5 wll be graded)*** a). Defne the terms: open system, closed system and solated system

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko

Equilibrium with Complete Markets. Instructor: Dmytro Hryshko Equlbrum wth Complete Markets Instructor: Dmytro Hryshko 1 / 33 Readngs Ljungqvst and Sargent. Recursve Macroeconomc Theory. MIT Press. Chapter 8. 2 / 33 Equlbrum n pure exchange, nfnte horzon economes,

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

Engineering Risk Benefit Analysis

Engineering Risk Benefit Analysis Engneerng Rsk Beneft Analyss.55, 2.943, 3.577, 6.938, 0.86, 3.62, 6.862, 22.82, ESD.72, ESD.72 RPRA 2. Elements of Probablty Theory George E. Apostolaks Massachusetts Insttute of Technology Sprng 2007

More information

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur

Module 2. Random Processes. Version 2 ECE IIT, Kharagpur Module Random Processes Lesson 6 Functons of Random Varables After readng ths lesson, ou wll learn about cdf of functon of a random varable. Formula for determnng the pdf of a random varable. Let, X be

More information

Lecture 7: Boltzmann distribution & Thermodynamics of mixing

Lecture 7: Boltzmann distribution & Thermodynamics of mixing Prof. Tbbtt Lecture 7 etworks & Gels Lecture 7: Boltzmann dstrbuton & Thermodynamcs of mxng 1 Suggested readng Prof. Mark W. Tbbtt ETH Zürch 13 März 018 Molecular Drvng Forces Dll and Bromberg: Chapters

More information

Retrieval Models: Language models

Retrieval Models: Language models CS-590I Informaton Retreval Retreval Models: Language models Luo S Department of Computer Scence Purdue Unversty Introducton to language model Ungram language model Document language model estmaton Maxmum

More information

find (x): given element x, return the canonical element of the set containing x;

find (x): given element x, return the canonical element of the set containing x; COS 43 Sprng, 009 Dsjont Set Unon Problem: Mantan a collecton of dsjont sets. Two operatons: fnd the set contanng a gven element; unte two sets nto one (destructvely). Approach: Canoncal element method:

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Online Classification: Perceptron and Winnow

Online Classification: Perceptron and Winnow E0 370 Statstcal Learnng Theory Lecture 18 Nov 8, 011 Onlne Classfcaton: Perceptron and Wnnow Lecturer: Shvan Agarwal Scrbe: Shvan Agarwal 1 Introducton In ths lecture we wll start to study the onlne learnng

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Lecture 3. Ax x i a i. i i

Lecture 3. Ax x i a i. i i 18.409 The Behavor of Algorthms n Practce 2/14/2 Lecturer: Dan Spelman Lecture 3 Scrbe: Arvnd Sankar 1 Largest sngular value In order to bound the condton number, we need an upper bound on the largest

More information

Lecture 14 (03/27/18). Channels. Decoding. Preview of the Capacity Theorem.

Lecture 14 (03/27/18). Channels. Decoding. Preview of the Capacity Theorem. Lecture 14 (03/27/18). Channels. Decodng. Prevew of the Capacty Theorem. A. Barg The concept of a communcaton channel n nformaton theory s an abstracton for transmttng dgtal (and analog) nformaton from

More information

3.1 ML and Empirical Distribution

3.1 ML and Empirical Distribution 67577 Intro. to Machne Learnng Fall semester, 2008/9 Lecture 3: Maxmum Lkelhood/ Maxmum Entropy Dualty Lecturer: Amnon Shashua Scrbe: Amnon Shashua 1 In the prevous lecture we defned the prncple of Maxmum

More information

Bayesian belief networks

Bayesian belief networks CS 1571 Introducton to I Lecture 24 ayesan belef networks los Hauskrecht mlos@cs.ptt.edu 5329 Sennott Square CS 1571 Intro to I dmnstraton Homework assgnment 10 s out and due next week Fnal exam: December

More information

Rules of Probability

Rules of Probability ( ) ( ) = for all Corollary: Rules of robablty The probablty of the unon of any two events and B s roof: ( Φ) = 0. F. ( B) = ( ) + ( B) ( B) If B then, ( ) ( B). roof: week 2 week 2 2 Incluson / Excluson

More information

PROBABILITY PRIMER. Exercise Solutions

PROBABILITY PRIMER. Exercise Solutions PROBABILITY PRIMER Exercse Solutons 1 Probablty Prmer, Exercse Solutons, Prncples of Econometrcs, e EXERCISE P.1 (b) X s a random varable because attendance s not known pror to the outdoor concert. Before

More information

The Order Relation and Trace Inequalities for. Hermitian Operators

The Order Relation and Trace Inequalities for. Hermitian Operators Internatonal Mathematcal Forum, Vol 3, 08, no, 507-57 HIKARI Ltd, wwwm-hkarcom https://doorg/0988/mf088055 The Order Relaton and Trace Inequaltes for Hermtan Operators Y Huang School of Informaton Scence

More information

Power law and dimension of the maximum value for belief distribution with the max Deng entropy

Power law and dimension of the maximum value for belief distribution with the max Deng entropy Power law and dmenson of the maxmum value for belef dstrbuton wth the max Deng entropy Bngy Kang a, a College of Informaton Engneerng, Northwest A&F Unversty, Yanglng, Shaanx, 712100, Chna. Abstract Deng

More information

Boning Yang. March 8, 2018

Boning Yang. March 8, 2018 Concentraton Inequaltes by concentraton nequalty Introducton to Basc Concentraton Inequaltes by Florda State Unversty March 8, 2018 Framework Concentraton Inequaltes by 1. concentraton nequalty concentraton

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

CSE 546 Midterm Exam, Fall 2014(with Solution)

CSE 546 Midterm Exam, Fall 2014(with Solution) CSE 546 Mdterm Exam, Fall 014(wth Soluton) 1. Personal nfo: Name: UW NetID: Student ID:. There should be 14 numbered pages n ths exam (ncludng ths cover sheet). 3. You can use any materal you brought:

More information

Markov Chain Monte Carlo Lecture 6

Markov Chain Monte Carlo Lecture 6 where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways

More information

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan

Winter 2008 CS567 Stochastic Linear/Integer Programming Guest Lecturer: Xu, Huan Wnter 2008 CS567 Stochastc Lnear/Integer Programmng Guest Lecturer: Xu, Huan Class 2: More Modelng Examples 1 Capacty Expanson Capacty expanson models optmal choces of the tmng and levels of nvestments

More information

Expectation Maximization Mixture Models HMMs

Expectation Maximization Mixture Models HMMs -755 Machne Learnng for Sgnal Processng Mture Models HMMs Class 9. 2 Sep 200 Learnng Dstrbutons for Data Problem: Gven a collecton of eamples from some data, estmate ts dstrbuton Basc deas of Mamum Lelhood

More information

10.34 Fall 2015 Metropolis Monte Carlo Algorithm

10.34 Fall 2015 Metropolis Monte Carlo Algorithm 10.34 Fall 2015 Metropols Monte Carlo Algorthm The Metropols Monte Carlo method s very useful for calculatng manydmensonal ntegraton. For e.g. n statstcal mechancs n order to calculate the prospertes of

More information

Vapnik-Chervonenkis theory

Vapnik-Chervonenkis theory Vapnk-Chervonenks theory Rs Kondor June 13, 2008 For the purposes of ths lecture, we restrct ourselves to the bnary supervsed batch learnng settng. We assume that we have an nput space X, and an unknown

More information

Channel Encoder. Channel. Figure 7.1: Communication system

Channel Encoder. Channel. Figure 7.1: Communication system Chapter 7 Processes The model of a communcaton system that we have been developng s shown n Fgure 7.. Ths model s also useful for some computaton systems. The source s assumed to emt a stream of symbols.

More information

Applied Stochastic Processes

Applied Stochastic Processes STAT455/855 Fall 23 Appled Stochastc Processes Fnal Exam, Bref Solutons 1. (15 marks) (a) (7 marks) The dstrbuton of Y s gven by ( ) ( ) y 2 1 5 P (Y y) for y 2, 3,... The above follows because each of

More information

Note on EM-training of IBM-model 1

Note on EM-training of IBM-model 1 Note on EM-tranng of IBM-model INF58 Language Technologcal Applcatons, Fall The sldes on ths subject (nf58 6.pdf) ncludng the example seem nsuffcent to gve a good grasp of what s gong on. Hence here are

More information

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County

Bayesian Learning. Smart Home Health Analytics Spring Nirmalya Roy Department of Information Systems University of Maryland Baltimore County Smart Home Health Analytcs Sprng 2018 Bayesan Learnng Nrmalya Roy Department of Informaton Systems Unversty of Maryland Baltmore ounty www.umbc.edu Bayesan Learnng ombnes pror knowledge wth evdence to

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Stochastic Structural Dynamics

Stochastic Structural Dynamics Stochastc Structural Dynamcs Lecture-1 Defnton of probablty measure and condtonal probablty Dr C S Manohar Department of Cvl Engneerng Professor of Structural Engneerng Indan Insttute of Scence angalore

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

FUZZY FINITE ELEMENT METHOD

FUZZY FINITE ELEMENT METHOD FUZZY FINITE ELEMENT METHOD RELIABILITY TRUCTURE ANALYI UING PROBABILITY 3.. Maxmum Normal tress Internal force s the shear force, V has a magntude equal to the load P and bendng moment, M. Bendng moments

More information

Lecture Space-Bounded Derandomization

Lecture Space-Bounded Derandomization Notes on Complexty Theory Last updated: October, 2008 Jonathan Katz Lecture Space-Bounded Derandomzaton 1 Space-Bounded Derandomzaton We now dscuss derandomzaton of space-bounded algorthms. Here non-trval

More information

Ph 219a/CS 219a. Exercises Due: Wednesday 12 November 2008

Ph 219a/CS 219a. Exercises Due: Wednesday 12 November 2008 1 Ph 19a/CS 19a Exercses Due: Wednesday 1 November 008.1 Whch state dd Alce make? Consder a game n whch Alce prepares one of two possble states: ether ρ 1 wth a pror probablty p 1, or ρ wth a pror probablty

More information

Edge Isoperimetric Inequalities

Edge Isoperimetric Inequalities November 7, 2005 Ross M. Rchardson Edge Isopermetrc Inequaltes 1 Four Questons Recall that n the last lecture we looked at the problem of sopermetrc nequaltes n the hypercube, Q n. Our noton of boundary

More information

Ensemble Methods: Boosting

Ensemble Methods: Boosting Ensemble Methods: Boostng Ncholas Ruozz Unversty of Texas at Dallas Based on the sldes of Vbhav Gogate and Rob Schapre Last Tme Varance reducton va baggng Generate new tranng data sets by samplng wth replacement

More information

Laboratory 3: Method of Least Squares

Laboratory 3: Method of Least Squares Laboratory 3: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly they are correlated wth

More information

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011

Stanford University CS359G: Graph Partitioning and Expanders Handout 4 Luca Trevisan January 13, 2011 Stanford Unversty CS359G: Graph Parttonng and Expanders Handout 4 Luca Trevsan January 3, 0 Lecture 4 In whch we prove the dffcult drecton of Cheeger s nequalty. As n the past lectures, consder an undrected

More information

Notes on Frequency Estimation in Data Streams

Notes on Frequency Estimation in Data Streams Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to

More information

Lecture 10 Support Vector Machines. Oct

Lecture 10 Support Vector Machines. Oct Lecture 10 Support Vector Machnes Oct - 20-2008 Lnear Separators Whch of the lnear separators s optmal? Concept of Margn Recall that n Perceptron, we learned that the convergence rate of the Perceptron

More information

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14

APPROXIMATE PRICES OF BASKET AND ASIAN OPTIONS DUPONT OLIVIER. Premia 14 APPROXIMAE PRICES OF BASKE AND ASIAN OPIONS DUPON OLIVIER Prema 14 Contents Introducton 1 1. Framewor 1 1.1. Baset optons 1.. Asan optons. Computng the prce 3. Lower bound 3.1. Closed formula for the prce

More information

ECE559VV Project Report

ECE559VV Project Report ECE559VV Project Report (Supplementary Notes Loc Xuan Bu I. MAX SUM-RATE SCHEDULING: THE UPLINK CASE We have seen (n the presentaton that, for downlnk (broadcast channels, the strategy maxmzng the sum-rate

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI

Logistic Regression. CAP 5610: Machine Learning Instructor: Guo-Jun QI Logstc Regresson CAP 561: achne Learnng Instructor: Guo-Jun QI Bayes Classfer: A Generatve model odel the posteror dstrbuton P(Y X) Estmate class-condtonal dstrbuton P(X Y) for each Y Estmate pror dstrbuton

More information

Calculation of time complexity (3%)

Calculation of time complexity (3%) Problem 1. (30%) Calculaton of tme complexty (3%) Gven n ctes, usng exhaust search to see every result takes O(n!). Calculaton of tme needed to solve the problem (2%) 40 ctes:40! dfferent tours 40 add

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Probabilistic & Unsupervised Learning. Introduction and Foundations

Probabilistic & Unsupervised Learning. Introduction and Foundations Probablstc & Unsupervsed Learnng Introducton and Foundatons Maneesh Sahan maneesh@gatsby.ucl.ac.uk Gatsby Computatonal Neuroscence Unt, and MSc ML/CSML, Dept Computer Scence Unversty College London Term

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Laboratory 1c: Method of Least Squares

Laboratory 1c: Method of Least Squares Lab 1c, Least Squares Laboratory 1c: Method of Least Squares Introducton Consder the graph of expermental data n Fgure 1. In ths experment x s the ndependent varable and y the dependent varable. Clearly

More information

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen

Hopfield networks and Boltzmann machines. Geoffrey Hinton et al. Presented by Tambet Matiisen Hopfeld networks and Boltzmann machnes Geoffrey Hnton et al. Presented by Tambet Matsen 18.11.2014 Hopfeld network Bnary unts Symmetrcal connectons http://www.nnwj.de/hopfeld-net.html Energy functon The

More information

Hopfield Training Rules 1 N

Hopfield Training Rules 1 N Hopfeld Tranng Rules To memorse a sngle pattern Suppose e set the eghts thus - = p p here, s the eght beteen nodes & s the number of nodes n the netor p s the value requred for the -th node What ll the

More information