CSE 312 Final Review: Section AA

Similar documents
CSE 312: Foundations of Computing II Quiz Section #10: Review Questions for Final Exam (solutions)

Lecture 16. Lectures 1-15 Review

Kousha Etessami. U. of Edinburgh, UK. Kousha Etessami (U. of Edinburgh, UK) Discrete Mathematics (Chapter 7) 1 / 13

CSE548, AMS542: Analysis of Algorithms, Spring 2014 Date: May 12. Final In-Class Exam. ( 2:35 PM 3:50 PM : 75 Minutes )

Probability reminders

Copyright c 2006 Jason Underdown Some rights reserved. choose notation. n distinct items divided into r distinct groups.

Statistics for Managers Using Microsoft Excel/SPSS Chapter 4 Basic Probability And Discrete Probability Distributions

Quick Tour of Basic Probability Theory and Linear Algebra

COMPSCI 240: Reasoning Under Uncertainty

Part IA Probability. Definitions. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Stochastic Models of Manufacturing Systems

Evaluating Hypotheses

X = X X n, + X 2

Advanced Herd Management Probabilities and distributions

Part IA Probability. Theorems. Based on lectures by R. Weber Notes taken by Dexter Chua. Lent 2015

Things to remember when learning probability distributions:

Hypothesis testing: theory and methods

GEOMETRIC -discrete A discrete random variable R counts number of times needed before an event occurs

Disjointness and Additivity

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.

Midterm 2 Review. CS70 Summer Lecture 6D. David Dinh 28 July UC Berkeley

Midterm #1. Lecture 10: Joint Distributions and the Law of Large Numbers. Joint Distributions - Example, cont. Joint Distributions - Example

Statistics for Managers Using Microsoft Excel (3 rd Edition)

Probability Theory for Machine Learning. Chris Cremer September 2015

(Practice Version) Midterm Exam 2

CSE 312, 2017 Winter, W.L.Ruzzo. 5. independence [ ]

F79SM STATISTICAL METHODS

A Probability Primer. A random walk down a probabilistic path leading to some stochastic thoughts on chance events and uncertain outcomes.

STAT 830 Hypothesis Testing

simple if it completely specifies the density of x

Review of Probabilities and Basic Statistics

Lecture 4: Sampling, Tail Inequalities

MAS113 Introduction to Probability and Statistics

Expectation. DS GA 1002 Probability and Statistics for Data Science. Carlos Fernandez-Granda

Chernoff Bounds. Theme: try to show that it is unlikely a random variable X is far away from its expectation.

ECEN 5612, Fall 2007 Noise and Random Processes Prof. Timothy X Brown NAME: CUID:

Expectation of Random Variables

CS145: Probability & Computing

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20

Chapter 7: Hypothesis Testing

Randomized Algorithms

15 Discrete Distributions

Learning Objectives for Stat 225

EE514A Information Theory I Fall 2013

Example continued. Math 425 Intro to Probability Lecture 37. Example continued. Example

Analysis of Engineering and Scientific Data. Semester

Chapter 3: Random Variables 1

SDS 321: Practice questions

Estimating the accuracy of a hypothesis Setting. Assume a binary classification setting

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

COMP2610/COMP Information Theory

MTMS Mathematical Statistics

40.530: Statistics. Professor Chen Zehua. Singapore University of Design and Technology

CS 109 Review. CS 109 Review. Julia Daniel, 12/3/2018. Julia Daniel

MATH 118 FINAL EXAM STUDY GUIDE

Notation: X = random variable; x = particular value; P(X = x) denotes probability that X equals the value x.

Lecture Testing Hypotheses: The Neyman-Pearson Paradigm

Lecture Notes 1 Probability and Random Variables. Conditional Probability and Independence. Functions of a Random Variable

Brief Review of Probability

Basic Probability. Introduction

Practice Problem - Skewness of Bernoulli Random Variable. Lecture 7: Joint Distributions and the Law of Large Numbers. Joint Distributions - Example

{ p if x = 1 1 p if x = 0

YORK UNIVERSITY. Faculty of Science Department of Mathematics and Statistics MATH A Test #2 June 11, Solutions

UC Berkeley, CS 174: Combinatorics and Discrete Probability (Fall 2008) Midterm 1. October 7, 2008

Math Review Sheet, Fall 2008

Lecture 5: The Principle of Deferred Decisions. Chernoff Bounds

What is Probability? Probability. Sample Spaces and Events. Simple Event

Chapter 4. Chapter 4 sections

Final Solutions Fri, June 8

Two hours. Statistical Tables to be provided THE UNIVERSITY OF MANCHESTER. 14 January :45 11:45

Tail and Concentration Inequalities

Statistical Inference

SDS 321: Introduction to Probability and Statistics

SUMMARY OF PROBABILITY CONCEPTS SO FAR (SUPPLEMENT FOR MA416)

CSE548, AMS542: Analysis of Algorithms, Fall 2016 Date: Nov 30. Final In-Class Exam. ( 7:05 PM 8:20 PM : 75 Minutes )

(a) (3 points) Construct a 95% confidence interval for β 2 in Equation 1.

Lecture 1: Review on Probability and Statistics

Deep Learning for Computer Vision

Computational Biology Lecture #3: Probability and Statistics. Bud Mishra Professor of Computer Science, Mathematics, & Cell Biology Sept

Probability and Distributions

Lecture 4: Random Variables and Distributions

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Tom Salisbury

Learning Theory. Sridhar Mahadevan. University of Massachusetts. p. 1/38

Moments. Raw moment: February 25, 2014 Normalized / Standardized moment:

4. Suppose that we roll two die and let X be equal to the maximum of the two rolls. Find P (X {1, 3, 5}) and draw the PMF for X.

Lecture 2: Repetition of probability theory and statistics

Week 2: Review of probability and statistics

Chapter 3, 4 Random Variables ENCS Probability and Stochastic Processes. Concordia University

M378K In-Class Assignment #1

Dept. of Linguistics, Indiana University Fall 2015

3. Review of Probability and Statistics

Fundamental Tools - Probability Theory IV

3 Multiple Discrete Random Variables

Why study probability? Set theory. ECE 6010 Lecture 1 Introduction; Review of Random Variables

Week 2. Review of Probability, Random Variables and Univariate Distributions

11.1 Set Cover ILP formulation of set cover Deterministic rounding

Lecture 11. Probability Theory: an Overveiw

Continuous Distributions

Transcription:

CSE 312 TAs December 8, 2011

General Information

General Information Comprehensive Midterm

General Information Comprehensive Midterm Heavily weighted toward material after the midterm

Pre-Midterm Material

Pre-Midterm Material Basic Counting Principles

Pre-Midterm Material Basic Counting Principles Pigeonhole Principle

Pre-Midterm Material Basic Counting Principles Pigeonhole Principle Inclusion Exclusion

Pre-Midterm Material Basic Counting Principles Pigeonhole Principle Inclusion Exclusion Counting the Complement

Pre-Midterm Material Basic Counting Principles Pigeonhole Principle Inclusion Exclusion Counting the Complement Using symmetry

Pre-Midterm Material Basic Counting Principles Pigeonhole Principle Inclusion Exclusion Counting the Complement Using symmetry Conditional Probability

Pre-Midterm Material Basic Counting Principles Pigeonhole Principle Inclusion Exclusion Counting the Complement Using symmetry Conditional Probability P(A B) = P(AB) P(B)

Pre-Midterm Material Basic Counting Principles Pigeonhole Principle Inclusion Exclusion Counting the Complement Using symmetry Conditional Probability P(A B) = P(AB) P(B) Law of Total Probability: P(A) = P(A B) P(B) + P(A B) P(B)

Pre-Midterm Material Basic Counting Principles Pigeonhole Principle Inclusion Exclusion Counting the Complement Using symmetry Conditional Probability P(A B) = P(AB) P(B) Law of Total Probability: P(A) = P(A B) P(B) + P(A B) P(B) Bayes Theorem: P(A B) = P(B A) P(A) P(B)

Pre-Midterm Material Basic Counting Principles Pigeonhole Principle Inclusion Exclusion Counting the Complement Using symmetry Conditional Probability P(A B) = P(AB) P(B) Law of Total Probability: P(A) = P(A B) P(B) + P(A B) P(B) Bayes Theorem: P(A B) = Network Failure Questions P(B A) P(A) P(B)

Pre-Midterm Material

Pre-Midterm Material Independence

Pre-Midterm Material Independence E and F are independent if P(EF ) = P(E)P(F )

Pre-Midterm Material Independence E and F are independent if P(EF ) = P(E)P(F ) E and F are independent if P(E F ) = P(E) and P(F E) = P(F )

Pre-Midterm Material Independence E and F are independent if P(EF ) = P(E)P(F ) E and F are independent if P(E F ) = P(E) and P(F E) = P(F ) Events E 1,... E n are independent if for every subset S of events ( ) P E i = P(E i ) i S i S

Pre-Midterm Material Independence E and F are independent if P(EF ) = P(E)P(F ) E and F are independent if P(E F ) = P(E) and P(F E) = P(F ) Events E 1,... E n are independent if for every subset S of events ( ) P E i = P(E i ) i S i S Biased coin example from Lecture 5, slide 7

Pre-Midterm Material Independence E and F are independent if P(EF ) = P(E)P(F ) E and F are independent if P(E F ) = P(E) and P(F E) = P(F ) Events E 1,... E n are independent if for every subset S of events ( ) P E i = P(E i ) i S i S Biased coin example from Lecture 5, slide 7 If E and F are independent and G is an arbitrary event then in general P(EF G) P(E G) P(F G)

Pre-Midterm Material Independence E and F are independent if P(EF ) = P(E)P(F ) E and F are independent if P(E F ) = P(E) and P(F E) = P(F ) Events E 1,... E n are independent if for every subset S of events ( ) P E i = P(E i ) i S i S Biased coin example from Lecture 5, slide 7 If E and F are independent and G is an arbitrary event then in general P(EF G) P(E G) P(F G) For any given G, equality in the above statement means that E and F are Conditionally Independent given G

Distributions

Distributions Know the mean and variance for:

Distributions Know the mean and variance for: Uniform distribution

Distributions Know the mean and variance for: Uniform distribution Normal distribution

Distributions Know the mean and variance for: Uniform distribution Normal distribution Geometric distribution

Distributions Know the mean and variance for: Uniform distribution Normal distribution Geometric distribution Binomial distribution

Distributions Know the mean and variance for: Uniform distribution Normal distribution Geometric distribution Binomial distribution Poisson distribution

Distributions Know the mean and variance for: Uniform distribution Normal distribution Geometric distribution Binomial distribution Poisson distribution Hypergeometric distribution

Distributions Know the mean and variance for: Uniform distribution Normal distribution Geometric distribution Binomial distribution Poisson distribution Hypergeometric distribution Remember Linearity of Expectation and other useful facts (e.g. Var[aX + b] = a 2 Var[X ]; in general Var[X + Y ] Var[X ] + Var[Y ]).

Distributions Know the mean and variance for: Uniform distribution Normal distribution Geometric distribution Binomial distribution Poisson distribution Hypergeometric distribution Remember Linearity of Expectation and other useful facts (e.g. Var[aX + b] = a 2 Var[X ]; in general Var[X + Y ] Var[X ] + Var[Y ]). Remember: For any a, P(X = a) = 0 (the probability that a continuous R.V. falls at a specific point is 0!)

Distributions Know the mean and variance for: Uniform distribution Normal distribution Geometric distribution Binomial distribution Poisson distribution Hypergeometric distribution Remember Linearity of Expectation and other useful facts (e.g. Var[aX + b] = a 2 Var[X ]; in general Var[X + Y ] Var[X ] + Var[Y ]). Remember: For any a, P(X = a) = 0 (the probability that a continuous R.V. falls at a specific point is 0!) Expectation is now an integral: E[X ] = x f (x)dx

Distributions Know the mean and variance for: Uniform distribution Normal distribution Geometric distribution Binomial distribution Poisson distribution Hypergeometric distribution Remember Linearity of Expectation and other useful facts (e.g. Var[aX + b] = a 2 Var[X ]; in general Var[X + Y ] Var[X ] + Var[Y ]). Remember: For any a, P(X = a) = 0 (the probability that a continuous R.V. falls at a specific point is 0!) Expectation is now an integral: E[X ] = x f (x)dx Use normal approximation when applicable.

Central Limit Theorem

Central Limit Theorem Central Limit Theorem: Consider i.i.d. (independent, identically distributed) random variables X 1, X 2,.... Xi has µ = E[X i ] and σ 2 = Var[X i ]. Then, as n X 1 + + X n nµ σ n N(0, 1) Alternatively X = 1 n n X i N i=1 ) (µ, σ2 n

Tail Bounds

Tail Bounds Markov s Inequality: If X is a non-negative random variable, then for every α > 0, we have P(X α) E[X ] α

Tail Bounds Markov s Inequality: If X is a non-negative random variable, then for every α > 0, we have Corollary P(X α) E[X ] α P(X αe[x ]) 1 α

Tail Bounds Markov s Inequality: If X is a non-negative random variable, then for every α > 0, we have Corollary P(X α) E[X ] α P(X αe[x ]) 1 α Chebyshev s Inequality: If Y is an arbitrary random variable with E[Y ] = µ, then, for any α > 0, P( Y µ α) Var[Y ] α 2

Tail Bounds

Tail Bounds Chernoff Bounds: Suppose X is drawn from Bin(n, p) and µ = E[X ] = pn Then, for any 0 < δ < 1 P(X > (1 + δ)µ) e δ2 µ 2 P(X < (1 δ)µ) e δ2 µ 3

Law of Large Numbers

Law of Large Numbers Weak Law of Large Numbers: Let X be the empirical mean of i.i.d.s X 1,..., X n. For any ɛ > 0, as n Pr ( X µ > ɛ ) 0

Law of Large Numbers Weak Law of Large Numbers: Let X be the empirical mean of i.i.d.s X 1,..., X n. For any ɛ > 0, as n Pr ( X µ > ɛ ) 0 Strong Law of Large Numbers: Same hypotheses ( ( )) X1 + + X n Pr lim = µ = 1 n n

Random Facts

Random Facts If X and Y are R.V.s from the same distribution, then Z = X + Y isn t necessarily from the same distribution as X and Y.

Random Facts If X and Y are R.V.s from the same distribution, then Z = X + Y isn t necessarily from the same distribution as X and Y. If X and Y are both normal, then so is X + Y.

MLEs

MLEs Write an expression for the likelihood.

MLEs Write an expression for the likelihood. Convert this into the log likelihood.

MLEs Write an expression for the likelihood. Convert this into the log likelihood. Take derivatives to find the maximum.

MLEs Write an expression for the likelihood. Convert this into the log likelihood. Take derivatives to find the maximum. Verify that this is indeed a maximum.

MLEs Write an expression for the likelihood. Convert this into the log likelihood. Take derivatives to find the maximum. Verify that this is indeed a maximum. See Lecture 11 for worked examples.

Expectation-Maximization

Expectation-Maximization E-step: Computes the log-likelihood using current parameters.

Expectation-Maximization E-step: Computes the log-likelihood using current parameters. M-step: Maximize the expected log-likelihood, changing the parameters.

Expectation-Maximization E-step: Computes the log-likelihood using current parameters. M-step: Maximize the expected log-likelihood, changing the parameters. Iterated until convergence is achieved.

Hypothesis Testing

Hypothesis Testing H 0 is the null hypothesis

Hypothesis Testing H 0 is the null hypothesis H 1 is the alternative hypothesis

Hypothesis Testing H 0 is the null hypothesis H 1 is the alternative hypothesis Likelihood Ratio = L 1 L 0 where L 0 is the likelihood of H 0 and L 1 is the likelihood of H 1.

Hypothesis Testing H 0 is the null hypothesis H 1 is the alternative hypothesis Likelihood Ratio = L 1 L 0 where L 0 is the likelihood of H 0 and L 1 is the likelihood of H 1. Saying that alternative hypothesis is 5 times more likely than the null hypothesis means that L 1 L 0 5.

Hypothesis Testing H 0 is the null hypothesis H 1 is the alternative hypothesis Likelihood Ratio = L 1 L 0 where L 0 is the likelihood of H 0 and L 1 is the likelihood of H 1. Saying that alternative hypothesis is 5 times more likely than the null hypothesis means that L 1 L 0 5. Decision rule: When to reject H 0.

Hypothesis Testing H 0 is the null hypothesis H 1 is the alternative hypothesis Likelihood Ratio = L 1 L 0 where L 0 is the likelihood of H 0 and L 1 is the likelihood of H 1. Saying that alternative hypothesis is 5 times more likely than the null hypothesis means that L 1 L 0 5. Decision rule: When to reject H 0. α = P(rejected H 0 but H 0 was true)

Hypothesis Testing H 0 is the null hypothesis H 1 is the alternative hypothesis Likelihood Ratio = L 1 L 0 where L 0 is the likelihood of H 0 and L 1 is the likelihood of H 1. Saying that alternative hypothesis is 5 times more likely than the null hypothesis means that L 1 L 0 5. Decision rule: When to reject H 0. α = P(rejected H 0 but H 0 was true) β = P(accept H 0 but H 1 was true)

Algorithms

Algorithms Better algorithms usually trump better hardware, but both are needed for progress

Algorithms Better algorithms usually trump better hardware, but both are needed for progress Some problems cannot be solved (e.g. the Halting Problem)

Algorithms Better algorithms usually trump better hardware, but both are needed for progress Some problems cannot be solved (e.g. the Halting Problem) Other (intractable) problems cannot (yet?) be solved in a reasonable amount of time (e.g. Integer Factorization)

Sequence Alignment

Sequence Alignment Brute force solution take at least ( 2n n ) computations of the sequence score, which is exponential time.

Sequence Alignment Brute force solution take at least ( 2n n ) computations of the sequence score, which is exponential time. Dynamic Programming decrease computation to O(n 2 ).

Sequence Alignment Brute force solution take at least ( 2n n ) computations of the sequence score, which is exponential time. Dynamic Programming decrease computation to O(n 2 ). IDEA:

Sequence Alignment Brute force solution take at least ( 2n n ) computations of the sequence score, which is exponential time. Dynamic Programming decrease computation to O(n 2 ). IDEA: Store prior computations so that future computations can do table look-ups.

Sequence Alignment Brute force solution take at least ( 2n n ) computations of the sequence score, which is exponential time. Dynamic Programming decrease computation to O(n 2 ). IDEA: Store prior computations so that future computations can do table look-ups. Backtrace Algorithm: Start at the bottom right of the matrix and find which neighboring cells could have transitioned to the current cell under the cost function, σ. Time Bound (O(n 2 ))

Sequence Alignment Brute force solution take at least ( 2n n ) computations of the sequence score, which is exponential time. Dynamic Programming decrease computation to O(n 2 ). IDEA: Store prior computations so that future computations can do table look-ups. Backtrace Algorithm: Start at the bottom right of the matrix and find which neighboring cells could have transitioned to the current cell under the cost function, σ. Time Bound (O(n 2 )) See Lecture 15 for a worked example.

P vs. NP

P vs. NP Some problems cannot be computed (e.g. The Halting Problem).

P vs. NP Some problems cannot be computed (e.g. The Halting Problem). Some problems can be computed but take a long time (e.g. SAT, 3-SAT, 3-coloring).

P vs. NP Some problems cannot be computed (e.g. The Halting Problem). Some problems can be computed but take a long time (e.g. SAT, 3-SAT, 3-coloring). P - a solution is computable in polynomial time

P vs. NP Some problems cannot be computed (e.g. The Halting Problem). Some problems can be computed but take a long time (e.g. SAT, 3-SAT, 3-coloring). P - a solution is computable in polynomial time NP - a solution can be verified in polynomial time, given a hint that is polynomial in the input length

P vs. NP Some problems cannot be computed (e.g. The Halting Problem). Some problems can be computed but take a long time (e.g. SAT, 3-SAT, 3-coloring). P - a solution is computable in polynomial time NP - a solution can be verified in polynomial time, given a hint that is polynomial in the input length A p B means that if you have a fast algorithm for B, you have a fast algorithm for A.

P vs. NP Some problems cannot be computed (e.g. The Halting Problem). Some problems can be computed but take a long time (e.g. SAT, 3-SAT, 3-coloring). P - a solution is computable in polynomial time NP - a solution can be verified in polynomial time, given a hint that is polynomial in the input length A p B means that if you have a fast algorithm for B, you have a fast algorithm for A. NP-complete - In NP and as hard as the hardest problem in NP

P vs. NP Some problems cannot be computed (e.g. The Halting Problem). Some problems can be computed but take a long time (e.g. SAT, 3-SAT, 3-coloring). P - a solution is computable in polynomial time NP - a solution can be verified in polynomial time, given a hint that is polynomial in the input length A p B means that if you have a fast algorithm for B, you have a fast algorithm for A. NP-complete - In NP and as hard as the hardest problem in NP Any fast solution to an NP-complete problem would yield a fast solution to all problems in NP.