STAT2201 Slava Vaisman

Similar documents
Revision : Thermodynamics

Modern Navigation. Thomas Herring

Preparing for the HNC Electrical Maths Components. online learning. Page 1 of 15

Section 2: Equations and Inequalities

Systems of Linear Equations

1 Maintaining a Dictionary

Lesson 24: True and False Number Sentences

Lesson 12: Solving Equations

PHY103A: Lecture # 1

Secondary 3H Unit = 1 = 7. Lesson 3.3 Worksheet. Simplify: Lesson 3.6 Worksheet

COMP 551 Applied Machine Learning Lecture 20: Gaussian processes

Worksheets for GCSE Mathematics. Quadratics. mr-mathematics.com Maths Resources for Teachers. Algebra

Physics 2D Lecture Slides Lecture 1: Jan

2.1 Intro to Algebra

Dan Roth 461C, 3401 Walnut

Math, Stats, and Mathstats Review ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

CSC 578 Neural Networks and Deep Learning

Teacher Road Map for Lesson 10: True and False Equations

Algorithms and Programming I. Lecture#1 Spring 2015

Section 5: Quadratic Equations and Functions Part 1

course overview 18.06: Linear Algebra

Physics 141H. University of Arizona Spring 2004 Prof. Erich W. Varnes

SECTION 7: FAULT ANALYSIS. ESE 470 Energy Distribution Systems

Unit Calendar. Date Sect. Topic Homework HW On-Time Apr , 2, 3 Quadratic Equations & Page 638: 3-11 Page 647: 3-29, odd

Required Textbook. Grade Determined by

Product and Inventory Management (35E00300) Forecasting Models Trend analysis

General Strong Polarization

Math 3 Unit 4: Rational Functions

Lesson 18: Recognizing Equations of Circles

HIGLEY UNIFIED SCHOOL DISTRICT INSTRUCTIONAL ALIGNMENT

Physics 2020 Lab 5 Intro to Circuits

Physics 273 (Fall 2013) (4 Credit Hours) Fundamentals of Physics II

Eureka Lessons for 6th Grade Unit FIVE ~ Equations & Inequalities

Multiple Regression Analysis: Inference ECONOMETRICS (ECON 360) BEN VAN KAMMEN, PHD

Lecture 6. Notes on Linear Algebra. Perceptron

Physics 2D Lecture Slides Lecture 1: Jan

Physics Fall Semester. Sections 1 5. Please find a seat. Keep all walkways free for safety reasons and to comply with the fire code.

Lesson 24: Using the Quadratic Formula,

MATH 18.01, FALL PROBLEM SET # 3A

Property Testing and Affine Invariance Part I Madhu Sudan Harvard University

Lecture 3. Linear Regression

Standard circuit diagram symbols Content Key opportunities for skills development

The Elasticity of Quantum Spacetime Fabric. Viorel Laurentiu Cartas

STAT 350 Final (new Material) Review Problems Key Spring 2016

Chapter 27 Summary Inferences for Regression

Lecture 3. STAT161/261 Introduction to Pattern Recognition and Machine Learning Spring 2018 Prof. Allie Fletcher

PHYS 272 (Spring 2018): Introductory Physics: Fields Homeworks

CHAPTER 5 Wave Properties of Matter and Quantum Mechanics I

Lesson 1: Exponential Notation

Physics 2D Lecture Slides Sep 26. Vivek Sharma UCSD Physics

Mathematics Algebra. It is used to describe the world around us.

Circuits Gustav Robert Kirchhoff 12 March October 1887

Lecture Outline. I. Fundamental Measurements II. Unit Conversion Level I: Motion (Treat the levels as practice exams)

A Review/Intro to some Principles of Astronomy

ECE 4800 Fall 2011: Electromagnetic Fields and Waves. Credits: 4 Office Hours: M 6-7:30PM, Th 2-3:30, and by appointment

Eureka Math. Grade 8, Module 4. Student File_B. Contains Exit Ticket, and Assessment Materials

ECS 20: Discrete Mathematics for Computer Science UC Davis Phillip Rogaway June 12, Final Exam

Course: Math 111 Pre-Calculus Summer 2016

Lesson 25: Using the Quadratic Formula,

Review for Exam Hyunse Yoon, Ph.D. Assistant Research Scientist IIHR-Hydroscience & Engineering University of Iowa

Of all the physical properties, it is temperature, which is being measured most often.

Geography 1103: Spatial Thinking

Lecture 29: Forced Convection II

Introduction to Electrical Theory and DC Circuits

6.01: Introduction to EECS I. Discrete Probability and State Estimation

7.3 The Jacobi and Gauss-Seidel Iterative Methods

MATH 1080: Calculus of One Variable II Fall 2018 Textbook: Single Variable Calculus: Early Transcendentals, 7e, by James Stewart.

Econ 1123: Section 2. Review. Binary Regressors. Bivariate. Regression. Omitted Variable Bias

Physics 162a Quantum Mechanics

Terms of Use. Copyright Embark on the Journey

Lecture 5: Hashing. David Woodruff Carnegie Mellon University

Warm-up Using the given data Create a scatterplot Find the regression line

6.080 / Great Ideas in Theoretical Computer Science Spring 2008

Loss Functions and Optimization. Lecture 3-1

Finite Automata Part One

Mini-project in scientific computing

Social networks and scripts. Zvi Lotker IRIF-France

Coulomb s Law and Coulomb s Constant

Homework #3 RELEASE DATE: 10/28/2013 DUE DATE: extended to 11/18/2013, BEFORE NOON QUESTIONS ABOUT HOMEWORK MATERIALS ARE WELCOMED ON THE FORUM.

Autonomous Mobile Robot Design

I will post Homework 1 soon, probably over the weekend, due Friday, September 30.

A new procedure for sensitivity testing with two stress factors

Lecture 10: Logistic Regression

1 EM Primer. CS4786/5786: Machine Learning for Data Science, Spring /24/2015: Assignment 3: EM, graphical models

3. Mathematical Modelling

Math 171 Spring 2017 Final Exam. Problem Worth

Mathematical Foundations of Finite State Discrete Time Markov Chains

Lecture 11. Kernel Methods

Teacher Road Map for Lesson 23: Solving Complicated Radical Equations

Designing Information Devices and Systems I Fall 2018 Homework 7

MAS156: Mathematics (Electrical and Aerospace)

Worksheets for GCSE Mathematics. Algebraic Expressions. Mr Black 's Maths Resources for Teachers GCSE 1-9. Algebra

Principles of the Global Positioning System Lecture 11

Work, Energy, and Power. Chapter 6 of Essential University Physics, Richard Wolfson, 3 rd Edition

Probability and Statistics

Nonregular Languages

Worksheet Week 4 Solutions

Low-Degree Polynomials

CSE 241 Class 1. Jeremy Buhler. August 24,

Astronomy 1010: Survey of Astronomy. University of Toledo Department of Physics and Astronomy

Transcription:

STAT2201 Slava Vaisman

This course is for engineering (civil, mechanical, software ) Electronic Course Profile http://www.courses.uq.edu.au/student_section_loader.php?section=1&profileid=9 2234

If you are doing STAT2201 1 unit course If you are doing CIVL2530 (2 units), then STAT2201 is about half of the CIVL2530 course. The Black Board site is joined for both courses CIVL2530 STAT2201 50% 50%

10 lectures 2 streams (same material): Yoni Nazarathy (coordinator) Slava Vaisman 6 tutorial meetings during the semester, each mapping to a homework assignment 2 hour final exam during the examination period, June 2017 CIVL2530 students should attend both civil and stat (details below) activities. See the CIVL2530 course profile for the time table of the CIVL2530 activities. https://courses.smp.uq.edu.au/stat2201/2018a/2018_weekbyweek.pdf

Assignments: 40% Best 5 out of 6, 8% each Final Exam: 60% Must get at least 40/100 on the exam to pass the course (2 hours) Exam examples : https://courses.smp.uq.edu.au/stat2201/20 18a/

Tutors will not provide consultations (use the lecturers) Yoni - Thursdays12pm 1pm, 67-753. Slava Tuesday 2pm-3pm 67-450 For technical questions, please come to consultation hours.

Study units are (mostly) mapped to Applied Statistics and Probability for Engineers, by D. C. Montgomery and G. C. Run

Condensed Notes (can take to the exam!!!) Get it from https://courses.smp.uq.edu.au/stat2201/2018a/

In this course, we use Julia Languages that worth noting R major statistical language Matlab/Octave Python

Probability vs Statistics and Data Science Deterministic vs Stochastic systems Inference Mechanistic and Empirical models

Probability deals with predicting the likelihood of future events. Probability is about creating models (learning complex relationships). What is the likelihood of a rainy day (tomorrow)? Statistics involves the analysis of the frequency of past events. Statistics is about collecting data. What is the average number of rainy days in Brisbane? Note that: Probability can help us to collect data in a better way, and, Statistics can be used for creating probabilistic models. Data Science is an emerging field, combining statistics, big-data, machine learning and computational techniques.

Deterministic systems Consider the (deterministic) function yy = xx 2. For any xx RR, the outcome yy is determined exactly. Example Ohm's law: II is the current VV is the voltage RR is the resistance II = VV RR

Have you ever used ampere-meter? Do you always get the same measurement? Is there a noise involved?

GPS navigation; Voice and video transmission systems; Communication over unreliable channels; Compression of signals; System reliability (system components fail randomly); Resource-sharing systems (random demand); Machine learning (visit https://www.kaggle.com/competitions);

Inference is the process of collecting data and say something about the world. Data analysis is the process of curating, organizing and analyzing data sets to make inferences. Statistical Inference is the process of making inferences about population parameters (often never fully observed) based on observations collected as part of samples. Example: A certain smartphone manufacturer claims that The battery lasts 2.3 days (on average).

Buy all smartphones (say NN phones were produced) For each smartphone, measure and record its battery life bb 1, bb 2,, bb NN Calculate the average battery life via bb 1 + bb 2 + + bb NN NN

Buy some small number of smartphones (say n NN) For each smartphone, measure and record its battery life bb 1, bb 2,, bb nn Estimate the average battery life via This number is called a statistic. bb 1 + bb 2 + + bb nn nn Statistic is just a quantity (a number) that we calculate from our sample. Sounds reasonable. However, we want our estimations to be reliable. How large nn should be? That is, what is the sample size?

Mechanistic model is a model for which we understand the basic physical mechanism (like Ohm's law): II = VV RR + εε Here, εε is a random term added to the model to account for the fact that the observed values of current flow do not perfectly conform to the mechanistic model. Empirical models are used by engineers where were is no simple or well understood mechanistic model that explains the phenomenon.

Consider the smartphone battery life example. We know that the battery life (LL) depends on the phone usage (UU). That is, there exists a function LL = ff UU. However, ff is unknown. We can try the first-order Taylor series expansion to achieve a (maybe) reasonable approximation. Namely, LL = ββ 0 + ββ 1 UU. Here, ββ 0 and ββ 1 are unknown parameters. In addition, we should account for other sources of variability (like a measurement error) by adding a random parameter εε, and hence: LL = ββ 0 + ββ 1 UU + εε.

We can now make an inference about the intercept (ββ 0 ) and slope (ββ 1 ), respectively.

Stochastic Simulation is about generating random numbers. It might be not very clear now why one would like to do so, but you will have to trust me (for now). Suppose for example, that I would like to play the best paying slot machine. I can try to observe several machines and perform some statistics. Alternatively, suppose that I am slot machine designer. I am building probabilistic model that specifies the likelihood of getting all kinds of winning combinations. Then, I can ask a computer to generate many spins based on my model. As soon as these spins are available, I can calculate many quantities of interest, such as What is the machine payout How often a player wins

Natural phenomena like atmospheric and white noise, or temperature, can be used for a generation of random numbers. These are expensive. We would like to get random numbers using a computer. However, we have a few requirements. It should be robust and reliable. It should be fast. It should be reproducible. That is, one should be able to recover the stream without storing it in the memory. This property is important for testing. The period of the generator is the smallest number of steps taken before entering the previously visited state. A good generator should have a large period. It should be application dependent. For example, in cryptography, it is crucial that the generated sequence will be hard to predict.

Pseudorandom number generators is an important field of study. We will only care about it during this lecture. All modern pseudorandom number generators are capable of producing a sequence UU 1, UU 2, of random numbers such that 1. 0 UU ii 1, and 2. UU 1, UU 2, have a sort of uniform spread on the unit interval. Such a uniform spread is called uniform distribution and is denoted by UU(0,1)

A general pseudorandom number generator will be of the following form: In order to create such a generator, we need the following. Specify an initial number (seed) for reproducibility; (this is XX 0 ). Define some appropriate functions ff and gg.

Define ff XX = aaaa + cc mod mm, and gg XX = XX/mm for some constants aa, cc and mm. Let us set for example aa = 3, cc = 1, and mm = 10,000. Finally, set the seed XX 0 = 1. We can show that: XX 1 = 4 UU 1 = 4 10,000 XX 2 = 13 UU 2 = 13 10,000

Suppose that we would like to use such a generator to plot two-dimensional random uniform points. The algorithm is simple, plot pairs UU 1, UU 2, UU 3, UU 4, We expect to get:

However, using aa = 3, cc = 1, and mm = 10,000, we get: Conclusion: aa = 3, cc = 1, and mm = 10,000 is a very bad choice!

Nevertheless, by using aa = 69069, cc = 1, and mm = 2 32, we get a nice spread. Conclusion: except of this lecture, we do not implement random generators! We use the ones that passed appropriate statistical tests!

https://juliabox.com/up/uq/ajuf5nq Comfortable web interface

Cell types: Markdown and Code

Useful command: mark a cell and press "x" to delete it.

Check JuliaReferenceSheet.pdf

Suppose that we have some random numbers XX 1,, XX nn. Let us sum them to get a new random number SS nn = XX 1 + XX 2 + + XX nn. Then, SS nn has a special spread (distribution), called a Gaussian distribution.