Info 2950 Intro to Data Science

Similar documents
Finite Mathematics

College Algebra To learn more about all our offerings Visit Knewton.com

ECE285/SIO209, Machine learning for physical applications, Spring 2017

Fundamentals of Probability Theory and Mathematical Statistics

Machine Learning: Assignment 1

MATH-3150H-A: Partial Differential Equation 2018WI - Peterborough Campus

Homework 1 Solutions Probability, Maximum Likelihood Estimation (MLE), Bayes Rule, knn

Two years of high school algebra and ACT math score of at least 19; or DSPM0850 or equivalent math placement score.

HIRES 2017 Syllabus. Instructors:

MATH 135 PRE-CALCULUS: ELEMENTARY FUNCTIONS COURSE SYLLABUS FALL 2012

Value Function Methods. CS : Deep Reinforcement Learning Sergey Levine

DRAFT. Algebra I Honors Course Syllabus

EECS 1028 M: Discrete Mathematics for Engineers

Prerequisite: Qualification by assessment process or completion of Mathematics 1050 or one year of high school algebra with a grade of "C" or higher.

Computational Genomics

Date Credits 3 Course Title College Algebra II Course Number MA Pre-requisite (s) MAC 1105 Co-requisite (s) None Hours 45

Mathematics Prince George s County Public Schools SY

Lectures about Python, useful both for beginners and experts, can be found at (

Course ID May 2017 COURSE OUTLINE. Mathematics 130 Elementary & Intermediate Algebra for Statistics

Utah Secondary Mathematics Core Curriculum Precalculus

Algebra and Trigonometry

Saint Patrick High School

CS 2750: Machine Learning. Bayesian Networks. Prof. Adriana Kovashka University of Pittsburgh March 14, 2016


ECE 3800 Probabilistic Methods of Signal and System Analysis

Lecture 2: Probability and Distributions

COWLEY COLLEGE & Area Vocational Technical School

INFO 2950 Intro to Data Science. Lecture 18: Power Laws and Big Data

The classifier. Theorem. where the min is over all possible classifiers. To calculate the Bayes classifier/bayes risk, we need to know

The classifier. Linear discriminant analysis (LDA) Example. Challenges for LDA

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

Astro 32 - Galactic and Extragalactic Astrophysics/Spring 2016

COWLEY COLLEGE & Area Vocational Technical School

Math.3336: Discrete Mathematics. Combinatorics: Basics of Counting

North Toronto Christian School MATH DEPARTMENT

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

Machine Learning. Instructor: Pranjal Awasthi

COWLEY COLLEGE & Area Vocational Technical School

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Subject CS1 Actuarial Statistics 1 Core Principles

CMPSCI 240: Reasoning about Uncertainty

PAC-learning, VC Dimension and Margin-based Bounds

CPSC 540: Machine Learning

CSC411 Fall 2018 Homework 5

Machine Learning CISC 5800 Dr Daniel Leeds

CURRICULUM GUIDE Algebra II College-Prep Level

PELLISSIPPI STATE COMMUNITY COLLEGE MASTER SYLLABUS

Math , Fall 2014 TuTh 12:30pm - 1:45pm MTH 0303 Dr. M. Machedon. Office: Math Office Hour: Tuesdays and

Requisite Approval must be attached

JEFFERSON COLLEGE COURSE SYLLABUS MTH 110 INTRODUCTORY ALGEBRA. 3 Credit Hours. Prepared by: Skyler Ross & Connie Kuchar September 2014

Honors Algebra II / Trigonometry

MATH College Algebra 3:3:1

Bayesian Classifiers, Conditional Independence and Naïve Bayes. Required reading: Naïve Bayes and Logistic Regression (available on class website)

Machine Learning & Data Mining Caltech CS/CNS/EE 155 Set 2 January 12 th, 2018

MATH 137 : Calculus 1 for Honours Mathematics. Online Assignment #2. Introduction to Sequences

Sets and Functions. MATH 464/506, Real Analysis. J. Robert Buchanan. Summer Department of Mathematics. J. Robert Buchanan Sets and Functions

Machine Learning CISC 5800 Dr Daniel Leeds

4. List of program goals/learning outcomes to be met. Goal-1: Students will have the ability to analyze and graph polynomial functions.

CMPT 310 Artificial Intelligence Survey. Simon Fraser University Summer Instructor: Oliver Schulte

MTH 163, Sections 40 & 41 Precalculus I FALL 2015

Algebra & Trigonometry for College Readiness Media Update, 2016

Intro to Probability Instructor: Alexandre Bouchard

Mr. Kelly s Algebra II Syllabus

Data Mining. 3.6 Regression Analysis. Fall Instructor: Dr. Masoud Yaghini. Numeric Prediction

HOSTOS COMMUNITY COLLEGE DEPARTMENT OF MATHEMATICS

HOSTOS COMMUNITY COLLEGE DEPARTMENT OF MATHEMATICS. MAT 010 or placement on the COMPASS/CMAT

MATH 137 : Calculus 1 for Honours Mathematics Instructor: Barbara Forrest Self Check #1: Sequences and Convergence

Chemistry 55 Lecture and Lab Quantitative Analysis Fall 2003

COURSE OUTLINE. Course Number Course Title Credits MAT251 Calculus III 4

Introduction to Machine Learning

Information Science 2

Portable Assisted Study Sequence ALGEBRA IIB

BN Semantics 3 Now it s personal! Parameter Learning 1

Utah Core State Standards for Mathematics - Precalculus

STAT 302 Introduction to Probability Learning Outcomes. Textbook: A First Course in Probability by Sheldon Ross, 8 th ed.

Secondary Honors Algebra II Objectives

Boosting. CAP5610: Machine Learning Instructor: Guo-Jun Qi

STA 4273H: Statistical Machine Learning

SPRING 2014 Department of Physics & Astronomy, UGA PHYS 4202/6202 Electricity and Magnetism II (as of Jan. 07/2014)

download from

Bayesian Networks Representation

Introduction to Engineering Analysis - ENGR1100 Course Description and Syllabus Monday / Thursday Sections. Fall '17.

cse303 ELEMENTS OF THE THEORY OF COMPUTATION Professor Anita Wasilewska

Machine Learning: Homework 5

Sets. Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry. Fall 2007

Course Staff. Textbook

Chapter 2 Class Notes

CPSC 340: Machine Learning and Data Mining. MLE and MAP Fall 2017

Algebra 1A is a prerequisite course for Algebra 1B. Before beginning this course, you should be able to do the following:

MATH-0955: BEGINNING ALGEBRA

Generative Clustering, Topic Modeling, & Bayesian Inference

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30

MA : Introductory Probability

Today s Menu. Administrativia Two Problems Cutting a Pizza Lighting Rooms

CS 208: Automata Theory and Logic

JEFFERSON COLLEGE COURSE SYLLABUS MTH 110 INTRODUCTORY ALGEBRA. 3 Credit Hours. Prepared by: Skyler Ross & Connie Kuchar September 2014

Sets. Slides by Christopher M. Bourke Instructor: Berthe Y. Choueiry. Spring 2006

Sets. Introduction to Set Theory ( 2.1) Basic notations for sets. Basic properties of sets CMSC 302. Vojislav Kecman

Sampling for Frequent Itemset Mining The Power of PAC learning

Transcription:

Info 2950 Intro to Data Science Instructor: Paul Ginsparg (242 Gates Hall) Tue/Thu 10:10-11:25 Kennedy Hall 116 (Call Auditorium) Only permitted to use middle section of auditorium https://courses.cit.cornell.edu/info2950_2017sp/

https://courses.cit.cornell.edu/info2950_2017sp/

0. Review of basic python / jupyter notebook 1. Counting and probability (factorial, binomial coefficients, conditional probability, Bayes Theorem Real Data: text classifier, etc. [baby machine learning] 2. Statistics: mean, variance; binomial, Gaussian, Poisson distributions 3. Graph theory (nodes, edges), networks (c.f. Info 2040), graph algorithms 4. Power Law data (need exponential and logarithms ) 5. Linear and Logistic regression, Pearson and Spearman correlators 6. Markov and other correlated data Rosen chapters 2,6,7,10,11 Easley Kleinberg chpts 3,18 + many other on-line resources [mentioned, e.g. Berkeley Foundations of Data Science https://data-8.appspot.com/sp16/course ]

Problem sets will involve both programming and non-programming problems. Problem sets are not group projects. You are expected to abide by the Cornell University Code of Academic Integrity. It is your responsibility to understand and follow these policies. (In particular, the work you submit for course assignments must be your own. You may discuss homework assignments with other students at a high level, by for example discussing general methods or strategies to solve a problem, but you must cite the other student in your submission. Any work you submit must be your own understanding of the solution, the details of which you personally and individually worked out, and written in your own words.) You ll be penalized if you copy an ipython notebook, OR if yours is copied.

Problem Set 0, due in one week to be posted later today: will include instructions for installing anaconda, we'll standardize on python 3 due to minor python 2.7/3.5 compatibility issues (though welcome to use python 2) known problem with python installations: cs 1110 unfortunately recommends misconfigured software that violates standard practice by adding environment variables to ~/.bashrc file. (link to instructions for removing)

https://www.continuum.io/downloads

Definition. A set is a collection of objects. The objects of a set are called elements of the set. X = {1, 2, 3, 4, 5} x 2 S, or x 62 S C = {Ithaca, Boston, Chicago} Stu = {1,snow,Cornell,y} empty set = ; Cardinality S is the number of elements of S Can also be defined by rule or equation: X = 5, C = 3, Stu = 4, ; =0 Example: E is the set of even numbers. E = {x x is even}

X = {1, 2, 3, 4, 5} C = {Ithaca, Boston, Chicago} A subset T of a set S is a set of elements all of which are contained in S. T S (proper subset) or T S empty set ;2S for all S Stu = {1,snow,Cornell,y} empty set = ; Can also be defined by rule or equation: Example: E is the set of even numbers. E = {x x is even} C 0 = {Ithaca, Chicago} C 0 C X 0 = {x x is a whole number between 2 and 5} is a subset of X

For two sets to be the same, must have the same elements. A = B means that 8x we have x 2 A i x 2 B (Equivalently A = B means that A B and B A)

The power set P(S) of a set S is the set of all subsets of S. Example: For the set A = {1, 2, 3}, P(A) ={;, {1}, {2}, {3}, {1, 2}, {2, 3}, {1, 3}, {1, 2, 3}} For a set S with n elements, what is P(S)? Cartesian product of two sets A B = {(x, y) x 2 A and y 2 B} Example: A A = {(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)}

Set Operations union of two sets A [ B = {x x 2 A or x 2 B} X = {1, 2, 3, 4, 5} C = {Ithaca, Boston, Chicago} X [ Stu = {1, 2, 3, 4, 5, snow, Cornell,y} C [;= {Ithaca, Boston, Chicago} A [ X = {1, 2, 3, 4, 5} (In this case, A [ X = X). Stu = {1,snow,Cornell,y} empty set = ; Can also be defined by rule or equation: Example: E is the set of even numbers. E = {x x is even} A = {1, 2, 3} intersection of two sets A \ B = {x x 2 A and x 2 B} X \ Stu = {1} C \;= ; X \ E = {2, 4} A \ X = {1, 2, 3} (In this case A \ X = A)

di erence of two sets A B = {x x 2 A and x 62 B} X = {1, 2, 3, 4, 5} X Stu = {2, 3, 4, 5} Stu X = {snow, Cornell, y} C = {Ithaca, Boston, Chicago} C ;= {Ithaca, Boston, Chicago} Stu = {1,snow,Cornell,y} X E = {1, 3, 5} empty set = ; symmetric di erence A4B = {x x 2 A or x 2 B, and x 62 A \ B} Can also be defined by rule or equation: Example: E is the set of even numbers. E = {x x is even} A = {1, 2, 3} X4Stu = {2, 3, 4, 5, snow, Cornell, y} C4; = {Ithaca, Boston, Chicago} X4A = {4, 5}