CSE 241 Class 1. Jeremy Buhler. August 24,

Similar documents
1.1 Administrative Stuff

Lecture 1: Asymptotic Complexity. 1 These slides include material originally prepared by Dr.Ron Cytron, Dr. Jeremy Buhler, and Dr. Steve Cole.

Lecture 3. Big-O notation, more recurrences!!

CS1210 Lecture 23 March 8, 2019

CSC236 Week 3. Larry Zhang

Math 31 Lesson Plan. Day 2: Sets; Binary Operations. Elizabeth Gillaspy. September 23, 2011

Big , and Definition Definition

CIS 121. Analysis of Algorithms & Computational Complexity. Slides based on materials provided by Mary Wootters (Stanford University)

CSE373: Data Structures and Algorithms Lecture 2: Math Review; Algorithm Analysis. Hunter Zahn Summer 2016

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2019

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

Asymptotic Analysis Cont'd

CSE332: Data Structures & Parallelism Lecture 2: Algorithm Analysis. Ruth Anderson Winter 2018

An analogy from Calculus: limits

Data Structures. Outline. Introduction. Andres Mendez-Vazquez. December 3, Data Manipulation Examples

1 Closest Pair of Points on the Plane

Announcements. CSE332: Data Abstractions Lecture 2: Math Review; Algorithm Analysis. Today. Mathematical induction. Dan Grossman Spring 2010

CSE 311: Foundations of Computing I. Lecture 1: Propositional Logic

Introduction to Algebra: The First Week

MATH 22 FUNCTIONS: ORDER OF GROWTH. Lecture O: 10/21/2003. The old order changeth, yielding place to new. Tennyson, Idylls of the King

Lecture 3: Big-O and Big-Θ

Knots, Coloring and Applications

CSE373: Data Structures and Algorithms Lecture 3: Math Review; Algorithm Analysis. Catie Baker Spring 2015

AP PHYSICS C Mechanics - SUMMER ASSIGNMENT FOR

Course Staff. Textbook

Great Theoretical Ideas in Computer Science. Lecture 7: Introduction to Computational Complexity

Calculus II. Calculus II tends to be a very difficult course for many students. There are many reasons for this.

CS173 Lecture B, November 3, 2015

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Asymptotic Analysis, recurrences Date: 9/7/17

Calculus at Rutgers. Course descriptions

Lecture 2: Change of Basis

REVIEW FOR TEST I OF CALCULUS I:

#29: Logarithm review May 16, 2009

CS 4407 Algorithms Lecture 2: Iterative and Divide and Conquer Algorithms

Math 1553 Introduction to Linear Algebra. School of Mathematics Georgia Institute of Technology

Lab 2 Worksheet. Problems. Problem 1: Geometry and Linear Equations

Lecture 10: Powers of Matrices, Difference Equations

Review for Final Exam, MATH , Fall 2010

Announcements Wednesday, October 25

Algebra Exam. Solutions and Grading Guide

5 + 9(10) + 3(100) + 0(1000) + 2(10000) =

When we use asymptotic notation within an expression, the asymptotic notation is shorthand for an unspecified function satisfying the relation:

For all For every For each For any There exists at least one There exists There is Some

Math (P)Review Part I:

Implicit Differentiation Applying Implicit Differentiation Applying Implicit Differentiation Page [1 of 5]

ORF 363/COS 323 Final Exam, Fall 2015

CS 350 Algorithms and Complexity

CSE 421, Spring 2017, W.L.Ruzzo. 8. Average-Case Analysis of Algorithms + Randomized Algorithms

Math 31 Lesson Plan. Day 5: Intro to Groups. Elizabeth Gillaspy. September 28, 2011

Lesson 21 Not So Dramatic Quadratics

Computer Algorithms CISC4080 CIS, Fordham Univ. Outline. Last class. Instructor: X. Zhang Lecture 2

Computer Algorithms CISC4080 CIS, Fordham Univ. Instructor: X. Zhang Lecture 2

Physics 6A Lab Experiment 6

Comp 11 Lectures. Mike Shah. July 26, Tufts University. Mike Shah (Tufts University) Comp 11 Lectures July 26, / 45

CMU CS 462/662 (INTRO TO COMPUTER GRAPHICS) HOMEWORK 0.0 MATH REVIEW/PREVIEW LINEAR ALGEBRA

It ain t no good if it ain t snappy enough. (Efficient Computations) COS 116, Spring 2011 Sanjeev Arora

CS103 Handout 08 Spring 2012 April 20, 2012 Problem Set 3

Homework Assignment 4 Root Finding Algorithms The Maximum Velocity of a Car Due: Friday, February 19, 2010 at 12noon

Growth of Functions (CLRS 2.3,3)

Amount of Substance and Its Unit Mole- Connecting the Invisible Micro World to the Observable Macro World Part 2 (English, mp4)

HW8. Due: November 1, 2018

Space Complexity of Algorithms

Physics 2D Lecture Slides Lecture 1: Jan

Exploring Graphs of Polynomial Functions

CS 301. Lecture 18 Decidable languages. Stephen Checkoway. April 2, 2018

Quiz 1 Solutions. Problem 2. Asymptotics & Recurrences [20 points] (3 parts)

base 2 4 The EXPONENT tells you how many times to write the base as a factor. Evaluate the following expressions in standard notation.

Physics E-1ax, Fall 2014 Experiment 3. Experiment 3: Force. 2. Find your center of mass by balancing yourself on two force plates.

SAT-Solvers: propositional logic in action

Homework 5: Sampling and Geometry

Extensions to the Logic of All x are y: Verbs, Relative Clauses, and Only

Quadratic Equations Part I

Solving Quadratic & Higher Degree Equations

Preparing for the CS 173 (A) Fall 2018 Midterm 1

Representations of Motion in One Dimension: Speeding up and slowing down with constant acceleration

Morpheus: Neo, sooner or later you re going to realize, just as I did, that there s a difference between knowing the path, and walking the path.

6A Lab Post-test. Instructions for printed version. Week 10, Fall 2016

A Note on Turing Machine Design

Please bring the task to your first physics lesson and hand it to the teacher.

Taking Stock. IE170: Algorithms in Systems Engineering: Lecture 3. Θ Notation. Comparing Algorithms

PageRank: The Math-y Version (Or, What To Do When You Can t Tear Up Little Pieces of Paper)

Asymptotic Analysis 1

Lab 7 Energy. What You Need To Know: Physics 225 Lab

( )( b + c) = ab + ac, but it can also be ( )( a) = ba + ca. Let s use the distributive property on a couple of

Physics 6A Lab Experiment 6

εx 2 + x 1 = 0. (2) Suppose we try a regular perturbation expansion on it. Setting ε = 0 gives x 1 = 0,

Math Lab 10: Differential Equations and Direction Fields Complete before class Wed. Feb. 28; Due noon Thu. Mar. 1 in class

Lab Slide Rules and Log Scales

Algebra & Trig Review

Algorithms and Programming I. Lecture#1 Spring 2015

MATRIX MULTIPLICATION AND INVERSION

Designing Information Devices and Systems I Fall 2018 Homework 3

Section 4.3. Polynomial Division; The Remainder Theorem and the Factor Theorem

Welcome to Physics 211! General Physics I

F = ma W = mg v = D t

COLLEGE ALGEBRA. Solving Equations and Inequalities. Paul Dawkins

Lecture 4: Constructing the Integers, Rationals and Reals

CSE 373: Data Structures and Algorithms Pep Talk; Algorithm Analysis. Riley Porter Winter 2017

CSE 21 Practice Exam for Midterm 2 Fall 2017

CS 310 Advanced Data Structures and Algorithms

Transcription:

CSE 41 Class 1 Jeremy Buhler August 4, 015 Before class, write URL on board: http://classes.engineering.wustl.edu/cse41/. Also: Jeremy Buhler, Office: Jolley 506, 314-935-6180 1 Welcome and Introduction Welcome to CSE 41: Algorithms and Data Structures! I m your instructor, Dr. Jeremy Buhler. I study computational biology we deal with lots of classic algorithmic problems: searching for things, finding similar objects in a large set, etc. objects are sequences of DNA and protein, and they occur in sets millions of bases in size. Specialized algos deal with them, though many rely on basic 41 stuff (you ll see an example!)... theorist some days (algorithm producer) informed consumer too basic searching, sorting, etc must run fast and not take much space, or biologists get grumpy! I want to get you excited, or at very least informed about algos, so that you too can be a producer and/or informed consumer and make your colleagues happy! Who are you? Ask for show of hands: CSE students? Engineering? Arts and Sciences? Sophomores? CS minors? Second majors? First, Some Administrivia (Course Policies) Let s get this out of the way so we can proceed to the fun stuff! 1. First, everything I m going to say and more is on the web site.. Office hours: Mon 10:30-1:00 Wed 10:30-1 3. TAs: lots of them; they will also hold office hours. 1

4. We will use Piazza for discussions and help requests. Please create an account if you have not done so and enroll in 41 to see the discussion boards. 5. TAs and I may not answer help requests sent via email, or we may ask you to repost via Piazza. With almost 00 people, it will be much easier to keep track if we keep all discussions centralized. 6. Evaluation: 5 homeworks worth 0%, 4 labs worth 0%, three equally weighted exams worth 60%. No cumulative final! 7. Turn-ins: no late homeworks accepted! (I ll try to hand out solutions immediately). Late labs up to three days, 10% off per day. If no late labs, can redo one and turn in at end of semester for regrade (cost of 10%). 8. We ve implemented electronic turn-in for labs using personalized SVN repositories, similar to 131, and for homeworks using Blackboard. See the web site for instructions. Lab 0 and Homework 0 are out today, due 9/, to make sure everyone can successfully use the infrastructure. (Yes, they are for credit.) 9. Collaboration: I try to balance fun with fairness. No written sharing, Iron Chef rule. Everyone must read policy on class web and sign statement of compliance with each homework. Also, we plan to do electronic comparisons among turned-in labs to check for code copying. 10. We will try to treat you with respect; please reciprocate. 3 Goals, and a First Problem What do we want you to learn? A way to think about computation. What is a good algorithm? Which of two algorithms is better? What does fast / faster mean? Abstractions: ways of thinking about how to organize data. You already know some lists, queues, stacks. We ll introduce more abstractions sparse tables, trees, graphs that can be realized efficiently. A bag of tricks : basic algorithmic techniques sorting, searching, data management and analytical techniques recursion trees, asymptotic notation, lower bounds. Both practical and basic to computer science culture (yes, it s more than just writing code). Let s start on the first of these three points with an example problem. Air traffic control center n airplanes appear as points on a D screen At any time, you must quickly be able to identify the closest pair of planes (points) on the screen! (E.g., to sound a warning if they re too close). The airport is busy many planes on screen at any time.

Closest pair problem: other uses as well [Draw the following picture here] What is obvious solution to this problem? For each point p, compute its distance to all other points q. (Write it:) (p.x q.x) + (p.y q.y) Return the pair at minimum distance. A little more formally... Let P [0... n 1] be an array of points. CheckAll(P) mindist j 0 while j n do k j + 1 while k n 1 do d distance(p [j], P [k]) if d < mindist mindist d p 1 P [j] p P [k] k++ j++ return p 1, p, mindist Does anyone have a better idea or improvement? [Solicit, use as example of alternative algorithms.] 4 Is This a Good Algorithm? Is it correct? Despite what you may have heard, correct algorithms are always better than incorrect ones. 3

(Yes, it s correct checks every pair, stores pair of min distance.) Correctness means correct for all possible inputs! Is it fast? In particular, what if I gave you a different algorithm for closest pair is CheckAll faster or slower? How do we usefully measure speed of an algorithm? Code, debug, time execution? On what machine? With what inputs? Will the planes line up the same way as our test points? (If not, will we be sued for a crash?!?!?!?) We desire machine independence: say something useful about speed that doesn t depend on PC versus supercomputer versus smartphone Estimate statements executed in an ideal language, or on an ideal machine? Don t laugh Turing machines, recursive function theory, Knuth s MIX architecture. We ll try this approach. Speed should be expressed as function of input size. Most interesting algorithms produce answers as a function of their inputs. The bigger the input, the more time to read it (and, in general, the more time to process it.) Example: more points n means CheckAll computes more distances. We need a worst-case time estimate for each input size: what if Dr. Evil designed the flight plan? What if we re unlucky? Make clear that choosing to consider the worst case is a fundamental assumption. Compare average case can be useful if distribution of inputs known, but only if worst case is not catastrophic! 4

5 Let s Count Statements for CheckAll Assume an input array P of size n. mindist j 0 while j n do k j + 1 while k n 1 do d distance(p [j], P [k]) if d < mindist mindist d k++ j++ return mindist Now let s add up all statements S(n): n n S(n) = 3 + 3n + (n j) + 4 (n j 1) j=0 = 3n + 1 + (n + n 1 +... + ) + 4 (n 1 + n +... + 1) [ ] [ ] n(n + 1) (n 1)n = 3n + 1 + 1 + 4 = 5 n + 3 n Some hints for doing this yourself: For each loop in a nested set, write down the number of times its body statements execute as a function of its explicit bounds. (That would be high bound minus low bound plus one, if the interval is closed.) (For instance, the inner loop shown runs from k = j + 1 to k = n 1, so its body executes n j 1 times. Wait until you ve labeled every statment in this way before trying to sum over the different values of loop counters like j.) If a loop iterates m times, its test is executed m+1 times (including the last failure). j=0 5

Remember your sequence sums, e.g. n i = i=1 n(n + 1) 6 Why It s OK to Approximate Whew! That was a lot of work! The above approach gives a useful answer, but it is often more precise than necessary. n 5/n 5/n + 3/n % difference 10 50 65 5.7% 50 650 635 1.% 100 5000 5150 0.6% 500 65000 65750 0.1% As you can see, highest-order polynomial term of running time eventually dominates. We say that this term determines the worst-case asymptotic time complexity of the algorithm. Hence, we say that running time is roughly a constant times n, or, for short, Θ(n )!!!! More on the meaning of Θ next week! Eventually? For large enough input size n, or, as computer scientists like to say, in asymptopia (a magical land where only asymptotic complexity matters). What good is a simplified running time estimate? Easier to compute. Example: looking at CheckAll, it is obvious that the inner loop computes n choose distances in the worst case for every n. This observation is enough to say that the algorithm s cost is at least a constant times n (more on this next week). Asymptotic comparisons are frequently meaningful in practice. Suppose we had another closest-pair algo that runs in time Θ(n log n)? Even if the constant is moderately worse... [Show graphs of T (n) versus n here. First graph shows quadratic algo winning, but point out small n. Now show other two graphs.] If you weren t sure how many planes might show up at the airport, which algorithm would you rather use? Finally, a caveat about estimating running times by counting statements: In your ideal language of choice, be careful that statements take only constant time, regardless of input size. Example: In many languages, list.length() is a single statement, but its execution time depends on the size of the list. 6

Example: Multiplying two matrices of size n n is written as one line in Matlab but takes time Θ(n 3 )! If you re careful, statement counts and their asymptotic estimates can be a valuable notion of what it means to have a fast algorithm. [Show graphs of merge-sort versus insertion sort] 7