Bioeng 3070/5070 App Math/Stats for Bioengineer Lecture 3
Five number summary Five-number summary of a data set consists of: the minimum (smallest observation) the first quartile (which cuts off the lowest 25% of the data or 25 th percentile) the median (middle value) the third quartile (which cuts off the highest 25% of the data or 75 th percentile) the maximum (largest observation)
The Box Plot Calculate the first quartile (25 th ), the median (50 th ) and third quartile (75 th ) Calculate the interquartile range (IQR) by subtracting the first quartile from the third quartile. Construct a box above the number line bounded on the left by the first quartile (25 th ) and on the right by the third quartile (75 th ). Indicate where the median lies inside of the box with the presence of a symbol or a line dividing the box at the median value. The mean value of the data can also be labeled with a point. (Optional not always included) Any data observation which lies more than 1.5xIQR lower than the first quartile or 1.5xIQR higher than the third quartile is considered an outlier. Indicate where the smallest value that is not an outlier is by connecting it to the box with a horizontal line or "whisker". Optionally, also mark the position of this value more clearly using a small vertical line. Likewise, connect the largest value that is not an outlier to the box by a "whisker" (and optionally mark it with another small vertical line). Indicate outliers by open and closed dots. "Extreme" outliers, or those which lie more than three times the IQR to the left and right from the first and third quartiles respectively, are indicated by the presence of an open dot. "Mild" outliers - that is, those observations which lie more than 1.5 times the IQR from the first and third quartile but are not also extreme outliers are indicated by the presence of a closed dot. (Sometimes no distinction is made between "mild" and "extreme" outliers.)
Example Age Data Minimium 234 Q1 253 Median 270 Mean 268 Q3 282 Maximum 303 Upper Adjacent Value 303 Lower Adjacent Value 234 Count 17
Probability Theory Basic Notions: Experiment: Any process of observation or measurement Event or Outcome: The result one obtains from an experiment. Sample Space: The set of all possible outcomes. An event is always a sub set of the sample space.
Example of Sample Spaces Rolling of a pair of dice: s={(x,y) x=1,2, 6,y=1,2, 6} Events are not just points in the sample space but rather subset: Example: Let event B be such that the total number of points rolled is 7 B = {(1,6),(2,5),(3,4),(4,3),(5,2),(6,1)}
Probability Theory Sample spaces can be finite, countable, or continuous. If a sample space is finite or countable it is called discrete. Continuous sample spaces are a good mathematical model for outcomes from measurements of physical properties such as temperature, length
Probabilistic Notions A U B is the event that either A or B or both occur. A n B is the event that both A and B occur at the same time. Events A and B are mutually exclusive if they cannot happen at the same time: A B =
Axioms of Probability 1. Probability of an event is a nonnegative real number; A S, P (A) 0 2. P(S) = 1 3. If A 1, A 2, A 3, is a finite or infinite sequence of events with A i A j =,(Mutually exclusive) then P(A 1 A 2 A 3 ) = P (A 1 ) + P (A 2 ) + P (A 3 ) +
Some Rules of Probability 1. If A and A are complementary events in a sample space S, then P(A ) = 1-P(A) Proof: A U A = S => P(AUA ) = 1 (By Axiom 2) => P(A) +P(A ) = 1 ( By Axiom 3)
Some Rules of Probability If A and B are any two events in S, then P(AUB) = P(A)+P(B)-P(A n B) Proof: A U B = (A n B) U (A n B ) U (A n B) P(A U B) = P(A n B) + P(A n B ) + P (A n B) +P (A n B) - P (A n B) = P(A)+P(B)-P(A n B) Can be extended to more than 2. (Homework)
Example A insulin pump shipped by a manufacturer will have a manufacturing defect of either a bad power supply or a blocked delivery system is 0.03. 1. Probability that a insulin pump has a bad power supply is 0.023. 2. Probability that a insulin pump has a blocked delivery system is 0.024 What is the probability that pump will have both a bad power supply and a blocked delivery system?
Example Event A: insulin pump has a bad power supply. P(A) = 0.023 Event B: insulin pump has a blocked delivery system. P(B) = 0.024 P(A U B) = 0.03 A n B : Pump will have both a bad power supply and a blocked delivery system.
Example P(A U B) = P(A) + P(B) P( A n B) P(A n B) = P(A) + P(B) P(A U B) probability that pump will have both a bad power supply and a blocked delivery system: 0.023+0.024 0.03 = 0.017