Clustering: K-Means. Machine Learning , Fall Bhavana Dalvi Mishra PhD student LTI, CMU
|
|
- Hector Carr
- 6 years ago
- Views:
Transcription
1 Clusterg: K-Meas Mache Learg 0-60, Fall 204 Bhavaa Dalv Mshra PhD studet LTI, CMU Sldes are based o materals from Prof. Erc Xg, Prof. Wllam Cohe ad Prof. Adrew Ng
2 Outle What s clusterg? How are smlarty measures defed? Dfferet clusterg algorthms K-Meas Gaussa Mture Models Epectato Mamato Advaced topcs How to seed clusterg? How to choose #clusters Applcato: Gloss fdg for a Kowledge Base 2
3 3 Clusterg
4 Classfcato vs. Clusterg Supervso avalable Usupervsed Learg from supervsed data: eample classfcatos are gve Usupervsed learg: learg from raw ulabeled data 4
5 Clusterg The process of groupg a set of objects to clusters hgh tra-cluster smlarty low ter-cluster smlarty How may clusters? How to detfy them? 5
6 Applcatos of Clusterg Google ews: Clusters ews stores from dfferet sources about same evet. Computatoal bology: Group gees that perform the same fuctos Socal meda aalyss: Group dvduals that have smlar poltcal vews Computer graphcs: Idetfy smlar objects from pctures 6
7 Eamples People Images Speces 7
8 What s a atural groupg amog these objects? 8
9 9 Smlarty Measures
10 What s Smlarty? Hard to defe! But we ow t whe we see t 0 The real meag of smlarty s a phlosophcal questo. Depeds o represetato ad algorthm. For may rep./alg., easer to th terms of a dstace rather tha smlarty betwee vectors.
11 Itutos behd desrable dstace measure propertes DA,B = DB,A Symmetry Otherwse you could clam "Ale loos le Bob, but Bob loos othg le Ale" DA,A = 0 Costacy of Self-Smlarty Otherwse you could clam "Ale loos more le Bob, tha Bob does" DA,B = 0 IIf A= B Idetty of dscerblesects your world that are dfferet, but you caot tell apart. DA,B DA,C + DB,C Tragular Iequalty Otherwse you could clam "Ale s very le Bob, ad Ale s very le Carl, but Bob s very ule Carl"
12 Itutos behd desrable dstace measure propertes DA,B = DB,A Symmetry Otherwse you could clam "Ale loos le Bob, but Bob loos othg le Ale" DA,A = 0 Costacy of Self-Smlarty Otherwse you could clam "Ale loos more le Bob, tha Bob does" DA,B = 0 IIf A= B Idetty of dscerbles Otherwse there are objects your world that are dfferet, but you caot tell apart. DA,B DA,C + DB,C Tragular Iequalty Otherwse you could clam "Ale s very le Bob, ad Ale s very le Carl, but Bob s very ule Carl" 2
13 Dstace Measures: Mows Metrc 3 Suppose two object ad y both have p features The Mows metrc s defed by Most Commo Mows Metrcs r p r y y d,,,,,,, 2 2 p p y y y y ma, dstace "sup", Mahatta dstace, Eucldea dstace p p p y y d r y y d r y y d r
14 A Eample 3 y 4 : 2 : 3: Eucldea dstace : Mahatta dstace : "sup" dstace : ma{ 4, 3} 4. 4
15 Hammg dstace Mahatta dstace s called Hammg dstace whe all features are bary. Gee Epresso Levels Uder 7 Codtos -Hgh,0-Low GeeA GeeB Hammg Dstace : # 0 #
16 Smlarty Measures: Correlato Coeffcet Negatvely correlated Epresso Level Epresso Level Ucorrelated Gee A Gee B 3 Gee B Gee A Tme Tme 6 Epresso Level 2 Tme Postvely correlated Gee B Gee A
17 Smlarty Measures: Correlato Coeffcet 7 Pearso correlato coeffcet Specal case: cose dstace. ad where, 2 2 p p p p p p p y y y y y y y s, y s y y y s,
18 Clusterg Algorthm K-Meas 8
19 9 K-meas Clusterg: Step
20 20 K-meas Clusterg: Step 2
21 2 K-meas Clusterg: Step 3
22 22 K-meas Clusterg: Step 4
23 23 K-meas Clusterg: Step 5
24 K-Meas: Algorthm. Decde o a value for. 2. Itale the cluster ceters radomly f ecessary. 3. Repeat tll ay object chages ts cluster assgmet Decde the cluster membershps of the N objects by assgg them to the earest cluster cetrod cluster arg m j d, j Re-estmate the cluster ceters, by assumg the membershps foud above are correct. 24
25 K-Meas s wdely used practce Etremely fast ad scalable: used varety of applcatos Ca be easly paralleled Easy Map-Reduce mplemetato Mapper: assgs each datapot to earest cluster Reducer: taes all pots assged to a cluster, ad re-computes the cetrods Sestve to startg pots or radom seed talato Smlar to Neural etwors There are etesos le K-Meas++ that try to solve ths problem 25
26 26 Outlers
27 Clusterg Algorthm Gaussa Mture Model 27
28 Desty estmato Estmate desty fucto P gve ulabeled datapots X to X 28 A arcraft testg faclty measures Heat ad Vbrato parameters for every ewly bult arcraft.
29 29 Mture of Gaussas
30 Mture Models A desty model p may be mult-modal. We may be able to model t as a mture of u-modal dstrbutos e.g., Gaussas. Each mode may correspod to a dfferet sub-populato e.g., male ad female. 30
31 Gaussa Mture Models GMMs Cosder a mture of K Gaussa compoets: K p, N, mture proporto mture compoet 3 Ths model ca be used for usupervsed clusterg. Ths model ft by AutoClass has bee used to dscover ew ds of stars astroomcal data, etc.
32 Learg mture models 32 I fully observed d settgs, the log lelhood decomposes to a sum of local terms. Wth latet varables, all the parameters become coupled together va margalato, log log, log ; c p p p D l c p p p D, log, log ; l
33 If we are dog MLE for completely observed data Data log-lelhood MLE C N p p p D log, ; log log,, log, log ; 2 θ l MLE for GMM 33, ; arg ma ˆ, D MLE θ l ; arg ma ˆ, D MLE θ l ; arg ma ˆ, D MLE θ l, ˆ MLE ˆ datapots Number of, Z MLE Gaussa Naïve Bayes
34 34 Learg GMM s are uow
35 35 Epectato Mamato EM
36 Epectato-Mamato EM Start: "Guess" the mea ad covarace of each of the K gaussas Loop 36
37 37
38 Epectato-Mamato EM Start: "Guess" the cetrod ad covarace of each of the K clusters Loop 38
39 The Epectato-Mamato EM Algorthm 39 E Step: Guess values of Z s l t t t t j l P l p j P j p j p w,,,, t j t j N j p t Z P
40 The Epectato-Mamato EM Algorthm 40 # datapots t t w Z P t T t t t t w w t t t w w M Step: Update parameter estmates
41 EM Algorthm for GMM 4 E Step: Guess values of Z s l t t t t j l P l p j P j p j p w,,, t N w Z P t t T t t t t w w t t t w w M Step: Update parameter estmates
42 K-meas s a hard verso of EM 42 I the K-meas E-step we do hard assgmet: I the K-meas M-step we update the meas as the weghted sum of the data, but ow the weghts are 0 or : arg m t t T t t t t t,,
43 Soft vs. Hard EM assgmets GMM K-Meas 43
44 Theory uderlyg EM What are we dog? Recall that accordg to MLE, we ted to lear the model parameters that would mame the lelhood of the data. But we do ot observe, so computg s dffcult! What shall we do? l ; D log p, log p p, c 44
45 45 Ituto behd the EM algorthm
46 Jese s Iequalty For a cove fucto f fε[ ] [f] Smlarly, for a cocave fucto f fε[ ] [f] 46
47 Jese s Iequalty: cocave f fε[ ] [f] 47
48 EM ad Jese s Iequalty fε[ ] [f] 48
49 49 Advaced Topcs
50 How May Clusters? Number of clusters K s gve Partto documets to predetermed #topcs Solve a optmato problem: peale #clusters Iformato theoretc approaches: AIC, BIC crtera for model selecto Tradeoff betwee havg clearly separable clusters ad havg too may clusters 50
51 Seed Choce: K-Meas++ K-Meas results ca vary based o radom seed selecto. K-Meas++ Choose oe ceter uformly at radom amog gve datapots. For each data pot, compute D D = dstace, earest ceter Choose oe ew data pot at radom as a ew ceter P D 2. Repeat Steps 2 ad 3 utl ceters have bee chose. Ru stadard K-Meas wth ths cetrod talato. 5
52 52 Sem-supervsed K-Meas
53 Supervsed Learg Usupervsed Learg Sem-supervsed Learg 53
54 Automatc Gloss Fdg for a Kowledge Base Glosses: Natural laguage deftos of amed ettes. E.g. Mcrosoft s a Amerca multatoal corporato headquartered Redmod that develops, maufactures, lceses, supports ad sells computer software, cosumer electrocs ad persoal computers ad servces... Iput: Kowledge Base.e. a set of cocepts e.g. compay ad ettes belogg to those cocepts e.g. Mcrosoft, ad a set of potetal glosses. Output: Caddate glosses matched to relevat ettes the KB. Mcrosoft s a Amerca multatoal corporato headquartered Redmod s mapped to etty Mcrosoft of type Compay. [Automatc Gloss Fdg for a Kowledge Base usg Otologcal Costrats, Bhavaa Dalv Mshra, Eat Mov, Partha Pratm Taludar, ad Wllam W. Cohe, 204, Uder submsso] 54
55 55 Eample: Gloss fdg
56 56 Eample: Gloss fdg
57 57 Eample: Gloss fdg
58 58 Eample: Gloss fdg
59 Trag a clusterg model Frut Compay Test: Ambguous glosses 59 Tra: Uambguous glosses
60 60 GLOFIN: Clusterg glosses
61 6 GLOFIN: Clusterg glosses
62 62 GLOFIN: Clusterg glosses
63 63 GLOFIN: Clusterg glosses
64 64 GLOFIN: Clusterg glosses
65 65 GLOFIN: Clusterg glosses
66 GLOFIN o NELL Dataset SVM Labal Propagato GLOFIN 0 0 Precso Recall F categores, 247K caddate glosses, #tra=20k, #test=227k
67 GLOFIN o Freebase Dataset Precso Recall F SVM Labal Propagato GLOFIN categores, 285K caddate glosses, #tra=25k, #test=260k
68 Summary What s clusterg? What are smlarty measures? K-Meas clusterg algorthm Mture of Gaussas GMM Epectato Mamato Advaced Topcs How to seed clusterg How to decde #clusters Applcato: Gloss fdg for a Kowledge Bases 68
69 Tha You Questos? 69
Unsupervised Learning and Other Neural Networks
CSE 53 Soft Computg NOT PART OF THE FINAL Usupervsed Learg ad Other Neural Networs Itroducto Mture Destes ad Idetfablty ML Estmates Applcato to Normal Mtures Other Neural Networs Itroducto Prevously, all
More informationBayes (Naïve or not) Classifiers: Generative Approach
Logstc regresso Bayes (Naïve or ot) Classfers: Geeratve Approach What do we mea by Geeratve approach: Lear p(y), p(x y) ad the apply bayes rule to compute p(y x) for makg predctos Ths s essetally makg
More informationGenerative classification models
CS 75 Mache Learg Lecture Geeratve classfcato models Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Data: D { d, d,.., d} d, Classfcato represets a dscrete class value Goal: lear f : X Y Bar classfcato
More informationSummary of the lecture in Biostatistics
Summary of the lecture Bostatstcs Probablty Desty Fucto For a cotuos radom varable, a probablty desty fucto s a fucto such that: 0 dx a b) b a dx A probablty desty fucto provdes a smple descrpto of the
More informationDimensionality reduction Feature selection
CS 750 Mache Learg Lecture 3 Dmesoalty reducto Feature selecto Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 750 Mache Learg Dmesoalty reducto. Motvato. Classfcato problem eample: We have a put data
More informationKernel-based Methods and Support Vector Machines
Kerel-based Methods ad Support Vector Maches Larr Holder CptS 570 Mache Learg School of Electrcal Egeerg ad Computer Scece Washgto State Uverst Refereces Muller et al. A Itroducto to Kerel-Based Learg
More informationMachine Learning. Topic 4: Measuring Distance
Mache Learg Topc 4: Measurg Dstace Bra Pardo Mache Learg: EECS 349 Fall 2009 Wh measure dstace? Clusterg requres dstace measures. Local methods requre a measure of localt Search eges requre a measure of
More informationEconometric Methods. Review of Estimation
Ecoometrc Methods Revew of Estmato Estmatg the populato mea Radom samplg Pot ad terval estmators Lear estmators Ubased estmators Lear Ubased Estmators (LUEs) Effcecy (mmum varace) ad Best Lear Ubased Estmators
More informationKLT Tracker. Alignment. 1. Detect Harris corners in the first frame. 2. For each Harris corner compute motion between consecutive frames
KLT Tracker Tracker. Detect Harrs corers the frst frame 2. For each Harrs corer compute moto betwee cosecutve frames (Algmet). 3. Lk moto vectors successve frames to get a track 4. Itroduce ew Harrs pots
More informationIntroduction to local (nonparametric) density estimation. methods
Itroducto to local (oparametrc) desty estmato methods A slecture by Yu Lu for ECE 66 Sprg 014 1. Itroducto Ths slecture troduces two local desty estmato methods whch are Parze desty estmato ad k-earest
More information6. Nonparametric techniques
6. Noparametrc techques Motvato Problem: how to decde o a sutable model (e.g. whch type of Gaussa) Idea: just use the orgal data (lazy learg) 2 Idea 1: each data pot represets a pece of probablty P(x)
More informationEstimation of Stress- Strength Reliability model using finite mixture of exponential distributions
Iteratoal Joural of Computatoal Egeerg Research Vol, 0 Issue, Estmato of Stress- Stregth Relablty model usg fte mxture of expoetal dstrbutos K.Sadhya, T.S.Umamaheswar Departmet of Mathematcs, Lal Bhadur
More informationSimulation Output Analysis
Smulato Output Aalyss Summary Examples Parameter Estmato Sample Mea ad Varace Pot ad Iterval Estmato ermatg ad o-ermatg Smulato Mea Square Errors Example: Sgle Server Queueg System x(t) S 4 S 4 S 3 S 5
More informationCS 1675 Introduction to Machine Learning Lecture 12 Support vector machines
CS 675 Itroducto to Mache Learg Lecture Support vector maches Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Mdterm eam October 9, 7 I-class eam Closed book Stud materal: Lecture otes Correspodg chapters
More informationLecture 7. Confidence Intervals and Hypothesis Tests in the Simple CLR Model
Lecture 7. Cofdece Itervals ad Hypothess Tests the Smple CLR Model I lecture 6 we troduced the Classcal Lear Regresso (CLR) model that s the radom expermet of whch the data Y,,, K, are the outcomes. The
More informationDiscrete Mathematics and Probability Theory Fall 2016 Seshia and Walrand DIS 10b
CS 70 Dscrete Mathematcs ad Probablty Theory Fall 206 Sesha ad Walrad DIS 0b. Wll I Get My Package? Seaky delvery guy of some compay s out delverg packages to customers. Not oly does he had a radom package
More informationLecture 3. Sampling, sampling distributions, and parameter estimation
Lecture 3 Samplg, samplg dstrbutos, ad parameter estmato Samplg Defto Populato s defed as the collecto of all the possble observatos of terest. The collecto of observatos we take from the populato s called
More informationApplication of Calibration Approach for Regression Coefficient Estimation under Two-stage Sampling Design
Authors: Pradp Basak, Kaustav Adtya, Hukum Chadra ad U.C. Sud Applcato of Calbrato Approach for Regresso Coeffcet Estmato uder Two-stage Samplg Desg Pradp Basak, Kaustav Adtya, Hukum Chadra ad U.C. Sud
More informationConvergence of the Desroziers scheme and its relation to the lag innovation diagnostic
Covergece of the Desrozers scheme ad ts relato to the lag ovato dagostc chard Méard Evromet Caada, Ar Qualty esearch Dvso World Weather Ope Scece Coferece Motreal, August 9, 04 o t t O x x x y x y Oservato
More informationSTA 105-M BASIC STATISTICS (This is a multiple choice paper.)
DCDM BUSINESS SCHOOL September Mock Eamatos STA 0-M BASIC STATISTICS (Ths s a multple choce paper.) Tme: hours 0 mutes INSTRUCTIONS TO CANDIDATES Do ot ope ths questo paper utl you have bee told to do
More informationChapter 4 (Part 1): Non-Parametric Classification (Sections ) Pattern Classification 4.3) Announcements
Aoucemets No-Parametrc Desty Estmato Techques HW assged Most of ths lecture was o the blacboard. These sldes cover the same materal as preseted DHS Bometrcs CSE 90-a Lecture 7 CSE90a Fall 06 CSE90a Fall
More informationLecture 9: Tolerant Testing
Lecture 9: Tolerat Testg Dael Kae Scrbe: Sakeerth Rao Aprl 4, 07 Abstract I ths lecture we prove a quas lear lower boud o the umber of samples eeded to do tolerat testg for L dstace. Tolerat Testg We have
More informationENGI 4421 Joint Probability Distributions Page Joint Probability Distributions [Navidi sections 2.5 and 2.6; Devore sections
ENGI 441 Jot Probablty Dstrbutos Page 7-01 Jot Probablty Dstrbutos [Navd sectos.5 ad.6; Devore sectos 5.1-5.] The jot probablty mass fucto of two dscrete radom quattes, s, P ad p x y x y The margal probablty
More informationNonparametric Techniques
Noparametrc Techques Noparametrc Techques w/o assumg ay partcular dstrbuto the uderlyg fucto may ot be kow e.g. mult-modal destes too may parameters Estmatg desty dstrbuto drectly Trasform to a lower-dmesoal
More informationChapter Business Statistics: A First Course Fifth Edition. Learning Objectives. Correlation vs. Regression. In this chapter, you learn:
Chapter 3 3- Busess Statstcs: A Frst Course Ffth Edto Chapter 2 Correlato ad Smple Lear Regresso Busess Statstcs: A Frst Course, 5e 29 Pretce-Hall, Ic. Chap 2- Learg Objectves I ths chapter, you lear:
More informationChapter 14 Logistic Regression Models
Chapter 4 Logstc Regresso Models I the lear regresso model X β + ε, there are two types of varables explaatory varables X, X,, X k ad study varable y These varables ca be measured o a cotuous scale as
More informationCS 2750 Machine Learning. Lecture 8. Linear regression. CS 2750 Machine Learning. Linear regression. is a linear combination of input components x
CS 75 Mache Learg Lecture 8 Lear regresso Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Lear regresso Fucto f : X Y s a lear combato of put compoets f + + + K d d K k - parameters
More information1 Onto functions and bijections Applications to Counting
1 Oto fuctos ad bectos Applcatos to Coutg Now we move o to a ew topc. Defto 1.1 (Surecto. A fucto f : A B s sad to be surectve or oto f for each b B there s some a A so that f(a B. What are examples of
More information3. Basic Concepts: Consequences and Properties
: 3. Basc Cocepts: Cosequeces ad Propertes Markku Jutt Overvew More advaced cosequeces ad propertes of the basc cocepts troduced the prevous lecture are derved. Source The materal s maly based o Sectos.6.8
More informationUNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Exam: ECON430 Statstcs Date of exam: Frday, December 8, 07 Grades are gve: Jauary 4, 08 Tme for exam: 0900 am 00 oo The problem set covers 5 pages Resources allowed:
More informationObjectives of Multiple Regression
Obectves of Multple Regresso Establsh the lear equato that best predcts values of a depedet varable Y usg more tha oe eplaator varable from a large set of potetal predctors {,,... k }. Fd that subset of
More informationSTATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS. x, where. = y - ˆ " 1
STATISTICAL PROPERTIES OF LEAST SQUARES ESTIMATORS Recall Assumpto E(Y x) η 0 + η x (lear codtoal mea fucto) Data (x, y ), (x 2, y 2 ),, (x, y ) Least squares estmator ˆ E (Y x) ˆ " 0 + ˆ " x, where ˆ
More informationRadial Basis Function Networks
Radal Bass Fucto Netorks Radal Bass Fucto Netorks A specal types of ANN that have three layers Iput layer Hdde layer Output layer Mappg from put to hdde layer s olear Mappg from hdde to output layer s
More informationAn Introduction to. Support Vector Machine
A Itroducto to Support Vector Mache Support Vector Mache (SVM) A classfer derved from statstcal learg theory by Vapk, et al. 99 SVM became famous whe, usg mages as put, t gave accuracy comparable to eural-etwork
More informationRecall MLR 5 Homskedasticity error u has the same variance given any values of the explanatory variables Var(u x1,...,xk) = 2 or E(UU ) = 2 I
Chapter 8 Heterosedastcty Recall MLR 5 Homsedastcty error u has the same varace gve ay values of the eplaatory varables Varu,..., = or EUU = I Suppose other GM assumptos hold but have heterosedastcty.
More informationFeature Selection: Part 2. 1 Greedy Algorithms (continued from the last lecture)
CSE 546: Mache Learg Lecture 6 Feature Selecto: Part 2 Istructor: Sham Kakade Greedy Algorthms (cotued from the last lecture) There are varety of greedy algorthms ad umerous amg covetos for these algorthms.
More informationPoint Estimation: definition of estimators
Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.
More informationGeneralization of the Dissimilarity Measure of Fuzzy Sets
Iteratoal Mathematcal Forum 2 2007 o. 68 3395-3400 Geeralzato of the Dssmlarty Measure of Fuzzy Sets Faramarz Faghh Boformatcs Laboratory Naobotechology Research Ceter vesa Research Isttute CECR Tehra
More informationCHAPTER VI Statistical Analysis of Experimental Data
Chapter VI Statstcal Aalyss of Expermetal Data CHAPTER VI Statstcal Aalyss of Expermetal Data Measuremets do ot lead to a uque value. Ths s a result of the multtude of errors (maly radom errors) that ca
More information13. Parametric and Non-Parametric Uncertainties, Radial Basis Functions and Neural Network Approximations
Lecture 7 3. Parametrc ad No-Parametrc Ucertates, Radal Bass Fuctos ad Neural Network Approxmatos he parameter estmato algorthms descrbed prevous sectos were based o the assumpto that the system ucertates
More informationSTA302/1001-Fall 2008 Midterm Test October 21, 2008
STA3/-Fall 8 Mdterm Test October, 8 Last Name: Frst Name: Studet Number: Erolled (Crcle oe) STA3 STA INSTRUCTIONS Tme allowed: hour 45 mutes Ads allowed: A o-programmable calculator A table of values from
More informationSpecial Instructions / Useful Data
JAM 6 Set of all real umbers P A..d. B, p Posso Specal Istructos / Useful Data x,, :,,, x x Probablty of a evet A Idepedetly ad detcally dstrbuted Bomal dstrbuto wth parameters ad p Posso dstrbuto wth
More informationECONOMETRIC THEORY. MODULE VIII Lecture - 26 Heteroskedasticity
ECONOMETRIC THEORY MODULE VIII Lecture - 6 Heteroskedastcty Dr. Shalabh Departmet of Mathematcs ad Statstcs Ida Isttute of Techology Kapur . Breusch Paga test Ths test ca be appled whe the replcated data
More informationBinary classification: Support Vector Machines
CS 57 Itroducto to AI Lecture 6 Bar classfcato: Support Vector Maches Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 57 Itro to AI Supervsed learg Data: D { D, D,.., D} a set of eamples D, (,,,,,
More informationThe Selection Problem - Variable Size Decrease/Conquer (Practice with algorithm analysis)
We have covered: Selecto, Iserto, Mergesort, Bubblesort, Heapsort Next: Selecto the Qucksort The Selecto Problem - Varable Sze Decrease/Coquer (Practce wth algorthm aalyss) Cosder the problem of fdg the
More informationhp calculators HP 30S Statistics Averages and Standard Deviations Average and Standard Deviation Practice Finding Averages and Standard Deviations
HP 30S Statstcs Averages ad Stadard Devatos Average ad Stadard Devato Practce Fdg Averages ad Stadard Devatos HP 30S Statstcs Averages ad Stadard Devatos Average ad stadard devato The HP 30S provdes several
More informationTHE ROYAL STATISTICAL SOCIETY GRADUATE DIPLOMA
THE ROYAL STATISTICAL SOCIETY EXAMINATIONS SOLUTIONS GRADUATE DIPLOMA PAPER II STATISTICAL THEORY & METHODS The Socety provdes these solutos to assst caddates preparg for the examatos future years ad for
More informationUNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS
UNIVERSITY OF OSLO DEPARTMENT OF ECONOMICS Postpoed exam: ECON430 Statstcs Date of exam: Jauary 0, 0 Tme for exam: 09:00 a.m. :00 oo The problem set covers 5 pages Resources allowed: All wrtte ad prted
More informationENGI 3423 Simple Linear Regression Page 12-01
ENGI 343 mple Lear Regresso Page - mple Lear Regresso ometmes a expermet s set up where the expermeter has cotrol over the values of oe or more varables X ad measures the resultg values of aother varable
More informationChapter 3 Sampling For Proportions and Percentages
Chapter 3 Samplg For Proportos ad Percetages I may stuatos, the characterstc uder study o whch the observatos are collected are qualtatve ature For example, the resposes of customers may marketg surveys
More informationOverview. Basic concepts of Bayesian learning. Most probable model given data Coin tosses Linear regression Logistic regression
Overvew Basc cocepts of Bayesa learg Most probable model gve data Co tosses Lear regresso Logstc regresso Bayesa predctos Co tosses Lear regresso 30 Recap: regresso problems Iput to learg problem: trag
More informationCS286.2 Lecture 4: Dinur s Proof of the PCP Theorem
CS86. Lecture 4: Dur s Proof of the PCP Theorem Scrbe: Thom Bohdaowcz Prevously, we have prove a weak verso of the PCP theorem: NP PCP 1,1/ (r = poly, q = O(1)). Wth ths result we have the desred costat
More informationOutline. Point Pattern Analysis Part I. Revisit IRP/CSR
Pot Patter Aalyss Part I Outle Revst IRP/CSR, frst- ad secod order effects What s pot patter aalyss (PPA)? Desty-based pot patter measures Dstace-based pot patter measures Revst IRP/CSR Equal probablty:
More informationCS 2750 Machine Learning Lecture 5. Density estimation. Density estimation
CS 750 Mache Learg Lecture 5 esty estmato Mlos Hausrecht mlos@tt.edu 539 Seott Square esty estmato esty estmato: s a usuervsed learg roblem Goal: Lear a model that rereset the relatos amog attrbutes the
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 17
CS434a/54a: Patter Recogto Prof. Olga Vesler Lecture 7 Today Paraetrc Usupervsed Learg Expectato Maxato (EM) oe of the ost useful statstcal ethods oldest verso 958 (Hartley) seal paper 977 (Depster et
More informationSupport vector machines
CS 75 Mache Learg Lecture Support vector maches Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square CS 75 Mache Learg Outle Outle: Algorthms for lear decso boudary Support vector maches Mamum marg hyperplae.
More informationSTA 108 Applied Linear Models: Regression Analysis Spring Solution for Homework #1
STA 08 Appled Lear Models: Regresso Aalyss Sprg 0 Soluto for Homework #. Let Y the dollar cost per year, X the umber of vsts per year. The the mathematcal relato betwee X ad Y s: Y 300 + X. Ths s a fuctoal
More informationParametric Density Estimation: Bayesian Estimation. Naïve Bayes Classifier
arametrc Dest Estmato: Baesa Estmato. Naïve Baes Classfer Baesa arameter Estmato Suppose we have some dea of the rage where parameters θ should be Should t we formalze such pror owledge hopes that t wll
More informationLinear Regression Linear Regression with Shrinkage. Some slides are due to Tommi Jaakkola, MIT AI Lab
Lear Regresso Lear Regresso th Shrkage Some sldes are due to Tomm Jaakkola, MIT AI Lab Itroducto The goal of regresso s to make quattatve real valued predctos o the bass of a vector of features or attrbutes.
More informationDescriptive Statistics
Page Techcal Math II Descrptve Statstcs Descrptve Statstcs Descrptve statstcs s the body of methods used to represet ad summarze sets of data. A descrpto of how a set of measuremets (for eample, people
More informationStatistics MINITAB - Lab 5
Statstcs 10010 MINITAB - Lab 5 PART I: The Correlato Coeffcet Qute ofte statstcs we are preseted wth data that suggests that a lear relatoshp exsts betwee two varables. For example the plot below s of
More informationBig Data Analytics. Data Fitting and Sampling. Acknowledgement: Notes by Profs. R. Szeliski, S. Seitz, S. Lazebnik, K. Chaturvedi, and S.
Bg Data Aaltcs Data Fttg ad Samplg Ackowledgemet: Notes b Profs. R. Szelsk, S. Setz, S. Lazebk, K. Chaturved, ad S. Shah Fttg: Cocepts ad recpes A bag of techques If we kow whch pots belog to the le, how
More informationContinuous Distributions
7//3 Cotuous Dstrbutos Radom Varables of the Cotuous Type Desty Curve Percet Desty fucto, f (x) A smooth curve that ft the dstrbuto 3 4 5 6 7 8 9 Test scores Desty Curve Percet Probablty Desty Fucto, f
More informationBAYESIAN INFERENCES FOR TWO PARAMETER WEIBULL DISTRIBUTION
Iteratoal Joural of Mathematcs ad Statstcs Studes Vol.4, No.3, pp.5-39, Jue 06 Publshed by Europea Cetre for Research Trag ad Developmet UK (www.eajourals.org BAYESIAN INFERENCES FOR TWO PARAMETER WEIBULL
More informationSection l h l Stem=Tens. 8l Leaf=Ones. 8h l 03. 9h 58
Secto.. 6l 34 6h 667899 7l 44 7h Stem=Tes 8l 344 Leaf=Oes 8h 5557899 9l 3 9h 58 Ths dsplay brgs out the gap the data: There are o scores the hgh 7's. 6. a. beams cylders 9 5 8 88533 6 6 98877643 7 488
More informationBayes Decision Theory - II
Bayes Decso Theory - II Ke Kreutz-Delgado (Nuo Vascocelos) ECE 175 Wter 2012 - UCSD Nearest Neghbor Classfer We are cosderg supervsed classfcato Nearest Neghbor (NN) Classfer A trag set D = {(x 1,y 1 ),,
More informationMultivariate Transformation of Variables and Maximum Likelihood Estimation
Marquette Uversty Multvarate Trasformato of Varables ad Maxmum Lkelhood Estmato Dael B. Rowe, Ph.D. Assocate Professor Departmet of Mathematcs, Statstcs, ad Computer Scece Copyrght 03 by Marquette Uversty
More informationDr. Shalabh. Indian Institute of Technology Kanpur
Aalyss of Varace ad Desg of Expermets-I MODULE -I LECTURE - SOME RESULTS ON LINEAR ALGEBRA, MATRIX THEORY AND DISTRIBUTIONS Dr. Shalabh Departmet t of Mathematcs t ad Statstcs t t Ida Isttute of Techology
More informationStatistics. Correlational. Dr. Ayman Eldeib. Simple Linear Regression and Correlation. SBE 304: Linear Regression & Correlation 1/3/2018
/3/08 Sstems & Bomedcal Egeerg Departmet SBE 304: Bo-Statstcs Smple Lear Regresso ad Correlato Dr. Ama Eldeb Fall 07 Descrptve Orgasg, summarsg & descrbg data Statstcs Correlatoal Relatoshps Iferetal Geeralsg
More informationbest estimate (mean) for X uncertainty or error in the measurement (systematic, random or statistical) best
Error Aalyss Preamble Wheever a measuremet s made, the result followg from that measuremet s always subject to ucertaty The ucertaty ca be reduced by makg several measuremets of the same quatty or by mprovg
More information= 1. UCLA STAT 13 Introduction to Statistical Methods for the Life and Health Sciences. Parameters and Statistics. Measures of Centrality
UCLA STAT Itroducto to Statstcal Methods for the Lfe ad Health Sceces Istructor: Ivo Dov, Asst. Prof. of Statstcs ad Neurology Teachg Assstats: Fred Phoa, Krste Johso, Mg Zheg & Matlda Hseh Uversty of
More informationStatistics Descriptive and Inferential Statistics. Instructor: Daisuke Nagakura
Statstcs Descrptve ad Iferetal Statstcs Istructor: Dasuke Nagakura (agakura@z7.keo.jp) 1 Today s topc Today, I talk about two categores of statstcal aalyses, descrptve statstcs ad feretal statstcs, ad
More informationMachine Learning. knowledge acquisition skill refinement. Relation between machine learning and data mining. P. Berka, /18
Mache Learg The feld of mache learg s cocered wth the questo of how to costruct computer programs that automatcally mprove wth eperece. (Mtchell, 1997) Thgs lear whe they chage ther behavor a way that
More informationEECE 301 Signals & Systems
EECE 01 Sgals & Systems Prof. Mark Fowler Note Set #9 Computg D-T Covoluto Readg Assgmet: Secto. of Kame ad Heck 1/ Course Flow Dagram The arrows here show coceptual flow betwee deas. Note the parallel
More informationDimensionality Reduction and Learning
CMSC 35900 (Sprg 009) Large Scale Learg Lecture: 3 Dmesoalty Reducto ad Learg Istructors: Sham Kakade ad Greg Shakharovch L Supervsed Methods ad Dmesoalty Reducto The theme of these two lectures s that
More informationSome Applications of the Resampling Methods in Computational Physics
Iteratoal Joural of Mathematcs Treds ad Techoloy Volume 6 February 04 Some Applcatos of the Resampl Methods Computatoal Physcs Sotraq Marko #, Lorec Ekoom * # Physcs Departmet, Uversty of Korca, Albaa,
More informationMULTIDIMENSIONAL HETEROGENEOUS VARIABLE PREDICTION BASED ON EXPERTS STATEMENTS. Gennadiy Lbov, Maxim Gerasimov
Iteratoal Boo Seres "Iformato Scece ad Computg" 97 MULTIIMNSIONAL HTROGNOUS VARIABL PRICTION BAS ON PRTS STATMNTS Geady Lbov Maxm Gerasmov Abstract: I the wors [ ] we proposed a approach of formg a cosesus
More informationLecture 1. (Part II) The number of ways of partitioning n distinct objects into k distinct groups containing n 1,
Lecture (Part II) Materals Covered Ths Lecture: Chapter 2 (2.6 --- 2.0) The umber of ways of parttog dstct obects to dstct groups cotag, 2,, obects, respectvely, where each obect appears exactly oe group
More informationClass 13,14 June 17, 19, 2015
Class 3,4 Jue 7, 9, 05 Pla for Class3,4:. Samplg dstrbuto of sample mea. The Cetral Lmt Theorem (CLT). Cofdece terval for ukow mea.. Samplg Dstrbuto for Sample mea. Methods used are based o CLT ( Cetral
More informationModule 7. Lecture 7: Statistical parameter estimation
Lecture 7: Statstcal parameter estmato Parameter Estmato Methods of Parameter Estmato 1) Method of Matchg Pots ) Method of Momets 3) Mamum Lkelhood method Populato Parameter Sample Parameter Ubased estmato
More information2.28 The Wall Street Journal is probably referring to the average number of cubes used per glass measured for some population that they have chosen.
.5 x 54.5 a. x 7. 786 7 b. The raked observatos are: 7.4, 7.5, 7.7, 7.8, 7.9, 8.0, 8.. Sce the sample sze 7 s odd, the meda s the (+)/ 4 th raked observato, or meda 7.8 c. The cosumer would more lkely
More informationA Study of the Reproducibility of Measurements with HUR Leg Extension/Curl Research Line
HUR Techcal Report 000--9 verso.05 / Frak Borg (borgbros@ett.f) A Study of the Reproducblty of Measuremets wth HUR Leg Eteso/Curl Research Le A mportat property of measuremets s that the results should
More informationRegresso What s a Model? 1. Ofte Descrbe Relatoshp betwee Varables 2. Types - Determstc Models (o radomess) - Probablstc Models (wth radomess) EPI 809/Sprg 2008 9 Determstc Models 1. Hypothesze
More informationArithmetic Mean Suppose there is only a finite number N of items in the system of interest. Then the population arithmetic mean is
Topc : Probablty Theory Module : Descrptve Statstcs Measures of Locato Descrptve statstcs are measures of locato ad shape that perta to probablty dstrbutos The prmary measures of locato are the arthmetc
More informationMean is only appropriate for interval or ratio scales, not ordinal or nominal.
Mea Same as ordary average Sum all the data values ad dvde by the sample sze. x = ( x + x +... + x Usg summato otato, we wrte ths as x = x = x = = ) x Mea s oly approprate for terval or rato scales, ot
More informationChapter 13 Student Lecture Notes 13-1
Chapter 3 Studet Lecture Notes 3- Basc Busess Statstcs (9 th Edto) Chapter 3 Smple Lear Regresso 4 Pretce-Hall, Ic. Chap 3- Chapter Topcs Types of Regresso Models Determg the Smple Lear Regresso Equato
More informationMedian as a Weighted Arithmetic Mean of All Sample Observations
Meda as a Weghted Arthmetc Mea of All Sample Observatos SK Mshra Dept. of Ecoomcs NEHU, Shllog (Ida). Itroducto: Iumerably may textbooks Statstcs explctly meto that oe of the weakesses (or propertes) of
More informationLecture 8: Linear Regression
Lecture 8: Lear egresso May 4, GENOME 56, Sprg Goals Develop basc cocepts of lear regresso from a probablstc framework Estmatg parameters ad hypothess testg wth lear models Lear regresso Su I Lee, CSE
More information8.1 Hashing Algorithms
CS787: Advaced Algorthms Scrbe: Mayak Maheshwar, Chrs Hrchs Lecturer: Shuch Chawla Topc: Hashg ad NP-Completeess Date: September 21 2007 Prevously we looked at applcatos of radomzed algorthms, ad bega
More informationSupport vector machines II
CS 75 Mache Learg Lecture Support vector maches II Mlos Hauskrecht mlos@cs.ptt.edu 539 Seott Square Learl separable classes Learl separable classes: here s a hperplae that separates trag staces th o error
More informationFor combinatorial problems we might need to generate all permutations, combinations, or subsets of a set.
Addtoal Decrease ad Coquer Algorthms For combatoral problems we mght eed to geerate all permutatos, combatos, or subsets of a set. Geeratg Permutatos If we have a set f elemets: { a 1, a 2, a 3, a } the
More informationCIS 800/002 The Algorithmic Foundations of Data Privacy October 13, Lecture 9. Database Update Algorithms: Multiplicative Weights
CIS 800/002 The Algorthmc Foudatos of Data Prvacy October 13, 2011 Lecturer: Aaro Roth Lecture 9 Scrbe: Aaro Roth Database Update Algorthms: Multplcatve Weghts We ll recall aga) some deftos from last tme:
More informationSummary tables and charts
Data Aalyss Summary tables ad charts. Orgazg umercal data: Hstograms ad frequecy tables I ths lecture, we wll study descrptve statstcs. By descrptve statstcs, we refer to methods volvg the collecto, presetato,
More informationChapter 5 Properties of a Random Sample
Lecture 6 o BST 63: Statstcal Theory I Ku Zhag, /0/008 Revew for the prevous lecture Cocepts: t-dstrbuto, F-dstrbuto Theorems: Dstrbutos of sample mea ad sample varace, relatoshp betwee sample mea ad sample
More informationCSE 5526: Introduction to Neural Networks Linear Regression
CSE 556: Itroducto to Neural Netorks Lear Regresso Part II 1 Problem statemet Part II Problem statemet Part II 3 Lear regresso th oe varable Gve a set of N pars of data , appromate d by a lear fucto
More informationQuantitative analysis requires : sound knowledge of chemistry : possibility of interferences WHY do we need to use STATISTICS in Anal. Chem.?
Ch 4. Statstcs 4.1 Quattatve aalyss requres : soud kowledge of chemstry : possblty of terfereces WHY do we eed to use STATISTICS Aal. Chem.? ucertaty ests. wll we accept ucertaty always? f ot, from how
More informationPoint Estimation: definition of estimators
Pot Estmato: defto of estmators Pot estmator: ay fucto W (X,..., X ) of a data sample. The exercse of pot estmato s to use partcular fuctos of the data order to estmate certa ukow populato parameters.
More informationESS Line Fitting
ESS 5 014 17. Le Fttg A very commo problem data aalyss s lookg for relatoshpetwee dfferet parameters ad fttg les or surfaces to data. The smplest example s fttg a straght le ad we wll dscuss that here
More informationTema 5: Aprendizaje NO Supervisado: CLUSTERING Unsupervised Learning: CLUSTERING. Febrero-Mayo 2005
Tema 5: Apredzae NO Supervsado: CLUSTERING Usupervsed Learg: CLUSTERING Febrero-Mayo 2005 SUPERVISED METHODS: LABELED Data Base Labeled Data Base Dvded to Tra ad Test Choose Algorthm: MAP, ML, K-Nearest
More informationSPECIAL CONSIDERATIONS FOR VOLUMETRIC Z-TEST FOR PROPORTIONS
SPECIAL CONSIDERAIONS FOR VOLUMERIC Z-ES FOR PROPORIONS Oe s stctve reacto to the questo of whether two percetages are sgfcatly dfferet from each other s to treat them as f they were proportos whch the
More information