CS 347 Parallel and Distributed Data Processing
|
|
- Dustin O’Neal’
- 5 years ago
- Views:
Transcription
1 CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 4: Query Optimization
2 Query Optimization Cost estimation Strategies for exploring plans Q min CS 347 Notes 4 2
3 Cost Estimation Based on estimating result sizes Like in centralized databases CS 347 Notes 4 3
4 Cost Estimation But # of IOs may not be the best metric E.g., transmission time may dominate work at site 1 T 1 work at site 2 T 2 answer time/$ CS 347 Notes 4 4
5 Cost Estimation Another reason why # of IOs is not enough: parallelism Plan A Plan B 100 IOs site 1 site 2 50 IOs 70 IOs site 3 50 IOs CS 347 Notes 4 5
6 Cost Estimation Cost metrics E.g., IOs, bytes transmitted, $, Additive Task 1 Response time metric Not additive Need scheduling and dependency info Task 2 Task 3 Skew is important CS 347 Notes 4 6
7 Cost Estimation Also take into account Start up cost Data distribution cost/time Resource contention (for memory, disk, network) Cost of assembling results CS 347 Notes 4 7
8 Cost Estimation Response time example site 1 site 2 site 3 site 4 start up distribution searching + sending results final processing CS 347 Notes 4 8
9 Search Strategies Exhaustive (with pruning) Hill climbing (greedy) Query separation CS 347 Notes 4 9
10 Exhaustive Search Consider all query plans (given a set of techniques for operators) Prune some plans Heuristics CS 347 Notes 4 10
11 Exhaustive Search Example R S T A B R > S > T R S T R S R T S R S T T S T R (S R) T (T S) R ship S to R semi join ship T to S semi join 1 2 Prune because cross-product not necessary Prune because larger relation first CS 347 Notes 4 11
12 Search Strategies In generating plans, keep goal in mind E.g., if goal is parallelism (in system with fast network) Consider partitioning relations first E.g., if goal is reduction of network traffic Consider semi-joins CS 347 Notes 4 12
13 Hill Climbing Better plans 1 x initial plan Worse plans CS 347 Notes 4 13
14 Hill Climbing 2 Better plans 1 x initial plan Worse plans CS 347 Notes 4 14
15 Hill Climbing Example R S T V A B C relation site size R 1 10 S 2 20 T 3 30 V 4 40 tuple size = 1 Goal: minimize data transmission CS 347 Notes 4 15
16 Hill Climbing Initial plan Send relations to one site What site do we send all relations to? To site 1: cost = = 90 To site 2: cost = = 80 To site 3: cost = = 70 To site 4: cost = = 60 CS 347 Notes 4 16
17 Hill Climbing P 0 R (1 4) S (2 4) T (3 4) Compute R S T V at site 4 CS 347 Notes 4 17
18 Hill Climbing Local search Consider sending each relation to neighbor 4 R S R S 1 2 R CS 347 Notes 4 18
19 Hill Climbing Assume size R S = 20 size S T = 5 size T V = 1 Option A R S 2 4 R S R cost = 30 cost = 30 No savings CS 347 Notes 4 19
20 Hill Climbing Option B R S S R S 2 cost = 30 cost = 40 Worse off CS 347 Notes 4 20
21 Hill Climbing Option C S T T S T 3 cost = 50 cost = 35 Win CS 347 Notes 4 21
22 Hill Climbing Option D S T S T S cost = 50 cost = 25 Bigger win CS 347 Notes 4 22
23 Hill Climbing P 1 S (2 3) α = S T R (1 4) T (3 4) Compute answer at site 4 CS 347 Notes 4 23
24 Hill Climbing Repeat local search Treat α = S T as relation α R α R 4 α vs R α 1 3 R CS 347 Notes 4 24
25 Hill Climbing Hill climbing may miss best plan E.g., best plan could be P best T (3 4) β = T V β (4 2) β' = β S β' (2 1) β'' = β' R β'' (1 4) (optional) Compute answer V 4 β'' β T 1 β' 2 3 R S T CS 347 Notes 4 25
26 Hill Climbing Hill climbing may miss best plan E.g., best plan could be P best T (3 4) β = T V β (4 2) β' = β S β' (2 1) β'' = β' R β'' (1 4) (optional) Compute answer = 30 = 1 = 1 = 1 = 33 total CS 347 Notes 4 26 β'' β' R V S β T Costs could be low because β is very selective T
27 Search Strategies Exhaustive (with pruning) Hill climbing (greedy) Query separation CS 347 Notes 4 27
28 Query Separation Separate query into 2 or more steps Optimize each step independently CS 347 Notes 4 28
29 Query Separation Example Simple queries technique σ c1 A σ c2 σ c3 R S CS 347 Notes 4 29
30 Query Separation 1. Compute R = A [ σ c2 R ] S = A [ σ c3 S ] 2. Compute J = R S 3. Compute answer σ c1 { [ J σ c2 R ] [ J σ c3 S ] } σ c2 R σ c1 A σ c3 S CS 347 Notes 4 30
31 Query Separation 1. Compute R = A [ σ c2 R ] S = A [ σ c3 S ] 2. Compute J = R S Compute the A values in the answer first 3. Compute answer σ c1 { [ J σ c2 R ] [ J σ c3 S ] } Get tuples from sites matching A and compute answer next CS 347 Notes 4 31
32 Query Separation Simple query Relations have a single attribute Output has a single attribute E.g., J = R S CS 347 Notes 4 32
33 Query Separation Idea 1. Decompose query Local processing Simple query (or queries) Final processing 2. Optimize simple query CS 347 Notes 4 33
34 Query Separation Philosophy Hard part is distributed join Do this part with only keys; get the rest of the data later Simpler to optimize simple queries CS 347 Notes 4 34
35 Summary Cost estimation Optimization strategies Exhaustive (with pruning) Hill climbing (greedy) Query separation CS 347 Notes 4 35
36 Words of Wisdom Optimization is like chess playing May have to make sacrifices for later gains Move data, partition relations, build indexes CS 347 Notes 4 36
CS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 3: Query Processing Query Processing Decomposition Localization Optimization CS 347 Notes 3 2 Decomposition Same as in centralized system
More informationCS 347 Distributed Databases and Transaction Processing Notes03: Query Processing
CS 347 Distributed Databases and Transaction Processing Notes03: Query Processing Hector Garcia-Molina Zoltan Gyongyi CS 347 Notes 03 1 Query Processing! Decomposition! Localization! Optimization CS 347
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu 10/24/2012 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
More informationMultiple-Site Distributed Spatial Query Optimization using Spatial Semijoins
11 Multiple-Site Distributed Spatial Query Optimization using Spatial Semijoins Wendy OSBORN a, 1 and Saad ZAAMOUT a a Department of Mathematics and Computer Science, University of Lethbridge, Lethbridge,
More informationP Q1 Q2 Q3 Q4 Q5 Tot (60) (20) (20) (20) (60) (20) (200) You are allotted a maximum of 4 hours to complete this exam.
Exam INFO-H-417 Database System Architecture 13 January 2014 Name: ULB Student ID: P Q1 Q2 Q3 Q4 Q5 Tot (60 (20 (20 (20 (60 (20 (200 Exam modalities You are allotted a maximum of 4 hours to complete this
More informationA.I.: Beyond Classical Search
A.I.: Beyond Classical Search Random Sampling Trivial Algorithms Generate a state randomly Random Walk Randomly pick a neighbor of the current state Both algorithms asymptotically complete. Overview Previously
More information2.6 Complexity Theory for Map-Reduce. Star Joins 2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51
2.6. COMPLEXITY THEORY FOR MAP-REDUCE 51 Star Joins A common structure for data mining of commercial data is the star join. For example, a chain store like Walmart keeps a fact table whose tuples each
More informationQuiz 2. Due November 26th, CS525 - Advanced Database Organization Solutions
Name CWID Quiz 2 Due November 26th, 2015 CS525 - Advanced Database Organization s Please leave this empty! 1 2 3 4 5 6 7 Sum Instructions Multiple choice questions are graded in the following way: You
More informationLecture 7: DecisionTrees
Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:
More informationMario A. Nascimento. Univ. of Alberta, Canada http: //
DATA CACHING IN W Mario A. Nascimento Univ. of Alberta, Canada http: //www.cs.ualberta.ca/~mn With R. Alencar and A. Brayner. Work partially supported by NSERC and CBIE (Canada) and CAPES (Brazil) Outline
More informationLecture and notes by: Alessio Guerrieri and Wei Jin Bloom filters and Hashing
Bloom filters and Hashing 1 Introduction The Bloom filter, conceived by Burton H. Bloom in 1970, is a space-efficient probabilistic data structure that is used to test whether an element is a member of
More informationImpression Store: Compressive Sensing-based Storage for. Big Data Analytics
Impression Store: Compressive Sensing-based Storage for Big Data Analytics Jiaxing Zhang, Ying Yan, Liang Jeff Chen, Minjie Wang, Thomas Moscibroda & Zheng Zhang Microsoft Research The Curse of O(N) in
More informationDecoding Revisited: Easy-Part-First & MERT. February 26, 2015
Decoding Revisited: Easy-Part-First & MERT February 26, 2015 Translating the Easy Part First? the tourism initiative addresses this for the first time the die tm:-0.19,lm:-0.4, d:0, all:-0.65 tourism touristische
More informationCS425: Algorithms for Web Scale Data
CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original slides can be accessed at: www.mmds.org Challenges
More informationDATA MINING LECTURE 3. Frequent Itemsets Association Rules
DATA MINING LECTURE 3 Frequent Itemsets Association Rules This is how it all started Rakesh Agrawal, Tomasz Imielinski, Arun N. Swami: Mining Association Rules between Sets of Items in Large Databases.
More informationWolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig
Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 13 Indexes for Multimedia Data 13 Indexes for Multimedia
More information6.830 Lecture 11. Recap 10/15/2018
6.830 Lecture 11 Recap 10/15/2018 Celebration of Knowledge 1.5h No phones, No laptops Bring your Student-ID The 5 things allowed on your desk Calculator allowed 4 pages (2 pages double sided) of your liking
More informationLocal Search & Optimization
Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2018 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Some
More informationCS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 2: Distributed Database Design Logistics Gradiance No action items for now Detailed instructions coming shortly First quiz to be released
More informationDecision Trees. Machine Learning CSEP546 Carlos Guestrin University of Washington. February 3, 2014
Decision Trees Machine Learning CSEP546 Carlos Guestrin University of Washington February 3, 2014 17 Linear separability n A dataset is linearly separable iff there exists a separating hyperplane: Exists
More informationScalable Processing of Snapshot and Continuous Nearest-Neighbor Queries over One-Dimensional Uncertain Data
VLDB Journal manuscript No. (will be inserted by the editor) Scalable Processing of Snapshot and Continuous Nearest-Neighbor Queries over One-Dimensional Uncertain Data Jinchuan Chen Reynold Cheng Mohamed
More informationMultimedia Databases 1/29/ Indexes for Multimedia Data Indexes for Multimedia Data Indexes for Multimedia Data
1/29/2010 13 Indexes for Multimedia Data 13 Indexes for Multimedia Data 13.1 R-Trees Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig
More informationMethods for finding optimal configurations
CS 1571 Introduction to AI Lecture 9 Methods for finding optimal configurations Milos Hauskrecht milos@cs.pitt.edu 5329 Sennott Square Search for the optimal configuration Optimal configuration search:
More informationRESEARCH ON THE DISTRIBUTED PARALLEL SPATIAL INDEXING SCHEMA BASED ON R-TREE
RESEARCH ON THE DISTRIBUTED PARALLEL SPATIAL INDEXING SCHEMA BASED ON R-TREE Yuan-chun Zhao a, b, Cheng-ming Li b a. Shandong University of Science and Technology, Qingdao 266510 b. Chinese Academy of
More informationClassification Based on Logical Concept Analysis
Classification Based on Logical Concept Analysis Yan Zhao and Yiyu Yao Department of Computer Science, University of Regina, Regina, Saskatchewan, Canada S4S 0A2 E-mail: {yanzhao, yyao}@cs.uregina.ca Abstract.
More information' Characteristics of DSS Queries: read-mostly, complex, adhoc, Introduction Tremendous growth in Decision Support Systems (DSS). with large foundsets
' & Bitmap Index Design and Evaluation Chan Chee-Yong of Wisconsin-Madison University Ioannidis Yannis of Wisconsin-Madison University University of Athens 1 ' Characteristics of DSS Queries: read-mostly,
More informationReductionist View: A Priori Algorithm and Vector-Space Text Retrieval. Sargur Srihari University at Buffalo The State University of New York
Reductionist View: A Priori Algorithm and Vector-Space Text Retrieval Sargur Srihari University at Buffalo The State University of New York 1 A Priori Algorithm for Association Rule Learning Association
More informationYour Project Proposals
Your Project Proposals l See description in Learning Suite Remember your example instance! l Examples Look at Irvine Data Set to get a feel of what data sets look like l Stick with supervised classification
More informationClustering. Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein. Some slides adapted from Jacques van Helden
Clustering Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein Some slides adapted from Jacques van Helden Small vs. large parsimony A quick review Fitch s algorithm:
More informationLearning and Transferring Relational Instance-Based Policies
Learning and Transferring Relational Instance-Based Policies Rocío García Durán, Fernando Fernández and Daniel Borrajo Planning and Learning Group (PLG) Departamento de Informática Escuela Politécnica
More informationPrinciples of AI Planning
Principles of 5. Planning as search: progression and regression Malte Helmert and Bernhard Nebel Albert-Ludwigs-Universität Freiburg May 4th, 2010 Planning as (classical) search Introduction Classification
More informationPrinciples of AI Planning
Principles of AI Planning 5. Planning as search: progression and regression Albert-Ludwigs-Universität Freiburg Bernhard Nebel and Robert Mattmüller October 30th, 2013 Introduction Classification Planning
More informationYou are here! Query Processor. Recovery. Discussed here: DBMS. Task 3 is often called algebraic (or re-write) query optimization, while
Module 10: Query Optimization Module Outline 10.1 Outline of Query Optimization 10.2 Motivating Example 10.3 Equivalences in the relational algebra 10.4 Heuristic optimization 10.5 Explosion of search
More informationScaling Up. So far, we have considered methods that systematically explore the full search space, possibly using principled pruning (A* etc.).
Local Search Scaling Up So far, we have considered methods that systematically explore the full search space, possibly using principled pruning (A* etc.). The current best such algorithms (RBFS / SMA*)
More informationSummarizing Measured Data
Summarizing Measured Data 12-1 Overview Basic Probability and Statistics Concepts: CDF, PDF, PMF, Mean, Variance, CoV, Normal Distribution Summarizing Data by a Single Number: Mean, Median, and Mode, Arithmetic,
More informationInitial and Boundary Conditions
Initial and Boundary Conditions Initial- and boundary conditions are needed For a steady problems correct initial conditions is important to reduce computational time and reach convergence Boundary conditions
More informationCS 347. Parallel and Distributed Data Processing. Spring Notes 11: MapReduce
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 11: MapReduce Motivation Distribution makes simple computations complex Communication Load balancing Fault tolerance Not all applications
More informationCS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash
CS 781 Lecture 9 March 10, 2011 Topics: Local Search and Optimization Metropolis Algorithm Greedy Optimization Hopfield Networks Max Cut Problem Nash Equilibrium Price of Stability Coping With NP-Hardness
More informationModule 10: Query Optimization
Module 10: Query Optimization Module Outline 10.1 Outline of Query Optimization 10.2 Motivating Example 10.3 Equivalences in the relational algebra 10.4 Heuristic optimization 10.5 Explosion of search
More informationCrowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons
2015 The University of Texas at Arlington. All Rights Reserved. Crowdsourcing Pareto-Optimal Object Finding by Pairwise Comparisons Abolfazl Asudeh, Gensheng Zhang, Naeemul Hassan, Chengkai Li, Gergely
More information*** SAMPLE ONLY! The actual NASUNI-FILER-MIB file can be accessed through the Filer address>/site_media/snmp/nasuni-filer-mib ***
*** SAMPLE ONLY! The actual NASUNI-FILER-MIB file can be accessed through the https:///site_media/snmp/nasuni-filer-mib *** NASUNI-FILER-MIB DEFINITIONS ::= BEGIN -- Nasuni Filer
More informationReducing storage requirements for biological sequence comparison
Bioinformatics Advance Access published July 15, 2004 Bioinfor matics Oxford University Press 2004; all rights reserved. Reducing storage requirements for biological sequence comparison Michael Roberts,
More informationStaple Here. Student Name: End-of-Course Assessment. Algebra I. Pre-Test Performance Event
Staple Here Student Name: End-of-Course Assessment Algebra I Pre-Test Performance Event Summer 2014 A shipping company charges $1.20 times the sum, s, of the length, width, and height of a package to be
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu March 16, 2016 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision
More informationMethods that transform the original network into a tighter and tighter representations
hapter pproximation of inference: rc, path and i-consistecy Methods that transform the original network into a tighter and tighter representations all 00 IS 75 - onstraint Networks rc-consistency X, Y,
More informationMinimax strategies, alpha beta pruning. Lirong Xia
Minimax strategies, alpha beta pruning Lirong Xia Reminder ØProject 1 due tonight Makes sure you DO NOT SEE ERROR: Summation of parsed points does not match ØProject 2 due in two weeks 2 How to find good
More informationCS224W: Social and Information Network Analysis Jure Leskovec, Stanford University
CS224W: Social and Information Network Analysis Jure Leskovec, Stanford University http://cs224w.stanford.edu Find most influential set S of size k: largest expected cascade size f(s) if set S is activated
More informationFinding optimal configurations ( combinatorial optimization)
CS 1571 Introduction to AI Lecture 10 Finding optimal configurations ( combinatorial optimization) Milos Hauskrecht milos@cs.pitt.edu 539 Sennott Square Constraint satisfaction problem (CSP) Constraint
More informationFinding Frequent Items in Probabilistic Data
Finding Frequent Items in Probabilistic Data Qin Zhang, Hong Kong University of Science & Technology Feifei Li, Florida State University Ke Yi, Hong Kong University of Science & Technology SIGMOD 2008
More informationDot-Product Join: Scalable In-Database Linear Algebra for Big Model Analytics
Dot-Product Join: Scalable In-Database Linear Algebra for Big Model Analytics Chengjie Qin 1 and Florin Rusu 2 1 GraphSQL, Inc. 2 University of California Merced June 29, 2017 Machine Learning (ML) Is
More informationSequential Logic Optimization. Optimization in Context. Algorithmic Approach to State Minimization. Finite State Machine Optimization
Sequential Logic Optimization! State Minimization " Algorithms for State Minimization! State, Input, and Output Encodings " Minimize the Next State and Output logic Optimization in Context! Understand
More informationDecision Tree Learning
Decision Tree Learning Berlin Chen Department of Computer Science & Information Engineering National Taiwan Normal University References: 1. Machine Learning, Chapter 3 2. Data Mining: Concepts, Models,
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/7/2012 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 Web pages are not equally important www.joe-schmoe.com
More informationArtificial Intelligence Heuristic Search Methods
Artificial Intelligence Heuristic Search Methods Chung-Ang University, Jaesung Lee The original version of this content is created by School of Mathematics, University of Birmingham professor Sandor Zoltan
More informationFrequent Itemsets and Association Rule Mining. Vinay Setty Slides credit:
Frequent Itemsets and Association Rule Mining Vinay Setty vinay.j.setty@uis.no Slides credit: http://www.mmds.org/ Association Rule Discovery Supermarket shelf management Market-basket model: Goal: Identify
More informationDecision Tree Learning
Topics Decision Tree Learning Sattiraju Prabhakar CS898O: DTL Wichita State University What are decision trees? How do we use them? New Learning Task ID3 Algorithm Weka Demo C4.5 Algorithm Weka Demo Implementation
More informationLocal Search and Optimization
Local Search and Optimization Outline Local search techniques and optimization Hill-climbing Gradient methods Simulated annealing Genetic algorithms Issues with local search Local search and optimization
More informationSPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM
SPATIAL DATA MINING Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM INTRODUCTION The main difference between data mining in relational DBS and in spatial DBS is that attributes of the neighbors
More informationHandout 1: Introduction to Dynamic Programming. 1 Dynamic Programming: Introduction and Examples
SEEM 3470: Dynamic Optimization and Applications 2013 14 Second Term Handout 1: Introduction to Dynamic Programming Instructor: Shiqian Ma January 6, 2014 Suggested Reading: Sections 1.1 1.5 of Chapter
More informationMatrix Assembly in FEA
Matrix Assembly in FEA 1 In Chapter 2, we spoke about how the global matrix equations are assembled in the finite element method. We now want to revisit that discussion and add some details. For example,
More informationEECS 349:Machine Learning Bryan Pardo
EECS 349:Machine Learning Bryan Pardo Topic 2: Decision Trees (Includes content provided by: Russel & Norvig, D. Downie, P. Domingos) 1 General Learning Task There is a set of possible examples Each example
More informationLecture 3: Decision Trees
Lecture 3: Decision Trees Cognitive Systems II - Machine Learning SS 2005 Part I: Basic Approaches of Concept Learning ID3, Information Gain, Overfitting, Pruning Lecture 3: Decision Trees p. Decision
More informationCausal Belief Decomposition for Planning with Sensing: Completeness Results and Practical Approximation
Causal Belief Decomposition for Planning with Sensing: Completeness Results and Practical Approximation Blai Bonet 1 and Hector Geffner 2 1 Universidad Simón Boĺıvar 2 ICREA & Universitat Pompeu Fabra
More informationLocal Search & Optimization
Local Search & Optimization CE417: Introduction to Artificial Intelligence Sharif University of Technology Spring 2017 Soleymani Artificial Intelligence: A Modern Approach, 3 rd Edition, Chapter 4 Outline
More informationData Mining. CS57300 Purdue University. Bruno Ribeiro. February 8, 2018
Data Mining CS57300 Purdue University Bruno Ribeiro February 8, 2018 Decision trees Why Trees? interpretable/intuitive, popular in medical applications because they mimic the way a doctor thinks model
More informationSummary. Local Search and cycle cut set. Local Search Strategies for CSP. Greedy Local Search. Local Search. and Cycle Cut Set. Strategies for CSP
Summary Greedy and cycle cut set Greedy... in short G.B. Shaw A life spent doing mistakes is not only more honorable, but more useful than a life spent doing nothing Reminder: Hill-climbing Algorithm function
More informationProcessing Aggregate Queries over Continuous Data Streams
Processing Aggregate Queries over Continuous Data Streams Alin Dobra Computer Science Department Cornell University April 15, 2003 Relational Database Systems did dname 15 Legal 17 Marketing 3 Development
More informationCS 188: Artificial Intelligence Spring Announcements
CS 188: Artificial Intelligence Spring 2011 Lecture 16: Bayes Nets IV Inference 3/28/2011 Pieter Abbeel UC Berkeley Many slides over this course adapted from Dan Klein, Stuart Russell, Andrew Moore Announcements
More informationLecture 6: Greedy Algorithms I
COMPSCI 330: Design and Analysis of Algorithms September 14 Lecturer: Rong Ge Lecture 6: Greedy Algorithms I Scribe: Fred Zhang 1 Overview In this lecture, we introduce a new algorithm design technique
More informationOptimisation and Operations Research
Optimisation and Operations Research Lecture 11: Integer Programming Matthew Roughan http://www.maths.adelaide.edu.au/matthew.roughan/ Lecture_notes/OORII/ School of Mathematical
More information2.4 Conditional Probability
2.4 Conditional Probability The probabilities assigned to various events depend on what is known about the experimental situation when the assignment is made. Example: Suppose a pair of dice is tossed.
More informationCPU Scheduling. CPU Scheduler
CPU Scheduling These slides are created by Dr. Huang of George Mason University. Students registered in Dr. Huang s courses at GMU can make a single machine readable copy and print a single copy of each
More informationWebsite What's Happening Residents Visitors Business Marina Town Government
Page 1 of 6 Website What's Happening Residents Visitors Business Marina Town Government MARINA BUZZ January 2017 "Go confidently in the direction of your dreams. Live the life you have imagined." Greetings!
More informationLarge-Scale Behavioral Targeting
Large-Scale Behavioral Targeting Ye Chen, Dmitry Pavlov, John Canny ebay, Yandex, UC Berkeley (This work was conducted at Yahoo! Labs.) June 30, 2009 Chen et al. (KDD 09) Large-Scale Behavioral Targeting
More informationRepetition Prepar Pr a epar t a ion for fo Ex am Ex
Repetition Preparation for Exam Written Exam Date: 2008 10 20 Time: 9:30 14:30 No book or other aids Maximal 40 points Grades: 0 19 U (did not pass) 20 27 3 (pass) 28 32 4 (verygood) 33 40 5 (excellent)
More informationUsing Geographical Information System Techniques for Finding Appropriate Location for opening up a new retail site
Using Geographical Information System Techniques for Finding Appropriate Location for opening up a new retail site Harpreet Rai Assistant Professor Department of computer applications SASIIT&R, Mohali
More informationFactorized Relational Databases Olteanu and Závodný, University of Oxford
November 8, 2013 Database Seminar, U Washington Factorized Relational Databases http://www.cs.ox.ac.uk/projects/fd/ Olteanu and Závodný, University of Oxford Factorized Representations of Relations Cust
More informationProf. S. Boyd. Final exam
EE263 Dec. 8 9 or Dec. 9 10, 2006. Prof. S. Boyd Final exam This is a 24 hour take-home final exam. Please turn it in at Bytes Cafe in the Packard building, 24 hours after you pick it up. Please read the
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 4: Vector Data: Decision Tree Instructor: Yizhou Sun yzsun@cs.ucla.edu October 10, 2017 Methods to Learn Vector Data Set Data Sequence Data Text Data Classification Clustering
More informationIncremental Construction of Complex Aggregates: Counting over a Secondary Table
Incremental Construction of Complex Aggregates: Counting over a Secondary Table Clément Charnay 1, Nicolas Lachiche 1, and Agnès Braud 1 ICube, Université de Strasbourg, CNRS 300 Bd Sébastien Brant - CS
More informationResponse Time in Data Broadcast Systems: Mean, Variance and Trade-O. Shu Jiang Nitin H. Vaidya. Department of Computer Science
Response Time in Data Broadcast Systems: Mean, Variance and Trade-O Shu Jiang Nitin H. Vaidya Department of Computer Science Texas A&M University College Station, TX 7784-11, USA Email: fjiangs,vaidyag@cs.tamu.edu
More informationThe Market-Basket Model. Association Rules. Example. Support. Applications --- (1) Applications --- (2)
The Market-Basket Model Association Rules Market Baskets Frequent sets A-priori Algorithm A large set of items, e.g., things sold in a supermarket. A large set of baskets, each of which is a small set
More informationDatabases 2011 The Relational Algebra
Databases 2011 Christian S. Jensen Computer Science, Aarhus University What is an Algebra? An algebra consists of values operators rules Closure: operations yield values Examples integers with +,, sets
More informationMetric Embedding of Task-Specific Similarity. joint work with Trevor Darrell (MIT)
Metric Embedding of Task-Specific Similarity Greg Shakhnarovich Brown University joint work with Trevor Darrell (MIT) August 9, 2006 Task-specific similarity A toy example: Task-specific similarity A toy
More informationDecision Tree Learning
Decision Tree Learning Goals for the lecture you should understand the following concepts the decision tree representation the standard top-down approach to learning a tree Occam s razor entropy and information
More informationFinal. Introduction to Artificial Intelligence. CS 188 Spring You have approximately 2 hours and 50 minutes.
CS 188 Spring 2014 Introduction to Artificial Intelligence Final You have approximately 2 hours and 50 minutes. The exam is closed book, closed notes except your two-page crib sheet. Mark your answers
More informationP, NP, NP-Complete. Ruth Anderson
P, NP, NP-Complete Ruth Anderson A Few Problems: Euler Circuits Hamiltonian Circuits Intractability: P and NP NP-Complete What now? Today s Agenda 2 Try it! Which of these can you draw (trace all edges)
More informationDecision Trees. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore
Decision Trees Claude Monet, The Mulberry Tree Slides from Pedro Domingos, CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Michael Guerzhoy
More information5 Spatial Access Methods
5 Spatial Access Methods 5.1 Quadtree 5.2 R-tree 5.3 K-d tree 5.4 BSP tree 3 27 A 17 5 G E 4 7 13 28 B 26 11 J 29 18 K 5.5 Grid file 9 31 8 17 24 28 5.6 Summary D C 21 22 F 15 23 H Spatial Databases and
More informationDecision Tree. Decision Tree Learning. c4.5. Example
Decision ree Decision ree Learning s of systems that learn decision trees: c4., CLS, IDR, ASSISA, ID, CAR, ID. Suitable problems: Instances are described by attribute-value couples he target function has
More informationData Mining and Machine Learning (Machine Learning: Symbolische Ansätze)
Data Mining and Machine Learning (Machine Learning: Symbolische Ansätze) Learning Individual Rules and Subgroup Discovery Introduction Batch Learning Terminology Coverage Spaces Descriptive vs. Predictive
More informationEasySDM: A Spatial Data Mining Platform
EasySDM: A Spatial Data Mining Platform (User Manual) Authors: Amine Abdaoui and Mohamed Ala Al Chikha, Students at the National Computing Engineering School. Algiers. June 2013. 1. Overview EasySDM is
More informationCSE 562 Database Systems
Outline Query Optimization CSE 562 Database Systems Query Processing: Algebraic Optimization Some slides are based or modified from originals by Database Systems: The Complete Book, Pearson Prentice Hall
More informationSemantics of Ranking Queries for Probabilistic Data and Expected Ranks
Semantics of Ranking Queries for Probabilistic Data and Expected Ranks Graham Cormode AT&T Labs Feifei Li FSU Ke Yi HKUST 1-1 Uncertain, uncertain, uncertain... (Probabilistic, probabilistic, probabilistic...)
More information(tree searching technique) (Boolean formulas) satisfying assignment: (X 1, X 2 )
Algorithms Chapter 5: The Tree Searching Strategy - Examples 1 / 11 Chapter 5: The Tree Searching Strategy 1. Ex 5.1Determine the satisfiability of the following Boolean formulas by depth-first search
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu November 16, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision
More information5 Spatial Access Methods
5 Spatial Access Methods 5.1 Quadtree 5.2 R-tree 5.3 K-d tree 5.4 BSP tree 3 27 A 17 5 G E 4 7 13 28 B 26 11 J 29 18 K 5.5 Grid file 9 31 8 17 24 28 5.6 Summary D C 21 22 F 15 23 H Spatial Databases and
More informationScalable Asynchronous Gradient Descent Optimization for Out-of-Core Models
Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models Chengjie Qin 1, Martin Torres 2, and Florin Rusu 2 1 GraphSQL, Inc. 2 University of California Merced August 31, 2017 Machine
More informationHow hard is it to find a good solution?
How hard is it to find a good solution? Simons Institute Open Lecture November 4, 2013 Research Area: Complexity Theory Given a computational problem, find an efficient algorithm that solves it. Goal of
More informationEssentials of Large Volume Data Management - from Practical Experience. George Purvis MASS Data Manager Met Office
Essentials of Large Volume Data Management - from Practical Experience George Purvis MASS Data Manager Met Office There lies trouble ahead Once upon a time a Project Manager was tasked to go forth and
More information