QuickScorer: a fast algorithm to rank documents with additive ensembles of regression trees
|
|
- Meghan Richards
- 5 years ago
- Views:
Transcription
1 QuickScorer: a fast algorithm to rank documents with additive ensembles of regression trees Claudio Lucchese, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto HPC Lab, ISTI-CNR, Pisa, Italy & Tiscali SpA Salvatore Orlando Università Ca Foscari, Venice, Italy Rossano Venturini Università di Pisa, Pisa, Italy
2 Ranking (in web search) is computationally expensive and requires trade-offs between efficiency and efficacy to be devised
3 Additive ensembles of regression trees Query First step Second step Results Page(s) Base Ranker N docs Top Ranker K docs Document Index Features Learning to Rank Algorithm K. QuickScore: from X to 6.5X faster scoring
4 Additive ensembles of regression trees
5 The winner proposal used a linear combination of 12 ranking models, 8 of which were LambdaMART boosted tree models, having each up to 3,000 trees About 24,000 regression trees in total! [C. Burges, K. Svore, O. Dekel, Q. Wu, P. Bennett, A. Pastusiak and J. Platt, Microsoft Research] Additive ensembles of regression trees Yahoo! Learning to Rank Challenge
6 Query-Document feature set Process of Query-Document Scoring
7 Query-Document feature set Process of Query-Document Scoring
8 Query-Document feature set Process of Query-Document Scoring
9 Query-Document feature set Process of Query-Document Scoring
10 Query-Document feature set Process of Query-Document Scoring
11 Query-Document feature set Process of Query-Document Scoring
12 Query-Document feature set Process of Query-Document Scoring
13 Query-Document feature set Process of Query-Document Scoring
14 Query-Document feature set Exit leaf Process of Query-Document Scoring
15 Query-Document feature set Exit leaf Score += Process of Query-Document Scoring
16 number number number number of of of of trees = 1K 20K leaves = 4 64 docs = 3K-10K features = Additive ensembles of regression trees -
17 Exit leaf SoA: Struct+ Query-Document feature sets Naïve baseline Each tree node is represented by a C++ object containing the feature id, the associated threshold and the left and right pointers.
18 SoA: If-then-else Query-Document feature sets if (x[4] <= 50.1) { // recurses on the left subtree } else { // recurses on the right subtree if(x[3] <= -3.0) result = 0.4; else result = -1.4; }
19 Need to store the structure of the tree High branch misprediction rate Low cache hit ratio SoA: If-then-else Query-Document feature sets if (x[4] <= 50.1) { // recurses on the left subtree } else { // recurses on the right subtree if(x[3] <= -3.0) result = 0.4; else result = -1.4; }
20 16 docs F 1 F 2 F 3 F 4 F 5 F 6 F 7 int nodeid F 8 = 0; nodeid = nodes->children[x[nodes[nodeid].fid] > nodes[nodeid].theta]; nodeid = nodes->children[x[nodes[nodeid].fid] > nodes[nodeid].theta]; nodeid = nodes->children[x[nodes[nodeid].fid] > nodes[nodeid].theta]; nodeid = nodes->children[x[nodes[nodeid].fid] > nodes[nodeid].theta]; return scores[nodeid]; } :F 10.1:F 3 1 Query-Document feature sets double depth4(float* x, Node* nodes) { SoA: VPred [Asadi et Al. TKDE 2014]
21 QuickScore, a new efficient algorithm for the interleaved traversal of additive ensembles of regression trees by means of simple logical bitwise operations
22 QuickScore: false and true nodes
23 Given the feature set, each node of a tree can be classified as True or False True Node False Node QuickScore: false and true nodes
24 BITVECTOR? ?? QuickScore: Single Tree Traversal
25 BITVECTOR? ?? QuickScore: Single Tree Traversal
26 BITVECTOR? ?? QuickScore: Single Tree Traversal
27 BITVECTOR? ?? QuickScore: Single Tree Traversal
28 BITVECTOR? ?? QuickScore: Single Tree Traversal
29 BITVECTOR? ?? QuickScore: Single Tree Traversal
30 BITVECTOR? ?? QuickScore: Single Tree Traversal
31 BITVECTOR? ?? QuickScore: Single Tree Traversal
32 ? ?? QuickScore: use of false nodes masks
33 BITVECTOR AND AND = ??? QuickScore: use of false nodes masks
34 BITVECTOR AND AND = ? ?? Few operations, insensitive to nodes processing order! QuickScore: use of false nodes masks
35 F offsets Read-only, sequential data access R/W, random access Read-only, sequential access bitvectors bitmasks v leaves f 0 increasing values f 1 f F 1 num. leaves num. leaves num. leaves num. leaves num.leaves num. trees num. leaves QuickScore: data structures
36 offsets bitvectors bitvectors v leaves f 0 increasing values Query-Document Features sets F 0 F 1 F 2 F 3 F 4 F 5 F 6 F F F +1 f 1 f F 1 f F 1 num. leaves num. leaves num. leaves num. leaves num.leaves num. trees num. leaves num. leaves QuickScore: interleaved tree traversals
37 offsets bitvectors bitvectors v leaves f 0 increasing values Query-Document Features sets F 0 F 1 F 2 F 3 F 4 F 5 F 6 F F F +1 f 1 f F 1 f F 1 num. leaves num. leaves num. leaves num. leaves num.leaves num. trees num. leaves num. leaves QuickScore: interleaved tree traversals
38 offsets bitvectors bitvectors offsets v leaves f 0 increasing values Query-Document Features sets F 0 F 1 F 2 F 3 F 4 F 5 F 6 F F F +1 f 1 f F 1 f F 1 num. leaves num. leaves num. leaves num. leaves num.leaves num. trees num. leaves num. leaves QuickScore: interleaved tree traversals
39 offsets bitvectors bitvectors offsets v leaves f 0 increasing values Query-Document Features sets F 0 F 1 F 2 F 3 F 4 F 5 F 6 F F F +1 f 1 f F 1 f F 1 num. leaves num. leaves num. leaves num. leaves num.leaves num. trees num. leaves num. leaves Low branch misprediction rate High cache hit ratio QuickScore: interleaved tree traversals
40 Experimental Settings Lambda-MART ranking models optimizing learned with RankLib from MSN and Yahoo LETOR datasets Ensembles with 1K, 5K, 10K, or 20K regression trees, each with up to 8, 16, 32, or 64 leaves Intel Core 3.50Ghz CPU, with 32GB RAM, Ubuntu Linux
41 Experimental Results Per-document scoring time in microsecs and speedups Table 2: Per-document scoring time in µs of QS, VPred, If-Then-Else and Struct+ on MSN-1 and Y!S1 datasets. Gain factors are reported in parentheses. Method 1, 000 MSN-1 5, 000 Y!S1 MSN-1 Number of trees/dataset 10, 000 Y!S1 MSN-1 QS 2.2 ( ) 4.3 ( ) 10.5 ( ) 14.3 ( ) VPred 7.9 (3.6x) 8.5 (x) 40.2 (3.8x) 41.6 (2.9x) 8 If-Then-Else 8.2 (3.7x) 10.3 (2.4x) 81.0 (7.7x) 85.8 (6.0x) Struct (9.6x) 23.1 (5.4x) (10.3x) (7.9x) QS 2.9 ( ) 6.1 ( ) 16.2 ( ) 22.2 ( ) VPred 16.0 (5.5x) 16.5 (2.7x) 82.4 (5.0x) 82.8 (3.7x) 16 If-Then-Else 18.0 (6.2x) 21.8 (3.6x) (7.8x) (5.8x) Struct (14.7x) 41.0 (6.7x) (26.2x) (18.2x) QS 5.2 ( ) 9.7 ( ) 2 ( ) 34.3 ( ) VPred 31.9 (6.1x) 31.6 (3.2x) (6.0x) (4.7x) 32 If-Then-Else 34.5 (6.6x) 36.2 (3.7x) (11.1x) (8.0x) Struct (13.3x) 67.4 (6.9x) (34.2x) (24.3x) QS 9.5 ( ) 15.1 ( ) 56.3 ( ) 66.9 ( ) VPred 62.2 (6.5x) 57.6 (3.8x) (6.3x) (5.0x) 64 If-Then-Else 55.9 (5.9x) 55.1 (3.6x) (16.6x) (14.0x) Struct (11.6x) (7.7x) (29.5x) (23.2x) same trivially holds for Struct+. This means that the interleaved traversal strategy of QS needs to process less nodes than in a traditional root-to-leaf visit. This mostly explains the results achieved by QS. As far as number of branches is concerned, we note that, not surprisingly, QS and VPred are much more efficient than If-Then-Else and Struct+ with this respect. QS has a larger total number of branches than VPred, which uses scoring functions that are branch-free. However, those 20.0 ( ) 80.5 (4.0x) (9.3x) (18.7x) 32.4 ( ) (5.1x) (19.0x) (37.6x) 59.6 ( ) (5.7x) (23.4x) (30.3x) ( ) (4.7x) (15.9x) (19.3x) 20, 000 Y!S1 MSN-1 Y!S ( ) 82.7 (3.3) (7.3x) (15.4x) 41.2 ( ) (4.0x) (9.9x) (28.9x) 70.3 ( ) (4.8x) (19.8x) (25.2x) ( ) (4.4x) (15.2x) (18.4x) 40.5 ( ) (4.0x) (17.5x) (28.4x) 67.8 ( ) (4.9x) (26.0x) (38.2x) ( ) (4.5x) (20.4x) (29.6x) ( ) (3.0x) 466 (11.0x) (12.8x) 48.1 ( ) (3.4x) (16.0x) (23.7x) 81.0 ( ) (4.1x) (21.1x) (32.4x) ( ) (4.3x) (19.4x) (27.0x) ( ) (4.1x) (14.0x) (15.9x) number of L3 cache misses starts increasing when dealing with 9, 000 trees on MSN and 8, 000 trees on Y!S1. BWQS: a block-wise variant of QS The previous experiments suggest that improving the cache efficiency of QS may result in significant benefits. As in Tang et al. [12], we can split the tree ensemble in disjoint blocks of size that are processed separately in order to let the corresponding data structures fit into the faster levels of
42 on MSN-1 with 64-leaves -MART models. Method Number of Trees 1, 000 5, , , , 000 Instruction Count QS VPred If-Then-Else Struct Num. Visited Nodes (above) Visited Nodes/Total Nodes (below) QS % 21% 25% 26% 29% VPred % 89% 89% 88% 77% Struct If-Then-Else 64% 62% 59% 57% 50% MSN-1 with 64-leaves λ-mart models. Per-tree per-document low-level statistics on
43 MSN-1: Scoring Time and Cache Misses Scoring time per document per tree (µs) MSN-1 QS Scoring Time QS Cache Misses Number of Trees (64 leaves) Cache Misses (%)
44 MSN-1: Scoring Time and Cache Misses Scoring time per document per tree (µs) MSN-1 QS Scoring Time QS Cache Misses We can split the tree ensemble 40 in disjoint blocks p r o c e 30 s s e d Number of Trees (64 leaves) Cache Misses (%) s e p a r a t e l y i n order to 20 let the c o r r e s p o n d i n g data structures fit 10 into the faster l e v e l s o f t h e 0 memory hierarchy.
45 MSN-1: Scoring Time and Cache Misses Scoring time per document per tree (µs) QS Scoring Time BWQS Scoring Time QS Cache Misses BWQS Cache Misses The block-wise version outperforms QS up to 55% 10 (MSN, 20K trees, 64 leaves) Cache Misses (%) Number of Trees (64 leaves)
46 Thank you!
Distributed Mining of Frequent Closed Itemsets: Some Preliminary Results
Distributed Mining of Frequent Closed Itemsets: Some Preliminary Results Claudio Lucchese Ca Foscari University of Venice clucches@dsi.unive.it Raffaele Perego ISTI-CNR of Pisa perego@isti.cnr.it Salvatore
More informationFrom RankNet to LambdaRank to LambdaMART: An Overview
From RankNet to LambdaRank to LambdaMART: An Overview Christopher J.C. Burges Microsoft Research Technical Report MSR-TR-2010-82 Abstract LambdaMART is the boosted tree version of LambdaRank, which is
More informationModern Information Retrieval
Modern Information Retrieval Chapter 8 Text Classification Introduction A Characterization of Text Classification Unsupervised Algorithms Supervised Algorithms Feature Selection or Dimensionality Reduction
More informationLearning Decision Trees
Learning Decision Trees Machine Learning Spring 2018 1 This lecture: Learning Decision Trees 1. Representation: What are decision trees? 2. Algorithm: Learning decision trees The ID3 algorithm: A greedy
More informationK-Nearest Neighbor Temporal Aggregate Queries
Experiments and Conclusion K-Nearest Neighbor Temporal Aggregate Queries Yu Sun Jianzhong Qi Yu Zheng Rui Zhang Department of Computing and Information Systems University of Melbourne Microsoft Research,
More informationDecision Trees: Overfitting
Decision Trees: Overfitting Emily Fox University of Washington January 30, 2017 Decision tree recap Loan status: Root 22 18 poor 4 14 Credit? Income? excellent 9 0 3 years 0 4 Fair 9 4 Term? 5 years 9
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu November 16, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision
More informationAdvanced Data Structures
Simon Gog gog@kit.edu - Simon Gog: KIT University of the State of Baden-Wuerttemberg and National Research Center of the Helmholtz Association www.kit.edu Predecessor data structures We want to support
More informationAdvanced Data Structures
Simon Gog gog@kit.edu - Simon Gog: KIT The Research University in the Helmholtz Association www.kit.edu Predecessor data structures We want to support the following operations on a set of integers from
More informationProblem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26
Binary Search Introduction Problem Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26 Strategy 1: Random Search Randomly select a page until the page containing
More informationTASM: Top-k Approximate Subtree Matching
TASM: Top-k Approximate Subtree Matching Nikolaus Augsten 1 Denilson Barbosa 2 Michael Böhlen 3 Themis Palpanas 4 1 Free University of Bozen-Bolzano, Italy augsten@inf.unibz.it 2 University of Alberta,
More informationarxiv: v1 [cs.ds] 19 Apr 2011
Fixed Block Compression Boosting in FM-Indexes Juha Kärkkäinen 1 and Simon J. Puglisi 2 1 Department of Computer Science, University of Helsinki, Finland juha.karkkainen@cs.helsinki.fi 2 Department of
More informationPar$$oned Elias- Fano Indexes
Par$$oned Elias- Fano Indexes Giuseppe O)aviano ISTI- CNR, Pisa Rossano Venturini Università di Pisa Inverted indexes Docid Document 1: [it is what it is not] 2: [what is a] 3: [it is a banana] a 2, 3
More informationDictionary: an abstract data type
2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees
More informationNearest Neighbor Search with Keywords
Nearest Neighbor Search with Keywords Yufei Tao KAIST June 3, 2013 In recent years, many search engines have started to support queries that combine keyword search with geography-related predicates (e.g.,
More informationLarge-scale Linear RankSVM
Large-scale Linear RankSVM Ching-Pei Lee Department of Computer Science National Taiwan University Joint work with Chih-Jen Lin Ching-Pei Lee (National Taiwan Univ.) 1 / 41 Outline 1 Introduction 2 Our
More informationModern Information Retrieval
Modern Information Retrieval Chapter 8 Text Classification Introduction A Characterization of Text Classification Unsupervised Algorithms Supervised Algorithms Feature Selection or Dimensionality Reduction
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 23. Decision Trees Barnabás Póczos Contents Decision Trees: Definition + Motivation Algorithm for Learning Decision Trees Entropy, Mutual Information, Information
More informationWAM-Miner: In the Search of Web Access Motifs from Historical Web Log Data
WAM-Miner: In the Search of Web Access Motifs from Historical Web Log Data Qiankun Zhao a, Sourav S Bhowmick a and Le Gruenwald b a School of Computer Engineering, Division of Information Systems, Nanyang
More informationAn Ensemble Ranking Solution for the Yahoo! Learning to Rank Challenge
An Ensemble Ranking Solution for the Yahoo! Learning to Rank Challenge Ming-Feng Tsai Department of Computer Science University of Singapore 13 Computing Drive, Singapore Shang-Tse Chen Yao-Nan Chen Chun-Sung
More informationLearning to Rank Using Localized Geometric Mean Metrics
Learning to Rank Using Localized Geometric Mean Metrics Yuxin Su Department of Computer Science and Engineering The Chinese University of Hong Kong Shatin, N.T., Hong Kong yxsu@cse.cuhk.edu.hk Irwin King
More informationClassification Using Decision Trees
Classification Using Decision Trees 1. Introduction Data mining term is mainly used for the specific set of six activities namely Classification, Estimation, Prediction, Affinity grouping or Association
More informationExploiting Geographic Dependencies for Real Estate Appraisal
Exploiting Geographic Dependencies for Real Estate Appraisal Yanjie Fu Joint work with Hui Xiong, Yu Zheng, Yong Ge, Zhihua Zhou, Zijun Yao Rutgers, the State University of New Jersey Microsoft Research
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Mining Graph/Network Data Instructor: Yizhou Sun yzsun@ccs.neu.edu March 16, 2016 Methods to Learn Classification Clustering Frequent Pattern Mining Matrix Data Decision
More informationInformation Retrieval and Organisation
Information Retrieval and Organisation Chapter 13 Text Classification and Naïve Bayes Dell Zhang Birkbeck, University of London Motivation Relevance Feedback revisited The user marks a number of documents
More informationB- TREE. Michael Tsai 2017/06/06
B- TREE Michael Tsai 2017/06/06 2 B- Tree Overview Balanced search tree Very large branching factor Height = O(log n), but much less than that of RB tree Usage: Large amount of data to be stored - - Partially
More informationChapter 6: Classification
Chapter 6: Classification 1) Introduction Classification problem, evaluation of classifiers, prediction 2) Bayesian Classifiers Bayes classifier, naive Bayes classifier, applications 3) Linear discriminant
More informationLearning Decision Trees
Learning Decision Trees Machine Learning Fall 2018 Some slides from Tom Mitchell, Dan Roth and others 1 Key issues in machine learning Modeling How to formulate your problem as a machine learning problem?
More informationEnvironment (Parallelizing Query Optimization)
Advanced d Query Optimization i i Techniques in a Parallel Computing Environment (Parallelizing Query Optimization) Wook-Shin Han*, Wooseong Kwak, Jinsoo Lee Guy M. Lohman, Volker Markl Kyungpook National
More informationMidterm, Fall 2003
5-78 Midterm, Fall 2003 YOUR ANDREW USERID IN CAPITAL LETTERS: YOUR NAME: There are 9 questions. The ninth may be more time-consuming and is worth only three points, so do not attempt 9 unless you are
More informationInformation Retrieval
Introduction to Information CS276: Information and Web Search Christopher Manning and Pandu Nayak Lecture 15: Learning to Rank Sec. 15.4 Machine learning for IR ranking? We ve looked at methods for ranking
More informationUniversität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Decision Trees. Tobias Scheffer
Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Decision Trees Tobias Scheffer Decision Trees One of many applications: credit risk Employed longer than 3 months Positive credit
More informationBehavioral Data Mining. Lecture 7 Linear and Logistic Regression
Behavioral Data Mining Lecture 7 Linear and Logistic Regression Outline Linear Regression Regularization Logistic Regression Stochastic Gradient Fast Stochastic Methods Performance tips Linear Regression
More information11. Learning To Rank. Most slides were adapted from Stanford CS 276 course.
11. Learning To Rank Most slides were adapted from Stanford CS 276 course. 1 Sec. 15.4 Machine learning for IR ranking? We ve looked at methods for ranking documents in IR Cosine similarity, inverse document
More informationDecision Trees. Lewis Fishgold. (Material in these slides adapted from Ray Mooney's slides on Decision Trees)
Decision Trees Lewis Fishgold (Material in these slides adapted from Ray Mooney's slides on Decision Trees) Classification using Decision Trees Nodes test features, there is one branch for each value of
More informationLarge-Scale Behavioral Targeting
Large-Scale Behavioral Targeting Ye Chen, Dmitry Pavlov, John Canny ebay, Yandex, UC Berkeley (This work was conducted at Yahoo! Labs.) June 30, 2009 Chen et al. (KDD 09) Large-Scale Behavioral Targeting
More informationICS 233 Computer Architecture & Assembly Language
ICS 233 Computer Architecture & Assembly Language Assignment 6 Solution 1. Identify all of the RAW data dependencies in the following code. Which dependencies are data hazards that will be resolved by
More informationDecision Trees. Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1
Decision Trees Data Science: Jordan Boyd-Graber University of Maryland MARCH 11, 2018 Data Science: Jordan Boyd-Graber UMD Decision Trees 1 / 1 Roadmap Classification: machines labeling data for us Last
More informationIncremental Construction of Complex Aggregates: Counting over a Secondary Table
Incremental Construction of Complex Aggregates: Counting over a Secondary Table Clément Charnay 1, Nicolas Lachiche 1, and Agnès Braud 1 ICube, Université de Strasbourg, CNRS 300 Bd Sébastien Brant - CS
More informationParallelization Strategies for Distributed Non Negative Matrix Factorization
Parallelization Strategies for Distributed Non Negative Matrix Factorization Ahmed Nagy IMT Institute for Advanced Studies Lucca Italy ahmed.nagy@imtlucca.it Massimo Coppola ISTI - CNR Pisa Italy massimo.coppola
More informationLecture 2 September 4, 2014
CS 224: Advanced Algorithms Fall 2014 Prof. Jelani Nelson Lecture 2 September 4, 2014 Scribe: David Liu 1 Overview In the last lecture we introduced the word RAM model and covered veb trees to solve the
More informationDecision Support. Dr. Johan Hagelbäck.
Decision Support Dr. Johan Hagelbäck johan.hagelback@lnu.se http://aiguy.org Decision Support One of the earliest AI problems was decision support The first solution to this problem was expert systems
More informationSlides based on those in:
Spyros Kontogiannis & Christos Zaroliagis Slides based on those in: http://www.mmds.org High dim. data Graph data Infinite data Machine learning Apps Locality sensitive hashing PageRank, SimRank Filtering
More informationDepartment of Electrical and Computer Engineering University of Wisconsin - Madison. ECE/CS 752 Advanced Computer Architecture I.
Last (family) name: Solution First (given) name: Student I.D. #: Department of Electrical and Computer Engineering University of Wisconsin - Madison ECE/CS 752 Advanced Computer Architecture I Midterm
More informationCART Classification and Regression Trees Trees can be viewed as basis expansions of simple functions
CART Classification and Regression Trees Trees can be viewed as basis expansions of simple functions f (x) = M c m 1(x R m ) m=1 with R 1,..., R m R p disjoint. The CART algorithm is a heuristic, adaptive
More informationLarge-scale Collaborative Ranking in Near-Linear Time
Large-scale Collaborative Ranking in Near-Linear Time Liwei Wu Depts of Statistics and Computer Science UC Davis KDD 17, Halifax, Canada August 13-17, 2017 Joint work with Cho-Jui Hsieh and James Sharpnack
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu 2/7/2012 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 Web pages are not equally important www.joe-schmoe.com
More informationSAT based Abstraction-Refinement using ILP and Machine Learning Techniques
SAT based Abstraction-Refinement using ILP and Machine Learning Techniques 1 SAT based Abstraction-Refinement using ILP and Machine Learning Techniques Edmund Clarke James Kukula Anubhav Guta Ofer Strichman
More informationMaintaining Frequent Itemsets over High-Speed Data Streams
Maintaining Frequent Itemsets over High-Speed Data Streams James Cheng, Yiping Ke, and Wilfred Ng Department of Computer Science Hong Kong University of Science and Technology Clear Water Bay, Kowloon,
More informationA Decision Stump. Decision Trees, cont. Boosting. Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University. October 1 st, 2007
Decision Trees, cont. Boosting Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 1 st, 2007 1 A Decision Stump 2 1 The final tree 3 Basic Decision Tree Building Summarized
More informationFinal Exam, Machine Learning, Spring 2009
Name: Andrew ID: Final Exam, 10701 Machine Learning, Spring 2009 - The exam is open-book, open-notes, no electronics other than calculators. - The maximum possible score on this exam is 100. You have 3
More informationLars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Syllabus Fri. 21.10. (1) 0. Introduction A. Supervised Learning: Linear Models & Fundamentals Fri. 27.10. (2) A.1 Linear Regression Fri. 3.11. (3) A.2 Linear Classification Fri. 10.11. (4) A.3 Regularization
More informationthe tree till a class assignment is reached
Decision Trees Decision Tree for Playing Tennis Prediction is done by sending the example down Prediction is done by sending the example down the tree till a class assignment is reached Definitions Internal
More informationABC-LogitBoost for Multi-Class Classification
Ping Li, Cornell University ABC-Boost BTRY 6520 Fall 2012 1 ABC-LogitBoost for Multi-Class Classification Ping Li Department of Statistical Science Cornell University 2 4 6 8 10 12 14 16 2 4 6 8 10 12
More informationMachine Learning, Midterm Exam: Spring 2009 SOLUTION
10-601 Machine Learning, Midterm Exam: Spring 2009 SOLUTION March 4, 2009 Please put your name at the top of the table below. If you need more room to work out your answer to a question, use the back of
More information27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling
10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel
More informationMultivalued Decision Diagrams. Postoptimality Analysis Using. J. N. Hooker. Tarik Hadzic. Cork Constraint Computation Centre
Postoptimality Analysis Using Multivalued Decision Diagrams Tarik Hadzic Cork Constraint Computation Centre J. N. Hooker Carnegie Mellon University London School of Economics June 2008 Postoptimality Analysis
More informationC4.5 - pruning decision trees
C4.5 - pruning decision trees Quiz 1 Quiz 1 Q: Is a tree with only pure leafs always the best classifier you can have? A: No. Quiz 1 Q: Is a tree with only pure leafs always the best classifier you can
More informationHybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS
Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS Jorge González-Domínguez*, Bertil Schmidt*, Jan C. Kässens**, Lars Wienbrandt** *Parallel and Distributed Architectures
More informationInformation Retrieval
Introduction to Information Retrieval CS276: Information Retrieval and Web Search Christopher Manning, Pandu Nayak, and Prabhakar Raghavan Lecture 14: Learning to Rank Sec. 15.4 Machine learning for IR
More informationSupervised Learning via Decision Trees
Supervised Learning via Decision Trees Lecture 4 1 Outline 1. Learning via feature splits 2. ID3 Information gain 3. Extensions Continuous features Gain ratio Ensemble learning 2 Sequence of decisions
More informationIncremental Learning and Concept Drift: Overview
Incremental Learning and Concept Drift: Overview Incremental learning The algorithm ID5R Taxonomy of incremental learning Concept Drift Teil 5: Incremental Learning and Concept Drift (V. 1.0) 1 c G. Grieser
More informationWolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig
Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de 13 Indexes for Multimedia Data 13 Indexes for Multimedia
More informationRESEARCH ON THE DISTRIBUTED PARALLEL SPATIAL INDEXING SCHEMA BASED ON R-TREE
RESEARCH ON THE DISTRIBUTED PARALLEL SPATIAL INDEXING SCHEMA BASED ON R-TREE Yuan-chun Zhao a, b, Cheng-ming Li b a. Shandong University of Science and Technology, Qingdao 266510 b. Chinese Academy of
More informationOptimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs
Optimal Tree-decomposition Balancing and Reachability on Low Treewidth Graphs Krishnendu Chatterjee Rasmus Ibsen-Jensen Andreas Pavlogiannis IST Austria Abstract. We consider graphs with n nodes together
More informationMachine Learning 2nd Edi7on
Lecture Slides for INTRODUCTION TO Machine Learning 2nd Edi7on CHAPTER 9: Decision Trees ETHEM ALPAYDIN The MIT Press, 2010 Edited and expanded for CS 4641 by Chris Simpkins alpaydin@boun.edu.tr h1p://www.cmpe.boun.edu.tr/~ethem/i2ml2e
More informationSorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort
Sorting Algorithms We have already seen: Selection-sort Insertion-sort Heap-sort We will see: Bubble-sort Merge-sort Quick-sort We will show that: O(n log n) is optimal for comparison based sorting. Bubble-Sort
More informationSlide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University.
Slide source: Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University http://www.mmds.org #1: C4.5 Decision Tree - Classification (61 votes) #2: K-Means - Clustering
More informationCS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING. Santiago Ontañón
CS 380: ARTIFICIAL INTELLIGENCE MACHINE LEARNING Santiago Ontañón so367@drexel.edu Summary so far: Rational Agents Problem Solving Systematic Search: Uninformed Informed Local Search Adversarial Search
More informationAlgorithms and Data Structures
Algorithms and Data Structures, Divide and Conquer Albert-Ludwigs-Universität Freiburg Prof. Dr. Rolf Backofen Bioinformatics Group / Department of Computer Science Algorithms and Data Structures, December
More informationReview of Lecture 1. Across records. Within records. Classification, Clustering, Outlier detection. Associations
Review of Lecture 1 This course is about finding novel actionable patterns in data. We can divide data mining algorithms (and the patterns they find) into five groups Across records Classification, Clustering,
More informationPrediction of Citations for Academic Papers
000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050
More informationReal Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report
Real Estate Price Prediction with Regression and Classification CS 229 Autumn 2016 Project Final Report Hujia Yu, Jiafu Wu [hujiay, jiafuwu]@stanford.edu 1. Introduction Housing prices are an important
More informationNotes on Logarithmic Lower Bounds in the Cell Probe Model
Notes on Logarithmic Lower Bounds in the Cell Probe Model Kevin Zatloukal November 10, 2010 1 Overview Paper is by Mihai Pâtraşcu and Erik Demaine. Both were at MIT at the time. (Mihai is now at AT&T Labs.)
More informationJeffrey D. Ullman Stanford University
Jeffrey D. Ullman Stanford University 3 We are given a set of training examples, consisting of input-output pairs (x,y), where: 1. x is an item of the type we want to evaluate. 2. y is the value of some
More informationData classification (II)
Lecture 4: Data classification (II) Data Mining - Lecture 4 (2016) 1 Outline Decision trees Choice of the splitting attribute ID3 C4.5 Classification rules Covering algorithms Naïve Bayes Classification
More informationEXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING
EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: June 9, 2018, 09.00 14.00 RESPONSIBLE TEACHER: Andreas Svensson NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical
More informationGenerative v. Discriminative classifiers Intuition
Logistic Regression (Continued) Generative v. Discriminative Decision rees Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University January 31 st, 2007 2005-2007 Carlos Guestrin 1 Generative
More informationLinear Classifiers (Kernels)
Universität Potsdam Institut für Informatik Lehrstuhl Linear Classifiers (Kernels) Blaine Nelson, Christoph Sawade, Tobias Scheffer Exam Dates & Course Conclusion There are 2 Exam dates: Feb 20 th March
More informationAuto-Tuning Complex Array Layouts for GPUs - Supplemental Material
BIN COUNT EGPGV,. This is the author version of the work. It is posted here by permission of Eurographics for your personal use. Not for redistribution. The definitive version is available at http://diglib.eg.org/.
More informationClaude Tadonki. MINES ParisTech PSL Research University Centre de Recherche Informatique
Claude Tadonki MINES ParisTech PSL Research University Centre de Recherche Informatique claude.tadonki@mines-paristech.fr Monthly CRI Seminar MINES ParisTech - CRI June 06, 2016, Fontainebleau (France)
More informationMultimedia Databases 1/29/ Indexes for Multimedia Data Indexes for Multimedia Data Indexes for Multimedia Data
1/29/2010 13 Indexes for Multimedia Data 13 Indexes for Multimedia Data 13.1 R-Trees Multimedia Databases Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig
More informationRule Generation using Decision Trees
Rule Generation using Decision Trees Dr. Rajni Jain 1. Introduction A DT is a classification scheme which generates a tree and a set of rules, representing the model of different classes, from a given
More informationParallelized Variational EM for Latent Dirichlet Allocation: An Experimental Evaluation of Speed and Scalability
Parallelized Variational EM for Latent Dirichlet Allocation: An Experimental Evaluation of Speed and Scalability Ramesh Nallapati, William Cohen and John Lafferty Machine Learning Department Carnegie Mellon
More informationDescribing Data Table with Best Decision
Describing Data Table with Best Decision ANTS TORIM, REIN KUUSIK Department of Informatics Tallinn University of Technology Raja 15, 12618 Tallinn ESTONIA torim@staff.ttu.ee kuusik@cc.ttu.ee http://staff.ttu.ee/~torim
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationLimitations of Algorithm Power
Limitations of Algorithm Power Objectives We now move into the third and final major theme for this course. 1. Tools for analyzing algorithms. 2. Design strategies for designing algorithms. 3. Identifying
More informationLecture 24: Other (Non-linear) Classifiers: Decision Tree Learning, Boosting, and Support Vector Classification Instructor: Prof. Ganesh Ramakrishnan
Lecture 24: Other (Non-linear) Classifiers: Decision Tree Learning, Boosting, and Support Vector Classification Instructor: Prof Ganesh Ramakrishnan October 20, 2016 1 / 25 Decision Trees: Cascade of step
More informationML techniques. symbolic techniques different types of representation value attribute representation representation of the first order
MACHINE LEARNING Definition 1: Learning is constructing or modifying representations of what is being experienced [Michalski 1986], p. 10 Definition 2: Learning denotes changes in the system That are adaptive
More informationAnomaly Detection for the CERN Large Hadron Collider injection magnets
Anomaly Detection for the CERN Large Hadron Collider injection magnets Armin Halilovic KU Leuven - Department of Computer Science In cooperation with CERN 2018-07-27 0 Outline 1 Context 2 Data 3 Preprocessing
More informationTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest Neighbors Frank Nielsen 1 Paolo Piro 2 Michel Barlaud 2 1 Ecole Polytechnique, LIX, Palaiseau, France 2 CNRS / University of Nice-Sophia Antipolis, Sophia
More informationNeighborhood-Based Local Sensitivity
Neighborhood-Based Local Sensitivity Paul N. Bennett Microsoft Research, One Microsoft Way, Redmond WA 98052, USA paul.n.bennett@microsoft.com Abstract. We introduce a nonparametric model for sensitivity
More informationWeight-balanced Binary Search Trees
Weight-balanced Binary Search Trees Weight-balanced BSTs are one way to achieve O(lg n) tree height. For this purpose, weight means size+1: weight(x) = size(x) + 1. A weight-balanced BST: is a binary search
More informationCS 243 Lecture 11 Binary Decision Diagrams (BDDs) in Pointer Analysis
CS 243 Lecture 11 Binary Decision Diagrams (BDDs) in Pointer Analysis 1. Relations in BDDs 2. Datalog -> Relational Algebra 3. Relational Algebra -> BDDs 4. Context-Sensitive Pointer Analysis 5. Performance
More informationDictionary: an abstract data type
2-3 Trees 1 Dictionary: an abstract data type A container that maps keys to values Dictionary operations Insert Search Delete Several possible implementations Balanced search trees Hash tables 2 2-3 trees
More informationLecture 7: DecisionTrees
Lecture 7: DecisionTrees What are decision trees? Brief interlude on information theory Decision tree construction Overfitting avoidance Regression trees COMP-652, Lecture 7 - September 28, 2009 1 Recall:
More informationBayesian Multiple Target Localization
Bayesian Multiple Target Localization Purnima Rajan (Johns Hopkins) Weidong Han (Princeton) Raphael Sznitman (Bern) Peter Frazier (Cornell, Uber) Bruno Jedynak (Johns Hopkins, Portland State) Thursday
More informationarxiv: v2 [cs.ir] 13 Jun 2018
Ranking Robustness Under Adversarial Document Manipulations arxiv:1806.04425v2 [cs.ir] 13 Jun 2018 ABSTRACT Gregory Goren gregory.goren@campus.techion.ac.il Technion Israel Institute of Technology Moshe
More informationClick Prediction and Preference Ranking of RSS Feeds
Click Prediction and Preference Ranking of RSS Feeds 1 Introduction December 11, 2009 Steven Wu RSS (Really Simple Syndication) is a family of data formats used to publish frequently updated works. RSS
More informationInformation Retrieval. Lecture 6
Information Retrieval Lecture 6 Recap of the last lecture Parametric and field searches Zones in documents Scoring documents: zone weighting Index support for scoring tf idf and vector spaces This lecture
More information