Introduction of Recruit
|
|
- Josephine Barbra Jennings
- 6 years ago
- Views:
Transcription
1 Apr. 11, 2018
2 Introduction of Recruit We provide various kinds of online services from job search to hotel reservations across the world. Housing Beauty Travel Life & Local O2O Education Automobile Bridal & Baby Human Resources IT & Trends Media Dining 2
3 Introduction of Recruit We help users to find the best clients through our services. Data science plays an important role in the business. Internet Users Clients 3
4 Data Science at Recruit Recruit has hosted two data mining competitions in Kaggle Kaggle, KDD Cup: International competitions of data mining Recruit Restaurant Visitor Forecasting (2018) Coupon Purchase Prediction (2015) We are passionate about data science Some of us came in 1st and 2nd place in KDD Cup 2015 { Engineers at Recruit (as of March 2018)4 C Recruit Communications Co., Ltd.
5 Feature Selection: A Key Technique A key technique to win data mining competitions Find the most relevant features Balance bias-variance trade-off Features User 1 User 2 User 3 User 4 Benefits Improve prediction Reduce computational cost User n-1 User n 5 Beating Kaggle the easy way studien/2015/dong_ying.pdf
6 Types of Feature Selection (FS) Algorithms Wrapper methods Iteratively evaluate a feature subset by black-box learning algorithm Embedded methods Train a model and select features at the same time Filter methods Features are selected by some criteria such as Mutual Information Independent on learning algorithms Can be used as a pre-processing 6
7 What is Mutual Information (MI)? Mutual Information I(X;Y) is a measure of the mutual independence between two random variables X and Y High Mutual Information I(X;Y) Low Able to predict Y given X Hard to predict Y given X MI can capture non-linear relationships unlike Pearson s correlation coefficient Shannon entropy Pearson r = 0.8 MI = 0.5 Pearson r = 0.0 MI = 0.7 Pearson r = 0.0 MI = Figures are retrieved from
8 Mutual Information based Feature Selection (MIFS) MIFS: using Mutual Information as a criteria in filter methods General formulation of MIFS MIFS selects a feature subset with a size of k which maximizes the Mutual Information (MI) between the features and the target variable 8
9 Heuristic MIFS Algorithms Max Relevance method Selecting the most relevant feature iteratively Repeat k times Mim Redundancy & Max Relevance method [1] (MRMR) Selecting the most relevant and least redundant feature iteratively Repeat k times 9 [1] H. Peng et al., 2005 [2] J. R. Vergara & P. A. Estévez, 2015
10 Our Contributions MI increase (%) w.r.t Linear MIFS optimization QUBO formulation of MIFS ) Better (1) We reformulate MIFS by QUBO ( #features (2) We confirmed optimizations by D-Wave do well in MIFS QUBO: Quadratic Unconstrained Binary Optimization HOW? image is retrieved from 10 C Recruit Communications Co., Ltd.
11 Reformulation of MIFS by QUBO (1) Expand the MI term Proof. Theorem 1.1: Chain theorem for Conditional Mutual Information Using theorem 1.1, the following equation holds for all i S Averaging the equation above for all i leads to 11
12 Reformulation of MIFS by QUBO (2) Approximate under the assumption of Conditional Independence (CI) Proof. If we assume the conditional independence We can obtain 12
13 Reformulation of MIFS by QUBO (3) Optimization of MIFS QUBO formulation of MIFS MI Penalty for selecting only k features α: penalty strength 13
14 Interpretation of the Derived Formulation Expand the derived formulation Increase: Relevance, Complementary Reduce: Redundancy Relevance Redundancy Complementary 14
15 Comparison of Optimization Methods Problem Formulation Binary Quadratic Problem (BQP) Optimization Methods Linear Relaxation [1] (Linear) Truncated Power [1,2] (TPower) QUBO Tabu Search by qbsolv [3] D-Wave 2000Q 15 [1] H. Venkateswara, et al., 2015 [2] X. T. Yuan & T. Zhang, 2013 [3]
16 Linear Relaxation Method (Linear) Linearize the quadratic term by introducing new variables One of the optimal conditions is, which leads to Since Qij 0, the solution of this problem is given by k largest column sum of Q. This solution is tightly bounded [1]. Time complexity is O(nk). 16 [1] H. Venkateswara, et al., 2015
17 Truncated Power Method (TPower) Finding the largest k-sparse eigenvector of Q is defined as We select i th feature if xi > 0 This is calculated by the following procedure [1] [1] X. T. Yuan & T. Zhang, 2013 [2] H. Venkateswara, et al., 2015 Repeat T times This method is confirmed to be the best-performing method for BQP problem with non-negative matrix [2]. Time complexity of the algorithm is O(Tn 2 ). 17
18 Optimization by D-Wave Machine We used the D-Wave machine with the following settings Machine: D-Wave 2000Q Embedding: 64 bit full connection Annealing Time: 20µs Annealing Repetitions: 10 When feature size n is larger than hardware size h (=64), we use Linear to narrow down the candidate features to h as a pre-processing. Full Connection Embedding for C(4,4,4) 18
19 Comparison of Mutual Information Score We compared MI scores of each optimization method for a public dataset. The increases with regard to Linear are shown in the graph below. Better MI increase (%) w.r.t Linear Mutual Information Score #features 19 ( ) Data Name: a1a #features: 122 #data points: 8000
20 Classification Accuracy We calculated the classification accuracy for different #features. Accuracy is a good measure to evaluate the quality of a selected subset of features. Original features Classification Accuracy Selected k-features Measure the classification accuracy by random forest classifiers 20
21 Classification Accuracy We evaluated each method by classification accuracy for different #features. Better Accuracy Classification Accuracy D-Wave TPower Tabu(qbsolv) Linear #features Better 21 Data Name: a1a #features: 122 #data points: 8000
22 Summary We derived the QUBO formulation of MIFS so that the problem can be embedded in Ising machines We used the D-Wave quantum annealing machine as a solver in MIFS The optimization method by D-Wave outperformed TPower which is the state-of-the-art optimization method for BQP We are planning to use MIFS by D-Wave in Kaggle! 22
23 Thank you for listening 23
24 Runtime of Optimizations method Linear TPower Tabu(qbsolv) D-Wave Averaege Runtime 9.0 msec 26.1 msec 14.3 sec 9.0 msec (Linear) μsec (annealing) Data Name: a1a #features: 122 #data points:
25 Comparison to MRMR, Max Rel. Accuracy D-Wave MRMR Max Rel #features Data Name: a1a #features: 122 #data points:
Iterative Laplacian Score for Feature Selection
Iterative Laplacian Score for Feature Selection Linling Zhu, Linsong Miao, and Daoqiang Zhang College of Computer Science and echnology, Nanjing University of Aeronautics and Astronautics, Nanjing 2006,
More informationSupport Vector Machines
Support Vector Machines Reading: Ben-Hur & Weston, A User s Guide to Support Vector Machines (linked from class web page) Notation Assume a binary classification problem. Instances are represented by vector
More informationQubits qop Tools Directions
Qubits qop Tools Directions Steve Reinhardt Director of Software Tools D-Wave Systems The qop goals are to establish key abstractions that are valuable for applications and higherlevel tools and effectively
More informationSOLVING SPARSE REPRESENTATIONS FOR OBJECT CLASSIFICATION USING QUANTUM D-WAVE 2X MACHINE
SOLVING SPARSE REPRESENTATIONS FOR OBJECT CLASSIFICATION USING QUANTUM D-WAVE 2X MACHINE! Nga Nguyen, Amy Larson, Carleton Coffrin, John Perry, Gary Salazar, and Garrett Kenyon Los Alamos National Laboratory
More informationDisplay Advertising Optimization by Quantum Annealing Processor
Display Advertising Optimization by Quantum Annealing Processor Shinichi Takayanagi*, Kotaro Tanahashi*, Shu Tanaka *Recruit Communications Co., Ltd. Waseda University, JST PRESTO Overview 1. Introduction
More informationROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015
ROBERTO BATTITI, MAURO BRUNATO. The LION Way: Machine Learning plus Intelligent Optimization. LIONlab, University of Trento, Italy, Apr 2015 http://intelligentoptimization.org/lionbook Roberto Battiti
More informationObserving Dark Worlds (Final Report)
Observing Dark Worlds (Final Report) Bingrui Joel Li (0009) Abstract Dark matter is hypothesized to account for a large proportion of the universe s total mass. It does not emit or absorb light, making
More informationConditional Likelihood Maximization: A Unifying Framework for Information Theoretic Feature Selection
Conditional Likelihood Maximization: A Unifying Framework for Information Theoretic Feature Selection Gavin Brown, Adam Pocock, Mingjie Zhao and Mikel Lujan School of Computer Science University of Manchester
More informationQuantum Annealing Approaches to Graph Partitioning on the D-Wave System
Quantum Annealing Approaches to Graph Partitioning on the D-Wave System 2017 D-Wave QUBITS Users Conference Applications 1: Optimization S. M. Mniszewski, smm@lanl.gov H. Ushijima-Mwesigwa, hayato@lanl.gov
More informationFinancial Portfolio Management using D-Wave s Quantum Optimizer: The Case of Abu Dhabi Securities Exchange
Financial Portfolio Management using D-Wave s Quantum Optimizer: The Case of Abu Dhabi Securities Exchange Nada Elsokkary and Faisal Shah Khan Quantum Computing Research Group Department of Applied Mathematics
More informationClassification using stochastic ensembles
July 31, 2014 Topics Introduction Topics Classification Application and classfication Classification and Regression Trees Stochastic ensemble methods Our application: USAID Poverty Assessment Tools Topics
More informationComputer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo
Group Prof. Daniel Cremers 10a. Markov Chain Monte Carlo Markov Chain Monte Carlo In high-dimensional spaces, rejection sampling and importance sampling are very inefficient An alternative is Markov Chain
More informationSparse Approximation and Variable Selection
Sparse Approximation and Variable Selection Lorenzo Rosasco 9.520 Class 07 February 26, 2007 About this class Goal To introduce the problem of variable selection, discuss its connection to sparse approximation
More informationLeast Squares Regression
CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the
More informationSemestrial Project - Expedia Hotel Ranking
1 Many customers search and purchase hotels online. Companies such as Expedia make their profit from purchases made through their sites. The ultimate goal top of the list are the hotels that are most likely
More informationPrinciples of Pattern Recognition. C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata
Principles of Pattern Recognition C. A. Murthy Machine Intelligence Unit Indian Statistical Institute Kolkata e-mail: murthy@isical.ac.in Pattern Recognition Measurement Space > Feature Space >Decision
More informationDecision Trees: Overfitting
Decision Trees: Overfitting Emily Fox University of Washington January 30, 2017 Decision tree recap Loan status: Root 22 18 poor 4 14 Credit? Income? excellent 9 0 3 years 0 4 Fair 9 4 Term? 5 years 9
More informationMachine Learning for Software Engineering
Machine Learning for Software Engineering Dimensionality Reduction Prof. Dr.-Ing. Norbert Siegmund Intelligent Software Systems 1 2 Exam Info Scheduled for Tuesday 25 th of July 11-13h (same time as the
More informationPreliminaries. Data Mining. The art of extracting knowledge from large bodies of structured data. Let s put it to use!
Data Mining The art of extracting knowledge from large bodies of structured data. Let s put it to use! 1 Recommendations 2 Basic Recommendations with Collaborative Filtering Making Recommendations 4 The
More informationThe role of dimensionality reduction in classification
The role of dimensionality reduction in classification Weiran Wang and Miguel Á. Carreira-Perpiñán Electrical Engineering and Computer Science University of California, Merced http://eecs.ucmerced.edu
More informationSolving the Travelling Salesman Problem Using Quantum Computing
Solving the Travelling Salesman Problem Using Quantum Computing Sebastian Feld, Christoph Roch, Thomas Gabor Ludwig-Maximilians-Universität München OpenMunich 01.12.2017, Munich Agenda I. Quantum Computing
More informationIntroduction to Machine Learning. Regression. Computer Science, Tel-Aviv University,
1 Introduction to Machine Learning Regression Computer Science, Tel-Aviv University, 2013-14 Classification Input: X Real valued, vectors over real. Discrete values (0,1,2,...) Other structures (e.g.,
More informationCS6375: Machine Learning Gautam Kunapuli. Decision Trees
Gautam Kunapuli Example: Restaurant Recommendation Example: Develop a model to recommend restaurants to users depending on their past dining experiences. Here, the features are cost (x ) and the user s
More informationHETEROGENEOUS QUANTUM COMPUTING FOR SATELLITE OPTIMIZATION
HETEROGENEOUS QUANTUM COMPUTING FOR SATELLITE OPTIMIZATION GIDEON BASS BOOZ ALLEN HAMILTON September 2017 COLLABORATORS AND PARTNERS Special thanks to: Brad Lackey (UMD/QuICS) for advice and suggestions
More informationLarge-Scale Feature Learning with Spike-and-Slab Sparse Coding
Large-Scale Feature Learning with Spike-and-Slab Sparse Coding Ian J. Goodfellow, Aaron Courville, Yoshua Bengio ICML 2012 Presented by Xin Yuan January 17, 2013 1 Outline Contributions Spike-and-Slab
More informationInformation Theory and Feature Selection (Joint Informativeness and Tractability)
Information Theory and Feature Selection (Joint Informativeness and Tractability) Leonidas Lefakis Zalando Research Labs 1 / 66 Dimensionality Reduction Feature Construction Construction X 1,..., X D f
More informationOpportunities and challenges in quantum-enhanced machine learning in near-term quantum computers
Opportunities and challenges in quantum-enhanced machine learning in near-term quantum computers Alejandro Perdomo-Ortiz Senior Research Scientist, Quantum AI Lab. at NASA Ames Research Center and at the
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Oct 18, 2016 Outline One versus all/one versus one Ranking loss for multiclass/multilabel classification Scaling to millions of labels Multiclass
More informationarxiv: v1 [quant-ph] 16 Aug 2017
Noname manuscript No. (will be inserted by the editor) Combinatorial Optimization on Gate Model Quantum Computers: A Survey Ehsan Zahedinejad Arman Zaribafiyan arxiv:1708.05294v1 [quant-ph] 16 Aug 2017
More informationLeast Squares Regression
E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute
More informationLOCKHEED MARTIN SITE UPDATE
LOCKHEED MARTIN SITE UPDATE 25 SEPTEMBER 2018 Julia Kwok Software Engineer Quantum Applications THE USC-LM QUANTUM COMPUTING CENTER Dr. Edward H. Ned Allen Chief Scientist and LM Senior Fellow Lockheed
More informationRecap from previous lecture
Recap from previous lecture Learning is using past experience to improve future performance. Different types of learning: supervised unsupervised reinforcement active online... For a machine, experience
More informationCSE 417T: Introduction to Machine Learning. Lecture 11: Review. Henry Chai 10/02/18
CSE 417T: Introduction to Machine Learning Lecture 11: Review Henry Chai 10/02/18 Unknown Target Function!: # % Training data Formal Setup & = ( ), + ),, ( -, + - Learning Algorithm 2 Hypothesis Set H
More informationMULTIPLEKERNELLEARNING CSE902
MULTIPLEKERNELLEARNING CSE902 Multiple Kernel Learning -keywords Heterogeneous information fusion Feature selection Max-margin classification Multiple kernel learning MKL Convex optimization Kernel classification
More informationCompressing Tabular Data via Pairwise Dependencies
Compressing Tabular Data via Pairwise Dependencies Amir Ingber, Yahoo! Research TCE Conference, June 22, 2017 Joint work with Dmitri Pavlichin, Tsachy Weissman (Stanford) Huge datasets: everywhere - Internet
More informationMachine Learning Linear Classification. Prof. Matteo Matteucci
Machine Learning Linear Classification Prof. Matteo Matteucci Recall from the first lecture 2 X R p Regression Y R Continuous Output X R p Y {Ω 0, Ω 1,, Ω K } Classification Discrete Output X R p Y (X)
More informationMax Margin-Classifier
Max Margin-Classifier Oliver Schulte - CMPT 726 Bishop PRML Ch. 7 Outline Maximum Margin Criterion Math Maximizing the Margin Non-Separable Data Kernels and Non-linear Mappings Where does the maximization
More informationQuantum Annealing with continuous variables: Low-Rank Matrix Factorization. Daniele Ottaviani CINECA. Alfonso Amendola ENI
Quantum Annealing with continuous variables: Low-Rank Matrix Factorization Daniele Ottaviani CINECA Alfonso Amendola ENI Qubits Europe 2019 Milan, 25-27/03/2019 QUBO Problems with real variables We define
More informationQuantum Computing at Volkswagen: Traffic Flow Optimization using the D-Wave Quantum Annealer
Quantum Computing at Volkswagen: Traffic Flow Optimization using the D-Wave Quantum Annealer D-Wave Users Group Meeting - National Harbour, MD 27.09.2017 Dr. Gabriele Compostella The Volkswagen Data:Lab
More informationMachine Learning, Midterm Exam
10-601 Machine Learning, Midterm Exam Instructors: Tom Mitchell, Ziv Bar-Joseph Wednesday 12 th December, 2012 There are 9 questions, for a total of 100 points. This exam has 20 pages, make sure you have
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationPredicting Future Energy Consumption CS229 Project Report
Predicting Future Energy Consumption CS229 Project Report Adrien Boiron, Stephane Lo, Antoine Marot Abstract Load forecasting for electric utilities is a crucial step in planning and operations, especially
More informationFast Logistic Regression for Text Categorization with Variable-Length N-grams
Fast Logistic Regression for Text Categorization with Variable-Length N-grams Georgiana Ifrim *, Gökhan Bakır +, Gerhard Weikum * * Max-Planck Institute for Informatics Saarbrücken, Germany + Google Switzerland
More informationMulticlass Classification-1
CS 446 Machine Learning Fall 2016 Oct 27, 2016 Multiclass Classification Professor: Dan Roth Scribe: C. Cheng Overview Binary to multiclass Multiclass SVM Constraint classification 1 Introduction Multiclass
More informationNetBox: A Probabilistic Method for Analyzing Market Basket Data
NetBox: A Probabilistic Method for Analyzing Market Basket Data José Miguel Hernández-Lobato joint work with Zoubin Gharhamani Department of Engineering, Cambridge University October 22, 2012 J. M. Hernández-Lobato
More informationCSC2515 Winter 2015 Introduction to Machine Learning. Lecture 2: Linear regression
CSC2515 Winter 2015 Introduction to Machine Learning Lecture 2: Linear regression All lecture slides will be available as.pdf on the course website: http://www.cs.toronto.edu/~urtasun/courses/csc2515/csc2515_winter15.html
More informationFilter Methods. Part I : Basic Principles and Methods
Filter Methods Part I : Basic Principles and Methods Feature Selection: Wrappers Input: large feature set Ω 10 Identify candidate subset S Ω 20 While!stop criterion() Evaluate error of a classifier using
More informationTerm Filtering with Bounded Error
Term Filtering with Bounded Error Zi Yang, Wei Li, Jie Tang, and Juanzi Li Knowledge Engineering Group Department of Computer Science and Technology Tsinghua University, China {yangzi, tangjie, ljz}@keg.cs.tsinghua.edu.cn
More informationCS6375: Machine Learning Gautam Kunapuli. Support Vector Machines
Gautam Kunapuli Example: Text Categorization Example: Develop a model to classify news stories into various categories based on their content. sports politics Use the bag-of-words representation for this
More informationFinal Examination CS540-2: Introduction to Artificial Intelligence
Final Examination CS540-2: Introduction to Artificial Intelligence May 9, 2018 LAST NAME: SOLUTIONS FIRST NAME: Directions 1. This exam contains 33 questions worth a total of 100 points 2. Fill in your
More informationQuantum Annealing and the Satisfiability Problem
arxiv:1612.7258v1 [quant-ph] 21 Dec 216 Quantum Annealing and the Satisfiability Problem 1. Introduction Kristen L PUDENZ 1, Gregory S TALLANT, Todd R BELOTE, and Steven H ADACHI Lockheed Martin, United
More informationLinear Programming-based Data Mining Techniques And Credit Card Business Intelligence
Linear Programming-based Data Mining Techniques And Credit Card Business Intelligence Yong Shi the Charles W. and Margre H. Durham Distinguished Professor of Information Technology University of Nebraska,
More informationActive Learning for Sparse Bayesian Multilabel Classification
Active Learning for Sparse Bayesian Multilabel Classification Deepak Vasisht, MIT & IIT Delhi Andreas Domianou, University of Sheffield Manik Varma, MSR, India Ashish Kapoor, MSR, Redmond Multilabel Classification
More informationUndirected Graphical Models
Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Properties Properties 3 Generative vs. Conditional
More informationLANL Site Report. Motivation and Machine Activity Training/Access Approach Technical Highlights. Daniel O Malley EES-16 04/11/2018 LA-UR-XX-XXXXXX
LANL Site Report Motivation and Machine Activity Training/Access Approach Technical Highlights Daniel O Malley EES-16 04/11/2018 Operated by Los Alamos National Security, LLC for the U.S. Department of
More informationDecision Trees. CS57300 Data Mining Fall Instructor: Bruno Ribeiro
Decision Trees CS57300 Data Mining Fall 2016 Instructor: Bruno Ribeiro Goal } Classification without Models Well, partially without a model } Today: Decision Trees 2015 Bruno Ribeiro 2 3 Why Trees? } interpretable/intuitive,
More informationL5 Support Vector Classification
L5 Support Vector Classification Support Vector Machine Problem definition Geometrical picture Optimization problem Optimization Problem Hard margin Convexity Dual problem Soft margin problem Alexander
More informationGradient Boosting (Continued)
Gradient Boosting (Continued) David Rosenberg New York University April 4, 2016 David Rosenberg (New York University) DS-GA 1003 April 4, 2016 1 / 31 Boosting Fits an Additive Model Boosting Fits an Additive
More informationLinear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)
Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training
More informationQuantum Classification of Malware
Quantum Classification of Malware John Seymour seymour1@umbc.edu Charles Nicholas nicholas@umbc.edu August 24, 2015 Abstract Quantum computation has recently become an important area for security research,
More informationMining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee
Mining Newsgroups Using Networks Arising From Social Behavior by Rakesh Agrawal et al. Presented by Will Lee wwlee1@uiuc.edu September 28, 2004 Motivation IR on newsgroups is challenging due to lack of
More informationFinding Maximum Cliques on a Quantum Annealer
Finding Maximum Cliques on a Quantum Annealer Guillaume Chapuis Los Alamos National Laboratory Georg Hahn Imperial College, London, UK Hristo Djidjev (PI) Los Alamos National Laboratory Guillaume Rizk
More informationSupport Vector Machines Explained
December 23, 2008 Support Vector Machines Explained Tristan Fletcher www.cs.ucl.ac.uk/staff/t.fletcher/ Introduction This document has been written in an attempt to make the Support Vector Machines (SVM),
More informationCollaborative Filtering Applied to Educational Data Mining
Journal of Machine Learning Research (200) Submitted ; Published Collaborative Filtering Applied to Educational Data Mining Andreas Töscher commendo research 8580 Köflach, Austria andreas.toescher@commendo.at
More informationTurbulence Simulations
Innovatives Supercomputing in Deutschland Innovative HPC in Germany Vol. 14 No. 2 Autumn 2016 Turbulence Simulations The world s largest terrestrial & astrophysical applications Vice World Champion HLRS
More informationSupport Vector Machine
Andrea Passerini passerini@disi.unitn.it Machine Learning Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)
More informationUsing Entropy-Related Measures in Categorical Data Visualization
Using Entropy-Related Measures in Categorical Data Visualization Jamal Alsakran The University of Jordan Xiaoke Huang, Ye Zhao Kent State University Jing Yang UNC Charlotte Karl Fast Kent State University
More informationKyle Reing University of Southern California April 18, 2018
Renormalization Group and Information Theory Kyle Reing University of Southern California April 18, 2018 Overview Renormalization Group Overview Information Theoretic Preliminaries Real Space Mutual Information
More informationCSE 417T: Introduction to Machine Learning. Final Review. Henry Chai 12/4/18
CSE 417T: Introduction to Machine Learning Final Review Henry Chai 12/4/18 Overfitting Overfitting is fitting the training data more than is warranted Fitting noise rather than signal 2 Estimating! "#$
More informationMIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October,
MIDTERM: CS 6375 INSTRUCTOR: VIBHAV GOGATE October, 23 2013 The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run
More informationTraffic flow optimization using a quantum annealer
Traffic flow optimization using a quantum annealer Florian Neukart 1, David Von Dollen 1, Gabriele Compostella 2, Christian Seidel 2, Sheir Yarkoni 3, and Bob Parney 3 1 Volkswagen Group of America, San
More informationDimension Reduction Methods
Dimension Reduction Methods And Bayesian Machine Learning Marek Petrik 2/28 Previously in Machine Learning How to choose the right features if we have (too) many options Methods: 1. Subset selection 2.
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationCutting Plane Training of Structural SVM
Cutting Plane Training of Structural SVM Seth Neel University of Pennsylvania sethneel@wharton.upenn.edu September 28, 2017 Seth Neel (Penn) Short title September 28, 2017 1 / 33 Overview Structural SVMs
More informationDATA MINING AND MACHINE LEARNING
DATA MINING AND MACHINE LEARNING Lecture 5: Regularization and loss functions Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Loss functions Loss functions for regression problems
More informationUnderstanding Wealth in New York City From the Activity of Local Businesses
Understanding Wealth in New York City From the Activity of Local Businesses Vincent S. Chen Department of Computer Science Stanford University vschen@stanford.edu Dan X. Yu Department of Computer Science
More informationPlanning maximum capacity Wireless Local Area Networks
Edoardo Amaldi Sandro Bosio Antonio Capone Matteo Cesana Federico Malucelli Di Yuan Planning maximum capacity Wireless Local Area Networks http://www.elet.polimi.it/upload/malucell Outline Application
More informationKernel Methods and Support Vector Machines
Kernel Methods and Support Vector Machines Oliver Schulte - CMPT 726 Bishop PRML Ch. 6 Support Vector Machines Defining Characteristics Like logistic regression, good for continuous input features, discrete
More informationSelf-Organization by Optimizing Free-Energy
Self-Organization by Optimizing Free-Energy J.J. Verbeek, N. Vlassis, B.J.A. Kröse University of Amsterdam, Informatics Institute Kruislaan 403, 1098 SJ Amsterdam, The Netherlands Abstract. We present
More informationData Mining Techniques
Data Mining Techniques CS 6220 - Section 3 - Fall 2016 Lecture 12 Jan-Willem van de Meent (credit: Yijun Zhao, Percy Liang) DIMENSIONALITY REDUCTION Borrowing from: Percy Liang (Stanford) Linear Dimensionality
More informationLinear vs Non-linear classifier. CS789: Machine Learning and Neural Network. Introduction
Linear vs Non-linear classifier CS789: Machine Learning and Neural Network Support Vector Machine Jakramate Bootkrajang Department of Computer Science Chiang Mai University Linear classifier is in the
More informationDecision Trees. CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore
Decision Trees Claude Monet, The Mulberry Tree Slides from Pedro Domingos, CSC411/2515: Machine Learning and Data Mining, Winter 2018 Luke Zettlemoyer, Carlos Guestrin, and Andrew Moore Michael Guerzhoy
More informationPerceptron Revisited: Linear Separators. Support Vector Machines
Support Vector Machines Perceptron Revisited: Linear Separators Binary classification can be viewed as the task of separating classes in feature space: w T x + b > 0 w T x + b = 0 w T x + b < 0 Department
More informationRandomized Decision Trees
Randomized Decision Trees compiled by Alvin Wan from Professor Jitendra Malik s lecture Discrete Variables First, let us consider some terminology. We have primarily been dealing with real-valued data,
More informationarxiv: v1 [cs.ds] 25 Jan 2016
A Novel Graph-based Approach for Determining Molecular Similarity Maritza Hernandez 1, Arman Zaribafiyan 1,2, Maliheh Aramon 1, and Mohammad Naghibi 3 1 1QB Information Technologies (1QBit), Vancouver,
More informationIn: Advances in Intelligent Data Analysis (AIDA), International Computer Science Conventions. Rochester New York, 1999
In: Advances in Intelligent Data Analysis (AIDA), Computational Intelligence Methods and Applications (CIMA), International Computer Science Conventions Rochester New York, 999 Feature Selection Based
More informationMaximum Entropy Klassifikator; Klassifikation mit Scikit-Learn
Maximum Entropy Klassifikator; Klassifikation mit Scikit-Learn Benjamin Roth Centrum für Informations- und Sprachverarbeitung Ludwig-Maximilian-Universität München beroth@cis.uni-muenchen.de Benjamin Roth
More informationAutomated Solar Flare Prediction: Is it a myth?
Automated Solar Flare Prediction: Is it a myth? Tufan Colak, t.colak@bradford.ac.uk Rami Qahwaji, Omar W. Ahmed, Paul Higgins* University of Bradford, U.K.,Trinity Collage Dublin, Ireland* European Space
More informationMSc Project Feature Selection using Information Theoretic Techniques. Adam Pocock
MSc Project Feature Selection using Information Theoretic Techniques Adam Pocock pococka4@cs.man.ac.uk 15/08/2008 Abstract This document presents a investigation into 3 different areas of feature selection,
More informationMini-project in scientific computing
Mini-project in scientific computing Eran Treister Computer Science Department, Ben-Gurion University of the Negev, Israel. March 7, 2018 1 / 30 Scientific computing Involves the solution of large computational
More informationThe connection of dropout and Bayesian statistics
The connection of dropout and Bayesian statistics Interpretation of dropout as approximate Bayesian modelling of NN http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf Dropout Geoffrey Hinton Google, University
More informationMidterm: CS 6375 Spring 2018
Midterm: CS 6375 Spring 2018 The exam is closed book (1 cheat sheet allowed). Answer the questions in the spaces provided on the question sheets. If you run out of room for an answer, use an additional
More informationNatural Language Processing. Classification. Features. Some Definitions. Classification. Feature Vectors. Classification I. Dan Klein UC Berkeley
Natural Language Processing Classification Classification I Dan Klein UC Berkeley Classification Automatically make a decision about inputs Example: document category Example: image of digit digit Example:
More informationarxiv: v2 [quant-ph] 2 Oct 2014
A Quantum Annealing Approach for Fault Detection and Diagnosis of Graph-Based Systems Alejandro Perdomo-Ortiz,, 2, a) Joseph Fluegemann,, 3 Sriram Narasimhan, 2 Rupak Biswas, and Vadim N. Smelyanskiy )
More informationMidterm: CS 6375 Spring 2015 Solutions
Midterm: CS 6375 Spring 2015 Solutions The exam is closed book. You are allowed a one-page cheat sheet. Answer the questions in the spaces provided on the question sheets. If you run out of room for an
More informationReducing Computation Time for the Analysis of Large Social Science Datasets
Reducing Computation Time for the Analysis of Large Social Science Datasets Douglas G. Bonett Center for Statistical Analysis in the Social Sciences University of California, Santa Cruz Jan 28, 2014 Overview
More informationSupport vector machines Lecture 4
Support vector machines Lecture 4 David Sontag New York University Slides adapted from Luke Zettlemoyer, Vibhav Gogate, and Carlos Guestrin Q: What does the Perceptron mistake bound tell us? Theorem: The
More informationSupport Vector Machines: Maximum Margin Classifiers
Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind
More informationCSE 151 Machine Learning. Instructor: Kamalika Chaudhuri
CSE 151 Machine Learning Instructor: Kamalika Chaudhuri Ensemble Learning How to combine multiple classifiers into a single one Works well if the classifiers are complementary This class: two types of
More information