Impression Store: Compressive Sensing-based Storage for. Big Data Analytics
|
|
- Ada Townsend
- 5 years ago
- Views:
Transcription
1 Impression Store: Compressive Sensing-based Storage for Big Data Analytics Jiaxing Zhang, Ying Yan, Liang Jeff Chen, Minjie Wang, Thomas Moscibroda & Zheng Zhang Microsoft Research
2 The Curse of O(N) in Big Data Era In the old days, an O(N) algorithm was efficient But what if N is increasing fast? Parallelism is only a partial solution O(N) O N k k: # of machines It s an illusion that we can compute against all the data What we collect is always a sample By the time we finish computing, new data has generated Approximate Results Suffice! N k
3 Impression Store Provides an abstraction of big data vectors Support retrieval of big data components Store impression information rather than raw data Improvements the performance Save storage capacity Save IO bandwidth High scalability
4 System Design Query Top-K/Outlier-K Δx 1 Δx 2 Δx L 1 Δx L Update Synchronization f(δx 1 ) f(δx 2 ) Eventually consistent f(δx L 1 ) f(δx L ) Key technique: Compressive Sensing Node 1 Node 2 Node L 1 Node L 1. High Scalability: Any node can process any update / query Synchronization 2. Storage, memory, IO and communication cost efficient 3. High throughput to Impression Query- Top, Outlier & mode Compression Incremental updates on compression domain Uncompressing big components only
5 Introduction of Compressive Sensing Data Vector (Sparse) compression Length N Length M Measurement decompression x 1 x 2 x N φ 1 φ M y = Φ x y 1 y 2 y M M M N Recovery Algorithms: Orthogonal Matching Pursuit (OMP) Random Projection, Φ is a random matrix = N Recovered Data Vector x 1 x 2 x N
6 Compressive Sensing vs. Compression Decomposable Compression: x = x 1 + x 2 y = Φx = Φ x 1 + x 2 = Φx 1 + Φx 2 = y 1 + y 2 Distributed Aggregation y = y 1 + y 2 Continuous Updated Data y = y 1 + y 2 y 1 = Φx 1 y 2 = Φx 2 y 1 = Φx 0 y 2 = ΦΔx x 1 x 2 Data: x 1 + x 2 x 0 Δx Data: x0 + Δx Node 1 Node 2 Recovery Algorithm (OMP)/Our BOMP Big components have more precision Base data Update
7 Architecture Update Client Impression Store API Query Issue to any node Accumulated Update Δx 1 Impression Store Query Top-K/Outlier-K Measurement: Δy 1 = ΦΔx 1 Δy 2 1 Recovery: from y L Update Synchronization y 1 Δy 1 Δy 1 +Δy 2 Σ L 2 i=1 Δy i Σ L 1 i=1 Δy i Σ i=2 Δy i y 2 y L 1 y y = Σ i=1 Δy i Σ i=3 Δy i 1 + L Oracle Measurement Node 1 Node 2 Node L 1 Node L
8 Client Update Client Impression Store API Query 1. Map a table into data vectors 2. Translation: SQL->operations on Vector Issue to any node Accumulated Update Δx 1 Impression Store Query Top-K/Outlier-K Measurement: Δy 1 = ΦΔx 1 Δy 2 1 Recovery: from y L Update Synchronization y 1 Δy 1 Δy 1 +Δy 2 Σ L 2 i=1 Δy i Σ L 1 i=1 Δy i Σ i=2 Δy i y 2 y L 1 y y = Σ i=1 Δy i Σ i=3 Δy i 1 + L Oracle Measurement Node 1 Node 2 Node L 1 Node L
9 Architecture Update (i, x) Client Impression Store API Top(K): returns top-k Outlier(K): returns outlier-k and mode Issue to any node Accumulated Update Δx 1 Impression Store Query Top-K/Outlier-K Measurement: Δy 1 = ΦΔx 1 Δy 2 1 Recovery: from y L Update Synchronization y 1 Δy 1 Δy 1 +Δy 2 Σ L 2 i=1 Δy i Σ L 1 i=1 Δy i Σ i=2 Δy i y 2 y L 1 y y = Σ i=1 Δy i Σ i=3 Δy i 1 + L Oracle Measurement Node 1 Node 2 Node L 1 Node L
10 Query Processing Update Issue to any node Client Impression Store API Query Each node continuously works on three tasks: 1. Aggregate and compress data updates 2. Update Synchronization O(M) 3. Top/outlier-K recovery Matrix computation->gpu Impression Store Accumulated Update Δx 1 Query Top-K/Outlier-K Top/outlier-K Recovery Top/outlier-K Recovery Measurement: Δy 1 2 L 1 Recovery: Recovery: from from y 1 = ΦΔx 1 Δy 2 1 y L L Update Synchronization y 1 Δy 1 Δy 1 +Δy 2 Σ L 2 i=1 Δy i Σ L 1 i=1 Δy i Σ i=2 Δy i y 2 y L 1 y y = Σ i=1 Δy i Σ i=3 Δy i 1 + L Oracle Measurement Node 1 Node 2 Node L 1 Node L
11 Update Synchronization Client Update Impression Store API Query Goal: y i converges to y quickly Randomly issue Issue to to any node Accumulated Update Δx 1 Impression Store Query Top-K/Outlier-K Measurement: Δy 1 = ΦΔx 1 Δy 2 1 L Recovery: from y L Update Synchronization y 1 Δy 1 Δy Δy 1 1 +Δy +Δy 2 2 Σ i=1 Σ L 2 i=1 Δy ii ΣΣ L 1 L 1 i=1 i=1 Δy ii Oracle Measurement y 2 y L 1 y y = Σ L Σ L i=1 Δy i Σ L Σ L i=2 Δy i i=3 Δy i 1 +Δy Σ L i=2 Δy i i=3 Δy i L 1 + L Node 1 Node 2 Node L 1 Node L
12 Update Synchronization Synchronization policy ψ p l Δy y ψ w l Loop-free topology Master-Slave tree structure Small latency Load is not balanced Node p Node l Topology in between - trade off y = Δy + ψ q l q N(l) Send to p: ψ l p = Δy + q N(l) ψ q l ψ p l = y ψ p l Line structure Long latency Load is balanced Each pair of Send-Receive copies my not be the same all the time! The policy is proved to achieve eventual consistency.
13 Optimizations Speed up the recovery O(M 2 N) GPU: 30~40X speed-up For continuous updates Optimizing the recover algorithm by keeping the positions of the last recovery Reduce the complexity to O(M 3 )
14 Experiment Setup Error metrics Example Ground truth Key Value Approximation Key Value E p = = 20% E v = 3.88% Workload: Revenue on Ads entries in Bing Search engine Group by 6 attributes (Market, Vertical, QueryClass ) Totally N=12,891 user-interested entries in the vector
15 Preliminary Results (1) Effect of M and N on Recover Quality E p E v
16 Preliminary Results (2) Bigger value can be recovered much more accurate with smaller M
17 Preliminary Results (3) Compare with traditional Top-K only approach (K=M) Approximated x 1 + x 2 (top-k) Approximated x 1 + x 2 (top-k) Recovery Algorithm Merge y = y 1 + y 2 Top-K in x 1 Top-K in x 2 y 1 = Φx 1 y 2 = Φx 2 x 1 x 2 x 1 x 2 Data: x 1 + x 2 Traditional Top-K Approach Compressive Sensing Approach
18 Ongoing and Future work Support more sophisticated queries Exploring CS and other techniques Work together with sampling Multiple parallel queries to different nodes can improve confidence
19 Thanks!
Large-Scale Behavioral Targeting
Large-Scale Behavioral Targeting Ye Chen, Dmitry Pavlov, John Canny ebay, Yandex, UC Berkeley (This work was conducted at Yahoo! Labs.) June 30, 2009 Chen et al. (KDD 09) Large-Scale Behavioral Targeting
More informationAstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis
AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Joint work with: Ian Foster: Univ. of
More informationCS425: Algorithms for Web Scale Data
CS425: Algorithms for Web Scale Data Most of the slides are from the Mining of Massive Datasets book. These slides have been modified for CS425. The original slides can be accessed at: www.mmds.org Challenges
More informationRESEARCH ON THE DISTRIBUTED PARALLEL SPATIAL INDEXING SCHEMA BASED ON R-TREE
RESEARCH ON THE DISTRIBUTED PARALLEL SPATIAL INDEXING SCHEMA BASED ON R-TREE Yuan-chun Zhao a, b, Cheng-ming Li b a. Shandong University of Science and Technology, Qingdao 266510 b. Chinese Academy of
More informationCS 347 Parallel and Distributed Data Processing
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 4: Query Optimization Query Optimization Cost estimation Strategies for exploring plans Q min CS 347 Notes 4 2 Cost Estimation Based on
More informationHow to deal with uncertainties and dynamicity?
How to deal with uncertainties and dynamicity? http://graal.ens-lyon.fr/ lmarchal/scheduling/ 19 novembre 2012 1/ 37 Outline 1 Sensitivity and Robustness 2 Analyzing the sensitivity : the case of Backfilling
More informationBentley Map V8i (SELECTseries 3)
Bentley Map V8i (SELECTseries 3) A quick overview Why Bentley Map Viewing and editing of geospatial data from file based GIS formats, spatial databases and raster Assembling geospatial/non-geospatial data
More information12 Review and Outlook
12 Review and Outlook 12.1 Review 12.2 Outlook http://www-kdd.isti.cnr.it/nwa Spatial Databases and GIS Karl Neumann, Sarah Tauscher Ifis TU Braunschweig 926 What are the basic functions of a geographic
More informationComposite Quantization for Approximate Nearest Neighbor Search
Composite Quantization for Approximate Nearest Neighbor Search Jingdong Wang Lead Researcher Microsoft Research http://research.microsoft.com/~jingdw ICML 104, joint work with my interns Ting Zhang from
More informationDetecting Sparse Structures in Data in Sub-Linear Time: A group testing approach
Detecting Sparse Structures in Data in Sub-Linear Time: A group testing approach Boaz Nadler The Weizmann Institute of Science Israel Joint works with Inbal Horev, Ronen Basri, Meirav Galun and Ery Arias-Castro
More informationAstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis
AstroPortal: A Science Gateway for Large-scale Astronomy Data Analysis Ioan Raicu Distributed Systems Laboratory Computer Science Department University of Chicago Joint work with: Ian Foster: Univ. of
More informationRAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures
RAID+: Deterministic and Balanced Data Distribution for Large Disk Enclosures Guangyan Zhang, Zican Huang, Xiaosong Ma SonglinYang, Zhufan Wang, Weimin Zheng Tsinghua University Qatar Computing Research
More informationEstimation of DNS Source and Cache Dynamics under Interval-Censored Age Sampling
Estimation of DNS Source and Cache Dynamics under Interval-Censored Age Sampling Di Xiao, Xiaoyong Li, Daren B.H. Cline, Dmitri Loguinov Internet Research Lab Department of Computer Science and Engineering
More informationA Tale of Two Erasure Codes in HDFS
A Tale of Two Erasure Codes in HDFS Dynamo Mingyuan Xia *, Mohit Saxena +, Mario Blaum +, and David A. Pease + * McGill University, + IBM Research Almaden FAST 15 何军权 2015-04-30 1 Outline Introduction
More informationHigh Performance Computing
Master Degree Program in Computer Science and Networking, 2014-15 High Performance Computing 2 nd appello February 11, 2015 Write your name, surname, student identification number (numero di matricola),
More informationProgressive & Algorithms & Systems
University of California Merced Lawrence Berkeley National Laboratory Progressive Computation for Data Exploration Progressive Computation Online Aggregation (OLA) in DB Query Result Estimate Result ε
More informationBehavioral Simulations in MapReduce
Behavioral Simulations in MapReduce Guozhang Wang, Marcos Vaz Salles, Benjamin Sowell, Xun Wang, Tuan Cao, Alan Demers, Johannes Gehrke, Walker White Cornell University 1 What are Behavioral Simulations?
More informationAd Placement Strategies
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox 2014 Emily Fox January
More informationParallel programming using MPI. Analysis and optimization. Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco
Parallel programming using MPI Analysis and optimization Bhupender Thakur, Jim Lupo, Le Yan, Alex Pacheco Outline l Parallel programming: Basic definitions l Choosing right algorithms: Optimal serial and
More informationCMP 338: Third Class
CMP 338: Third Class HW 2 solution Conversion between bases The TINY processor Abstraction and separation of concerns Circuit design big picture Moore s law and chip fabrication cost Performance What does
More informationSparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery
Sparse analysis Lecture V: From Sparse Approximation to Sparse Signal Recovery Anna C. Gilbert Department of Mathematics University of Michigan Connection between... Sparse Approximation and Compressed
More informationNotation. Bounds on Speedup. Parallel Processing. CS575 Parallel Processing
Parallel Processing CS575 Parallel Processing Lecture five: Efficiency Wim Bohm, Colorado State University Some material from Speedup vs Efficiency in Parallel Systems - Eager, Zahorjan and Lazowska IEEE
More informationEfficient implementation of the overlap operator on multi-gpus
Efficient implementation of the overlap operator on multi-gpus Andrei Alexandru Mike Lujan, Craig Pelissier, Ben Gamari, Frank Lee SAAHPC 2011 - University of Tennessee Outline Motivation Overlap operator
More informationLow-Complexity FPGA Implementation of Compressive Sensing Reconstruction
2013 International Conference on Computing, Networking and Communications, Multimedia Computing and Communications Symposium Low-Complexity FPGA Implementation of Compressive Sensing Reconstruction Jerome
More informationECEN 689 Special Topics in Data Science for Communications Networks
ECEN 689 Special Topics in Data Science for Communications Networks Nick Duffield Department of Electrical & Computer Engineering Texas A&M University Lecture 5 Optimizing Fixed Size Samples Sampling as
More informationFast, Cheap and Deep Scaling machine learning
Fast, Cheap and Deep Scaling machine learning SFW Alexander Smola CMU Machine Learning and github.com/dmlc Many thanks to Mu Li Dave Andersen Chris Dyer Li Zhou Ziqi Liu Manzil Zaheer Qicong Chen Amr Ahmed
More informationFree and Open Source Software for Cadastre and Land Registration : A Hidden Treasure? Gertrude Pieper Espada. Overview
Free and Open Source Software for Cadastre and Land Registration : A Hidden Treasure? Gertrude Pieper Espada Overview FLOSS concepts Digital Land Administration systems FLOSS Database alternatives FLOSS
More informationMultimedia Databases - 68A6 Final Term - exercises
Multimedia Databases - 8A Final Term - exercises Exercises for the preparation to the final term June, the 1th 00 quiz 1. approximation of cosine similarity An approximate computation of the cosine similarity
More informationarxiv: v1 [cs.dc] 22 Oct 2018
FANTOM: A SCALABLE FRAMEWORK FOR ASYNCHRONOUS DISTRIBUTED SYSTEMS A PREPRINT Sang-Min Choi, Jiho Park, Quan Nguyen, and Andre Cronje arxiv:1810.10360v1 [cs.dc] 22 Oct 2018 FANTOM Lab FANTOM Foundation
More informationI Can t Believe It s Not Causal! Scalable Causal Consistency with No Slowdown Cascades
I Can t Believe It s Not Causal! Scalable Causal Consistency with No Slowdown Cascades Syed Akbar Mehdi 1, Cody Littley 1, Natacha Crooks 1, Lorenzo Alvisi 1,4, Nathan Bronson 2, Wyatt Lloyd 3 1 UT Austin,
More informationEssentials of Large Volume Data Management - from Practical Experience. George Purvis MASS Data Manager Met Office
Essentials of Large Volume Data Management - from Practical Experience George Purvis MASS Data Manager Met Office There lies trouble ahead Once upon a time a Project Manager was tasked to go forth and
More informationCompressed Sensing: Extending CLEAN and NNLS
Compressed Sensing: Extending CLEAN and NNLS Ludwig Schwardt SKA South Africa (KAT Project) Calibration & Imaging Workshop Socorro, NM, USA 31 March 2009 Outline 1 Compressed Sensing (CS) Introduction
More informationThe conceptual view. by Gerrit Muller University of Southeast Norway-NISE
by Gerrit Muller University of Southeast Norway-NISE e-mail: gaudisite@gmail.com www.gaudisite.nl Abstract The purpose of the conceptual view is described. A number of methods or models is given to use
More informationNEC PerforCache. Influence on M-Series Disk Array Behavior and Performance. Version 1.0
NEC PerforCache Influence on M-Series Disk Array Behavior and Performance. Version 1.0 Preface This document describes L2 (Level 2) Cache Technology which is a feature of NEC M-Series Disk Array implemented
More informationGeneralized Orthogonal Matching Pursuit- A Review and Some
Generalized Orthogonal Matching Pursuit- A Review and Some New Results Department of Electronics and Electrical Communication Engineering Indian Institute of Technology, Kharagpur, INDIA Table of Contents
More informationParallel and Distributed Stochastic Learning -Towards Scalable Learning for Big Data Intelligence
Parallel and Distributed Stochastic Learning -Towards Scalable Learning for Big Data Intelligence oé LAMDA Group H ŒÆOŽÅ Æ EâX ^ #EâI[ : liwujun@nju.edu.cn Dec 10, 2016 Wu-Jun Li (http://cs.nju.edu.cn/lwj)
More informationIntroducing a Bioinformatics Similarity Search Solution
Introducing a Bioinformatics Similarity Search Solution 1 Page About the APU 3 The APU as a Driver of Similarity Search 3 Similarity Search in Bioinformatics 3 POC: GSI Joins Forces with the Weizmann Institute
More informationChapter 7. Sequential Circuits Registers, Counters, RAM
Chapter 7. Sequential Circuits Registers, Counters, RAM Register - a group of binary storage elements suitable for holding binary info A group of FFs constitutes a register Commonly used as temporary storage
More informationPERFORMANCE METRICS. Mahdi Nazm Bojnordi. CS/ECE 6810: Computer Architecture. Assistant Professor School of Computing University of Utah
PERFORMANCE METRICS Mahdi Nazm Bojnordi Assistant Professor School of Computing University of Utah CS/ECE 6810: Computer Architecture Overview Announcement Jan. 17 th : Homework 1 release (due on Jan.
More informationLecture 23: Illusiveness of Parallel Performance. James C. Hoe Department of ECE Carnegie Mellon University
18 447 Lecture 23: Illusiveness of Parallel Performance James C. Hoe Department of ECE Carnegie Mellon University 18 447 S18 L23 S1, James C. Hoe, CMU/ECE/CALCM, 2018 Your goal today Housekeeping peel
More informationChe-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University
Che-Wei Chang chewei@mail.cgu.edu.tw Department of Computer Science and Information Engineering, Chang Gung University } 2017/11/15 Midterm } 2017/11/22 Final Project Announcement 2 1. Introduction 2.
More informationA Computation- and Communication-Optimal Parallel Direct 3-body Algorithm
A Computation- and Communication-Optimal Parallel Direct 3-body Algorithm Penporn Koanantakool and Katherine Yelick {penpornk, yelick}@cs.berkeley.edu Computer Science Division, University of California,
More informationImage Compression Using the Haar Wavelet Transform
College of the Redwoods http://online.redwoods.cc.ca.us/instruct/darnold/laproj/fall2002/ames/ 1/33 Image Compression Using the Haar Wavelet Transform Greg Ames College of the Redwoods Math 45 Linear Algebra
More information2.5D algorithms for distributed-memory computing
ntroduction for distributed-memory computing C Berkeley July, 2012 1/ 62 ntroduction Outline ntroduction Strong scaling 2.5D factorization 2/ 62 ntroduction Strong scaling Solving science problems faster
More informationUSING SINGULAR VALUE DECOMPOSITION (SVD) AS A SOLUTION FOR SEARCH RESULT CLUSTERING
POZNAN UNIVE RSIY OF E CHNOLOGY ACADE MIC JOURNALS No. 80 Electrical Engineering 2014 Hussam D. ABDULLA* Abdella S. ABDELRAHMAN* Vaclav SNASEL* USING SINGULAR VALUE DECOMPOSIION (SVD) AS A SOLUION FOR
More informationArtificial Intelligence Hopfield Networks
Artificial Intelligence Hopfield Networks Andrea Torsello Network Topologies Single Layer Recurrent Network Bidirectional Symmetric Connection Binary / Continuous Units Associative Memory Optimization
More informationProcessing Big Data Matrix Sketching
Processing Big Data Matrix Sketching Dimensionality reduction Linear Principal Component Analysis: SVD-based Compressed sensing Matrix sketching Non-linear Kernel PCA Isometric mapping Matrix sketching
More informationLecture 1 September 3, 2013
CS 229r: Algorithms for Big Data Fall 2013 Prof. Jelani Nelson Lecture 1 September 3, 2013 Scribes: Andrew Wang and Andrew Liu 1 Course Logistics The problem sets can be found on the course website: http://people.seas.harvard.edu/~minilek/cs229r/index.html
More informationCompressed Sensing and Linear Codes over Real Numbers
Compressed Sensing and Linear Codes over Real Numbers Henry D. Pfister (joint with Fan Zhang) Texas A&M University College Station Information Theory and Applications Workshop UC San Diego January 31st,
More informationCPU SCHEDULING RONG ZHENG
CPU SCHEDULING RONG ZHENG OVERVIEW Why scheduling? Non-preemptive vs Preemptive policies FCFS, SJF, Round robin, multilevel queues with feedback, guaranteed scheduling 2 SHORT-TERM, MID-TERM, LONG- TERM
More informationCS 347. Parallel and Distributed Data Processing. Spring Notes 11: MapReduce
CS 347 Parallel and Distributed Data Processing Spring 2016 Notes 11: MapReduce Motivation Distribution makes simple computations complex Communication Load balancing Fault tolerance Not all applications
More informationThe Quantum Supremacy Experiment
The Quantum Supremacy Experiment John Martinis, Google & UCSB New tests of QM: Does QM work for 10 15 Hilbert space? Does digitized error model also work? Demonstrate exponential computing power: Check
More informationDistributed Inexact Newton-type Pursuit for Non-convex Sparse Learning
Distributed Inexact Newton-type Pursuit for Non-convex Sparse Learning Bo Liu Department of Computer Science, Rutgers Univeristy Xiao-Tong Yuan BDAT Lab, Nanjing University of Information Science and Technology
More informationSource Coding and Function Computation: Optimal Rate in Zero-Error and Vanishing Zero-Error Regime
Source Coding and Function Computation: Optimal Rate in Zero-Error and Vanishing Zero-Error Regime Solmaz Torabi Dept. of Electrical and Computer Engineering Drexel University st669@drexel.edu Advisor:
More informationOn Content Indexing for Off-Path Caching in Information-Centric Networks
On Content Indexing for Off-Path Caching in Information-Centric Networks Suzan Bayhan, Liang Wang, Jörg Ott, Jussi Kangasharju, Arjuna Sathiaseelan, Jon Crowcroft University of Helsinki (Finland), TU Munich
More informationOne Optimized I/O Configuration per HPC Application
One Optimized I/O Configuration per HPC Application Leveraging I/O Configurability of Amazon EC2 Cloud Mingliang Liu, Jidong Zhai, Yan Zhai Tsinghua University Xiaosong Ma North Carolina State University
More informationQ520: Answers to the Homework on Hopfield Networks. 1. For each of the following, answer true or false with an explanation:
Q50: Answers to the Homework on Hopfield Networks 1. For each of the following, answer true or false with an explanation: a. Fix a Hopfield net. If o and o are neighboring observation patterns then Φ(
More informationOnline Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems
Online Scheduling Switch for Maintaining Data Freshness in Flexible Real-Time Systems Song Han 1 Deji Chen 2 Ming Xiong 3 Aloysius K. Mok 1 1 The University of Texas at Austin 2 Emerson Process Management
More informationGeodatabase Management Pathway
Geodatabase Management Pathway Table of Contents ArcGIS Desktop II: Tools and Functionality 3 ArcGIS Desktop III: GIS Workflows and Analysis 6 Building Geodatabases 8 Data Management in the Multiuser Geodatabase
More informationFUSION METHODS BASED ON COMMON ORDER INVARIABILITY FOR META SEARCH ENGINE SYSTEMS
FUSION METHODS BASED ON COMMON ORDER INVARIABILITY FOR META SEARCH ENGINE SYSTEMS Xiaohua Yang Hui Yang Minjie Zhang Department of Computer Science School of Information Technology & Computer Science South-China
More informationCompiling Techniques
Lecture 11: Introduction to 13 November 2015 Table of contents 1 Introduction Overview The Backend The Big Picture 2 Code Shape Overview Introduction Overview The Backend The Big Picture Source code FrontEnd
More informationGeodatabase Best Practices. Dave Crawford Erik Hoel
Geodatabase Best Practices Dave Crawford Erik Hoel Geodatabase best practices - outline Geodatabase creation Data ownership Data model Data configuration Geodatabase behaviors Data integrity and validation
More informationCSCE 561 Information Retrieval System Models
CSCE 561 Information Retrieval System Models Satya Katragadda 26 August 2015 Agenda Introduction to Information Retrieval Inverted Index IR System Models Boolean Retrieval Model 2 Introduction Information
More informationQuery Analyzer for Apache Pig
Imperial College London Department of Computing Individual Project: Final Report Query Analyzer for Apache Pig Author: Robert Yau Zhou 00734205 (robert.zhou12@imperial.ac.uk) Supervisor: Dr Peter McBrien
More informationRandomness-in-Structured Ensembles for Compressed Sensing of Images
Randomness-in-Structured Ensembles for Compressed Sensing of Images Abdolreza Abdolhosseini Moghadam Dep. of Electrical and Computer Engineering Michigan State University Email: abdolhos@msu.edu Hayder
More informationHigh-Dimensional Indexing by Distributed Aggregation
High-Dimensional Indexing by Distributed Aggregation Yufei Tao ITEE University of Queensland In this lecture, we will learn a new approach for indexing high-dimensional points. The approach borrows ideas
More informationA Nonuniform Quantization Scheme for High Speed SAR ADC Architecture
A Nonuniform Quantization Scheme for High Speed SAR ADC Architecture Youngchun Kim Electrical and Computer Engineering The University of Texas Wenjuan Guo Intel Corporation Ahmed H Tewfik Electrical and
More informationPanorama des modèles et outils de programmation parallèle
Panorama des modèles et outils de programmation parallèle Sylvain HENRY sylvain.henry@inria.fr University of Bordeaux - LaBRI - Inria - ENSEIRB April 19th, 2013 1/45 Outline Introduction Accelerators &
More informationOptimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks
2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays Optimizing Loop Operation and Dataflow in FPGA Acceleration of Deep Convolutional Neural Networks Yufei Ma, Yu Cao, Sarma Vrudhula,
More informationUsing Oracle Rdb Partitioned Lock Trees. Norman Lastovica Oracle Rdb Engineering November 13, 06
Using Oracle Rdb Partitioned Lock Trees Norman Lastovica Oracle Rdb Engineering November 13, 06 Agenda Locking Review Partitioned Lock Trees in OpenVMS Clusters Performance tests 2 Disclaimers Tests represented
More informationEnvironmental Chemistry through Intelligent Atmospheric Data Analysis (EnChIlADA): A Platform for Mining ATOFMS and Other Atmospheric Data
Environmental Chemistry through Intelligent Atmospheric Data Analysis (EnChIlADA): A Platform for Mining ATOFMS and Other Atmospheric Data Katie Barton, John Choiniere, Melanie Yuen, and Deborah Gross
More informationScalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver
Scalable Hybrid Programming and Performance for SuperLU Sparse Direct Solver Sherry Li Lawrence Berkeley National Laboratory Piyush Sao Rich Vuduc Georgia Institute of Technology CUG 14, May 4-8, 14, Lugano,
More informationReconstruction of Block-Sparse Signals by Using an l 2/p -Regularized Least-Squares Algorithm
Reconstruction of Block-Sparse Signals by Using an l 2/p -Regularized Least-Squares Algorithm Jeevan K. Pant, Wu-Sheng Lu, and Andreas Antoniou University of Victoria May 21, 2012 Compressive Sensing 1/23
More informationCS 700: Quantitative Methods & Experimental Design in Computer Science
CS 700: Quantitative Methods & Experimental Design in Computer Science Sanjeev Setia Dept of Computer Science George Mason University Logistics Grade: 35% project, 25% Homework assignments 20% midterm,
More informationRevenue Maximization in a Cloud Federation
Revenue Maximization in a Cloud Federation Makhlouf Hadji and Djamal Zeghlache September 14th, 2015 IRT SystemX/ Telecom SudParis Makhlouf Hadji Outline of the presentation 01 Introduction 02 03 04 05
More informationArcGIS Deployment Pattern. Azlina Mahad
ArcGIS Deployment Pattern Azlina Mahad Agenda Deployment Options Cloud Portal ArcGIS Server Data Publication Mobile System Management Desktop Web Device ArcGIS An Integrated Web GIS Platform Portal Providing
More informationSPATIAL DATA MINING. Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM
SPATIAL DATA MINING Ms. S. Malathi, Lecturer in Computer Applications, KGiSL - IIM INTRODUCTION The main difference between data mining in relational DBS and in spatial DBS is that attributes of the neighbors
More informationSparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images
Sparse Solutions of Systems of Equations and Sparse Modelling of Signals and Images Alfredo Nava-Tudela ant@umd.edu John J. Benedetto Department of Mathematics jjb@umd.edu Abstract In this project we are
More informationModels of collective inference
Models of collective inference Laurent Massoulié (Microsoft Research-Inria Joint Centre) Mesrob I. Ohannessian (University of California, San Diego) Alexandre Proutière (KTH Royal Institute of Technology)
More informationIs Information-Centric Multi-Tree Routing Feasible?
Is Information-Centric Multi-Tree Routing Feasible? ICN workshop 2013 Michele Papalini (University of Lugano) joint work with: Antonio Carzaniga (University of Lugano) Koorosh Khazaei (University of Lugano)
More informationData Streams & Communication Complexity
Data Streams & Communication Complexity Lecture 1: Simple Stream Statistics in Small Space Andrew McGregor, UMass Amherst 1/25 Data Stream Model Stream: m elements from universe of size n, e.g., x 1, x
More informationMatrix Assembly in FEA
Matrix Assembly in FEA 1 In Chapter 2, we spoke about how the global matrix equations are assembled in the finite element method. We now want to revisit that discussion and add some details. For example,
More informationHW #4. (mostly by) Salim Sarımurat. 1) Insert 6 2) Insert 8 3) Insert 30. 4) Insert S a.
HW #4 (mostly by) Salim Sarımurat 04.12.2009 S. 1. 1. a. 1) Insert 6 2) Insert 8 3) Insert 30 4) Insert 40 2 5) Insert 50 6) Insert 61 7) Insert 70 1. b. 1) Insert 12 2) Insert 29 3) Insert 30 4) Insert
More informationA Cotton Irrigator s Decision Support System Using National, Regional and Local Data
A Cotton Irrigator s Decision Support System Using National, Regional and Local Data ISESS 2015, Melbourne Jamie Vleeshouwer, Nicholas J. Car, John Hornbuckle 26 March 2015 LAND & WATER FLAGSHIP / AGRICULTURE
More informationArcGIS Enterprise: What s New. Philip Heede Shannon Kalisky Melanie Summers Shreyas Shinde
ArcGIS Enterprise: What s New Philip Heede Shannon Kalisky Melanie Summers Shreyas Shinde ArcGIS Enterprise is the new name for ArcGIS for Server ArcGIS Enterprise Software Components ArcGIS Server Portal
More informationAdjoint-Based Uncertainty Quantification and Sensitivity Analysis for Reactor Depletion Calculations
Adjoint-Based Uncertainty Quantification and Sensitivity Analysis for Reactor Depletion Calculations Hayes F. Stripling 1 Marvin L. Adams 1 Mihai Anitescu 2 1 Department of Nuclear Engineering, Texas A&M
More informationProbabilistic Near-Duplicate. Detection Using Simhash
Probabilistic Near-Duplicate Detection Using Simhash Sadhan Sood, Dmitri Loguinov Presented by Matt Smith Internet Research Lab Department of Computer Science and Engineering Texas A&M University 27 October
More informationThe Pros and Cons of Compressive Sensing
The Pros and Cons of Compressive Sensing Mark A. Davenport Stanford University Department of Statistics Compressive Sensing Replace samples with general linear measurements measurements sampled signal
More informationReview: From problem to parallel algorithm
Review: From problem to parallel algorithm Mathematical formulations of interesting problems abound Poisson s equation Sources: Electrostatics, gravity, fluid flow, image processing (!) Numerical solution:
More informationData Canopy. Accelerating Exploratory Statistical Analysis. Abdul Wasay Xinding Wei Niv Dayan Stratos Idreos
Accelerating Exploratory Statistical Analysis Abdul Wasay inding Wei Niv Dayan Stratos Idreos Statistics are everywhere! Algorithms Systems Analytic Pipelines 80 Temperature 60 40 20 0 May 2017 80 Temperature
More informationDistributed Data Fusion with Kalman Filters. Simon Julier Computer Science Department University College London
Distributed Data Fusion with Kalman Filters Simon Julier Computer Science Department University College London S.Julier@cs.ucl.ac.uk Structure of Talk Motivation Kalman Filters Double Counting Optimal
More informationThe File Geodatabase API. Craig Gillgrass Lance Shipman
The File Geodatabase API Craig Gillgrass Lance Shipman Schedule Cell phones and pagers Please complete the session survey we take your feedback very seriously! Overview File Geodatabase API - Introduction
More informationOutline. policies for the first part. with some potential answers... MCS 260 Lecture 10.0 Introduction to Computer Science Jan Verschelde, 9 July 2014
Outline 1 midterm exam on Friday 11 July 2014 policies for the first part 2 questions with some potential answers... MCS 260 Lecture 10.0 Introduction to Computer Science Jan Verschelde, 9 July 2014 Intro
More informationParallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption. Langshi CHEN 1,2,3 Supervised by Serge PETITON 2
1 / 23 Parallel Asynchronous Hybrid Krylov Methods for Minimization of Energy Consumption Langshi CHEN 1,2,3 Supervised by Serge PETITON 2 Maison de la Simulation Lille 1 University CNRS March 18, 2013
More informationAllocate-On-Use Space Complexity of Shared-Memory Algorithms
Allocate-On-Use Space Complexity of Shared-Memory Algorithms James Aspnes Bernhard Haeupler Alexander Tong Philipp Woelfel DISC 2018 Asynchronous shared memory model Processes: p 1 p 2 p 3 p n Objects:
More informationScalable Asynchronous Gradient Descent Optimization for Out-of-Core Models
Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models Chengjie Qin 1, Martin Torres 2, and Florin Rusu 2 1 GraphSQL, Inc. 2 University of California Merced August 31, 2017 Machine
More informationCME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.
CME342 Parallel Methods in Numerical Analysis Matrix Computation: Iterative Methods II Outline: CG & its parallelization. Sparse Matrix-vector Multiplication. 1 Basic iterative methods: Ax = b r = b Ax
More informationEnabling ENVI. ArcGIS for Server
Enabling ENVI throughh ArcGIS for Server 1 Imagery: A Unique and Valuable Source of Data Imagery is not just a base map, but a layer of rich information that can address problems faced by GIS users. >
More information4th year Project demo presentation
4th year Project demo presentation Colm Ó héigeartaigh CASE4-99387212 coheig-case4@computing.dcu.ie 4th year Project demo presentation p. 1/23 Table of Contents An Introduction to Quantum Computing The
More informationAmortized Complexity Main Idea
omp2711 S1 2006 Amortized Complexity Example 1 Amortized Complexity Main Idea Worst case analysis of run time complexity is often too pessimistic. Average case analysis may be difficult because (i) it
More information