Progressive & Algorithms & Systems

Size: px

Start display at page:

Download "Progressive & Algorithms & Systems"

Kimberly Holt
5 years ago
Views:

1 University of California Merced Lawrence Berkeley National Laboratory

2 Progressive Computation for Data Exploration

3 Progressive Computation Online Aggregation (OLA) in DB Query Result Estimate Result ε Confidence bounds Time How to derive confidence bounds that shrink progressively? How to generate a meaningful estimate and confidence bounds as early as possible? How to minimize the estimation overhead?

4 Outline 1 PF-OLA: Parallel Framework for Online Aggregation 2 OLA-GD: Online Aggregation for Gradient Descent Optimization 3 OLA-RAW: Online Aggregation over Raw Data

5 GLADE GLA 1 GLA 2 GLA k GLA Chunk 1 Begin Chunk Accumulate End Chunk GLA i Local Merge Local Term Chunk r Begin Chunk Accumulate End Chunk GLA j GLA 1 Node 1 Remote Merge GLA Term Result GLA 1 GLA 2 GLA l GLA GLA n Chunk 1 Begin Chunk Accumulate End Chunk GLA p Local Merge Local Term Chunk s Begin Chunk Accumulate End Chunk GLA q Node n

6 GLADE Architecture Coordinator Comm Manager Query Manager Code Generator GLA Manager Catalog Node 1 Node n Comm Manager Query Manager GLA Manager Storage Manager... Comm Manager Query Manager GLA Manager Storage Manager DataPath Exec. Engine Code Loader DataPath Exec. Engine Code Loader

7 PF-OLA API Method Init () Accumulate (Item d) Merge (UDA input 1,UDA input 2, UDA output) local and remote Terminate () local and final BeginChunk() EndChunk () Serialize () Deserialize () EstimatorTerminate () EstimatorMerge (UDA input 1,UDA input 2, UDA output) Estimate (estimator, lower, upper, confidence) Usage Basic interface Chunk processing Transfer UDA across processes Progressive computation OLA estimation

8 Partial Aggregation Execution Engine Processing WorkUnit 1 WorkUnit 2 Waypoint WorkUnit 1 Thread Pool Thread Thread Thread... Accumulate GLA GLA List WorkUnit 2 Merge GLA GLA WorkUnit 1 Done GLA... Have All Merged GLA WorkUnit 2 Done GLA } GLA Put one copy back to GLA list Partial Result chunks

9 Parallel Sampling Centralized random shuffling Permute data randomly at loading Scan produces larger samples Stratified sampling Permute data randomly in each partition Direct extension of random shuffling to partitioned data Global data randomization at loading Split data randomly at each node Permute all received data randomly Standard hash-based data partitioning

10 Sample Aggregation EST PartAgg EST PartAgg LocEst LocEst LocEst LocEst AGG AGG AGG AGG AGG AGG PartAgg AGG PartAgg AGG PartAgg AGG PartAgg AGG Chunk Chunk Chunk Chunk Chunk Chunk Chunk Chunk Centralized Distributed tree

11 Generic Sampling Estimator AGG =SELECT SUM(f (d)) FROM D WHERE P(d) S is simple random sample without replacement from D Estimator X = D S s S,P(s) f (s) Var [X ] = Est Var[X ] = D S D ( D 1) S D ( D S ) S 2 ( S 1) E [X ] = AGG d D,P(d) S s S,P(s) f 2 (d) f 2 (s) d D,P(d) s S,P(s) 2 f (d) 2 f (s)

12 Parallel Sampling Estimators Data are partitioned across N nodes: D = D 1 D 2 D N Take samples S i, 1 i N independently at each node Single Estimator Guarantee S = S 1 S 2 S N is a sample from D Synchronized estimator S i D i = k (const), 1 i N Asynchronous estimator Global data randomization Multiple Estimators Stratified sampling Build an estimator X i for each partition D i, 1 i N: X i = D i S i s S i,p(s) f (s) X = N i=1 X i is unbiased [ N ] Var i=1 X i = N i=1 Var [X i]

13 Time to Convergence SELECT n name, SUM(l extendprice*(1-l discount)*(1+l tax)) FROM lineitem, supplier, nation WHERE l shipdate = AND l quantity = 1 AND l discount between [0.02,0.03] AND l suppkey = s suppkey AND s nationkey = n nationkey GROUP BY n name Relative Error (%) Single Estimator, High Selectivity 1 node 2 nodes 4 nodes 8 nodes Time (seconds) TPC-H scale 8,000 (8TB) Single node: 16 2GHz; 16GB RAM; 4 110MB/s throughput/disk Cluster: 8 X worker + coordinator (9 nodes); Gigabit Ethernet; same rack

14 Estimation Overhead Query Execution Time (seconds) No estimation OLA Aggregate Group small Group large Join

15 References Chengjie Qin and. Sampling Estimators for Parallel Online Aggregation. BNCOD 2013, pp Chengjie Qin and. Parallel Online Aggregation in Action. SSDBM 2013, pp [Demo] Chengjie Qin and. PF-OLA: A High-Performance Framework for Parallel Online Aggregation. Distributed and Parallel Databases (DAPD), August 2013.

16 Outline 1 PF-OLA: Parallel Framework for Online Aggregation 2 OLA-GD: Online Aggregation for Gradient Descent Optimization 3 OLA-RAW: Online Aggregation over Raw Data

17 Gradient Descent Optimization

18 Gradient Descent as GLADE GLA w 11 w 12 w 1i w 1j Chunk 1 Chunk r η 11 η 1r w 1i =w 1i - β k f η1 (w 1i ) w 1i =w 1j - β k f η1 (w 1j ) End Chunk End Chunk w 1i w 1j w 1i + w 1j Local Term w 1 Node 1 w n1 w n2 w ni w nj w 1 + w n w Term w w n Chunk 1 Chunk s η n1 η ns w ni =w ni - β k f ηn (w ni ) w nj =w nj - β k f ηn (w nj ) End Chunk End Chunk w ni w nj w ni + w nj Local Term Node n

19 Approximate Gradient Descent Main idea: apply online aggregation (OLA) sampling to speed-up the execution of a speculative iteration Speculative Gradient Descent Approximate Gradient Descent w Λ(w ) w Λ(w ) Select s steps α 1, α 2,..., α s Select s steps α 1, α 2,..., α s w 1 w s w 1 w s Compute gradient & loss Update model Compute gradient & loss Estimate gradient & loss no estimators convergence yes Update model Estimate gradient & loss no w Λ(w ) w Λ(w ) no convergence yes no model convergence yes

20 Approximate BGD

21 Approximate BGD

22 Approximate BGD

23 Train SVM Model Loss per example BGD OLA # step sizes OLA OLA OLA Time (seconds) 50M examples, 200 dimensions, 136GB

24 Sample Percentage 40 Sample percentage (%) iteration # 50M examples, 200 dimensions, 136GB

25 References Chengjie Qin and. Speculative Approximations for Terascale Analytics. CoRR abs/ , December Chengjie Qin and. Speculative Approximations for Terascale Distributed Gradient Descent Optimization , Chengjie Qin, and Martin Torres. Scalable Analytics Model Calibration with Online Aggregation. IEEE Data Engineering Bulletin, Vol. 38, No. 3, pp , September 2015.

26 Outline 1 PF-OLA: Parallel Framework for Online Aggregation 2 OLA-GD: Online Aggregation for Gradient Descent Optimization 3 OLA-RAW: Online Aggregation over Raw Data

27 Motivation & Approach Time Q4 Q3 Q2 Q1 load Q4 Q3 Q2 Q1 shuffle Q4 Q3 Q2 Q1 Q4 Q3 Q2 Q1 DBMS OLA RAW OLA-RAW Error ratio DBMS OLA RAW OLA-RAW Elapsed time [sec]

28 OLA-RAW Architecture Bounds Sample Estimate Shuffle Read Raw file Text chunks Code-Generated Shuffle Extract Code-Generated Shuffle Extract Code-Generated Shuffle Extract Code-Generated Shuffle Extract Binary chunk sample Sample Sample OLA Sample Estimator In-memory Sample Synopsis

29 Parallel Bi-Level Sampling A B T Exact computation Chunk-level sampling Bi-level sampling Inspection paradox: result order random chunk order

30 Experimental Results Selectivity = 100% Error ratio Error ratio C-1 C-4 C-16 BI-1 BI-4 BI-16 EXT-1 EXT-4 EXT Elapsed time [sec] C-1 C-4 C-16 BI-1 BI-4 BI-16 EXT-1 EXT-4 EXT Elasped time [sec] Ratio of sampled chunks Selectivity = 50% Ratio of sampled chunks C BI # threads C BI # threads Ratio of sampled tuples Ratio of sampled tuples C BI # threads C BI # threads FlorinSelectivity Rusu = Progressive 10% & Algorithms & Systems

31 References Yu Cheng, Weijie Zhao, and. OLA-RAW: Scalable Exploration over Raw Data. CoRR abs/ , February Yu Cheng, Weijie Zhao, and. Bi-Level Online Aggregation on Raw Data. SSDBM 2017.

32 Thank you! Questions?

Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models

Scalable Asynchronous Gradient Descent Optimization for Out-of-Core Models Chengjie Qin 1, Martin Torres 2, and Florin Rusu 2 1 GraphSQL, Inc. 2 University of California Merced August 31, 2017 Machine