PetaBricks: Variable Accuracy and Online Learning

Size: px

Start display at page:

Download "PetaBricks: Variable Accuracy and Online Learning"

Dulcie Spencer
6 years ago
Views:

1 PetaBricks: Variable Accuracy and Online Learning Jason Ansel MIT - CSAIL May 4, 2011 Jason Ansel (MIT) PetaBricks May 4, / 40

2 Outline 1 Motivating Example 2 PetaBricks Language Overview 3 Variable Accuracy 4 Offline Autotuner 5 Results: Variable Accuracy 6 Online Autotuner: SiblingRivalry 7 Results: Sibling Rivalry 8 Conclusions Jason Ansel (MIT) PetaBricks May 4, / 40

3 A motivating example How would you write a fast sorting algorithm? Jason Ansel (MIT) PetaBricks May 4, / 40

4 A motivating example How would you write a fast sorting algorithm? Insertion sort Quick sort Merge sort Radix sort Jason Ansel (MIT) PetaBricks May 4, / 40

5 A motivating example How would you write a fast sorting algorithm? Insertion sort Quick sort Merge sort Radix sort Binary tree sort, Bitonic sort, Bubble sort, Bucket sort, Burstsort, Cocktail sort, Comb sort, Counting Sort, Distribution sort, Flashsort, Heapsort, Introsort, Library sort, Odd-even sort, Postman sort, Samplesort, Selection sort, Shell sort, Stooge sort, Strand sort, Timsort? Jason Ansel (MIT) PetaBricks May 4, / 40

6 A motivating example How would you write a fast sorting algorithm? Insertion sort Quick sort Merge sort Radix sort Binary tree sort, Bitonic sort, Bubble sort, Bucket sort, Burstsort, Cocktail sort, Comb sort, Counting Sort, Distribution sort, Flashsort, Heapsort, Introsort, Library sort, Odd-even sort, Postman sort, Samplesort, Selection sort, Shell sort, Stooge sort, Strand sort, Timsort? Poly-algorithms Jason Ansel (MIT) PetaBricks May 4, / 40

7 std::stable sort /usr/include/c++/4.5.2/bits/stl algo.h lines Jason Ansel (MIT) PetaBricks May 4, / 40

8 std::stable sort /usr/include/c++/4.5.2/bits/stl algo.h lines Jason Ansel (MIT) PetaBricks May 4, / 40

9 std::sort /usr/include/c++/4.5.2/bits/stl algo.h lines Jason Ansel (MIT) PetaBricks May 4, / 40

10 std::sort /usr/include/c++/4.5.2/bits/stl algo.h lines Why 16? Why 15? Jason Ansel (MIT) PetaBricks May 4, / 40

11 std::sort /usr/include/c++/4.5.2/bits/stl algo.h lines Why 16? Why 15? Dates back to at least 2000 (Jun 2000 SGI release) Still in current C++ STL shipped with GCC 10+ years of of S threshold = 16 Jason Ansel (MIT) PetaBricks May 4, / 40

12 Is 15 the right number? The best cutoff (CO) changes Depends on competing costs: Cost of computation (< operator, call overhead, etc) Cost of communication (swaps) Cache behavior (misses, prefetcher, locality) Jason Ansel (MIT) PetaBricks May 4, / 40

13 Is 15 the right number? The best cutoff (CO) changes Depends on competing costs: Cost of computation (< operator, call overhead, etc) Cost of communication (swaps) Cache behavior (misses, prefetcher, locality) Sorting doubles with std::stable sort: CO 200 optimal on a Phenom 905e (15% speedup over CO = 15) CO 400 optimal on a Opteron 6168 (15% speedup over CO = 15) CO 500 optimal on a Xeon E5320 (34% speedup over CO = 15) CO 700 optimal on a Xeon X5460 (25% speedup over CO = 15) Compiler s hands are tied, it is stuck with 15 Jason Ansel (MIT) PetaBricks May 4, / 40

14 Back to our motivating example How would you write a fast sorting algorithm? Insertion sort Quick sort Merge sort Radix sort Binary tree sort, Bitonic sort, Bubble sort, Bucket sort, Burstsort, Cocktail sort, Comb sort, Counting Sort, Distribution sort, Flashsort, Heapsort, Introsort, Library sort, Odd-even sort, Postman sort, Samplesort, Selection sort, Shell sort, Stooge sort, Strand sort, Timsort? Poly-algorithms Jason Ansel (MIT) PetaBricks May 4, / 40

15 Back to our motivating example How would you write a fast sorting algorithm? Insertion sort Quick sort Merge sort Radix sort Binary tree sort, Bitonic sort, Bubble sort, Bucket sort, Burstsort, Cocktail sort, Comb sort, Counting Sort, Distribution sort, Flashsort, Heapsort, Introsort, Library sort, Odd-even sort, Postman sort, Samplesort, Selection sort, Shell sort, Stooge sort, Strand sort, Timsort? Poly-algorithms Answer It depends! Jason Ansel (MIT) PetaBricks May 4, / 40

16 Autotuned parallel sorting algorithms On a Xeon E7340 (2 4 cores) 1 Insertion sort below Quick sort below way parallel merge sort Jason Ansel (MIT) PetaBricks May 4, / 40

17 Autotuned parallel sorting algorithms On a Xeon E7340 (2 4 cores) 1 Insertion sort below Quick sort below way parallel merge sort On a Sun Fire T200 Niagara (8 cores) 1 16-way merge sort below way merge sort below way merge sort below way parallel merge sort Jason Ansel (MIT) PetaBricks May 4, / 40

18 Autotuned parallel sorting algorithms On a Xeon E7340 (2 4 cores) 1 Insertion sort below Quick sort below way parallel merge sort On a Sun Fire T200 Niagara (8 cores) 1 16-way merge sort below way merge sort below way merge sort below way parallel merge sort 235% slowdown running Niagara algorithm on the Xeon 8% slowdown running Xeon algorithm on the Niagara Need a way to express algorithmic choices to enable autotuning Jason Ansel (MIT) PetaBricks May 4, / 40

19 Outline 1 Motivating Example 2 PetaBricks Language Overview 3 Variable Accuracy 4 Offline Autotuner 5 Results: Variable Accuracy 6 Online Autotuner: SiblingRivalry 7 Results: Sibling Rivalry 8 Conclusions Jason Ansel (MIT) PetaBricks May 4, / 40

20 Algorithmic choices Language e i t h e r { I n s e r t i o n S o r t ( out, i n ) ; } or { QuickSort ( out, i n ) ; } or { MergeSort ( out, i n ) ; } or { R a d i x S o r t ( out, i n ) ; } Jason Ansel (MIT) PetaBricks May 4, / 40

21 Algorithmic choices Language e i t h e r { I n s e r t i o n S o r t ( out, i n ) ; } or { QuickSort ( out, i n ) ; } or { MergeSort ( out, i n ) ; } or { R a d i x S o r t ( out, i n ) ; } Representation Decision tree synthesized by our evolutionary algorithm (EA) Jason Ansel (MIT) PetaBricks May 4, / 40

22 The PetaBricks language Choices expressed in the language High level algorithmic choices Dependency-based synthesized outer control flow Parallelization strategy Programs automatically adapt to their environment Tuned using our bottom-up evaluation algorithm Offline autotuner or always-on online autotuner Jason Ansel (MIT) PetaBricks May 4, / 40

23 Outline 1 Motivating Example 2 PetaBricks Language Overview 3 Variable Accuracy 4 Offline Autotuner 5 Results: Variable Accuracy 6 Online Autotuner: SiblingRivalry 7 Results: Sibling Rivalry 8 Conclusions Jason Ansel (MIT) PetaBricks May 4, / 40

24 Variable accuracy algorithms Many problems don t have a single correct answer Jason Ansel (MIT) PetaBricks May 4, / 40

25 Variable accuracy algorithms Many problems don t have a single correct answer Soft computing Approximation algorithms for NP-hard problems DSP algorithms Different grid resolutions Data precisions Iterative algorithms Choosing convergence criteria Jason Ansel (MIT) PetaBricks May 4, / 40

26 Variable accuracy example Example... f o r ( i n t i = 0 ; i < 100; ++i ) { S O R I t e r a t i o n ( tmp ) ; }... Jason Ansel (MIT) PetaBricks May 4, / 40

27 Variable accuracy example Example... f o r ( i n t i = 0 ; i < 100; ++i ) { S O R I t e r a t i o n ( tmp ) ; }... Competing objectives of performance and accuracy Must maximize performance while meeting accuracy targets Jason Ansel (MIT) PetaBricks May 4, / 40

28 Accuracy metrics and for enough loops Language a c c u r a c y m e t r i c MyRMSError Jason Ansel (MIT) PetaBricks May 4, / 40

29 Accuracy metrics and for enough loops Language a c c u r a c y m e t r i c... f o r e n o u g h { S O R I t e r a t i o n ( tmp ) ; } MyRMSError Jason Ansel (MIT) PetaBricks May 4, / 40

30 Accuracy metrics and for enough loops Language a c c u r a c y m e t r i c... f o r e n o u g h { S O R I t e r a t i o n ( tmp ) ; } MyRMSError Representation Function from problem size to number of iterations synthesized by our EA Jason Ansel (MIT) PetaBricks May 4, / 40

31 Accuracy variables Language a c c u r a c y m e t r i c MyRMSError a c c u r a c y v a r i a b l e k... f o r ( i n t i =0; i <k ; ++i ) { S O R I t e r a t i o n ( tmp ) ; } Jason Ansel (MIT) PetaBricks May 4, / 40

32 Accuracy variables Language a c c u r a c y m e t r i c MyRMSError a c c u r a c y v a r i a b l e k... f o r ( i n t i =0; i <k ; ++i ) { S O R I t e r a t i o n ( tmp ) ; } Representation Function from problem size to k synthesized by our EA Jason Ansel (MIT) PetaBricks May 4, / 40

33 Variable accuracy and algorithmic choices Language a c c u r a c y m e t r i c MyRMSError... e i t h e r { f o r e n o u g h { S O R I t e r a t i o n ( tmp ) ; } } or { M u l t i g r i d ( tmp ) ; } or { D i r e c t S o l v e ( tmp ) ; } Jason Ansel (MIT) PetaBricks May 4, / 40

34 Outline 1 Motivating Example 2 PetaBricks Language Overview 3 Variable Accuracy 4 Offline Autotuner 5 Results: Variable Accuracy 6 Online Autotuner: SiblingRivalry 7 Results: Sibling Rivalry 8 Conclusions Jason Ansel (MIT) PetaBricks May 4, / 40

35 Traditional evolution algorithm Initial population???? Cost = 0 Jason Ansel (MIT) PetaBricks May 4, / 40

36 Traditional evolution algorithm Initial population 72.7s??? Cost = 72.7 Jason Ansel (MIT) PetaBricks May 4, / 40

37 Traditional evolution algorithm Initial population 72.7s 10.5s?? Cost = 83.2 Jason Ansel (MIT) PetaBricks May 4, / 40

38 Traditional evolution algorithm Initial population 72.7s 10.5s 4.1s? Cost = 87.3 Jason Ansel (MIT) PetaBricks May 4, / 40

39 Traditional evolution algorithm Initial population 72.7s 10.5s 4.1s 31.2s Cost = Jason Ansel (MIT) PetaBricks May 4, / 40

40 Traditional evolution algorithm Initial population 72.7s 10.5s 4.1s 31.2s Cost = Generation 2???? Cost = 0 Jason Ansel (MIT) PetaBricks May 4, / 40

41 Traditional evolution algorithm Initial population 72.7s 10.5s 4.1s 31.2s Cost = Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1 Jason Ansel (MIT) PetaBricks May 4, / 40

42 Traditional evolution algorithm Initial population 72.7s 10.5s 4.1s 31.2s Cost = Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1 Generation 3???? Cost = 0 Jason Ansel (MIT) PetaBricks May 4, / 40

43 Traditional evolution algorithm Initial population 72.7s 10.5s 4.1s 31.2s Cost = Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1 Generation 3 2.8s 0.1s 3.8s 2.3s Cost = 9.0 Jason Ansel (MIT) PetaBricks May 4, / 40

44 Traditional evolution algorithm Initial population 72.7s 10.5s 4.1s 31.2s Cost = Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1 Generation 3 2.8s 0.1s 3.8s 2.3s Cost = 9.0 Generation 4???? Cost = 0 Jason Ansel (MIT) PetaBricks May 4, / 40

45 Traditional evolution algorithm Initial population 72.7s 10.5s 4.1s 31.2s Cost = Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1 Generation 3 2.8s 0.1s 3.8s 2.3s Cost = 9.0 Generation 4 0.3s 0.1s 0.4s 2.4s Cost = 3.2 Jason Ansel (MIT) PetaBricks May 4, / 40

46 Traditional evolution algorithm Initial population 72.7s 10.5s 4.1s 31.2s Cost = Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1 Generation 3 2.8s 0.1s 3.8s 2.3s Cost = 9.0 Generation 4 0.3s 0.1s 0.4s 2.4s Cost = 3.2 Cost of autotuning front-loaded in initial (unfit) population We could speed up tuning if we start with a faster initial population Jason Ansel (MIT) PetaBricks May 4, / 40

47 Traditional evolution algorithm Initial population 72.7s 10.5s 4.1s 31.2s Cost = Generation 2 4.2s 5.1s 2.6s 13.2s Cost = 25.1 Generation 3 2.8s 0.1s 3.8s 2.3s Cost = 9.0 Generation 4 0.3s 0.1s 0.4s 2.4s Cost = 3.2 Cost of autotuning front-loaded in initial (unfit) population We could speed up tuning if we start with a faster initial population Key insight Smaller input sizes can be used to form better initial population Jason Ansel (MIT) PetaBricks May 4, / 40

48 Bottom-up evolutionary algorithm Train on input size 64 Jason Ansel (MIT) PetaBricks May 4, / 40

49 Bottom-up evolutionary algorithm Train on input size 32, to form initial population for: Train on input size 64 Jason Ansel (MIT) PetaBricks May 4, / 40

50 Bottom-up evolutionary algorithm Train on input size 16, to form initial population for: Train on input size 32, to form initial population for: Train on input size 64 Jason Ansel (MIT) PetaBricks May 4, / 40

51 Bottom-up evolutionary algorithm Train on input size 8, to form initial population for: Train on input size 16, to form initial population for: Train on input size 32, to form initial population for: Train on input size 64 Jason Ansel (MIT) PetaBricks May 4, / 40

52 Bottom-up evolutionary algorithm Train on input size 2, to form initial population for: Train on input size 8, to form initial population for: Train on input size 16, to form initial population for: Train on input size 32, to form initial population for: Train on input size 64 Jason Ansel (MIT) PetaBricks May 4, / 40

53 Bottom-up evolutionary algorithm Train on input size 1, to form initial population for: Train on input size 2, to form initial population for: Train on input size 8, to form initial population for: Train on input size 16, to form initial population for: Train on input size 32, to form initial population for: Train on input size 64 Jason Ansel (MIT) PetaBricks May 4, / 40

54 Bottom-up evolutionary algorithm Train on input size 1, to form initial population for: Train on input size 2, to form initial population for: Train on input size 8, to form initial population for: Train on input size 16, to form initial population for: Train on input size 32, to form initial population for: Train on input size 64 Naturally exploits optimal substructure of problems Jason Ansel (MIT) PetaBricks May 4, / 40

55 Variable accuracy autotuner Jason Ansel (MIT) PetaBricks May 4, / 40

56 Variable accuracy autotuner Jason Ansel (MIT) PetaBricks May 4, / 40

57 Variable accuracy autotuner Partition accuracy space into discrete levels Jason Ansel (MIT) PetaBricks May 4, / 40

58 Variable accuracy autotuner Partition accuracy space into discrete levels Prune population to have a fixed number of representatives from each level Jason Ansel (MIT) PetaBricks May 4, / 40

59 Variable accuracy autotuner Partition accuracy space into discrete levels Prune population to have a fixed number of representatives from each level Jason Ansel (MIT) PetaBricks May 4, / 40

60 Variable accuracy autotuner Generation i Generation i + 1 Partition accuracy space into discrete levels Prune population to have a fixed number of representatives from each level Jason Ansel (MIT) PetaBricks May 4, / 40

61 Variable accuracy autotuner Generation i Generation i + 1 Partition accuracy space into discrete levels Prune population to have a fixed number of representatives from each level Jason Ansel (MIT) PetaBricks May 4, / 40

62 Outline 1 Motivating Example 2 PetaBricks Language Overview 3 Variable Accuracy 4 Offline Autotuner 5 Results: Variable Accuracy 6 Online Autotuner: SiblingRivalry 7 Results: Sibling Rivalry 8 Conclusions Jason Ansel (MIT) PetaBricks May 4, / 40

63 Changing accuracy requirements Speedup (x) Accuracy Level 2.0 Accuracy Level 1.5 Accuracy Level 1.0 Accuracy Level 0.8 Accuracy Level 0.6 Accuracy Level Input Size Image Compression Jason Ansel (MIT) PetaBricks May 4, / 40

64 Changing accuracy requirements Speedup (x) Accuracy Level 0.95 Accuracy Level 0.75 Accuracy Level 0.50 Accuracy Level 0.20 Accuracy Level 0.10 Accuracy Level 0.05 Speedup (x) Accuracy Level 1.01 Accuracy Level 1.1 Accuracy Level 1.2 Accuracy Level 1.3 Accuracy Level 1.4 Speedup (x) Accuracy Level 2.0 Accuracy Level 1.5 Accuracy Level 1.0 Accuracy Level 0.8 Accuracy Level 0.6 Accuracy Level Input Size Clustering e+06 Input Size Bin Packing Input Size Image Compression Speedup (x) Accuracy Level 10 9 Accuracy Level 10 7 Accuracy Level 10 5 Accuracy Level 10 3 Accuracy Level 10 1 Speedup (x) Accuracy Level 10 9 Accuracy Level 10 7 Accuracy Level 10 5 Accuracy Level 10 3 Accuracy Level 10 1 Speedup (x) Accuracy Level 3.0 Accuracy Level 2.0 Accuracy Level 1.5 Accuracy Level 1.0 Accuracy Level 0.5 Accuracy Level e Input Size 3D Helmholtz Input Size 2D Poisson Input Size Preconditioner Jason Ansel (MIT) PetaBricks May 4, / 40

65 Multigrid choice space Pseudo code a c c u r a c y m e t r i c MyRMSError... e i t h e r { f o r e n o u g h { S O R I t e r a t i o n ( tmp ) ; } } or { M u l t i g r i d ( tmp ) ; } or { D i r e c t S o l v e ( tmp ) ; } Jason Ansel (MIT) PetaBricks May 4, / 40

66 Multigrid choice space Pseudo code a c c u r a c y m e t r i c MyRMSError... e i t h e r { f o r e n o u g h { S O R I t e r a t i o n ( tmp ) ; } } or { M u l t i g r i d ( tmp ) ; } or { D i r e c t S o l v e ( tmp ) ; } Time SOR Ite ra tio n Jason Ansel (MIT) PetaBricks May 4, / 40

67 Multigrid choice space Pseudo code a c c u r a c y m e t r i c MyRMSError... e i t h e r { f o r e n o u g h { S O R I t e r a t i o n ( tmp ) ; } } or { M u l t i g r i d ( tmp ) ; } or { D i r e c t S o l v e ( tmp ) ; } Grid Size Time SOR Iteration Jason Ansel (MIT) PetaBricks May 4, / 40

68 Multigrid choice space Pseudo code a c c u r a c y m e t r i c MyRMSError... e i t h e r { f o r e n o u g h { S O R I t e r a t i o n ( tmp ) ; } } or { M u l t i g r i d ( tmp ) ; } or { D i r e c t S o l v e ( tmp ) ; } Grid Size Time SOR Iteration Jason Ansel (MIT) PetaBricks May 4, / 40

69 Multigrid choice space Pseudo code a c c u r a c y m e t r i c MyRMSError... e i t h e r { f o r e n o u g h { S O R I t e r a t i o n ( tmp ) ; } } or { M u l t i g r i d ( tmp ) ; } or { D i r e c t S o l v e ( tmp ) ; } Grid Size Time SOR Iteration Direct Solve Jason Ansel (MIT) PetaBricks May 4, / 40

70 Autotuned V-cycle shapes Grid Size Jason Ansel (MIT) PetaBricks May 4, / 40

71 Autotuned V-cycle shapes Grid Size Jason Ansel (MIT) PetaBricks May 4, / 40

72 Autotuned V-cycle shapes Grid Size Jason Ansel (MIT) PetaBricks May 4, / 40

73 Autotuned V-cycle shapes Grid Size Grid Size Jason Ansel (MIT) PetaBricks May 4, / 40

74 Autotuned bin packing algorithms Jason Ansel (MIT) PetaBricks May 4, / 40

75 Outline 1 Motivating Example 2 PetaBricks Language Overview 3 Variable Accuracy 4 Offline Autotuner 5 Results: Variable Accuracy 6 Online Autotuner: SiblingRivalry 7 Results: Sibling Rivalry 8 Conclusions Jason Ansel (MIT) PetaBricks May 4, / 40

76 Challenges for online learning Search space is difficult to model High-dimensional Non-linear Irregular Complex dependencies Dangerous configurations exist Exponential algorithms Infinite loops Poor quality of service Jason Ansel (MIT) PetaBricks May 4, / 40

77 SiblingRivalry (online autotuner) Processor Jason Ansel (MIT) PetaBricks May 4, / 40

78 SiblingRivalry (online autotuner) Processor Safe Configuration Experimental Configuration Jason Ansel (MIT) PetaBricks May 4, / 40

79 SiblingRivalry (online autotuner) Split available resources in half Process identical requests on both halfs Race two candidate configurations (safe and experimental) and terminate slower algorithm Initial slowdown (from duplicating the request) can be overcome by autotuner Surprisingly, reduces average power consumption per request Jason Ansel (MIT) PetaBricks May 4, / 40

80 Learning technique Maintain population of candidate algorithms Each candidate must be pareto-optimal in 3D objective space: Performance Quality of service Confidence Pick safe and experimental configurations from population Mutate the experimental configuration Add the new configuration to the population if it wins the race Jason Ansel (MIT) PetaBricks May 4, / 40

81 Adaptive operator selection Extension of bandit-based differential evolution [DaCosta et al.] Deterministically choses mutation operators Requires only relative performance information Considers trade-off between exploitation and exploration 2 log k arg max AUC i + C n k i n i Jason Ansel (MIT) PetaBricks May 4, / 40

82 Outline 1 Motivating Example 2 PetaBricks Language Overview 3 Variable Accuracy 4 Offline Autotuner 5 Results: Variable Accuracy 6 Online Autotuner: SiblingRivalry 7 Results: Sibling Rivalry 8 Conclusions Jason Ansel (MIT) PetaBricks May 4, / 40

83 Experimental setup Offline Training Development machine (N cores) Deploy tuned application Baseline No Change Production machine (M cores) Run using M threads SiblingRivalry Online Training Production machine (M cores) Race M/2 threads vs M/2 threads Jason Ansel (MIT) PetaBricks May 4, / 40

84 Experimental setup Offline Training Development machine (N cores) Deploy tuned application Baseline No Change Production machine (M cores) Run using M threads SiblingRivalry Online Training Production machine (M cores) Race M/2 threads vs M/2 threads Compare throughput of both techniques Offline Training Machine 1 Deploy tuned application Production Machine 1 Migrate Production Machine 2 Migrate Production Machine 3 1 hour 10 minutes 10 minutes 10 minutes Jason Ansel (MIT) PetaBricks May 4, / 40

85 SiblingRivalry: throughput x 6.9x 9.7x offline:xeon8, online:amd48 offline:amd48, online:xeon32 Normalized throughput Sort Poisson Matrix Multiply LU Factorization Image Compression Helmholtz Clustering Bin Packing GeoMean Jason Ansel (MIT) PetaBricks May 4, / 40

86 SiblingRivalry: energy usage (on AMD48) Energy per request (joules) Baseline SiblingRivalry 0 Sort Poisson Matrix Multiply LU Factorization Image Compression Helmholtz Clustering Bin Packing 3.9 Mean Benchmark Jason Ansel (MIT) PetaBricks May 4, / 40

87 Outline 1 Motivating Example 2 PetaBricks Language Overview 3 Variable Accuracy 4 Offline Autotuner 5 Results: Variable Accuracy 6 Online Autotuner: SiblingRivalry 7 Results: Sibling Rivalry 8 Conclusions Jason Ansel (MIT) PetaBricks May 4, / 40

88 Conclusions Motivating goal of PetaBricks Make programs future-proof by allowing them to adapt to their environment. We can do better than hard coded constants! Jason Ansel (MIT) PetaBricks May 4, / 40

89 Thanks! Questions? Jason Ansel (MIT) PetaBricks May 4, / 40

Language and Compiler Support for Auto-Tuning Variable-Accuracy Algorithms

Language and Compiler Support for Auto-Tuning Variable-Accuracy Algorithms Jason Ansel Yee Lok Wong Cy Chan Marek Olszewski Alan Edelman Saman Amarasinghe MIT - CSAIL April 4, 2011 Jason Ansel (MIT) PetaBricks