CSE Introduction to Parallel Processing. Chapter 2. A Taste of Parallel Algorithms

Size: px
Start display at page:

Download "CSE Introduction to Parallel Processing. Chapter 2. A Taste of Parallel Algorithms"

Transcription

1 Dr.. Izadi CSE-0 Introduction to Parallel Processing Chapter 2 A Taste of Parallel Algorithms Consider five basic building-block parallel operations Implement them on four simple parallel architectures Learn about the nature of parallel computations and interplay between algorithm and architecture

2 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page Some Simple Computations x 0 identity element x x 2 t = 0 t = t = 2 t = Fig x n 2. x n t = n t = n s Semigroup computation on a uniprocessor. x 0 x x 2 x x x x x x x x 0 s Semigroup computation viewed as a tree or fan-in computation.

3 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page 2 x 0 identity element s 0 x s x 2 s 2... x n 2 x n t = 0 t = t = 2 t = t = n t = n s n 2 s n Prefix computation on a uniprocessor.. Packet routing one processor sending a packet of data to another. Broadcasting one processor sending a packet of data to all others. Sorting processors cooperating in rearranging their data into desired order

4 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page Some Simple Architectures P 0 P P 2 P P P P P P P 0 P P 2 P P P P P P Fig A linear array of nine processors and its ring variant. Diameter of linear array: D = p (Max) Node degree: d = 2 P 0 P P P 2 P P P P P Fig. 2.. A balanced (but incomplete) binary tree of nine processors. Diameter of balanced binary tree: D = 2 log 2 p ; or one less (Max) Node degree: d = We almost always deal with complete binary trees: p one less than a power of 2 D = 2 log 2 (p + ) 2

5 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page 2 P 0 P P 2 P 0 P P 2 P P P P P P P P P P P P Fig. 2.. A 2D mesh of nine processors and its torus variant. Diameter of r (p/r) mesh: D = r + p/r 2 (Max) Node degree: d = Square meshes preferred; they minimize D (= 2 p 2) P 0 P P P P 2 P P P P Fig. 2.. A shared-variable architecture modeled as a complete graph. Diameter of complete graph: D = (Max) Node degree: d = p

6 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page 2 2. Algorithms for a Linear Array P 0 P P 2 P P P P P P 2 Initial values Maximum identified Fig. 2.. Maximum-finding on a linear array of nine processors. P 0 P P 2 P P P P P P Initial values Final results Fig. 2.. Computing prefix sums on a linear array of nine processors. Diminished prefix computation: the ith result excludes the ith element (e.g., sum of the first i elements)

7 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page 2 P 0 P P 2 P P P P P P = Initial values Local prefixes Linear-array diminished prefix sums Final results Fig. 2.. Computing prefix sums on a linear array with two items per processor. Packet routing or broadcasting: right- and left-moving packets have no conflict

8 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page Fig. 2.. Sorting on a linear array with the keys input sequentially from the left.

9 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page 2 P 0 P P 2 P P P P P P In odd steps,,,, etc., odd-numbered processors exchange values with their right neighbors Fig Odd-even transposition sort on a linear array. For odd-even transposition sort: Speed-up = O(p log p) / p = O(log p) Efficinecy = O((log p) / p) Redundancy = O(p / (log p)) Utilization = /2

10 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page 2 2. Algorithms for a Binary Tree x 0x x 2x x x 0x x0 x x2 x 2x x x x Upward Propagation x x Identity Identity x 0x Identity x 0 x 0x x 0x x 2 Downward Propagation x x x 0 2 x 0x x 2 x x x 0x x 2x x 0 x x 0 x 0x x 2 x 0x x 2x x 0x x 2x x Results Fig. 2.. Parallel prefix computation on a binary tree of processors.

11 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page 2 Some applications of the parallel prefix computation Finding the rank of each in a list of 0s and s: Data : Prefix sums : Ranks of s : 2 Priority circuit: Data : Diminished prefix ORs : Complement : AND with data : Carry computation in fast adders Let g, p, and a denote the event that a particular digit position in the adder generates, propagates, or annihilates a carry. The input data for the carry circuit consists of a vector of three-valued elements such as: p g a g g p p p g a c in g or a direction of indexing Parallel prefix computation using the carry operator p x = x x propagates over p, for all x {g, p, a} a x = a x is annihilated or absorbed by a g x = g x is immaterial; a carry is generated

12 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page 0 Packet routing on a tree P 0 P P P 2 P P P P P A balanced binary tree with preorder node indices. maxl (maxr) = largest node number in the left (right) subtree if dest = self then remove the packet {done} else if dest < self or dest > maxr then route upward else if dest maxl then route leftward else route rightward endif endif endif

13 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page Other indexing schemes might lead to simpler routing algorithms XXX LXX RXX LLX LRX RLX RRL RRX RRR Broadcasting is done via the root node

14 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page 2 Sorting: let the root see all data in nondescending order 2 2 (a) (b) 2 2 (c) (d) Fig The first few steps of the sorting algorithm on a binary tree. Bisection Width = Fig. 2.. The bisection width of a binary tree architecture.

15 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page 2. Algorithms for a 2D Mesh 2 Row maximums Column maximums Finding the max value on a 2D mesh Row prefix sums 0 Diminished prefix sums in last column 0 Broadcast in rows and combine Computing prefix sums on a 2D mesh Row-major order required if the operator is not commutative Routing and broadcasting done via row/column operations Initial values Snake-like row sort Top-to-bottom column sort Snake-like row sort Top-to-bottom column sort Left-to-right row sort Fig. 2.. Phase Phase 2 Phase The shearsort algorithm on a mesh.

16 Introduction to Parallel Processing: Algorithms and Architectures Instructor s Manual, Vol. 2 (/00), Page 2. Algorithms with Shared Variables P 0 P P P P 2 P P P P Fig. 2.. A shared-variable architecture modeled as a complete graph. Semigroup computation: each processor read all values in turn and combine Parallel prefix: processor i read/combine values 0 to i Both of the above are quite inefficient, given the high cost Packet routing and broadcasting: one step, assuming allport communication Sorting: rank each element by comparing it to all others, then permute according to ranks Figure for Problem 2..

R ij = 2. Using all of these facts together, you can solve problem number 9.

R ij = 2. Using all of these facts together, you can solve problem number 9. Help for Homework Problem #9 Let G(V,E) be any undirected graph We want to calculate the travel time across the graph. Think of each edge as one resistor of 1 Ohm. Say we have two nodes: i and j Let the

More information

All of the above algorithms are such that the total work done by themisω(n 2 m 2 ). (The work done by a parallel algorithm that uses p processors and

All of the above algorithms are such that the total work done by themisω(n 2 m 2 ). (The work done by a parallel algorithm that uses p processors and Efficient Parallel Algorithms for Template Matching Sanguthevar Rajasekaran Department of CISE, University of Florida Abstract. The parallel complexity of template matching has been well studied. In this

More information

EE/CSCI 451: Parallel and Distributed Computation

EE/CSCI 451: Parallel and Distributed Computation EE/CSCI 451: Parallel and Distributed Computation Lecture #19 3/28/2017 Xuehai Qian Xuehai.qian@usc.edu http://alchem.usc.edu/portal/xuehaiq.html University of Southern California 1 From last class PRAM

More information

Problem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26

Problem. Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26 Binary Search Introduction Problem Problem Given a dictionary and a word. Which page (if any) contains the given word? 3 / 26 Strategy 1: Random Search Randomly select a page until the page containing

More information

On Detecting Multiple Faults in Baseline Interconnection Networks

On Detecting Multiple Faults in Baseline Interconnection Networks On Detecting Multiple Faults in Baseline Interconnection Networks SHUN-SHII LIN 1 AND SHAN-TAI CHEN 2 1 National Taiwan Normal University, Taipei, Taiwan, ROC 2 Chung Cheng Institute of Technology, Tao-Yuan,

More information

Hardware Design I Chap. 4 Representative combinational logic

Hardware Design I Chap. 4 Representative combinational logic Hardware Design I Chap. 4 Representative combinational logic E-mail: shimada@is.naist.jp Already optimized circuits There are many optimized circuits which are well used You can reduce your design workload

More information

HYPERCUBE ALGORITHMS FOR IMAGE PROCESSING AND PATTERN RECOGNITION SANJAY RANKA SARTAJ SAHNI Sanjay Ranka and Sartaj Sahni

HYPERCUBE ALGORITHMS FOR IMAGE PROCESSING AND PATTERN RECOGNITION SANJAY RANKA SARTAJ SAHNI Sanjay Ranka and Sartaj Sahni HYPERCUBE ALGORITHMS FOR IMAGE PROCESSING AND PATTERN RECOGNITION SANJAY RANKA SARTAJ SAHNI 1989 Sanjay Ranka and Sartaj Sahni 1 2 Chapter 1 Introduction 1.1 Parallel Architectures Parallel computers may

More information

Logic BIST. Sungho Kang Yonsei University

Logic BIST. Sungho Kang Yonsei University Logic BIST Sungho Kang Yonsei University Outline Introduction Basics Issues Weighted Random Pattern Generation BIST Architectures Deterministic BIST Conclusion 2 Built In Self Test Test/ Normal Input Pattern

More information

Program Performance Metrics

Program Performance Metrics Program Performance Metrics he parallel run time (par) is the time from the moment when computation starts to the moment when the last processor finished his execution he speedup (S) is defined as the

More information

2 Introduction to Network Theory

2 Introduction to Network Theory Introduction to Network Theory In this section we will introduce some popular families of networks and will investigate how well they allow to support routing.. Basic network topologies The most basic

More information

Tight Bounds on the Ratio of Network Diameter to Average Internode Distance. Behrooz Parhami University of California, Santa Barbara

Tight Bounds on the Ratio of Network Diameter to Average Internode Distance. Behrooz Parhami University of California, Santa Barbara Tight Bounds on the Ratio of Network Diameter to Average Internode Distance Behrooz Parhami University of California, Santa Barbara About This Presentation This slide show was first developed in fall of

More information

Week-I. Combinational Logic & Circuits

Week-I. Combinational Logic & Circuits Week-I Combinational Logic & Circuits Overview Binary logic operations and gates Switching algebra Algebraic Minimization Standard forms Karnaugh Map Minimization Other logic operators IC families and

More information

Efficient Distributed Quantum Computing

Efficient Distributed Quantum Computing Efficient Distributed Quantum Computing Steve Brierley Heilbronn Institute, Dept. Mathematics, University of Bristol October 2013 Work with Robert Beals, Oliver Gray, Aram Harrow, Samuel Kutin, Noah Linden,

More information

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1>

Chapter 5. Digital Design and Computer Architecture, 2 nd Edition. David Money Harris and Sarah L. Harris. Chapter 5 <1> Chapter 5 Digital Design and Computer Architecture, 2 nd Edition David Money Harris and Sarah L. Harris Chapter 5 Chapter 5 :: Topics Introduction Arithmetic Circuits umber Systems Sequential Building

More information

Logic and Computer Design Fundamentals. Chapter 5 Arithmetic Functions and Circuits

Logic and Computer Design Fundamentals. Chapter 5 Arithmetic Functions and Circuits Logic and Computer Design Fundamentals Chapter 5 Arithmetic Functions and Circuits Arithmetic functions Operate on binary vectors Use the same subfunction in each bit position Can design functional block

More information

COL 730: Parallel Programming

COL 730: Parallel Programming COL 730: Parallel Programming PARALLEL SORTING Bitonic Merge and Sort Bitonic sequence: {a 0, a 1,, a n-1 }: A sequence with a monotonically increasing part and a monotonically decreasing part For some

More information

Algorithms for Collective Communication. Design and Analysis of Parallel Algorithms

Algorithms for Collective Communication. Design and Analysis of Parallel Algorithms Algorithms for Collective Communication Design and Analysis of Parallel Algorithms Source A. Grama, A. Gupta, G. Karypis, and V. Kumar. Introduction to Parallel Computing, Chapter 4, 2003. Outline One-to-all

More information

CSE 202: Design and Analysis of Algorithms Lecture 3

CSE 202: Design and Analysis of Algorithms Lecture 3 CSE 202: Design and Analysis of Algorithms Lecture 3 Instructor: Kamalika Chaudhuri Announcement Homework 1 out Due on Tue Jan 24 in class No late homeworks will be accepted Greedy Algorithms Direct argument

More information

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication. CME342 Parallel Methods in Numerical Analysis Matrix Computation: Iterative Methods II Outline: CG & its parallelization. Sparse Matrix-vector Multiplication. 1 Basic iterative methods: Ax = b r = b Ax

More information

Boolean Algebra and Digital Logic 2009, University of Colombo School of Computing

Boolean Algebra and Digital Logic 2009, University of Colombo School of Computing IT 204 Section 3.0 Boolean Algebra and Digital Logic Boolean Algebra 2 Logic Equations to Truth Tables X = A. B + A. B + AB A B X 0 0 0 0 3 Sum of Products The OR operation performed on the products of

More information

CS361 Homework #3 Solutions

CS361 Homework #3 Solutions CS6 Homework # Solutions. Suppose I have a hash table with 5 locations. I would like to know how many items I can store in it before it becomes fairly likely that I have a collision, i.e., that two items

More information

Data selection. Lower complexity bound for sorting

Data selection. Lower complexity bound for sorting Data selection. Lower complexity bound for sorting Lecturer: Georgy Gimel farb COMPSCI 220 Algorithms and Data Structures 1 / 12 1 Data selection: Quickselect 2 Lower complexity bound for sorting 3 The

More information

1 Boolean Algebra Simplification

1 Boolean Algebra Simplification cs281: Computer Organization Lab3 Prelab Our objective in this prelab is to lay the groundwork for simplifying boolean expressions in order to minimize the complexity of the resultant digital logic circuit.

More information

Binary Decision Diagrams

Binary Decision Diagrams Binary Decision Diagrams Logic Circuits Design Seminars WS2010/2011, Lecture 2 Ing. Petr Fišer, Ph.D. Department of Digital Design Faculty of Information Technology Czech Technical University in Prague

More information

Tree and Array Multipliers Ivor Page 1

Tree and Array Multipliers Ivor Page 1 Tree and Array Multipliers 1 Tree and Array Multipliers Ivor Page 1 11.1 Tree Multipliers In Figure 1 seven input operands are combined by a tree of CSAs. The final level of the tree is a carry-completion

More information

searching algorithms

searching algorithms searching algorithms learning objectives algorithms your software system software hardware learn what the searching problem is about learn two algorithms for solving this problem learn the importance of

More information

VLSI Design I; A. Milenkovic 1

VLSI Design I; A. Milenkovic 1 The -bit inary dder CPE/EE 427, CPE 527 VLI Design I L2: dder Design Department of Electrical and Computer Engineering University of labama in Huntsville leksandar Milenkovic ( www. ece.uah.edu/~milenka

More information

Total Time = 90 Minutes, Total Marks = 50. Total /50 /10 /18

Total Time = 90 Minutes, Total Marks = 50. Total /50 /10 /18 University of Waterloo Department of Electrical & Computer Engineering E&CE 223 Digital Circuits and Systems Midterm Examination Instructor: M. Sachdev October 23rd, 2007 Total Time = 90 Minutes, Total

More information

Cost-Constrained Matchings and Disjoint Paths

Cost-Constrained Matchings and Disjoint Paths Cost-Constrained Matchings and Disjoint Paths Kenneth A. Berman 1 Department of ECE & Computer Science University of Cincinnati, Cincinnati, OH Abstract Let G = (V, E) be a graph, where the edges are weighted

More information

Arithmetic in Integer Rings and Prime Fields

Arithmetic in Integer Rings and Prime Fields Arithmetic in Integer Rings and Prime Fields A 3 B 3 A 2 B 2 A 1 B 1 A 0 B 0 FA C 3 FA C 2 FA C 1 FA C 0 C 4 S 3 S 2 S 1 S 0 http://koclab.org Çetin Kaya Koç Spring 2018 1 / 71 Contents Arithmetic in Integer

More information

CSE477 VLSI Digital Circuits Fall Lecture 20: Adder Design

CSE477 VLSI Digital Circuits Fall Lecture 20: Adder Design CSE477 VLSI Digital Circuits Fall 22 Lecture 2: Adder Design Mary Jane Irwin ( www.cse.psu.edu/~mji ) www.cse.psu.edu/~cg477 [Adapted from Rabaey s Digital Integrated Circuits, 22, J. Rabaey et al.] CSE477

More information

Overview: Parallelisation via Pipelining

Overview: Parallelisation via Pipelining Overview: Parallelisation via Pipelining three type of pipelines adding numbers (type ) performance analysis of pipelines insertion sort (type ) linear system back substitution (type ) Ref: chapter : Wilkinson

More information

Lecture 8: Sequential Multipliers

Lecture 8: Sequential Multipliers Lecture 8: Sequential Multipliers ECE 645 Computer Arithmetic 3/25/08 ECE 645 Computer Arithmetic Lecture Roadmap Sequential Multipliers Unsigned Signed Radix-2 Booth Recoding High-Radix Multiplication

More information

CSE 4502/5717 Big Data Analytics Spring 2018; Homework 1 Solutions

CSE 4502/5717 Big Data Analytics Spring 2018; Homework 1 Solutions CSE 502/5717 Big Data Analytics Spring 2018; Homework 1 Solutions 1. Consider the following algorithm: for i := 1 to α n log e n do Pick a random j [1, n]; If a[j] = a[j + 1] or a[j] = a[j 1] then output:

More information

CSE613: Parallel Programming, Spring 2012 Date: May 11. Final Exam. ( 11:15 AM 1:45 PM : 150 Minutes )

CSE613: Parallel Programming, Spring 2012 Date: May 11. Final Exam. ( 11:15 AM 1:45 PM : 150 Minutes ) CSE613: Parallel Programming, Spring 2012 Date: May 11 Final Exam ( 11:15 AM 1:45 PM : 150 Minutes ) This exam will account for either 10% or 20% of your overall grade depending on your relative performance

More information

VLSI Signal Processing

VLSI Signal Processing VLSI Signal Processing Lecture 1 Pipelining & Retiming ADSP Lecture1 - Pipelining & Retiming (cwliu@twins.ee.nctu.edu.tw) 1-1 Introduction DSP System Real time requirement Data driven synchronized by data

More information

13 Dynamic Programming (3) Optimal Binary Search Trees Subset Sums & Knapsacks

13 Dynamic Programming (3) Optimal Binary Search Trees Subset Sums & Knapsacks 13 Dynamic Programming (3) Optimal Binary Search Trees Subset Sums & Knapsacks Average-case analysis Average-case analysis of algorithms and data structures: Input is generated according to a known probability

More information

Ma/CS 6a Class 28: Latin Squares

Ma/CS 6a Class 28: Latin Squares Ma/CS 6a Class 28: Latin Squares By Adam Sheffer Latin Squares A Latin square is an n n array filled with n different symbols, each occurring exactly once in each row and exactly once in each column. 1

More information

ARITHMETIC COMBINATIONAL MODULES AND NETWORKS

ARITHMETIC COMBINATIONAL MODULES AND NETWORKS ARITHMETIC COMBINATIONAL MODULES AND NETWORKS 1 SPECIFICATION OF ADDER MODULES FOR POSITIVE INTEGERS HALF-ADDER AND FULL-ADDER MODULES CARRY-RIPPLE AND CARRY-LOOKAHEAD ADDER MODULES NETWORKS OF ADDER MODULES

More information

EECS150 - Digital Design Lecture 11 - Shifters & Counters. Register Summary

EECS150 - Digital Design Lecture 11 - Shifters & Counters. Register Summary EECS50 - Digital Design Lecture - Shifters & Counters February 24, 2003 John Wawrzynek Spring 2005 EECS50 - Lec-counters Page Register Summary All registers (this semester) based on Flip-flops: q 3 q 2

More information

Boolean Algebra, Gates and Circuits

Boolean Algebra, Gates and Circuits Boolean Algebra, Gates and Circuits Kasper Brink November 21, 2017 (Images taken from Tanenbaum, Structured Computer Organization, Fifth Edition, (c) 2006 Pearson Education, Inc.) Outline Last week: Von

More information

Part 1: Digital Logic and Gates. Analog vs. Digital waveforms. The digital advantage. In real life...

Part 1: Digital Logic and Gates. Analog vs. Digital waveforms. The digital advantage. In real life... Part 1: Digital Logic and Gates Analog vs Digital waveforms An analog signal assumes a continuous range of values: v(t) ANALOG A digital signal assumes discrete (isolated, separate) values Usually there

More information

EXHAUSTIVELY generating a class of combinatorial objects

EXHAUSTIVELY generating a class of combinatorial objects International Conference on Mechanical, Production and Materials Engineering (ICMPME') June 6-7,, Bangkok An Improving Scheme of Unranking t-ary Trees in Gray-code Order Ro Yu Wu, Jou Ming Chang, An Hang

More information

Binary addition by hand. Adding two bits

Binary addition by hand. Adding two bits Chapter 3 Arithmetic is the most basic thing you can do with a computer We focus on addition, subtraction, multiplication and arithmetic-logic units, or ALUs, which are the heart of CPUs. ALU design Bit

More information

VLSI Design. [Adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] ECE 4121 VLSI DEsign.1

VLSI Design. [Adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] ECE 4121 VLSI DEsign.1 VLSI Design Adder Design [Adapted from Rabaey s Digital Integrated Circuits, 2002, J. Rabaey et al.] ECE 4121 VLSI DEsign.1 Major Components of a Computer Processor Devices Control Memory Input Datapath

More information

Area-Time Optimal Adder with Relative Placement Generator

Area-Time Optimal Adder with Relative Placement Generator Area-Time Optimal Adder with Relative Placement Generator Abstract: This paper presents the design of a generator, for the production of area-time-optimal adders. A unique feature of this generator is

More information

Gray Codes for Torus and Edge Disjoint Hamiltonian Cycles Λ

Gray Codes for Torus and Edge Disjoint Hamiltonian Cycles Λ Gray Codes for Torus and Edge Disjoint Hamiltonian Cycles Λ Myung M. Bae Scalable POWERparallel Systems, MS/P963, IBM Corp. Poughkeepsie, NY 6 myungbae@us.ibm.com Bella Bose Dept. of Computer Science Oregon

More information

PRAM lower bounds. 1 Overview. 2 Definitions. 3 Monotone Circuit Value Problem

PRAM lower bounds. 1 Overview. 2 Definitions. 3 Monotone Circuit Value Problem U.C. Berkeley CS273: Parallel and Distributed Theory PRAM lower bounds. Professor Satish Rao October 16, 2006 Lecturer: Satish Rao Last revised Scribe so far: Satish Rao cribbing from previous years lectures

More information

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory

UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics. EECS 281A / STAT 241A Statistical Learning Theory UC Berkeley Department of Electrical Engineering and Computer Science Department of Statistics EECS 281A / STAT 241A Statistical Learning Theory Solutions to Problem Set 2 Fall 2011 Issued: Wednesday,

More information

Residue Number Systems Ivor Page 1

Residue Number Systems Ivor Page 1 Residue Number Systems 1 Residue Number Systems Ivor Page 1 7.1 Arithmetic in a modulus system The great speed of arithmetic in Residue Number Systems (RNS) comes from a simple theorem from number theory:

More information

A Complexity Theory for Parallel Computation

A Complexity Theory for Parallel Computation A Complexity Theory for Parallel Computation The questions: What is the computing power of reasonable parallel computation models? Is it possible to make technology independent statements? Is it possible

More information

EECS150 - Digital Design Lecture 24 - Arithmetic Blocks, Part 2 + Shifters

EECS150 - Digital Design Lecture 24 - Arithmetic Blocks, Part 2 + Shifters EECS150 - Digital Design Lecture 24 - Arithmetic Blocks, Part 2 + Shifters April 15, 2010 John Wawrzynek 1 Multiplication a 3 a 2 a 1 a 0 Multiplicand b 3 b 2 b 1 b 0 Multiplier X a 3 b 0 a 2 b 0 a 1 b

More information

EECS 358 Introduction to Parallel Computing Final Assignment

EECS 358 Introduction to Parallel Computing Final Assignment EECS 358 Introduction to Parallel Computing Final Assignment Jiangtao Gou Zhenyu Zhao March 19, 2013 1 Problem 1 1.1 Matrix-vector Multiplication on Hypercube and Torus As shown in slide 15.11, we assumed

More information

Learning Outcomes. Spiral 2 4. Function Synthesis Techniques SHANNON'S THEOREM

Learning Outcomes. Spiral 2 4. Function Synthesis Techniques SHANNON'S THEOREM -. -. Learning Outcomes piral unction synthesis with: Mues (hannon's Theorem) Memories I can implement logic for any truth table by using hannon's theorem to decompose the function to create two smaller

More information

MATH 433 Applied Algebra Lecture 22: Review for Exam 2.

MATH 433 Applied Algebra Lecture 22: Review for Exam 2. MATH 433 Applied Algebra Lecture 22: Review for Exam 2. Topics for Exam 2 Permutations Cycles, transpositions Cycle decomposition of a permutation Order of a permutation Sign of a permutation Symmetric

More information

Lecture 6 September 21, 2016

Lecture 6 September 21, 2016 ICS 643: Advanced Parallel Algorithms Fall 2016 Lecture 6 September 21, 2016 Prof. Nodari Sitchinava Scribe: Tiffany Eulalio 1 Overview In the last lecture, we wrote a non-recursive summation program and

More information

Tight Bounds on the Diameter of Gaussian Cubes

Tight Bounds on the Diameter of Gaussian Cubes Tight Bounds on the Diameter of Gaussian Cubes DING-MING KWAI AND BEHROOZ PARHAMI Department of Electrical and Computer Engineering, University of California, Santa Barbara, CA 93106 9560, USA Email: parhami@ece.ucsb.edu

More information

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2)

INF2270 Spring Philipp Häfliger. Lecture 8: Superscalar CPUs, Course Summary/Repetition (1/2) INF2270 Spring 2010 Philipp Häfliger Summary/Repetition (1/2) content From Scalar to Superscalar Lecture Summary and Brief Repetition Binary numbers Boolean Algebra Combinational Logic Circuits Encoder/Decoder

More information

COMP 633: Parallel Computing Fall 2018 Written Assignment 1: Sample Solutions

COMP 633: Parallel Computing Fall 2018 Written Assignment 1: Sample Solutions COMP 633: Parallel Computing Fall 2018 Written Assignment 1: Sample Solutions September 12, 2018 I. The Work-Time W-T presentation of EREW sequence reduction Algorithm 2 in the PRAM handout has work complexity

More information

Slides for Lecture 10

Slides for Lecture 10 Slides for Lecture 10 ENEL 353: Digital Circuits Fall 2013 Term Steve Norman, PhD, PEng Electrical & Computer Engineering Schulich School of Engineering University of Calgary 30 September, 2013 ENEL 353

More information

Description / Instructions: Covers 9th edition chapters and

Description / Instructions: Covers 9th edition chapters and Description / Instructions: Covers 9th edition chapters 21-1 -- 21-6 and 22-1 -- 22-4. Question 1 Figure 21-14 shows four situations in which charged particles are fixed in place on an axis. In which situations

More information

On Embeddings of Hamiltonian Paths and Cycles in Extended Fibonacci Cubes

On Embeddings of Hamiltonian Paths and Cycles in Extended Fibonacci Cubes American Journal of Applied Sciences 5(11): 1605-1610, 2008 ISSN 1546-9239 2008 Science Publications On Embeddings of Hamiltonian Paths and Cycles in Extended Fibonacci Cubes 1 Ioana Zelina, 2 Grigor Moldovan

More information

XI STANDARD [ COMPUTER SCIENCE ] 5 MARKS STUDY MATERIAL.

XI STANDARD [ COMPUTER SCIENCE ] 5 MARKS STUDY MATERIAL. 2017-18 XI STANDARD [ COMPUTER SCIENCE ] 5 MARKS STUDY MATERIAL HALF ADDER 1. The circuit that performs addition within the Arithmetic and Logic Unit of the CPU are called adders. 2. A unit that adds two

More information

INF2220: algorithms and data structures Series 1

INF2220: algorithms and data structures Series 1 Universitetet i Oslo Institutt for Informatikk I. Yu, D. Karabeg INF2220: algorithms and data structures Series 1 Topic Function growth & estimation of running time, trees (Exercises with hints for solution)

More information

Communication-avoiding LU and QR factorizations for multicore architectures

Communication-avoiding LU and QR factorizations for multicore architectures Communication-avoiding LU and QR factorizations for multicore architectures DONFACK Simplice INRIA Saclay Joint work with Laura Grigori INRIA Saclay Alok Kumar Gupta BCCS,Norway-5075 16th April 2010 Communication-avoiding

More information

DIGITAL TECHNICS. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute

DIGITAL TECHNICS. Dr. Bálint Pődör. Óbuda University, Microelectronics and Technology Institute DIGITAL TECHNICS Dr. Bálint Pődör Óbuda University, Microelectronics and Technology Institute 4. LECTURE: COMBINATIONAL LOGIC DESIGN: ARITHMETICS (THROUGH EXAMPLES) 2016/2017 COMBINATIONAL LOGIC DESIGN:

More information

Department of Electrical & Electronics EE-333 DIGITAL SYSTEMS

Department of Electrical & Electronics EE-333 DIGITAL SYSTEMS Department of Electrical & Electronics EE-333 DIGITAL SYSTEMS 1) Given the two binary numbers X = 1010100 and Y = 1000011, perform the subtraction (a) X -Y and (b) Y - X using 2's complements. a) X = 1010100

More information

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort

Sorting Algorithms. We have already seen: Selection-sort Insertion-sort Heap-sort. We will see: Bubble-sort Merge-sort Quick-sort Sorting Algorithms We have already seen: Selection-sort Insertion-sort Heap-sort We will see: Bubble-sort Merge-sort Quick-sort We will show that: O(n log n) is optimal for comparison based sorting. Bubble-Sort

More information

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design

CMPEN 411 VLSI Digital Circuits Spring Lecture 19: Adder Design CMPEN 411 VLSI Digital Circuits Spring 2011 Lecture 19: Adder Design [Adapted from Rabaey s Digital Integrated Circuits, Second Edition, 2003 J. Rabaey, A. Chandrakasan, B. Nikolic] Sp11 CMPEN 411 L19

More information

STUDY OF PERMUTATION MATRICES BASED LDPC CODE CONSTRUCTION

STUDY OF PERMUTATION MATRICES BASED LDPC CODE CONSTRUCTION EE229B PROJECT REPORT STUDY OF PERMUTATION MATRICES BASED LDPC CODE CONSTRUCTION Zhengya Zhang SID: 16827455 zyzhang@eecs.berkeley.edu 1 MOTIVATION Permutation matrices refer to the square matrices with

More information

Notes on Logarithmic Lower Bounds in the Cell Probe Model

Notes on Logarithmic Lower Bounds in the Cell Probe Model Notes on Logarithmic Lower Bounds in the Cell Probe Model Kevin Zatloukal November 10, 2010 1 Overview Paper is by Mihai Pâtraşcu and Erik Demaine. Both were at MIT at the time. (Mihai is now at AT&T Labs.)

More information

Analysis of Algorithms. Unit 5 - Intractable Problems

Analysis of Algorithms. Unit 5 - Intractable Problems Analysis of Algorithms Unit 5 - Intractable Problems 1 Intractable Problems Tractable Problems vs. Intractable Problems Polynomial Problems NP Problems NP Complete and NP Hard Problems 2 In this unit we

More information

Functional Data Structures

Functional Data Structures Functional Data Structures with Isabelle/HOL Tobias Nipkow Fakultät für Informatik Technische Universität München 2017-2-3 1 Part II Functional Data Structures 2 Chapter 1 Binary Trees 3 1 Binary Trees

More information

What s the Deal? MULTIPLICATION. Time to multiply

What s the Deal? MULTIPLICATION. Time to multiply What s the Deal? MULTIPLICATION Time to multiply Multiplying two numbers requires a multiply Luckily, in binary that s just an AND gate! 0*0=0, 0*1=0, 1*0=0, 1*1=1 Generate a bunch of partial products

More information

15.1 Elimination of Redundant States

15.1 Elimination of Redundant States 15.1 Elimination of Redundant States In Ch. 14 we tried not to have unnecessary states What if we have extra states in the state graph/table? Complete the table then eliminate the redundant states Chapter

More information

Louisiana State University Physics 2102, Exam 1, February 5, 2009.

Louisiana State University Physics 2102, Exam 1, February 5, 2009. Name: Instructor: Louisiana State University Physics 2102, Exam 1, February 5, 2009. Please be sure to write your name and class instructor above. The test consists of 4 questions (multiple choice, no

More information

Part I. Fundamental Concepts. Winter 2016 Parallel Processing, Fundamental Concepts Slide 1

Part I. Fundamental Concepts. Winter 2016 Parallel Processing, Fundamental Concepts Slide 1 Part I Fundamental Concepts Winter 2016 Parallel Processing, Fundamental Concepts Slide 1 About This Presentation This presentation is intended to support the use of the textbook Introduction to Parallel

More information

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1 A Generic Digital Processor MEMORY INPUT-OUTPUT CONTROL DATAPATH

More information

Sample Test Paper - I

Sample Test Paper - I Scheme G Sample Test Paper - I Course Name : Computer Engineering Group Marks : 25 Hours: 1 Hrs. Q.1) Attempt any THREE: 09 Marks a) Define i) Propagation delay ii) Fan-in iii) Fan-out b) Convert the following:

More information

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26. Homework #2. ( Due: Nov 8 )

CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26. Homework #2. ( Due: Nov 8 ) CSE548, AMS542: Analysis of Algorithms, Fall 2017 Date: Oct 26 Homework #2 ( Due: Nov 8 ) Task 1. [ 80 Points ] Average Case Analysis of Median-of-3 Quicksort Consider the median-of-3 quicksort algorithm

More information

A High-Speed Realization of Chinese Remainder Theorem

A High-Speed Realization of Chinese Remainder Theorem Proceedings of the 2007 WSEAS Int. Conference on Circuits, Systems, Signal and Telecommunications, Gold Coast, Australia, January 17-19, 2007 97 A High-Speed Realization of Chinese Remainder Theorem Shuangching

More information

CS470: Computer Architecture. AMD Quad Core

CS470: Computer Architecture. AMD Quad Core CS470: Computer Architecture Yashwant K. Malaiya, Professor malaiya@cs.colostate.edu AMD Quad Core 1 Architecture Layers Building blocks Gates, flip-flops Functional bocks: Combinational, Sequential Instruction

More information

Partitions and Covers

Partitions and Covers University of California, Los Angeles CS 289A Communication Complexity Instructor: Alexander Sherstov Scribe: Dong Wang Date: January 2, 2012 LECTURE 4 Partitions and Covers In previous lectures, we saw

More information

Problem Set 2 Solutions

Problem Set 2 Solutions Introduction to Algorithms October 1, 2004 Massachusetts Institute of Technology 6.046J/18.410J Professors Piotr Indyk and Charles E. Leiserson Handout 9 Problem Set 2 Solutions Reading: Chapters 5-9,

More information

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th

Randomized Sorting Algorithms Quick sort can be converted to a randomized algorithm by picking the pivot element randomly. In this case we can show th CSE 3500 Algorithms and Complexity Fall 2016 Lecture 10: September 29, 2016 Quick sort: Average Run Time In the last lecture we started analyzing the expected run time of quick sort. Let X = k 1, k 2,...,

More information

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd

4.8 Huffman Codes. These lecture slides are supplied by Mathijs de Weerd 4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd Data Compression Q. Given a text that uses 32 symbols (26 different letters, space, and some punctuation characters), how can we

More information

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic.

Digital Integrated Circuits A Design Perspective. Arithmetic Circuits. Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic. Digital Integrated Circuits A Design Perspective Jan M. Rabaey Anantha Chandrakasan Borivoje Nikolic Arithmetic Circuits January, 2003 1 A Generic Digital Processor MEM ORY INPUT-OUTPUT CONTROL DATAPATH

More information

Regular Sparse Crossbar Concentrators

Regular Sparse Crossbar Concentrators Regular Sparse Crossbar Concentrators Weiming Guo and A. Yavuz Oruç Electrical Engineering Department College Park, MD 20742 ABSTRACT A bipartite concentrator is a single stage sparse crossbar switching

More information

CPE 310: Numerical Analysis for Engineers

CPE 310: Numerical Analysis for Engineers CPE 310: Numerical Analysis for Engineers Chapter 2: Solving Sets of Equations Ahmed Tamrawi Copyright notice: care has been taken to use only those web images deemed by the instructor to be in the public

More information

1. Prove: A full m- ary tree with i internal vertices contains n = mi + 1 vertices.

1. Prove: A full m- ary tree with i internal vertices contains n = mi + 1 vertices. 1. Prove: A full m- ary tree with i internal vertices contains n = mi + 1 vertices. Proof: Every vertex, except the root, is the child of an internal vertex. Since there are i internal vertices, each of

More information

Topic 17. Analysis of Algorithms

Topic 17. Analysis of Algorithms Topic 17 Analysis of Algorithms Analysis of Algorithms- Review Efficiency of an algorithm can be measured in terms of : Time complexity: a measure of the amount of time required to execute an algorithm

More information

CPE100: Digital Logic Design I

CPE100: Digital Logic Design I Chapter 2 Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu http://www.ee.unlv.edu/~b1morris/cpe100/ CPE100: Digital Logic Design I Section 1004: Dr. Morris Combinational Logic Design Chapter

More information

Secure Multiparty Computation from Graph Colouring

Secure Multiparty Computation from Graph Colouring Secure Multiparty Computation from Graph Colouring Ron Steinfeld Monash University July 2012 Ron Steinfeld Secure Multiparty Computation from Graph Colouring July 2012 1/34 Acknowledgements Based on joint

More information

Distributed Systems Byzantine Agreement

Distributed Systems Byzantine Agreement Distributed Systems Byzantine Agreement He Sun School of Informatics University of Edinburgh Outline Finish EIG algorithm for Byzantine agreement. Number-of-processors lower bound for Byzantine agreement.

More information

ELE2120 Digital Circuits and Systems. Tutorial Note 9

ELE2120 Digital Circuits and Systems. Tutorial Note 9 ELE2120 Digital Circuits and Systems Tutorial Note 9 Outline 1. Exercise(1) Sequential Circuit Analysis 2. Exercise (2) Sequential Circuit Analysis 3. Exercise (3) Sequential Circuit Analysis 4. Ref. Construction

More information

CSE 140 Lecture 11 Standard Combinational Modules. CK Cheng and Diba Mirza CSE Dept. UC San Diego

CSE 140 Lecture 11 Standard Combinational Modules. CK Cheng and Diba Mirza CSE Dept. UC San Diego CSE 4 Lecture Standard Combinational Modules CK Cheng and Diba Mirza CSE Dept. UC San Diego Part III - Standard Combinational Modules (Harris: 2.8, 5) Signal Transport Decoder: Decode address Encoder:

More information

CS1800 Discrete Structures Final Version A

CS1800 Discrete Structures Final Version A CS1800 Discrete Structures Fall 2017 Profs. Aslam, Gold, & Pavlu December 11, 2017 CS1800 Discrete Structures Final Version A Instructions: 1. The exam is closed book and closed notes. You may not use

More information

Probability Propagation in Singly Connected Networks

Probability Propagation in Singly Connected Networks Lecture 4 Probability Propagation in Singly Connected Networks Intelligent Data Analysis and Probabilistic Inference Lecture 4 Slide 1 Probability Propagation We will now develop a general probability

More information

Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015

Search Trees. EECS 2011 Prof. J. Elder Last Updated: 24 March 2015 Search Trees < 6 2 > 1 4 = 8 9-1 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 2 - Learning Outcomes Ø From this lecture, you should be able to: q Define the properties of a binary search

More information

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM

Search Trees. Chapter 10. CSE 2011 Prof. J. Elder Last Updated: :52 AM Search Trees Chapter 1 < 6 2 > 1 4 = 8 9-1 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 2 - Outline Ø Binary Search Trees Ø AVL Trees Ø Splay Trees - 3 - Binary Search Trees Ø A binary search

More information