Three hours UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE. Date: Wednesday 16th January 2013 Time: 09:45-12:45

Similar documents
Rank One Update And the Google Matrix by Al Bernstein Signal Science, LLC

Remember: Project Proposals are due April 11.

4. Eccentric axial loading, cross-section core

GAUSS ELIMINATION. Consider the following system of algebraic linear equations

7.2 Volume. A cross section is the shape we get when cutting straight through an object.

6 Roots of Equations: Open Methods

ragsdale (zdr82) HW6 ditmire (58335) 1 the direction of the current in the figure. Using the lower circuit in the figure, we get

Proof that if Voting is Perfect in One Dimension, then the First. Eigenvector Extracted from the Double-Centered Transformed

Dennis Bricker, 2001 Dept of Industrial Engineering The University of Iowa. MDP: Taxi page 1

Katholieke Universiteit Leuven Department of Computer Science

Applied Statistics Qualifier Examination

Review of linear algebra. Nuno Vasconcelos UCSD

UNIVERSITY OF IOANNINA DEPARTMENT OF ECONOMICS. M.Sc. in Economics MICROECONOMIC THEORY I. Problem Set II

Chapter Newton-Raphson Method of Solving a Nonlinear Equation

Partially Observable Systems. 1 Partially Observable Markov Decision Process (POMDP) Formalism

International Journal of Pure and Applied Sciences and Technology

DCDM BUSINESS SCHOOL NUMERICAL METHODS (COS 233-8) Solutions to Assignment 3. x f(x)

Fall 2012 Analysis of Experimental Measurements B. Eisenstein/rev. S. Errede. with respect to λ. 1. χ λ χ λ ( ) λ, and thus:

Name: SID: Discussion Session:

FUNDAMENTALS ON ALGEBRA MATRICES AND DETERMINANTS

Statistics 423 Midterm Examination Winter 2009

Two Coefficients of the Dyson Product

Study of Trapezoidal Fuzzy Linear System of Equations S. M. Bargir 1, *, M. S. Bapat 2, J. D. Yadav 3 1

Introduction to Numerical Integration Part II

Chapter Newton-Raphson Method of Solving a Nonlinear Equation

ESCI 342 Atmospheric Dynamics I Lesson 1 Vectors and Vector Calculus

Learning Enhancement Team

Designing Information Devices and Systems I Discussion 8B

Quiz: Experimental Physics Lab-I

Formulated Algorithm for Computing Dominant Eigenvalue. and the Corresponding Eigenvector

Jens Siebel (University of Applied Sciences Kaiserslautern) An Interactive Introduction to Complex Numbers

Effects of polarization on the reflected wave

Multiple view geometry

Substitution Matrices and Alignment Statistics. Substitution Matrices

Fig. 1. Open-Loop and Closed-Loop Systems with Plant Variations

List all of the possible rational roots of each equation. Then find all solutions (both real and imaginary) of the equation. 1.

VECTORS AND TENSORS IV.1.1. INTRODUCTION

Physics 121 Sample Common Exam 2 Rev2 NOTE: ANSWERS ARE ON PAGE 7. Instructions:

Definition of Tracking

Variable time amplitude amplification and quantum algorithms for linear algebra. Andris Ambainis University of Latvia

INTRODUCTION TO COMPLEX NUMBERS

Lecture 4: Piecewise Cubic Interpolation

Homework Solution - Set 5 Due: Friday 10/03/08

MTH 146 Class 7 Notes

2.12 Pull Back, Push Forward and Lie Time Derivatives

CENTROID (AĞIRLIK MERKEZİ )

Pyramid Algorithms for Barycentric Rational Interpolation

Demand. Demand and Comparative Statics. Graphically. Marshallian Demand. ECON 370: Microeconomic Theory Summer 2004 Rice University Stanley Gilbert

FINITE NEUTROSOPHIC COMPLEX NUMBERS. W. B. Vasantha Kandasamy Florentin Smarandache

Many-Body Calculations of the Isotope Shift

The Schur-Cohn Algorithm

I1 = I2 I1 = I2 + I3 I1 + I2 = I3 + I4 I 3

Physics for Scientists and Engineers I

Quadratic Forms. Quadratic Forms

Nondeterminism and Nodeterministic Automata

The Study of Lawson Criterion in Fusion Systems for the

Zbus 1.0 Introduction The Zbus is the inverse of the Ybus, i.e., (1) Since we know that

The area under the graph of f and above the x-axis between a and b is denoted by. f(x) dx. π O

Department of Mechanical Engineering, University of Bath. Mathematics ME Problem sheet 11 Least Squares Fitting of data

COMPLEX NUMBERS INDEX

First Midterm Examination

Things to Memorize: A Partial List. January 27, 2017

8. INVERSE Z-TRANSFORM

Online Appendix to. Mandating Behavioral Conformity in Social Groups with Conformist Members

Work and Energy (Work Done by a Varying Force)

State space systems analysis (continued) Stability. A. Definitions A system is said to be Asymptotically Stable (AS) when it satisfies

Name Ima Sample ASU ID

Chemical Reaction Engineering

STRAND J: TRANSFORMATIONS, VECTORS and MATRICES

Course Review Introduction to Computer Methods

Strong Gravity and the BKL Conjecture

On the diagram below the displacement is represented by the directed line segment OA.

S56 (5.3) Vectors.notebook January 29, 2016

Convert the NFA into DFA

Chemical Reaction Engineering

Talen en Automaten Test 1, Mon 7 th Dec, h45 17h30

p-adic Egyptian Fractions

Ph2b Quiz - 1. Instructions

u( t) + K 2 ( ) = 1 t > 0 Analyzing Damped Oscillations Problem (Meador, example 2-18, pp 44-48): Determine the equation of the following graph.

Principle Component Analysis

Linear Inequalities. Work Sheet 1

CISE 301: Numerical Methods Lecture 5, Topic 4 Least Squares, Curve Fitting

State Estimation in TPN and PPN Guidance Laws by Using Unscented and Extended Kalman Filters

Let's start with an example:

Frequency scaling simulation of Chua s circuit by automatic determination and control of step-size

GRADE 4. Division WORKSHEETS

Level-2 BLAS. Matrix-Vector operations with O(n 2 ) operations (sequentially) BLAS-Notation: S --- single precision G E general matrix M V --- vector

Calculation of time complexity (3%)

Hints for Exercise 1 on: Current and Resistance

ICS 252 Introduction to Computer Design

6. Chemical Potential and the Grand Partition Function

Calculus Module C21. Areas by Integration. Copyright This publication The Northern Alberta Institute of Technology All Rights Reserved.

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

Physics 121 Sample Common Exam 1 NOTE: ANSWERS ARE ON PAGE 8. Instructions:

CSE : Exam 3-ANSWERS, Spring 2011 Time: 50 minutes

1 From NFA to regular expression

Lecture 36. Finite Element Methods

Abhilasha Classes Class- XII Date: SOLUTION (Chap - 9,10,12) MM 50 Mob no

R(3, 8) P( 3, 0) Q( 2, 2) S(5, 3) Q(2, 32) P(0, 8) Higher Mathematics Objective Test Practice Book. 1 The diagram shows a sketch of part of

Transcription:

Three hours UNIVERSITY OF MANHESTER SHOOL OF OMPUTER SIENE Prllel Progrms nd ther Performnce Dte: Wednesdy 16th Jnury 2013 Tme: 09:45-12:45 Plese nswer ny TWO Questons from the FOUR Questons provded. Ths s n OPEN ook exmnton. The use of electronc clcultors s permtted provded they re not progrmmle nd do not store text. [PTO]

Queston 1 onsder the followng frgments of code tht perform some smple numercl lner lger computtons (vector xpy operton (lnked vector ddton nd vector sclng), nd the multplcton of two lower trngulr mtrces): =x y B LM, where s sclr,, x, y re vectors of length n (n cn e ssumed to e lrge) nd B, L, M re n n, lower trngulr, mtrces. (A lower trngulr mtrx s one n whch ll the elements ove the dgonl re zero, L j 0, j.) ) The followng FORTRAN code ntlses the vectors x, y nd mplements the vector xpy operton. vector ntlston DO =1,n x() = rnd() y() = rnd() vector xpy DO =1,n () = lph*x() + y() Identfy, wthout reference to ny prtculr prllel rchtecture, the nture of ny prllel work n the loops ove. (3 mrks) ) One mplementton (mplementton A) prllelses the second loop (the xpy operton) y ncludng the OMP prgm $omp prllel do schedule(sttc) mmedtely efore the second DO sttement, nd second mplementton (mplementton B) prllelses oth loops y ncludng the sme prgm efore ech of the DO sttements. The executon tme (n seconds) of these two mplementtons on 1 8 cores of chronos ( 16-core AMD Opteron-sed server) (these tmngs exclude the ntlston loop n ech cse) s s follows: Pge 2 of 10

No of ores Implementton A Implementton B 1 0.4594 0.4597 2 0.5907 0.2308 3 0.4054 0.1554 4 0.3638 0.1182 6 0.2724 0.08048 8 0.2451 0.06411 Expln these results n terms of prllel overheds. (7 mrks) c) The followng FORTRAN code frgment clcultes the lower trngulr mtrx product: Mtrx multplcton DO j = 1,n DO = j,n B(,j) = 0.0 DO k = j, B(,j) = B(,j)+L(,k)*M(k,j) Identfy the nture (nd lmttons) of ny prllel work n the ove clculton s t s wrtten. Suggest prllel mplementton of the ove clculton sutle for qud qud core pltform such s chronos clerly dentfy ll the potentl overheds nd nclude consderton of the ntlston of the rrys L nd M. (10 mrks) Pge 3 of 10

Queston 2 ) Expln wht s ment y the executon tme overheds of prllel progrm (you should clerly dentfy ech dfferent knd of overhed you mght expect to occur). Descre how these overheds ffect executon of the prllel progrm. (5 mrks) The followng OpenMP/Fortrn-lke pseudocode (for emphss, the OpenMP drectves re on the left nd the Fortrn code on the rght) mplements prllel dvde-nd-conquer lgorthm usng shred stck to hold the outstndng jos tht need to e computed. The suroutne POP returns the specl vlue NULL f t s executed when the stck s empty. DO PARALLEL SHARED STAK, OUTPUT, TERMINATED PRIVATE JOB, JOB1, JOB2, RESULT RITIAL (STAK) END RITIAL RITIAL (STAK) END RITIAL RITIAL (OUTPUT) END RITIAL RITIAL (TERMINATED) END RITIAL DO WHILE (.NOT. TERMINATED) POP(TOP OF STAK INTO JOB) IF (JOB.NE. NULL) IF (JOB IS LARGE) REATE 2 SUBJOBS, JOB1 & JOB2 PUSH(JOB1 ONTO STAK) PUSH(JOB2 ONTO STAK) ELSE OMPUTE RESULT (OF JOB) ADD RESULT TO OUTPUT END IF ELSE END IF OMPUTE TERMINATED WHILE ) Expln wht needs to e done when the shred termnton condton TERMINATED s computed. Brefly descre strtegy for mplementng ths. (2 mrks) Pge 4 of 10

c) Expln clerly wht you expect to e the mn source of prllel executon tme overhed for ths code. Stte your ssumptons out the ehvour of ech prt of the lgorthm, nd mke t cler wht you expect to hppen s the tme to compute RESULT ncreses from eng reltvely short to eng reltvely long, compred wth the rest of the necessry work. (5 mrks) d) A progrmmer on your tem suggests the followng chnge to the ove pseudocode: nsted of pushng oth new sujos onto the stck, push only one of them nd then execute the other n the exstng thred. Gve new pseudocode (n the sme style s ove) tht cheves ths. Wht effect do you expect ths chnge to hve on the executon tme overheds you dentfed erler? (4 mrks) e) Dscuss the dffcultes tht would need to e ddressed f P stcks (s opposed to sngle shred stck) were used n P-fold prllel mplementton of ths lgorthm. Wht effect do you expect such chnge to hve on the executon tme overheds you dentfed erler? (4 mrks) Pge 5 of 10

Queston 3 ) onsder the frst order lner recurrence x d, 1 1 x x d, 2,3,, NMAX. 1 Show tht, y tertng the recurrence twce, one cn otn the recurrence x d, x x d, x x d, x x d, 1 1 2 2 1 2 3 3 2 3 4 4 3 4 x ˆ x dˆ, 5,6,, NMAX. 4 nd therey expose 4-fold prllelsm n ths computton. You should clerly derve expressons for ˆ nd ˆ d. (6 mrks) ) onsder now the trdgonl system Ax = y, (4.1) where A s the (symmetrc) trdgonl mtrx 1 2 A 2 2 3 3 3 n. n n Pge 6 of 10

) A cyclc reducton lgorthm results from the followng: usng equtons - 1, + 1 of (4.1) to elmnte x -1, x +1, respectvely, from the th equton of (4.1) show tht the trdgonl system (4.1) cn e replced y where A x = y, (4.2) A 1 0 3 0 2 0 4 3 0 3 n 4 n1 0, n 0 n nd otn expressons for the elements of A nd y. (4 mrks) ) In smlr wy, show tht equtons - 2, + 2 of (4.2) cn e used to elmnte x -2, x +2 respectvely, from the th equton of (4.2) to otn A (2) x = y (2), where the elements of A (2) nd y (2) re sutly defned. (3 mrks) ) Indcte how ths procedure my e contnued nd show tht N = log 2 n stges wll e requred to reduce the system of equtons to dgonl form. (4 mrks) v) "Ths cyclc reducton lgorthm s uncompettve on serl computers, ut hs ecome populr for mplementton on prllel computers." Expln ths sttement. (3 mrks) Pge 7 of 10

Queston 4 A stellr system s to e modelled usng 3-dmensonl, N-ody, tertve tme-steppng smulton of the effects of grvttonl ttrcton (gnorng collsons). The grvttonl force F ctng on str s due to str s j ( j) s gven y: F = G m * m j / r j 2, where G s constnt, m s the mss of str s, nd r j s the dstnce etween s nd s j. Also, the ccelerton of str s, under force F s: = F / m. The overll nture of the smulton s descred n the followng pseudo-code n whch the type TRIPLE REAL ARRAY s n rry of records of three REAL vlues. In rry postons, the three vlues represent the x, y, z coordntes of the correspondng str durng the current tme-step. Smlrly, rry veloctes represents the current u, v, w veloctes of the correspondng str, n the x, y, z drectons, respectvely, nd rry forces represents the F x, F y, F z force components currently ctng on the correspondng str, n the x, y, z drectons, respectvely. Suroutne prmeters tht re updted s result of cll re underlned n the pseudo-code elow; otherwse suroutne prmeters re red-only. PROGRAM grvttonl_n-ody_clculton REAL ARRAY msses (1:10000) TRIPLE REAL ARRAY postons (1:10000) TRIPLE REAL ARRAY veloctes (1:10000) TRIPLE REAL ARRAY forces (1:10000) INTEGER step REAL t, delt_t t=0.0 READ(delt_t) delt_t, the tme step sze, s progrm nput ALL ntlse (msses,postons,veloctes) ntlse plces ll 10000 strs n ntl postons nd gves them veloctes nd msses chosen t rndom from pproprte dstrutons repet tme steppng loop 1000000 tmes FOR step=1 TO 1000000 DO ALL clculte_forces (msses,postons,forces) clculte_forces determnes the forces on ech str on the ss of the current postons of the strs Pge 8 of 10

ALL move_strs (msses,forces,postons,veloctes move_strs clcultes new postons nd veloctes, t tme t+delt_t, for ech of the strs, under the clculted forces t=t+delt_t updte the tme nd repet END PROGRAM Pseudo-code for strghtforwrd mplementton of the suroutne clculte_forces s gven elow: SUBROUTINE clculte_forces(m,p,f) REAL ARRAY m(1:10000) TRIPLE REAL ARRAY p(1:10000) TRIPLE REAL ARRAY f(1:10000) INTEGER, j FOR =1 to 10000 DO Zero_The_3_omponents_Of_f() FOR j=1 TO 10000 DO IF j.ne. THEN lculte_the_3_force_omponents_ & At_Str_s()_Due_To_Str_s(j) Add_The_lculted_omponents_Into_f() END IF END SUBROUTINE ) Gve pseudo-code for the suroutnes ntlse nd move_strs. (6 mrks) ) omment on the nture of potentl prllel executons of the three mn suroutnes, ntlse, clculte_forces nd move_strs, sttng ny ssumptons you mke. Hence, suggest generl strtegy for prllelsng the whole progrm. (6 mrks) Pge 9 of 10

c) The gven suroutne for clculte_forces s neffcent. Snce grvty s symmetrcl force, the forces() components, ctng on str s, tht re due to str s j re equl n vlue, ut opposte n sense, to the forces(j) components, ctng on str s j, tht re due to str s. Hence, these vlues, whch re computed twce durng ech cycle n the gven code, could, n prncple, e clculted just once per cycle. Gve lterntve pseudo-code for the suroutne clculte_forces tht mplements ths optmston. (4 mrks) d) omment on the nture of potentl prllel executons of your code for prt c), nd expln how you would ttempt to orgnse ny ctul prllel executon so s to cheve hgh performnce. (4 mrks) END OF EXAMINATION Pge 10 of 10