Fragment Processor. Textures

Similar documents
Geometric Predicates P r og r a m s need t o t es t r ela t ive p os it ions of p oint s b a s ed on t heir coor d ina t es. S im p le exa m p les ( i

Agenda Rationale for ETG S eek ing I d eas ETG fram ew ork and res u lts 2

Patterns of soiling in the Old Library Trinity College Dublin. Allyson Smith, Robbie Goodhue, Susie Bioletti

Worksheets for GCSE Mathematics. Quadratics. mr-mathematics.com Maths Resources for Teachers. Algebra

1. The graph of a function f is given above. Answer the question: a. Find the value(s) of x where f is not differentiable. Ans: x = 4, x = 3, x = 2,

Terms of Use. Copyright Embark on the Journey

SECTION 7: STEADY-STATE ERROR. ESE 499 Feedback Control Systems

6 Lowercase Letter a Number Puzzles

Secondary 3H Unit = 1 = 7. Lesson 3.3 Worksheet. Simplify: Lesson 3.6 Worksheet

Quantitative Screening of 46 Illicit Drugs in Urine using Exactive Ultrahigh Resolution and Accurate Mass system

Classwork. Example 1 S.35

A L A BA M A L A W R E V IE W

" = Y(#,$) % R(r) = 1 4& % " = Y(#,$) % R(r) = Recitation Problems: Week 4. a. 5 B, b. 6. , Ne Mg + 15 P 2+ c. 23 V,

Support Vector Machines. CSE 4309 Machine Learning Vassilis Athitsos Computer Science and Engineering Department University of Texas at Arlington

A A A A A A A A A A A A. a a a a a a a a a a a a a a a. Apples taste amazingly good.

KENNESAW STATE UNIVERSITY ATHLETICS VISUAL IDENTITY

176 5 t h Fl oo r. 337 P o ly me r Ma te ri al s

ARC 202L. Not e s : I n s t r u c t o r s : D e J a r n e t t, L i n, O r t e n b e r g, P a n g, P r i t c h a r d - S c h m i t z b e r g e r

Lesson 24: True and False Number Sentences

Uncertain Compression & Graph Coloring. Madhu Sudan Harvard

Worksheets for GCSE Mathematics. Algebraic Expressions. Mr Black 's Maths Resources for Teachers GCSE 1-9. Algebra

Classification of Components

i.ea IE !e e sv?f 'il i+x3p \r= v * 5,?: S i- co i, ==:= SOrq) Xgs'iY # oo .9 9 PE * v E=S s->'d =ar4lq 5,n =.9 '{nl a':1 t F #l *r C\ t-e

Lecture No. 1 Introduction to Method of Weighted Residuals. Solve the differential equation L (u) = p(x) in V where L is a differential operator

Work, Energy, and Power. Chapter 6 of Essential University Physics, Richard Wolfson, 3 rd Edition

Vr Vr

Lecture 11. Kernel Methods


Crash course Verification of Finite Automata Binary Decision Diagrams

P a g e 3 6 of R e p o r t P B 4 / 0 9

TSOKOS READING ACTIVITY Section 7-2: The Greenhouse Effect and Global Warming (8 points)


Lesson 7: Linear Transformations Applied to Cubes

T h e C S E T I P r o j e c t

database marketing Database Marketing Defined Loyalty as Competitive Advantage DBM is Incremental in Nature DBM is a complete framework for Marketing

Humanistic, and Particularly Classical, Studies as a Preparation for the Law

I zm ir I nstiute of Technology CS Lecture Notes are based on the CS 101 notes at the University of I llinois at Urbana-Cham paign

CS60020: Foundations of Algorithm Design and Machine Learning. Sourangshu Bhattacharya

Big Bang Planck Era. This theory: cosmological model of the universe that is best supported by several aspects of scientific evidence and observation

10.4 The Cross Product

22 t b r 2, 20 h r, th xp t d bl n nd t fr th b rd r t t. f r r z r t l n l th h r t rl T l t n b rd n n l h d, nd n nh rd f pp t t f r n. H v v d n f

P a g e 5 1 of R e p o r t P B 4 / 0 9

Specialist Mathematics 2019 v1.2

Introduction to Algorithms

Property Testing and Affine Invariance Part I Madhu Sudan Harvard University

Introduction to Algorithms

General Strong Polarization

Vero Beach Elks Lodge # th Street Vero Beach, FL /

Math 171 Spring 2017 Final Exam. Problem Worth

PR D NT N n TR T F R 6 pr l 8 Th Pr d nt Th h t H h n t n, D D r r. Pr d nt: n J n r f th r d t r v th tr t d rn z t n pr r f th n t d t t. n

NCSS Statistical Software. Harmonic Regression. This section provides the technical details of the model that is fit by this procedure.

P.4 Composition of Functions and Graphs of Basic Polynomial Functions

Charge carrier density in metals and semiconductors

N V R T F L F RN P BL T N B ll t n f th D p rt nt f l V l., N., pp NDR. L N, d t r T N P F F L T RTL FR R N. B. P. H. Th t t d n t r n h r d r

Module 7 (Lecture 27) RETAINING WALLS

Model answers for the 2012 Electricity Revision booklet:

Trade Patterns, Production networks, and Trade and employment in the Asia-US region

Systems of Linear Equations

Implementations of 3 Types of the Schreier-Sims Algorithm

Math 30-1 Trigonometry Prac ce Exam 4. There are two op ons for PP( 5, mm), it can be drawn in SOLUTIONS

4 4 N v b r t, 20 xpr n f th ll f th p p l t n p pr d. H ndr d nd th nd f t v L th n n f th pr v n f V ln, r dn nd l r thr n nt pr n, h r th ff r d nd

Rotational Motion. Chapter 10 of Essential University Physics, Richard Wolfson, 3 rd Edition

Lecture 6. Notes on Linear Algebra. Perceptron

Introduction to Algorithms

G.6 Function Notation and Evaluating Functions

46 D b r 4, 20 : p t n f r n b P l h tr p, pl t z r f r n. nd n th t n t d f t n th tr ht r t b f l n t, nd th ff r n b ttl t th r p rf l pp n nt n th

,. *â â > V>V. â ND * 828.

GG7: Beyond 2016 FROM PARIS TO ISE-SHIMA. The State of Climate Negotiations: What to Expect after COP 21. g20g7.com

Teacher Road Map for Lesson 10: True and False Equations

Module 7 (Lecture 25) RETAINING WALLS

Chapter Learning Objectives. Random Experiments Dfiii Definition: Dfiii Definition:

Review for Exam Hyunse Yoon, Ph.D. Assistant Research Scientist IIHR-Hydroscience & Engineering University of Iowa

Heat, Work, and the First Law of Thermodynamics. Chapter 18 of Essential University Physics, Richard Wolfson, 3 rd Edition

General Strong Polarization

I N A C O M P L E X W O R L D

Jasmin Smajic1, Christian Hafner2, Jürg Leuthold2, March 23, 2015

0 t b r 6, 20 t l nf r nt f th l t th t v t f th th lv, ntr t n t th l l l nd d p rt nt th t f ttr t n th p nt t th r f l nd d tr b t n. R v n n th r

OH BOY! Story. N a r r a t iv e a n d o bj e c t s th ea t e r Fo r a l l a g e s, fr o m th e a ge of 9

Diffusion modeling for Dip-pen Nanolithography Apoorv Kulkarni Graduate student, Michigan Technological University

- Why aren t there more quantum algorithms? - Quantum Programming Languages. By : Amanda Cieslak and Ahmana Tarin

A A. an at and April fat Ana black orange gray cage. A a A a a f A a g A N a A n N p A G a A g A n a A a a A g a. Name. Copy the letter Aa.

F.4 Solving Polynomial Equations and Applications of Factoring

Lesson 12: Solving Equations

QJ) Zz LI Zz. Dd Jj. Jj Ww J' J Ww. Jj Ww. Jj I\~~ SOUN,DS AND LETTERS

Computer Science Introductory Course MSc - Introduction to Java

CS Lecture 8 & 9. Lagrange Multipliers & Varitional Bounds

Lesson 15: Rearranging Formulas

n r t d n :4 T P bl D n, l d t z d th tr t. r pd l

NVLAP Proficiency Test Round 14 Results. Rolf Bergman CORM 16 May 2016

B629 project - StreamIt MPI Backend. Nilesh Mahajan


Beechwood Music Department Staff

Dan Roth 461C, 3401 Walnut

4 8 N v btr 20, 20 th r l f ff nt f l t. r t pl n f r th n tr t n f h h v lr d b n r d t, rd n t h h th t b t f l rd n t f th rld ll b n tr t d n R th

3. 4 I n t e r C l i e n t. J B u i I d e r J D B C J D B C - O D B C J B u i l d e r. D B M S A P I O D B C ( Interface, CLI) J D B C O D B C

Chapter y. 8. n cd (x y) 14. (2a b) 15. (a) 3(x 2y) = 3x 3(2y) = 3x 6y. 16. (a)

MATH 1080: Calculus of One Variable II Fall 2018 Textbook: Single Variable Calculus: Early Transcendentals, 7e, by James Stewart.

Colby College Catalogue

Free energy dependence along the coexistence curve

Hardware Acceleration of the Pair HMM Algorithm for DNA Variant Calling

Transcription:

Cg: A system for programming graph ic s hh ard ww are in a C-l ik e ll angu age U n i v e r s i t y o f T e x a s a t A u s t i n S t e v e n G v N V I D I A K u A y N V I D I A a n d S t a n f o r d U n i v e r s i t y K N V I D I A William R. Mark R. lan ille rt ke le Mark ilg ard

Background C g i s b o t h a l a n g u a g e a n d a s y s t e m C g l a n g u a g e i s f o r w r i t i n g s t r e a m k e r n e l s C g s y s t e m t a r g e t s g r a p h i c s h a r d w a r e D e v e l o p e d b y N V I D I A I n c o l l a b o r a t i o n w i t h M i c r o s o f t R u n s o n l o t s o f h a r d w a r e ( n o t j u s t N V I D I A s ) W i d e l y d e p l o y e d S h i p p i n g f o r o v e r a y e a r A n y o n e c a n d o w n l o a d i t L o t s o f i n f o r m a t i o n a v a i l a b l e : C g l a n g u a g e s p e c i f i c a t i o n v i a d o w n l o a d C g t u t o r i a l b u y o n a m a z o n. c o m P a p e r i n S I G G R A P H 2 0 0 3

GPUs aa rr e e nn oo w w pp rr oo gg rr aa mm mm aa bb ll ee Vertex Processor Triangle Assembly & Rasterization Fragment Processor Framebuffer Operations Framebuffer Texture Decompression & Filtering Textures MOV R4.y, R2.y; ADD R4.x, -R4.y, C[3].w; MAD R3.xy, R2, R3.xyww, C[2].z; ADD R3.xy, R3.xyww, C11.z; TEX H5, R3, TEX2, 2D; TEX H6, R3, TEX2, 2D;

GG pp Programmable units in PU s are stream roc essors kernel state Input stream I1 I2 In Stream Processor Output stream O1 O2 On The programmable unit executes a computational k ernel f or each input element S treams consist of ord ered elements

VV Example: er tt ex pr oo cc es ss oo rr Vertex Array V1 V2 Vn uniform values Vertex Processor Transformed Vertexes V1 V2 Vn Fragment processor can do more: I t can read f rom tex tu re memory

Previous work E a r l i e r p r o g r a m m a b l e g r a p h i c s H W I k o n a s P i x a r F L A P e v i h U N C P i x e l F l o w 9 M u l t i p a s s S I M D : S G I I S L e e r c y 0 0 [England 86] [L nt al 87 ] [O lano 8] [P ] R e n d e r M a n [H anr ah an9 0 ] S t a n f o r d R T S L [P r o u df o o t 0 1 ]

Design goals E a s i e r p r o g r a m m i n g o f G P U s E a s e o f a d o p t i o n P o r t a b i l i t y H W, A P I, O S C o m p l e t e s u p p o r t f o r H W c a p a b i l i t i e s P e r f o r m a n c e - s i m i l a r t o a s s e m b l y M i n i m a l i n t e r f e r e n c e w i t h a p p l i c a t i o n d a t a L o n g e v i t y -- u s e f u l f o r f u t u r e h a r d w a r e S u p p o r t f o r n o n -s h a d i n g c o m p u t a t i o n s N o n -g o a l : B a c k w a r d c o m p a t i b i l i t y

Major Design Decisions

e Modular or monolithic archite cture?? C g s y s t e m i s m o d u l a r P r o v i d e s a c c e s s t o a s s e m b l y -l e v e l i n t e r f a c e s O t h e r s y s t e m s c h o s e d i f f e r e n t l y C o m p i l e o f f -l i n e, o r a t r u n t i m Cg OpenGL runtime OpenGL API GPU Application Cg Direct3D runtime Direct3D API Cg common runtime (compiler) The Cg system manages Cg programs AND the flow of data on which they operate

Specialize for shading? R e n d e r M a n l a n g u a g e i s d o m a i n -s p e c i f i c D o m a i n -s p e c i f i c t y p e s ( e. g. c o l o r ) D o m a i n -s p e c i f i c c o n s t r u c t s ( e. g. i l l u m i n a n c e ) I m p o s e s a p a r t i c u l a r m o d e l o n t h e u s e r C i s g e n e r a l p u r p o s e A H W o r i e n t e d k e r n e l l a n g u a g e A v o i d s a s s u m p t i o n s a b o u t p r o b l e m d o m a i n C g f o l l o w s C s p h i l o s o p h y L a n g u a g e f o l l o w s s y n t a x a n d p h i l o s o p h y o f C T a r g e t s G P U H W p e r f o r m a n c e t r a n s p a r e n c y S o m e e x c e p t i o n s w e w e r e n o t d o g m a t i c

Cg language example void simpletransform(float4 objectposition : POSITION, float4 color : COLOR, float4 decalcoord : TEXCOORD0, out float4 clipposition : POSITION, out float4 Color : COLOR, out float4 odecalcoord : TEXCOORD0, uniform float brightness, uniform float4x4 modelviewprojection) { clipposition = mul(modelviewprojection, objectposition); ocolor = brightness * color; odecalcoord = decalcoord; }

One program or two? Vertex Processor Triangle Assembly & Rasterization Fragment Processor Framebuffer Operations Framebuffer Texture Decompression & Filtering Textures void vprog(float4 objp : POSITION, float4 color : COLOR, Vertex program void fprog(float4 t0 : TEXCOORD0, float4 t1 : TEXCOORD1, Fragment program

How should stream HW be ex pp osed ii n n the lan gg uag e? B a s i c o r g a n i z a t i o n : S e p a r a t e p r o g r a m f o r e a c h k e r n e l K e r n e l s m a y i n c l u d e c o n t r o l f l o w ( S P M D ) I n p u t s & o u t p u t s : O n e i n p u t s t r e a m, o n e o u t p u t s t r e a m M u l t i p l e v a r i a b l e s i n e a c h r e c o r d N o c o n d i t i o n a l i n p u t s o r o u t p u t s M u l t i p l e n o n -s t r e a m i n p u t s ( u n i f o r m v a r s ) M e m o r y : M e m o r y -r e a d o p e r a t i o n s O K w i t h i n k e r n e l

How should system support di ff ff eren t lev els of HW?? NV20 R200 R300 NV30 NV40 R400 NV50? H W c a p a b i l i t i e s c h a n g e e a c h g e n e r a t i o n D a t a t y p e s S u p p o r t f o r b r a n c h i n s t r u c t i o n s, W e e x p e c t t h i s p r o b l e m t o p e r s i s t F u t u r e G P U s w i l l h a v e n e w f e a t u r e s M a n d a t e e x a c t l y o n e f e a t u r e s e t? M u s t s t r a n d o l d e r H W o r l i m i t n e w e r H W

Two options for handling HH W W diffe re nc ee s E m u l a t e m i s s i n g f e a t u r e s? T o o s l o w o n G P U T o o s l o w o n C P U, e s p e c i a l l y f o r f r a g m e n t H W E x p o s e d i f f e r e n c e s t o p r o g r a m m e r? W e c h o s e t h i s o p t i o n D i f f e r e n c e s e x p o s e d v i a subsetting A p r o f il e i s a n a m e d s u b s e t C g s u p p o r t s f u n c t i o n o v e r l o a d i n g b y p r o f i l e

Cg is closely related to oth er recen t lan gu ages M i c r o s o f t H L S L L a r g e l y c o m p a t i b l e w i t h C g N V I D I A a n d M i c r o s o f t c o l l a b o r a t e d O p e n G L A R B s h a d i n g l a n g u a g e A l l t h r e e l a n g u a g e s a r e s i m i l a r O v e r l a p p i n g d e v e l o p m e n t E x t e n s i v e c r o s s -p o l l i n a t i o n o f i d e a s D e s i g n e r s m o s t l y a g r e e d o n r i g h t a p p r o a c h S y s t e m s a r e d i f f e r e n t

Summary C g s y s t e m : A s y s t e m f o r p r o g r a m m i n g G P U s C g l a n g u a g e : E x t e n d s a n d r e s t r i c t s C a s n e e d e d f o r G P U s E x p r e s s e s s t r e a m k e r n e l s H W o r i e n t e d l a n g u a g e D e s i g n e d t o a g e w e l l B y r e i n t r o d u c i n g m i s s i n g C f e a t u r e s

Acknowledgements L a n g u a g e c o -d e s i g n e r s a t M i c r o s o f t : C r a i g P e e p e r a n d L o r e n M c Q u a d e I n t e r f a c e f u n c t i o n a l i t y : C r a i g K o l b, M a t t P h a r r I n i t i a l l a n g u a g e d i r e c t i o n : C a s s E v e r i t t S t a n d a r d l i b r a r y : C h r i s W y n n D e s i g n a n d i m p l e m e n t a t i o n t e a m : G e o f f B e r r y, M i k e B u n n e l l, C h r i s D o d d, W e s H u n t, J a y a n t K o l h e, R e v L e b a r e d i a n, N a t h a n P a y m e r, M a t t P h a r r, D o u g R o g e r s D i r e c t o r : N i c k T r i a n t o s M a n y o t h e r s a t N V I D I A a n d M i c r o s o f t C o d e a n d e x p e r i e n c e f r o m S t a n f o r d R T S L : K e k o a P r o u d f o o t, P a t H a n r a h a n, a n d o t h e r s

Questions

Backup slides

Two detailed des ig n n dec is ion ss

How to specify a returnable function param eter? C u s e s p o i n t e r s B u t n o p o i n t e r s o n c u r r e n t G P U s void foobar(float *y) { *y = 2*x; C + + u s e s p a s s -b y -r e f e r e n c e : f l o a t & y P r e f e r r e d i m p l e m e n t a t i o n s t i l l u s e s p o i n t e r s C g u s e s p a s s -b y -v a l u e -r e s u l t I m p l e m e n t a t i o n d o e s n t n e e d p o i n t e r s void foobar(out float y) { y = 2*x;. By using new syntax, we preserve C and C+ + syntax f o r th e f uture

General mechanism for su rface/ lig ht sep arab ilit yy SURFACE 1 SURFACE 2 LIGHT 1 LIGHT 3 LIGHT 2 C o n v e n i e n t t o p u t surface a n d l i g h t d i f f e r e n t m o d u l e s c o d e i n D e c o u p l e s s u r f a c e s e l e c t i o n f r o m l i g h t s e l e c t i o n P r o v e n u s e f u l n e s s : C l a s s i c O p e n G L ; R e n d e r M a n ; e t c. W e l o s t i t f o r a w h i l e! R e n d e r M a n u s e s a s p e c i a l i z e d m e c h a n i s m C g u s e s a g e n e r a l -p u r p o s e m e c h a n i s m M o r e f l e x i b l e F o l l o w s C -l i k e p h i l o s o p h y

Light/surface example FF irst: DD eclare an in terface // Declare interface to lights interface Light { float3 direction(float3 from); float4 illuminate(float3 p, out float3 lv); }; Mechanism adapted from Java, C#

Second: Declare a light that im pp lem ents interf ace // Declare object type for point lights struct PointLight : Light { float3 pos, color; float3 direction(float3 p) { return pos - p; } float3 illuminate(float3 p, out float3 lv) { lv = normalize(direction(p)); return color; } }; Declare an object that i m p lem ents the i nterf ace

Third: Define surface tt hat cal ll s int erface // Main program (surface shader) float4 main(appin IN, out float4 COUT, uniform Light lights[ ]) {... for (int i=0; i < lights.length; i++) { // for each light Cl = lights[i].illuminate(in.pos, L); // get dir/color color += Cl * Plastic(texcolor, L, Nn, In, 30); // apply } COUT = color; } Call o b j e c t ( s ) v i a t h e i n t e r f ac e t y p e