Encoding-Assisted Temporal Direct Mode Decision for B Pictures in H.264/AVC

Ecodig-Assisted Temporal Direct Mode Decisio for B Pictures i H.64/AVC Ya-Neg Fag ad Yiyi Li Hui-Jae Hsieh Departmet of Commuicatio Egieerig Departmet of Iformatio Maagemet Natioal Cetral Uiversity, Taiwa Chie Hsi Uiversity of Sciece ad Techology, Taiwa Email: yili@ce.cu.edu.tw Abstract This paper proposes a ecodig-assisted temporal DIRECT mode decisio algorithm for H.64/AVC iter bipredictive (B) frame video sequeces to improve the codig efficiecy. I the proposed algorithm, we employ motio vectors (s) of co-located block ad its eight eighborig blocks for DIRECT mode decisio. I additio, the weight selectio for bidirectioal predictio is also cosidered. The best ad weight to miimize the sum of absolute predictio error is selected for DIRECT mode decisio. The experimetal results reveal that the proposed algorithm achieves average 0. db PSNR gai or equivaletly average.6% bit-rate reductio, compared to the covetioal DIRECT mode codig that oly uses the of the co-located MB for DIRECT mode decisio. The temporal DIRECT mode decisio suggested i H.64/AVC is simple but ot effective sice i may cases the of co-located MB or block does ot represet the true motio of the curret MB or block [][3]. This could cause severe predictio errors resultig i heavy redudat codig bits. I this paper, we propose a efficiet temporal DIRECT mode decisio algorithm for H.64/AVC B frame video sequeces to improve its codig efficiecy. I the suggested techique, i additio to the of the co-located block we also employ s of its eighborig blocks for DIRECT mode decisio. Keywords: Direct mode decisio, H.64/AVC, Predictio motio vector, Rate-distortio optimizatio (RDO) I. INTRODUCTION The latest H.64/AVC achieves better performace i both PSNR ad visual quality at the same bit rate, compared to prior video codig stadards. This is due to that H.64/AVC features may advaced techiques, such as variable block sizes mode decisio ad multiple referece frames motio estimatio etc., ad also due to the cosideratio of geeralized bi-predictive (B) frame video codig []. Aother importat techique is the uses of Lagragia rate-distortio optimizatio (RDO). I the H.64/AVC ecoder both iter ad itra mode predictios are provided i both predictive (P) ad bipredictive (B) frames. The iter mode predictio provides seve modes for iter-frame motio estimatio, chagig amog 6x6, 6x8, 8x6, 8x8, 8x4, 4x8, ad 4x4. They are performed i each MB to achieve the best codig efficiecy. The itra mode predictio offers I4x4MB predictio mode ad I6x6MB predictio mode. The iter B frame codig ca use backward as well as forward frames for multiple predictios. As a result, high percetage of bits is required to ecode motio iformatio such as predictio mode, motio vector ad referece frame. To alleviate high overhead problem, i additio to iter ad itra modes the SKIP mode ad DIRECT mode are also itroduced i both P ad B frames, respectively. I the SKIP or DIRECT mode codig, the motio iformatio is obtaied directly from previously ecoded MB or blocks ad motio iformatio is ot eeded to trasmit, leadig to great overhead reductio withi the bit stream. Fig. Bidirectioal Predictio i DIRECT mode II. DIRECT MODE DECISION USED IN H.64/AVC I the H.64/AVC ecoder, the iter mode predictio for B frame ca use forward as well as backward referece frames (amely List 0 referece ad List referece) for multiple predictios. The temporal DIRECT mode decisio uses bidirectioal predictios, ad the forward ad backward motio vectors are derived from the motio vector co located used i the co-located block i the sub-sequetial referece frame, i.e., the first List referece frame. As illustrated i Fig., the motio vectors ad for temporal DIRECT mode blocks are calculated as 38

ad T = co located ( r0, 0) () L 0 = s T = co located ( u0, 0) () L = v where T ad T are the distaces betwee the curret frame ad the forward/backward referece frames i List 0 referece ad List referece respectively. The bidirectioal predictio for the DIRECT mode is obtaied by averagig associated blocks i these refereces B ( = B( i + r0, j + s0) + B( i u0, j v0) (3) The DIRECT mode decisio allows residual codig of the predictio error betwee the curret block B( ad the predictio block B 0(. There are three types of DIRECT mode used i H.64/AVC based upo the residual iformatio ad the block size: DIRECT 6x6, DIRECT 8x8 ad B SKIP 6x6. The residual iformatio are trasmitted i the bit-stream for both DIRECT 6x6 ad DIRECT 8x8; while o residual iformatio trasmitted for B SKIP 6x6. b b b b b b f f f L Fig. Accurate predictio for DIRECT mode decisio III. ENCODING-ASSISTED DIRECT MODE DECISION FOR B PICTURES The bidirectioal predictio usig (3), takig the of co-located block i the first List referece frame as the estimated, is a simple yet efficiet approach for DIRECT mode decisio. The predictio error however becomes highly critical i occlusio regios or whe the of the co-located block does ot preset the true motio of the curret block. As illustrated i Fig., whe a object is movig from block b 7 i the List 0 referece frame f to block b 3 i the List referece frame f with a costat, a ew object comes ito sight i the ucovered regio ad a existig object goes out of sight i the covered regio i the curret B frame f. Aother object is also covered i block b 3 i the List referece frame f. These areas are referred to as occlusio areas. I additio, the of the co-located block b 5 i the List referece frame f (i.e., iter coded with zero i this case) does ot represet the true of the curret block b 5 i the curret B frame f. As a result, the DIRECT mode decisio proposed i H.64/AVC, usig equal weight ad of the co-located block b 5 i the first List referece frame f, caot produce good predictio for these video occlusios, ad it leads to serious predictio errors. As illustrated i Fig., B L 0( ad B L ( should be respectively used to predict the blocks b 3 ad b7 i the curret B frame for DIRECT mode. I similar, the of the block b 3 i the List referece frame should be used to predict curret block b 5 : B ( = B( i + r3, j + s3) + B( i u3, j v3) (4) where ( r 3, s 3) ad ( u 3, v 3 ) are the s for forward ad backward predictio blocks derived from the of the block b 3 (as deoted as 3 ) i the List referece frame T L 0 = 3 = ( r3, s3) (5) T + T ad T L = 3 = ( u3, v3) (6) To achieve more accurate predictio i DIRECT mode decisio, we propose a geeral bidirectioal predictio for DIRECT mode which is expressed as B ( = w B ( i + r, j + s ) + w B ( i u, j v ) (7) k k k k where w ad w are the weights for forward ad backward predictio blocks respectively i DIRECT mode with w L 0 + w L = ad w, w 0,,} (8) { where w = ( w L 0, w) = (0,) is suitable to appearig objects (i.e., block b 7 i Fig. () while w = ( w L 0, w) = (,0 ) for disappearig objects (i.e., block b 3 ) i occlusio areas of the curret B frame. For o occlusio areas, the weight 39

w ( w L, w ) (, ) is the best choice. = 0 = I the proposed DIRECT mode decisio, the estimatios of w = w f, w ) ad v = v x, v ) are accomplished i the ( b ecoder by miimizig (, W ) = arg mi mi ( y k V s W W s j B ( B ( where γ desigates the selected orm, ad γ = is used i the experimet to measure predictio errors betwee the curret block B ( ad the bidirectioal predictio block B ( give i (7). The parameter V s represets the set of all cadidates k i co-located ad its eighborig blocks. The predictio error ca be reduced whe the best or weight is used for DIRECT mode decisio, leadig to less codig bits for the redudat iformatio. Prob. Code_um Codeword Co-located 0.40 0 w=(,0) 0.04 0 0000 w=(0,) 0.067 0 Left-Top 0.093 00 Left 0.066 3 0000 Left-Bottom 0.058 5 000 Bottom 0.06 4 000 Right-Bottom 0.058 6 00 Right 0.054 7 000000 Right-Top 0.047 8 00000 Top 0.044 9 0000 TABLE I Exp-Golomb code for extra overhead Although the redudat codig bits ca be lowered i DIRECT mode decisio whe the best or weight selected from co-located ad its eighborig blocks is used, the extra overhead that idicates the of which block or weight values used for bidirectioal predictio is required for trasmissio. More eighborig blocks employed for or weight selectio, more heavy extra overhead required for trasmissio i DIRECT mode. The extra overhead degrades the codig performace. To reduce performace degradatio itroduced i the extra overhead, i this paper we oly cosider the weights w = ( w L 0, w) = (0,) ad w = ( w L 0, w) = (,0 ) for co-located block. I additio, we take ito accout the s of the co-located block as well as its eight eighborig blocks for selectio. As a result, we eed to sed extra overhead describig the eleve cases whe the DIRECT mode is fially determied as the best mode through RDO mode decisio. γ (9) To comply with the H.64/AVC ecoder, we employ the Exp-Golomb code as the etropy ecoder to ecode the extra overhead. The extra overhead is iserted after mb_type that describes the best mode for the ecodig block. If the best mode is the DIRECT mode, the extra overhead describig ad weight iformatio is the ecoded usig the Exp-Golomb etropy ecoder, based o the probability distributio of best ad weight. The oe with higher probability is mapped with the shorter codeword, ad vice versa. A itesive experimet was coducted o may video sequeces to ivestigate the average probability distributio of ad weight iformatio. The probability distributio ad associated codewords for extra overhead is documeted i TABLE I. IV. EXPERIMENTAL RESULTS I this sectio, we compare the performace of the proposed temporal DIRECT mode decisio algorithm (deoted as proposed TDMD) with the temporal DIRECT mode decisio method proposed i the H.64/AVC ecoder (deoted as origial TDMD). I the proposed TDMD, the best or weight is selected from the co-located block ad its eight eighborig blocks for DIRECT mode decisio based o the criterio give i (9). The proposed TDMD is oly applied to DIRECT 6x6 ad B SKIP 6x6. To reduce extra overhead, the DIRECT 8x8 mode uses the origial DIRECT mode decisio. QCIF CIF 4CIF Forema Claire Trevor Mobile coastguard Waterfall Stefa ews bus Dacer City Crew Harbour Ice Soccer Code Versio JM. Profile Mai GOP Structure IBPBP... Ecodig Frames 99 Frame Rate 30 N P N B,List0 N B,List N P,List QP QPB=QPP+3 RDO O Etropy Codig CAVLC TABLE II Simulatio coditios We implemet these algorithms ito the JM ecoder JM. to evaluate their performace. The simulatio uses fiftee test sequeces, coverig a wide rage of motio activities ad various formats (QCIF: 76 44, CIF: 35 88, ad 4CIF: 704 576). I the experimetal settig, each sequece has 00 frames i simulatios for sequece coded with IBPBP structure. The frame rate is 30 frames per secod ad the quatizatio parameter for B frames is set as QP QP + 3 []. The B = P 40

experimetal settig is summarized i TABLE II. claire sequece are homogeeous ad statioary, ad the bidirectioal predictio usig of the co-located block for DIRECT mode usually gives very good codig efficiecy, compared to its eighborig blocks. As a result, extra overhead for ad weight selectio degrades its codig performace severely. TB k = k T A Fig. 3 biliear iterpolatio for forward/backward s For simplicity the umber of referece frames for motio estimatio is N P =, i.e., with two referece frame buffers. The umbers of referece frame for B frames are N B, List0 = ad N B, List = respectively, while N P, List0 = for P frames. Note that the biliear iterpolatio is employed to procure the desired i the List 0 referece frame whe the referece umber of the List referece P frame is, as illustrated i Fig. 3. The performace is compared based upo Bjotegaard Delta PSNR (BDPSNR) ad Bjotegaard Delta Bit Rate (BDBR) [4] for QP P =0, 4, 8 ad 3. TABLE III displays the BDPNR ad BDBR results, as compared to origial TDMD, that shows both cases with ad without extra overhead. As demostrated, the proposed TDMD achieves average 0. db BDPSNR gai ad 4.% of BDBR bit-rate savig whe the extra overhead is ot cosidered. Whe the extra overhead is take ito accout, the BDPSNR gai lesses from 0. db to 0. db ad the BDBR reductio lesses from 4.% to.6%. The proposed TDMD still outperforms the origial TDMD. As show i TABLE III, the superiority of the proposed algorithm is evidet for fast motio video sequeces such as forema, mobile, bus etc. I these sequeces with high motio activities, the occlusio pheomeo occurs ofte ad the of the co-located block caot usually represet the true motio of the curret ecodig block, leadig serious predictio errors. To obtai further isight, Fig. 4 compares RD performace for various QPs, carried o mobile to show its superiority over origial TDMD algorithm. No matter how, the advatage of the proposed TDMD becomes lost for video sequeces with slow motio activities like claire. This is because that most areas i PSNR (db) Mobole.QCIF 50 48 46 44 4 40 38 36 34 3 QPP0,4,8,3 BDPSNR (db) BDBR (%) No Overhead Overhead No Overhead Overhead Forema 0.396 0.65-8.480-5.67 Claire 0.68-0.5-4.965.84 QCIF Trevor 0.60 0.097-4.90 -.59 Mobile 0.365 0.33-4.48-3.783 coastguard 0.03 0. -3.583 -.43 Waterfall 0.46 0.39-0.859-7.734 Stefa 0.9 0.34 -.75 -.90 CIF ews 0.44-0.00-4.39 0.07 bus 0.347 0.7-5. -4.05 Dacer 0.0-0.039 -.059 0.779 City 0.09 0.049 -.888 -.76 Crew 0.049-0.05 -.40 0.390 4CIF Harbour 0.06 0.008 -.4-0.4 Ice 0.4-0.08-3.955 0.494 Soccer 0.08-0.03-0.65 0.477 Average 0.5 0.09-4.07 -.577 TABLE III BDPSNR ad BDBR compariso Orig. TDMD Proposed TDMD 30 00 600 000 400 800 00 600 Bit-Rate (kbps) Fig. 4 Rate distortio curve o mobile V. CONCLUSION I this paper, we suggest a temporal DIRECT mode decisio algorithm for H.64/AVC iter B frame video codig to ehace the codig performace. The proposed algorithm uses s of the co-located block as well as its eight eighborig blocks for DIRECT mode decisio. I additio, the weight selectio is also cosidered for occlusio areas. The experimetal results reveal that average PSNR gai of 0. db, or correspodig to average.6% of bit-rate reductio ca be achieved, compared to the temporal DIRECT mode decisio proposed i H.64/AVC. REFERENCES it R 4

[] A. Vectro, C. Christopoulos, ad H. Su, Video trascodig architectures ad techiques: A overview, IEEE Sigal Processig Magazie, vol. 0, pp. 8-9, March 003. [] M. Flierl ad B. Girod, Geeralized B pictures ad the draft H.64/AVC video compressio-stadard, IEEE Tras. Circuits Syst. Video Techol., vol. 3, o. 7, pp. 587-597, July 003. [3] A. M. Tourapis, F. Wu ad S. L Direct mode codig for bipredictive slices i the H.64 stadard, IEEE Tras. Circuits Syst. Video Techol., vol. 5, o., pp. 9-6, Jauary 005. [4] G. Bjotegaard, Calculatio of average PSNR differece betwee RD curves, ITU-T Q.6/6, Doc. VCEG-M33, April 00. 4