Additional Pointers. Introduction to Computer Vision. Convolution. Area operations: Linear filtering

Similar documents
I Chen Lin, Assistant Professor Dept. of CS, National Chiao Tung University. Computer Vision: 4. Filtering

Edge Detection PSY 5018H: Math Models Hum Behavior, Prof. Paul Schrater, Spring 2005

Edge Detection. Computer Vision P. Schrater Spring 2003

Edge Detection. Introduction to Computer Vision. Useful Mathematics Funcs. The bad news

Linear Operators and Fourier Transform

DISCRETE FOURIER TRANSFORM

CITS 4402 Computer Vision

Frequency2: Sampling and Aliasing

MIT 2.71/2.710 Optics 10/31/05 wk9-a-1. The spatial frequency domain

Lecture 6: Edge Detection. CAP 5415: Computer Vision Fall 2008

SYDE 575: Introduction to Image Processing. Image Compression Part 2: Variable-rate compression

6.869 Advances in Computer Vision. Bill Freeman, Antonio Torralba and Phillip Isola MIT Oct. 3, 2018

Convolution and Linear Systems

CS 4495 Computer Vision. Frequency and Fourier Transforms. Aaron Bobick School of Interactive Computing. Frequency and Fourier Transform

Discrete Fourier Transform

FILTERING IN THE FREQUENCY DOMAIN

Vlad Estivill-Castro (2016) Robots for People --- A project for intelligent integrated systems

Image Filtering. Slides, adapted from. Steve Seitz and Rick Szeliski, U.Washington

ECG782: Multidimensional Digital Signal Processing

CEE598 - Visual Sensing for Civil Infrastructure Eng. & Mgmt.

Thinking in Frequency

Digital Image Processing

Announcements. Filtering. Image Filtering. Linear Filters. Example: Smoothing by Averaging. Homework 2 is due Apr 26, 11:59 PM Reading:

Lecture 3: Linear Filters

Images have structure at various scales

Multimedia Networking ECE 599

Frequency Filtering CSC 767

Lecture 3: Linear Filters

Edges and Scale. Image Features. Detecting edges. Origin of Edges. Solution: smooth first. Effects of noise

Reading. 3. Image processing. Pixel movement. Image processing Y R I G Q

Computer Vision Lecture 3

Key Intuition: invertibility

Roadmap. Introduction to image analysis (computer vision) Theory of edge detection. Applications

Templates, Image Pyramids, and Filter Banks

Edge Detection. CS 650: Computer Vision

Advanced Features. Advanced Features: Topics. Jana Kosecka. Slides from: S. Thurn, D. Lowe, Forsyth and Ponce. Advanced features and feature matching

Lecture 04 Image Filtering

Filtering and Edge Detection

Review for Exam 1. Erik G. Learned-Miller Department of Computer Science University of Massachusetts, Amherst Amherst, MA

Lecture 8: Interest Point Detection. Saad J Bedros

Introduction to Computer Vision. 2D Linear Systems

Spatial Enhancement Region operations: k'(x,y) = F( k(x-m, y-n), k(x,y), k(x+m,y+n) ]

G52IVG, School of Computer Science, University of Nottingham

ECE Digital Image Processing and Introduction to Computer Vision

3. Lecture. Fourier Transformation Sampling

Outline. Convolution. Filtering

Why does a lower resolution image still make sense to us? What do we lose? Image:

CSCI 1290: Comp Photo

Multiscale Image Transforms

Fourier Matching. CS 510 Lecture #7 February 8 th, 2013

Subsampling and image pyramids

Digital Image Processing. Filtering in the Frequency Domain

Lecture 8: Interest Point Detection. Saad J Bedros

Laplacian Filters. Sobel Filters. Laplacian Filters. Laplacian Filters. Laplacian Filters. Laplacian Filters

Edge Detection in Computer Vision Systems

Introduction to Fourier Analysis Part 2. CS 510 Lecture #7 January 31, 2018

Lecture 7: Edge Detection

Convolution Spatial Aliasing Frequency domain filtering fundamentals Applications Image smoothing Image sharpening

The Frequency Domain : Computational Photography Alexei Efros, CMU, Fall Many slides borrowed from Steve Seitz

GBS765 Electron microscopy

Module 3 : Sampling and Reconstruction Lecture 22 : Sampling and Reconstruction of Band-Limited Signals

The Frequency Domain, without tears. Many slides borrowed from Steve Seitz

The Frequency Domain. Many slides borrowed from Steve Seitz

Review of Linear System Theory

Introduction to Computer Vision

Image Filtering, Edges and Image Representation

CS Sampling and Aliasing. Analog vs Digital

Representation of 1D Function

Lucas-Kanade Optical Flow. Computer Vision Carnegie Mellon University (Kris Kitani)

Today s lecture. Local neighbourhood processing. The convolution. Removing uncorrelated noise from an image The Fourier transform

Basics on 2-D 2 D Random Signal

Fourier series: Any periodic signals can be viewed as weighted sum. different frequencies. view frequency as an

Image Enhancement in the frequency domain. GZ Chapter 4

Wavelets and Multiresolution Processing

Frequency, Vibration, and Fourier

Computer Vision. Filtering in the Frequency Domain

Intelligent Visual Prosthesis

Discrete Fourier Transform

6.003: Signals and Systems. Sampling and Quantization

Chapter 4 Image Enhancement in the Frequency Domain

Contents. Signals as functions (1D, 2D)

Total Variation Image Edge Detection

Optics for Engineers Chapter 11

Neural networks and optimization

DESIGN OF MULTI-DIMENSIONAL DERIVATIVE FILTERS. Eero P. Simoncelli

Introduction to Computer Vision

Machine vision. Summary # 4. The mask for Laplacian is given

CSE 473/573 Computer Vision and Image Processing (CVIP)

2. the basis functions have different symmetries. 1 k = 0. x( t) 1 t 0 x(t) 0 t 1

DESIGNING CNN GENES. Received January 23, 2003; Revised April 2, 2003

Fourier Transform and Frequency Domain

Mathematics for Graphics and Vision

COMS 4721: Machine Learning for Data Science Lecture 19, 4/6/2017

CAP 5415 Computer Vision

Lecture 4 Filtering in the Frequency Domain. Lin ZHANG, PhD School of Software Engineering Tongji University Spring 2016

Optics for Engineers Chapter 11

Linear Diffusion and Image Processing. Outline

TRACKING and DETECTION in COMPUTER VISION Filtering and edge detection

Computer Vision & Digital Image Processing. Periodicity of the Fourier transform

Machine vision, spring 2018 Summary 4

Transcription:

Additional Pointers Introduction to Computer Vision CS / ECE 181B andout #4 : Available this afternoon Midterm: May 6, 2004 W #2 due tomorrow Ack: Prof. Matthew Turk for the lecture slides. See my ECE 178 class web page http://www.ece.ucsb.edu/aculty/manjunath/ece178 See the review chapters from Gonzalez and Woods (available on the 181b web) A good understanding of linear filtering and convolution is essential in developing computer vision algorithms. Topics I recommend for additional study (that I will not be able to discuss in detail during lectures)--> sampling of signals, ourier transform, quantization of signals. 2 Area operations: Linear filtering Point, local, and global operations Each kind has its purposes Much of computer vision analysis starts with local area operations and then builds from there Texture, edges, contours, shape, etc. Perhaps at multiple scales Linear filtering is an important class of local operators Convolution Correlation ourier (and other) transforms Sampling and aliasing issues Convolution The response of a linear shift-invariant system can be described by the convolution operation R ij = Output image R u,v i u,j v uv Convolution filter kernel Input image M 1 N 1 ij = mni m, j n m= 0 n= 0 R = * = Convolution notations 3 4

Convolution Convolution R M 1 N 1 ij = mni m, j n m= 0 n= 0 Think of 2D convolution as the following procedure or every pixel (i,j): Line up the image at (i,j) with the filter kernel lip the kernel in both directions (vertical and horizontal) Multiply and sum (dot product) to get output value R(i,j) or every (i,j) location in the output image R, there is a summation over the local area (i,j) 5 R 4,4 = 0,0 4,4 + 0,1 4,3 + 0,2 4,2 + = -1*222+0*170+1*149+ 1,0 3,4 + 1,1 3,3 + 1,2 3,2 + -2*173+0*147+2*205+ -1*149+0*198+1*221 2,0 2,4 + 2,1 2,3 + 2,2 2,2 6 = 63 Convolution: example Spatial frequency and ourier transforms n 1 1 4 1 0 2 5 3 0 1 2 x(m,n) m n 1 1 1 0 1-1 0 1 h(m,n) m -1 1 1 1 h(-m, -n) -1 1 1 1 h(1-m, n) A discrete image can be thought of as a regular sampling of a 2D continuous function The basis function used in sampling is, conceptually, an impulse function, shifted to various image locations Can be implemented as a convolution y(1,0) = Σ k,l x(k,l)h(1-k, -l) = n 1 5 5 1 y(m,n)= 3 10 5 2 2 3-2 -3 m 0 0 0 0 0-2 5 0 0 0 0 0 = 3 verify! 7 8

Spatial frequency and ourier transforms We could use a different basis function (or basis set) to sample the image Let s instead use 2D sinusoid functions at various frequencies (scales) and orientations Can also be thought of as a convolution (or dot product) ourier transform ( u, v) = 2 R g( x, y) e i2π ( ux+ vy) dxdy or a given (u, v), this is a dot product between the whole image g(x,y) and the complex sinusoid exp(-i2π (ux+vy)) exp(iθ) = cosθ + i sinθ (u,v) is a complete description of the image g(x,y) Spatial frequency components (u, v) define the scale and orientation of the sinusoidal basis filters requency of the sinusoid: (u 2 +v 2 ) 1/2 Orientation of the sinusoid: θ = tan -1 (v/u) 9 Lower frequency igher frequency 10 (u,v) requency and orientation v (u,v) requency and orientation v Point represents: (0,0) Increasing spatial frequency Orientation θ (u 1,v 1 ) u u (u 2,v 2 ) 11 12

ourier transform The output (u,v) is a complex image (real and imaginary components) (u,v) = R(u,v) + i I(u,v) It can also be considered to comprise a phase and magnitude (u,v))2 Original Magnitude Phase (u,v))2]1/2 Magnitude: (u,v) = [( R + ( I Phase: φ ((u,v)) = tan-1(i (u,v) / R (u,v)) v (u,v) location indicates frequency and orientation u (u,v) values indicate magnitude and phase 13 Low-pass filtering via T 14 igh-pass filtering via T Grey = zero Absolute value 15 16

ourier transform facts The T is linear and invertible (inverse T) A fast method for computing the T exists (the T) The T of a Gaussian is a Gaussian (f * g) = ( f ) ( g ) (f g) = k ( f ) * ( g ) (δ(x,y)) = 1 (See Table 7.1) Sampling and aliasing Analog signals (images) can be represented accurately and perfectly reconstructed is the sampling rate is high enough 2 samples per cycle of the highest frequency component in the signal (image) If the sampling rate is not high enough (i.g., the image has components over the Nyquist frequency) Bad things happen! This is called aliasing Smooth things can look jagged Patterns can look very different Colors can go astray Wagon wheels can move backwards (temporal sampling) 17 18 Examples 19 Original 20

iltering and subsampling iltering and sub-sampling iltered then Subsampled Subsampled iltered then Subsampled Subsampled 21 22 Sampling in 1-D The bottom line 1 D X(u) igh frequencies lead to trouble with sampling x(t) Time domain requency Solution: suppress high frequencies before sampling Multiply the T of the image with a mask that filters out high frequency, or Convolve with a low-pass filter (commonly a Gaussian) s(t) T s(t) 1/T x s (t) = x(t) s(t) = Σ x(kt) δ (t-kt) X s (f) 1/T 23 24

ilter and subsample Image pyramid So if you want to sample an image at a certain rate (e.g., resample a 640x480 image to make it 160x120), but the image has high frequency components over the Nyquist frequency, what can you do? Level 3 Get rid of those high frequencies by low-pass filtering! Level 2 This is a common operation in imaging and graphics: ilter and subsample Image pyramid: Shows an image at multiple scales Level 1 Each one a filtered and subsampled version of the previous Complete pyramid has (1+log 2 N) levels (where N is image height or width) 25 26 Image pyramids Image pyramids are useful in object detection/recognition, image compression, signal processing, etc. Gaussian pyramid ilter with a Gaussian Low-pass pyramid Laplacian pyramid ilter with the difference of Gaussians (at different scales) Band-pass pyramid Wavelet pyramid ilter with wavelets Gaussian pyramid 27 28

Wavelet Transform Example Original Gaussian pyramid Low pass igh pass - horizontal Laplacian pyramid igh pass - vertical igh pass - both 29 30 Pyramid filters (1D view) Spatial frequency G(x) The ourier transform gives us a precise way to define, represent, and measure spatial frequency in images G 1 (x)- G 2 (x) Other transforms give similar descriptions: Discrete Cosine Transform (DCT) used in JPEG Wavelet transforms very popular G(x) sin(x) Because of the T/convolution relationship (f * g) = ( f ) ( g ) convolutions can be implemented via ourier transforms! f * g = -1 { ( f ) ( g ) } or large kernels, this can be much more efficient 31 32

Convolution and correlation Back to convolution/correlation Convolution (or T/IT pair) is equivalent to linear filtering Think of the filter kernel as a pattern, and convolution checks the response of the pattern at every point in the image At each point, it is a dot product of the local image area with the filter kernel M 1 N 1 Rij = mni m, j n m= 0 n= 0 Conceptually, the image responds best to the pattern of the filter kernel (similarity) An edge kernel will produce high responses at edges, a face kernel will produce high responses at faces, etc. 33 Convolution and correlation or a given filter kernel, what image values really do give the largest output value? All white maximum pixel values What image values will give a zero output? All zeros or, any local vector of values that is perpendicular to the kernel vector 34 9-dimensional vectors θ = k = cosθ k Image = vector = point Correlation as a dot product An m by n image (or image patch) can be reorganized as a mn by 1 vector, or as a point in mn-dimensional space a d b e 2x3 image c f a b c d e f ( a, b, c, d, e, f ) 6-dimensional point? f 1 f 2 f 3??? f 4 f 5 f 6??? f 7 f 8 f 9?? h 1 h 2 h 3 h 4 h 5 h 6 h 7 h 8 h 9 At this location, * equals the dot product of two 9-dimensional vectors 6x1 vector f 1 f 2 h 1 h 2 f 3 h 3 f 4 f 5 dot h 4 h 5 = f T h = Σ f i h i f 6 h 6 f 7 h 7 35 f 8 h 8 f 9 h 9 36

inding patterns in images via correlation Correlation gives us a way to find patterns in images Task: ind the pattern in the image Approach: Convolve (correlate) and ind the maximum value of the output image That location is the best match is called a matched filter Minimize d 2 Another way: Calculate the distance d between the image patch and the pattern d 2 = Σ( i - i ) 2 Approach: d The location with minimum d 2 defines the best match This is quite expensive 37 Assume fixed (more or less) Correlation ixed So minimizing d 2 is approximately equivalent to maximizing the correlation 38 Normalized correlation Problems with these two approaches: Correlation responds best to an all white patch (maximum pixel values) Both techniques are sensitive to scaling of the image Normalized correlation solves these problems 9-dimensional vectors h 1 h 2 h 3? f 1 f 2 f 3?? h 4 h 5 h 6? f 4 f 5 f 6?? h 7 h 8 h 9? f 7 f 8 f 9?? θ k = k = cosθ39 Normalized correlation We don t really want white to give the maximum output, we want the maximum output to be when = Or when the angle θ is zero Normalized correlation measures the angle θ between and What if the image values are doubled? alved? It is independent of the magnitude (brightness) of the image R = cosθ = What if the image values are doubled? alved? R is independent of the magnitude (brightness) of the image 40

Normalized correlation Normalized correlation measures the angle θ between and What if the image values are doubled? alved? What if the template values are doubled? alved? Normalized correlation output is independent of the magnitude (brightness) of the image R = cosθ = Drawback: More expensive than correlation Specialized hardware implementations 41