Learning Spherical Convolution for Fast Features from 360 Imagery

Similar documents
1 Minimum Cut Problem

22/ Breakdown of the Born-Oppenheimer approximation. Selection rules for rotational-vibrational transitions. P, R branches.

LINEAR DELAY DIFFERENTIAL EQUATION WITH A POSITIVE AND A NEGATIVE TERM

Random Access Techniques: ALOHA (cont.)

ph People Grade Level: basic Duration: minutes Setting: classroom or field site

Higher order derivatives

Chemical Physics II. More Stat. Thermo Kinetics Protein Folding...

Chapter 8: Electron Configurations and Periodicity

ME 321 Kinematics and Dynamics of Machines S. Lambert Winter 2002

2008 AP Calculus BC Multiple Choice Exam

Davisson Germer experiment Announcements:

Outline. Why speech processing? Speech signal processing. Advanced Multimedia Signal Processing #5:Speech Signal Processing 2 -Processing-

EEO 401 Digital Signal Processing Prof. Mark Fowler

A Propagating Wave Packet Group Velocity Dispersion

Observer Bias and Reliability By Xunchi Pu

A Prey-Predator Model with an Alternative Food for the Predator, Harvesting of Both the Species and with A Gestation Period for Interaction

EAcos θ, where θ is the angle between the electric field and

Outline. Thanks to Ian Blockland and Randy Sobie for these slides Lifetimes of Decaying Particles Scattering Cross Sections Fermi s Golden Rule

3-2-1 ANN Architecture

Iterative Reorganization with Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning

The graph of y = x (or y = ) consists of two branches, As x 0, y + ; as x 0, y +. x = 0 is the

Kernel Integral Images: A Framework for Fast Non-Uniform Filtering

MCB137: Physical Biology of the Cell Spring 2017 Homework 6: Ligand binding and the MWC model of allostery (Due 3/23/17)

u 3 = u 3 (x 1, x 2, x 3 )

That is, we start with a general matrix: And end with a simpler matrix:

Forces. Quantum ElectroDynamics. α = = We have now:

Propositional Logic. Combinatorial Problem Solving (CPS) Albert Oliveras Enric Rodríguez-Carbonell. May 17, 2018

Extraction of Doping Density Distributions from C-V Curves

Problem Set 6 Solutions

On the Hamiltonian of a Multi-Electron Atom

Davisson Germer experiment

Differentiation of Exponential Functions

Searching Linked Lists. Perfect Skip List. Building a Skip List. Skip List Analysis (1) Assume the list is sorted, but is stored in a linked list.

Sara Godoy del Olmo Calculation of contaminated soil volumes : Geostatistics applied to a hydrocarbons spill Lac Megantic Case

First derivative analysis

Data Assimilation 1. Alan O Neill National Centre for Earth Observation UK

The Matrix Exponential

Ch. 24 Molecular Reaction Dynamics 1. Collision Theory

Solution of Assignment #2

Chapter 3 Exponential and Logarithmic Functions. Section a. In the exponential decay model A. Check Point Exercises

Introduction to Arithmetic Geometry Fall 2013 Lecture #20 11/14/2013

Alpha and beta decay equation practice

Pipe flow friction, small vs. big pipes

Homework #3. 1 x. dx. It therefore follows that a sum of the

Precision Standard Model Tests (at JLab)

5.80 Small-Molecule Spectroscopy and Dynamics

Fourier Transforms and the Wave Equation. Key Mathematics: More Fourier transform theory, especially as applied to solving the wave equation.

Homotopy perturbation technique


The Matrix Exponential

Physical Organization

PHYS ,Fall 05, Term Exam #1, Oct., 12, 2005

Evaluating Reliability Systems by Using Weibull & New Weibull Extension Distributions Mushtak A.K. Shiker

Abstract Interpretation: concrete and abstract semantics

Image Retrieval Based on Intrinsic Spectral Histogram Representation

Classical Magnetic Dipole

Recursive Estimation of Dynamic Time-Varying Demand Models

Sundials and Linear Algebra

Supplementary Materials

Two Products Manufacturer s Production Decisions with Carbon Constraint

Different Focus Points Images Fusion Based on Steerable Filters

Transitional Probability Model for a Serial Phases in Production

EXST Regression Techniques Page 1

Characterizing and Estimating Block DCT Image Compression Quantization Parameters

Preprint Archive - Chemistry Preprint Archive : EDEFlash - COMPUTER PROGRAM F...

Full Waveform Inversion Using an Energy-Based Objective Function with Efficient Calculation of the Gradient

Progressive Boosting for Class Imbalance and Its Application to Face Re-Identification

Background: We have discussed the PIB, HO, and the energy of the RR model. In this chapter, the H-atom, and atomic orbitals.

Phase Rotation for the 80 MHz ac Mixed Mode Packet

Eigenvalue Distributions of Quark Matrix at Finite Isospin Chemical Potential

Intro to Nuclear and Particle Physics (5110)

Improving Color Image Segmentation by Spatial-Color Pixel Clustering

Estimation of the two-photon QED background in Belle II

Image Filtering: Noise Removal, Sharpening, Deblurring. Yao Wang Polytechnic University, Brooklyn, NY11201

Sliding Mode Flow Rate Observer Design

Statistical Thermodynamics: Sublimation of Solid Iodine

Quasi-Classical States of the Simple Harmonic Oscillator

Machine Detector Interface Workshop: ILC-SLAC, January 6-8, 2005.

Einstein Equations for Tetrad Fields

CS 6353 Compiler Construction, Homework #1. 1. Write regular expressions for the following informally described languages:

Note If the candidate believes that e x = 0 solves to x = 0 or gives an extra solution of x = 0, then withhold the final accuracy mark.

Robust surface-consistent residual statics and phase correction part 2

u r du = ur+1 r + 1 du = ln u + C u sin u du = cos u + C cos u du = sin u + C sec u tan u du = sec u + C e u du = e u + C

SPATIAL DATABASE UPDATING USING ACTIVE CONTOURS FOR MULTISPECTRAL IMAGES: APPLICATION WITH LANDSAT7

INCOMPLETE KLOOSTERMAN SUMS AND MULTIPLICATIVE INVERSES IN SHORT INTERVALS. xy 1 (mod p), (x, y) I (j)

Construction of asymmetric orthogonal arrays of strength three via a replacement method

Unit 6: Solving Exponential Equations and More

PHASE-ONLY CORRELATION IN FINGERPRINT DATABASE REGISTRATION AND MATCHING

Inheritance Gains in Notional Defined Contributions Accounts (NDCs)

4 x 4, and. where x is Town Square

6.1 Integration by Parts and Present Value. Copyright Cengage Learning. All rights reserved.

Economics 201b Spring 2010 Solutions to Problem Set 3 John Zhu

MEASURING HEAT FLUX FROM A COMPONENT ON A PCB

A Parallel Two Level Hybrid Method for Diagonal Dominant Tridiagonal Systems

GEOMETRICAL PHENOMENA IN THE PHYSICS OF SUBATOMIC PARTICLES. Eduard N. Klenov* Rostov-on-Don, Russia

The Importance of Action History in Decision Making and Reinforcement Learning

3 Noisy Channel model

Inter-Packet Symbol Approach To Reed-Solomon FEC Codes For RTP-Multimedia Stream Protection

A High-speed Method for Liver Segmentation on Abdominal CT Image

Diffractive Dijet Production with Leading Proton in ep Collisions at HERA

Transcription:

Larning Sphrical Convolution for Fast Faturs from 36 Imagry Anonymous Author(s) 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 In this fil w provid additional dtails to supplmnt th main papr submission. In particular, this documnt contains:. Figur illustration of th sphrical convolution ntwork structur. Implmntation dtails, in particular th larning procss 3. Data prparation procss of ach datast 4. Complt xprimnt rsults 5. Additional objct dtction rsult on Pascal, including both succss and failur cass 6. Complt visualization of th AlxNt conv krnl in sphrical convolution Sphrical Convolution Ntwork Structur Fig. shows how th proposd sphrical convolutional ntwork diffrs from an ordinary convolutional nural ntwork (CNN). In a CNN, ach krnl convolvs ovr th ntir D map to gnrat a D output. Altrnativly, it can b considrd as a nural ntwork with a tid wight constraint, whr th wights ar shard across all rows and columns. In contrast, sphrical convolution only tis th wights along ach row. It larns a krnl for ach row, and th krnl only convolvs along th row to gnrat D output. Also, th krnl siz may diffr at diffrnt rows and layrs, and it xpands nar th top and bottom of th imag. Additional Implmntation Dtails W train th ntwork using ADAM []. For pr-training, w us th batch siz of 56 and initializ th larning rat to.. For layrs without batch normalization, w train th krnl for 6, itrations and dcras th larning rat by vry 4, itrations. For layrs with batch normalization, w train for 4, itrations and dcras th larning rat vry, itrations. For fin-tuning, w first fin-tun th ntwork on conv3_3 for, itrations with batch siz of. Th larning rat is st to -5 and is dividd by aftr 6, itrations. W thn fin-tun th ntwork on conv5_3 for,48 itrations. Th larning rat is initializd to -4 and is dividd by aftr,4 itrations. W do not insrt batch normalization in conv_ to conv3_3 bcaus w mpirically find that it incrass th training rror. 3 Data Prparation This sction provids mor dtails about th datast splits and sampling procdurs. PanoVid For th PanoVid datast, w discard vidos with rsolution W H and sampl frams at.5fps. W us Mountain Climbing for tsting bcaus it contains th smallst numbr of frams. Not that th training data contains no instancs of Mountain Climbing, such that our ntwork is forcd to gnraliz across smantic contnt. W sampl at a low fram rat in ordr to rduc tmporal rdundancy in both training and tsting splits. For krnl-wis pr-training and tsting, w sampl th output on 4 pixls pr row uniformly to rduc spatial rdundancy. Our prliminary xprimnts show that a dnsr sampl for training dos not improv th prformanc. Submittd to 3st Confrnc on Nural Information Procssing Systms (NIPS 7). Do not distribut.

K l+ K l+. K l+ K l+ K l+ K l+. K l K l K l. θ φ Figur : Sphrical convolution illustration. Th krnl wights at diffrnt rows of th imag ar untid, and ach krnl convolvs ovr on row to gnrat D output. Th krnl siz also diffrs at diffrnt rows and layrs. 36 37 38 39 4 4 4 43 44 45 46 47 PASCAL VOC 7 As discussd in th main papr, w transform th D PASCAL imags into quirctangular projctd 36 data in ordr to tst objct dtction in omnidirctional data whil still bing abl to rly on an xisting ground truthd datast. For ach bounding box, w rsiz th imag so th short sid of th bounding box matchs th targt scal. Th imag is backprojctd to th unit sphr using P, whr th cntr of th bounding box lis on ˆn. Th unit sphr is unwrappd into quirctangular projction as th tst data. W rsiz th bounding box to thr targt scals {, 4, 336} corrsponding to {.5R,.R,.5R}, whr R is th Rf of N p. Each bounding box is projctd to 5 tangnt plans with φ = 8 and θ {36, 7, 8, 44, 8 }. By sampling th boxs across a rang of scals and tangnt plan angls, w systmatically tst th approach in ths varying conditions. 4 Complt Exprimntal Rsults This sction contains additional xprimntal rsults that do not fit in th main papr. conv RMSE 8 36 54 7 9 conv3 3 RMSE 8 36 54 7 9 conv4 3 RMSE 8 36 54 7 9 Figur : Ntwork output rror. conv5 3 RMSE 8 36 54 7 9 Dirct Intrp Prspctiv Exact OptSphConv SphConv-Pr SphConv 48 49 5 Fig. shows th rror of ach mta layr in th VGG architctur. This is th complt vrsion of Fig. 4a in th main papr. It bcoms mor clar to what xtnt th rror of SPHCONV incrass as w go dpr in th ntwork as wll as how th rror of INTERP dcrass.

IoU.4. Scal =.5R 8 36 54 7 9.4. Scal =.R 8 36 54 7 9.4. Scal =.5R 8 36 54 7 9 Figur 3: Proposal ntwork accuracy (IoU). Dirct Intrp Prspctiv Exact OptConv SphConv-Pr SphConv 5 5 53 54 55 56 57 58 59 6 6 6 Fig. 3 shows th proposal ntwork accuracy for all thr objct scals. This is th complt vrsion of Fig. 6b in th main papr. Th prformanc of all mthods improvs at largr objct scals, but PERSPECTIVE still prforms poorly nar th quator. 5 Additional Objct Dtction Exampls Figurs 4, 5 and 6 show xampl dtction rsults for SPHCONV-PRE on th 36 vrsion of PASCAL VOC 7. Not that th larg black aras ar undfind pixls; thy xist bcaus th original PASCAL tst imags ar not 36 data, and th contnt occupis only a portion of th viwing sphr. Fig. 7 shows xampls whr th proposal ntwork gnrat a tight bounding box whil th dtctor ntwork fails to prdict th corrct objct catgory. Whil th distortion is not as svr as som of th succss cass, it maks th confusing cass mor difficult. Fig. 8 shows xampls whr th proposal ntwork fails to gnrat tight bounding box. Th bounding box is th on with th bst intrsction ovr union (IoU), which is lss than.5 in both xampls. 3

Figur 4: Objct dtction rsults on PASCAL VOC 7 tst imags transformd to quirctangular projctd inputs at diffrnt polar angls θ. Black aras indicat rgions outsid of th narrow fild of viw (FOV) PASCAL imags, i.., undfind pixls. Th polar angl θ = 8, 36, 54, 7 from top to bottom. Our approach succssfully larns to translat a D objct dtctor traind on prspctiv imags to 36 inputs. 4

Figur 5: Objct dtction rsults on PASCAL VOC 7 tst imags transformd to quirctangular projctd inputs at θ = 36. 5

Figur 6: Objct dtction rsults on PASCAL VOC 7 tst imags transformd to quirctangular projctd inputs at θ = 8. 6

Figur 7: Failur cass of th dtctor ntwork. Figur 8: Failur cass of th proposal ntwork. 7

63 64 65 66 67 68 69 6 Visualizing Krnls in Sphrical Convolution Fig. 9 shows th targt krnls in th AlxNt [] modl and th corrsponding krnls larnd by our approach at diffrnt polar angls θ {9, 8, 36, 7 }. This is th complt list for Fig. 5 in th main papr. Hr w s how ach krnl strtchs according to th polar angl, and it is clar that som of th krnls in sphrical convolution hav largr wights than th original krnls. As discussd in th main papr, ths xampls ar for visualization only. As w show, th first layr is amnabl to an analytic solution, and only layrs l > ar larnd by our mthod. Figur 9: Larnd conv krnls in AlxNt (full). Each squar patch is an AlxNt krnl in prpsctiv projction. Th four rctangular krnls bsid it ar th krnls larnd in our ntwork to achiv th sam faturs whn applid to an quirctangular projction of th 36 viwing sphr. 7 7 7 73 Rfrncs [] D. Kingma and J. Ba. Adam: A mthod for stochastic optimization. arxiv prprint arxiv:4.698, 4. [] A. Krizhvsky, I. Sutskvr, and G. Hinton. Imagnt classification with dp convolutional nural ntworks. In NIPS,. 8