Generalised density function estimation using moments and the characteristic function

Similar documents
Oplos van kwadratiese vergelykings: die vind van die vergelyking *

3. (d) None of these / Geen van hierdie

JUNE 2005 TYD/TIME: 90 min PUNTE / MARKS: 50 VAN/SURNAME: VOORNAME/FIRST NAMES: STUDENTENOMMER/STUDENT NUMBER:

EXAMINATION / EKSAMEN 17 JUNE/JUNIE 2011 AT / OM 12:00 Q1 Q2 Q3 Q4 Q5 Q6 TOTAL

UNIVERSITEIT VAN PRETORIA / UNIVERSITY OF PRETORIA WTW263 NUMERIESE METODES WTW263 NUMERICAL METHODS EKSAMEN / EXAMINATION

November 2005 TYD/TIME: 90 min PUNTE / MARKS: 35 VAN/SURNAME: VOORNAME/FIRST NAMES: STUDENTENOMMER/STUDENT NUMBER: HANDTEKENING/SIGNATURE:

UNIVERSITEIT VAN PRETORIA / UNIVERSITY OF PRETORIA DEPT WISKUNDE EN TOEGEPASTE WISKUNDE DEPT OF MATHEMATICS AND APPLIED MATHEMATICS

DR. J ROSSOUW Pro In9

WTW 263 NUMERIESE METODES / NUMERICAL METHODS

Kwadratiese rye - Graad 11

MATHEMATICS GRADE 10 TASK 1 INVESTIGATION Marks: 55

EXAMINATION / EKSAMEN 19 JUNE/JUNIE 2013 AT / OM 08:00

EKSAMEN / EXAMINATION Q1 Q2 Q3 Q4 Q5 TOTAL. 2. No pencil work or any work in red ink will be marked.

Funksies en Verwantskappe

Fusion of Phoneme Recognisers for South African English

Department of Mathematics and Applied Mathematics Departement Wiskunde en Toegepaste Wiskunde

WTW 158 : CALCULUS EKSAMEN / EXAMINATION Eksterne eksaminator / External examiner: Prof NFJ van Rensburg

CAMI EDUCATION. Graad 12 Vraestel I : Rekord eksamen Punte. Lees die volgende instruksies noukeurig deur voordat die vrae beantwoord word:

GRADE 11 - FINAL ROUND QUESTIONS GRAAD 11 - FINALE RONDTE VRAE

Eksterne eksaminator / External examiner: Dr. P Ntumba Interne eksaminatore / Internal examiners: Prof. I Broere, Prof. JE vd Berg, Dr.

3. How many gadgets must he make and sell to make a profit of R1000?

a b

Huiswerk Hoofstuk 22 Elektriese velde Homework Chapter 22 Electric fields

Department of Mathematics and Applied Mathematics Departement Wiskunde en Toegepaste Wiskunde

GRADE 9 - FINAL ROUND QUESTIONS GRAAD 9 - FINALE RONDTE VRAE

Question 1. The van der Waals equation of state is given by the equation: a

Vibration Covariate Regression Analysis of Failure Time Data with the Proportional Hazards Model

HOëRSKOOL STRAND WISKUNDE NOVEMBER 2016 GRAAD 11 VRAESTEL 2

[1a] 1, 3 [1b] 1, 0 [1c] 1, 3 en / and 1, 5 [1d] 1, 0 en / and 1, 0 [1e] Geen van hierdie / None of these

WTW 158 : CALCULUS EKSAMEN / EXAMINATION Eksterne eksaminator / External examiner: Me/Ms R Möller

VAN/SURNAME: VOORNAME/FIRST NAMES: STUDENTENOMMER/STUDENT NUMBER: Totaal / Total:

SEMESTERTOETS 1 / SEMESTER TEST 1

VAN / SURNAME: VOORNAME / FIRST NAMES: STUDENTENOMMER / STUDENT NUMBER: FOONNO. GEDURENDE EKSAMENPERIODE / PHONE NO. DURING EXAM PERIOD:

DATA MEASURES THAT CHARACTERISE

Examination Copyright reserved. Eksamen Kopiereg voorbehou. Module EBN122 Elektrisiteit en Elektronika 13 November 2009

GRAAD 12 SEPTEMBER 2012 WISKUNDE V3 MEMORANDUM

GRAAD 11 NOVEMBER 2012 WISKUNDIGE GELETTERDHEID V1 MEMORANDUM

GRADE 11 - FINAL ROUND QUESTIONS GRAAD 11 - FINALE RONDTE VRAE

NASIONALE SENIOR SERTIFIKAAT GRAAD 10

Graad 12: Rye en Reekse

Hoofstuk 29 Magnetiese Velde a.g.v Elektriese Strome

SEEKING SPATIAL JUSTICE

Punte: Intern Marks: Internal WTW 168 : CALCULUS. EKSAMEN / EXAMINATION Eksterne eksaminator / External examiner: Me / Ms R Möller

VAN / SURNAME: VOORNAME / FIRST NAMES: STUDENTENOMMER / STUDENT NUMBER: HANDTEKENING / SIGNATURE: TELEFOON / TELEPHONE:

Department of Mathematics and Applied Mathematics Departement Wiskunde en Toegepaste Wiskunde

NATIONAL SENIOR CERTIFICATE NASIONALE SENIOR SERTIFIKAAT GRADE/GRAAD 12 JUNE/JUNIE 2018 MATHEMATICS P1/WISKUNDE V1 MARKING GUIDELINE/NASIENRIGLYN

THE EFFECT OF HABITAT CHANGE ON THE STRUCTURE OF DUNG BEETLE ASSEMBLAGES IN THE NORTH-EASTERN FREE STATE: A COMPARISON OF CONSERVED AND FARMED LAND

Aspects of the determination of the platinum group elements and arsenic by inductively. coupled plasma mass spectrometry

BEHAVIOUR OF CEMENTITIOUS SUBBASE LAYERS IN BITUMEN BASE ROAD STRUCTURES

y =3x2 y 2 x 5 siny x y =6xy2 5x 4 siny

Government Gazette Staatskoerant

PERFORMANCE EVALUATION OF WET-COOLING TOWER FILLS WITH COMPUTATIONAL FLUID DYNAMICS

NATIONAL SENIOR CERTIFICATE GRADE 10 MATHEMATICS P3 PREPARATORY EXAMINATION 2008 NOVEMBER 2008

JAKKALS ROEP KURSUS JUNE 2016

Direct Piston Displacement Control of Free-Piston Stirling Engines

NATIONAL SENIOR CERTIFICATE/ NASIONALE SENIOR SERTIFIKAAT GRADE/GRAAD 12 SEPTEMBER 2015 MATHEMATICS P1/WISKUNDE V1 MEMORANDUM

UNIVERSITY OF PRETORIA

Some Statistical Aspects of LULU smoothers

VAN / SURNAME: VOORNAME / FIRST NAMES: STUDENTENOMMER / STUDENT NUMBER: HANDTEKENING / SIGNATURE: SEL NR / CELL NO:

TW 214 TOETS 2 - VOORBEREIDING 2018 TEST 2 - PREPARATION

A COMPARISON OF PHASE I CONTROL CHARTS. University of Pretoria, South Africa 1 4

ASSESSMENT OF FREQUENCY DOMAIN FORCE IDENTIFICATION PROCEDURES

Acoustic Signal Processing. Algorithms for Reverberant. Environments

Spectral and temporal modulation and characterization of femtosecond ultra-short laser pulses

LIMPOPO DEPARTEMENT VAN ONDERWYS LIMPOPO DEPARTMENT OF EDUCATION- LAERSKOOL WARMBAD

DEPRESSIE101. panic attacks - inside the brain TALKING about anxiety attacks. hanteer angstigheid beter snellers vir 'n paniekaanval

UNIVERSITY OF PRETORIA DEPT SlVlELE INGENIEURSWESE / DEPT OF CIVIL ENGINEERING

Algorithmic Component and System Reliability Analysis of Truss Structures

+ + SEPTEMBER 2016 MATHEMATICS PAPER 1 / WISKUNDE VRAESTEL 1 MEMORANDUM

FAKULTEIT INGENIEURSWESE FACULTY OF ENGINEERING

Improved estimation procedures for a positive extreme value index

Recovery Based Error Estimation for the Method of Moments

PRODUCTION OF ETHYL ACETATE USING CATALYTIC REACTION METHOD

Some Exponential Diophantine Equations. Automan Sibusiso Mabaso

Curve Fitting Re-visited, Bishop1.2.5

Government Gazette Staatskoerant

FAKULTEIT INGENIEURSWESE FACULTY OF ENGINEERING. Volpunte: Full marks: Instruksies / Instructions

Unsupervised Anomaly Detection for High Dimensional Data

Development of a crane load software application for electric driven overhead travelling bridge cranes in accordance with SANS :2010

Question / Vraag 1: [12]

Pattern Recognition. Parameter Estimation of Probability Density Functions

UNIVERSITEIT VAN PRETORIA / UNIVERSITY OF PRETORIA DEPT WISKUNDE EN TOEGEPASTE WISKUNDE DEPT OF MATHEMATICS AND APPLIED MATHEMATICS

PCA and CVA biplots: A study of their underlying theory and quality measures. Hilmarié Brand

RHEOLOGICAL MODEL FOR PAINT PROPERTIES

NATIONAL SENIOR CERTIFICATE/NASIONALE SENIOR SERTIFIKAAT GRADE/GRAAD 10

DETECTING CHANGE IN NONLINEAR DYNAMIC PROCESS SYSTEMS

The Modelling of IR Emission Spectra and Solid Rocket Motor Parameters using Neural Networks and Partial Least Squares

GRADE 9 - FIRST ROUND QUESTIONS GRAAD 9 - EERSTE RONDTE VRAE

Numerical analysis of the flow distribution within packed columns using an explicit approach

Probabilistic Models of Design Wind Loads in South Africa

soos gemeet in Fase C and D groeitoetse. Selekteer bulle

The Inverse Finite Element Method: Sensitivity to Measurement Setup

NATIONAL SENIOR CERTIFICATE/ NASIONALE SENIOR SERTIFIKAAT GRADE/GRAAD 10

DESIGN OF DOWELS FOR SHEAR TRANSFER AT THE INTERFACE BETWEEN CONCRETE CAST AT DIFFERENT TIMES: A CASE STUDY

Support Vector Machines using GMM Supervectors for Speaker Verification

An Open-Source Implementation to Predict Buckling Behaviour of Cold-Formed Sections

Role of Assembling Invariant Moments and SVM in Fingerprint Recognition

UNIVERSAL PORTFOLIO GENERATED BY IDEMPOTENT MATRIX AND SOME PROBABILITY DISTRIBUTION LIM KIAN HENG MASTER OF MATHEMATICAL SCIENCES

Speaker Representation and Verification Part II. by Vasileios Vasilakakis

Algebra van die vier basiese operasies

Transcription:

Generalised density function estimation using moments and the characteristic function Gerhard Esterhuizen Thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Electronic Engineering at the University of Stellenbosch Supervisor: Prof. J.A. du Preez April 2003

Declaration I, the undersigned, hereby declare that the work contained in this thesis is my own original work and that I have not previously in its entirety or in part submitted it at any university for a degree. Signature: March 2003 i

Abstract Probability density functions (PDFs) and cumulative distribution functions (CDFs) playa central role in statistical pattern recognition and verification systems. They allow observations that do not occur according to deterministic rules to be quantified and modelled. An example of such observations would be the voice patterns of a person that is used as input to a biometric security device. In order to model such non-deterministic observations, a density function estimator is employed to estimate a PDF or CDF from sample data. Although numerous density function estimation techniques exist, all the techniques can be classified into one of two groups, parametric and non-parametric, each with its own characteristic advantages and disadvantages. In this research, we introduce a novel approach to density function estimation that attempts to combine some of the advantages of both the parametric and non-parametric estimators. This is done by considering density estimation using an abstract approach in which the density function is modelled entirely in terms of its moments or characteristic function. New density function estimation techniques are first developed in theory, after which a number of practical density function estimators are presented. Experiments are performed in which the performance of the new estimators are compared to two established estimators, namely the Parzen estimator and the Gaussian mixture model (GMM). The comparison is performed in terms of the accuracy, computational requirements and ease of use of the estimators and it is found that the new estimators does combine some of the advantages of the established estimators without the corresponding disadvantages. II

Opsomming Waarskynlikheids digtheidsfunksies (WDFs) en Kumulatiewe distribusiefunksies (KDFs) speel 'n sentrale rol in statistiese patroonherkenning en verifikasie stelsels. Hulle maak dit moontlik om nie-deterministiese observasies te kwantifiseer en te modelleer. Die stempatrone van 'n spreker wat as intree tot 'n biometriese sekuriteits stelsel gegee word, is 'n voorbeeld van so 'n observasie. Ten einde sulke observasies te modelleer, word 'n digtheidsfunksie afskatter gebruik om die WDF of KDF vanaf data monsters af te skat. Alhoewel daar talryke digtheidsfunksie afskatters bestaan, kan almal in een van twee katagoriee geplaas word, parametries en nie-parametries, elk met hul eie kenmerkende voordele en nadele. Hierdie werk Ie 'n nuwe benadering tot digtheidsfunksie afskatting voor wat die voordele van beide die parametriese sowel as die nie-parametriese tegnieke probeer kombineer. Dit word gedoen deur digtheidsfunksie afskatting vanuit 'n abstrakte oogpunt te benader waar die digtheidsfunksie uitsluitlik in terme van sy momente en karakteristieke funksie gemodelleer word. Nuwe metodes word eers in teorie ondersoek en ontwikkel waarna praktiese tegnieke voorgele word. Hierdie afskatters het die vermoe om 'n wye verskeidenheid digtheidsfunksies af te skat en is nie net ontwerp om slegs sekere families van digtheidsfunksies optimaal voor te stel nie. Eksperimente is uitgevoer wat die werkverrigting van die nuwe tegnieke met twee gevestigde tegnieke, naamlik die Parzen afskatter en die Gaussiese mengsel model (GMM), te vergelyk. Die werkverrigting word gemeet in terme van akkuraatheid, vereiste numeriese verwerkingsvermoe en die gemak van gebruik. Daar word bevind dat die nuwe afskatters wei voordele van die gevestigde afskatters kombineer sonder die gepaardgaande nadele. iii

Acknowledgements I would like to thank the following people: Prof. J.A. Du Preez, my supervisor, for his patient guidance. Zelda Weitz for her unfailing love and encouragement. My parents and family for their education and support. My grandmother and Mr and Mrs Weitz for providing a home away from home. Dr Dave Weber and the DSP Lab for providing a creative learning environment. Chari Botha for his technical advice. Pieter Nel and Koos Hugo for False Bay sailing. iv

Contents 1 Introduction 1 1.1 Motivation and topicality. 1 1.1.1 Density function estimation 2 1.1.2 Our research. 3 1.2 Background 5 1.2.1 Random variables. 5 1.2.2 Pattern classification 6 1.2.3 Hypothesis tests. 7 1.3 Existing techniques 9 1.3.1 Non-parametric estimators. 10 1.3.2 Parametric estimators 11 1.3.3 Other techniques 12 1.3.4 Requirements for a new estimator 17 1.4 Objectives 17 1.5 Contributions 18 1.6 Overview of the document 19 2 Novel probability density function estimators 20 2.1 Introduction. 20 2.2 Motivation. 20 2.3 Definitions and Background 21 2.3.1 Moments. 21 2.3.2 Characteristic function 23 2.3.3 Fourier series 25 2.4 Estimators based on moments 26 2.4.1 Motivation. 27 v

2.4.2 A PDF in terms of moments 29 2.4.3 The anti-derivative..... 37 2.4.4 Numerical integration techniques 38 2.4.5 Fourier series approximation... 41 2.4.6 Estimating moments from sample data 44 2.5 Estimators based on the characteristic function 46 2.5.1 Motivation... 46 2.5.2 A PDF in terms of a characteristic function 47 2.5.3 Fourier series......... 51 2.5.4 Limiting case where N x -+ 00 52 2.6 Conclusions.............. 57 2.6.1 Comparison between characteristic function and moments techniques 58 2.6.2 Comparison with the Parzen estimator and Gaussian mixture model (GMM) 59 3 Novel cumulative distribution function estimators 3.1 Introduction 3.2 Motivation. 3.3 Definitions and background 3.4 Estimators based on moments 3.4.1 A CDF in terms of moments. 3.4.2 Numerical integration techniques 3.4.3 Fourier series approximation... 3.5 Estimators based on the characteristic function 3.5.1 A CDF in terms of a characteristic function 3.5.2 Fourier series 3.6 Conclusions... 4 Experimental results 4.1 Introduction... 4.2 Experimental setup 4.2.1 Input data. 4.2.2 Estimation error measure 4.2.3 PDF and CDF estimate using moments. 4.2.4 PDF and CDF estimate using characteristic function 61 61 62 62 63 64 66 69 75 76 79 81 83 83 85 85 86 87 87 VI

4.2.5 Parzen estimator... 88 4.2.6 Gaussian Mixture Model 88 4.3 Mean estimation error 88 4.3.1 PDF estimators... 88 4.3.2 CDF estimators... 97 4.4 Computational requirements 102 4.4.1 PDF Estimators. 104 4.4.2 CDF Estimators 106 4.5 Training requirements 107 4.6 Application to speaker verification. 109 4.7 Conclusions.......... 111 5 Conclusions and recommendations 114 5.1 Conclusions... 114 5.2 Recommendations......... 116 A A review of the Fourier transform 120 B Summary of algorithms 123 vii

List of Tables 4.1 PDF estimators: mllllmum mean estimation errors, minimum combined mean estimation errors and corresponding combined standard deviations (in brackets) from 100 samples (loox Kullback-Leibler divergence is indicated). 91 4.2 PDF estimators: minimum mean estimation errors, minimum combined mean estimation errors and corresponding combined standard deviations (in brackets) from 1000 samples (loox Kullback-Leibler divergence is indicated). 95 4.3 PDF estimators: the effect of selecting a sub-optimal working point..... 96 4.4 CDF estimators: minimum mean estimation errors, minimum combined mean estimation errors and corresponding combined standard deviation (in brackets) from 100 samples (lox integral absolute difference is indicated). 101 4.5 CDF estimators: minimum mean estimation errors, minimum combined mean estimation errors and corresponding combined standard deviation (in brackets) from 1000 samples (lox integral absolute difference is indicated). 102 4.6 Gradient (xl00) characterising the relationship between the computation time and the parameter value.......................... 105 4.7 Comparison between GMM and CF technique in a speaker verification application................. 111 4.8 Feature matrix for all the estimators. 113 A.l Selected Fourier transform properties.. B.l PDF estimate from moments using Fourier series. B.2 PDF estimate from samples using Fourier series.. B.3 CDF estimate from moments using Fourier series. B.4 CDF estimate from samples using Fourier series.. 122 124 125 126 127 Vlll

List of Figures 2.1 Successive Taylor series approximations to characteristic function of Gaussian PDF using 5, 14, and 32 terms....................... 31 2.2 The smoothing effect of a Hamming window on the characteristic function. 34 2.3 Leakage introduced by rectangular windowing of the characteristic function. 36 2.4 Discretisation of the characteristic function. 45 2.5 Reconstructing a PDF directly from data samples, using a triangular windowing function. 50 3.1 Relationship between f(x), f'(x)and f"(x) 71 4.1 Mean estimation error: PDF estimate using GMM from 100 samples. 89 4.2 Mean estimation error: PDF estimate using Parzen estimator from 100 samples................................. 90 4.3 Mean estimation error: PDF estimate using characteristic function from 100 samples..................................... 90 4.4 Mean estimation error: PDF estimate using moments from 100 samples.. 91 4.5 Typical PDF estimates........................... 92 4.6 Examples of over-fitted PDF estimates (obtained from 100 samples).. 94 4.7 Mean estimation error: CDF estimate using GMM from 100 samples. 97 4.8 Mean estimation error: CDF estimate using Parzen estimator from 100 samples...................................... 98 4.9 Mean estimation error: CDF estimate using characteristic function from 100 samples..................................... 98 4.10 Mean estimation error: CDF estimate using moments from 100 samples. 99 4.11 Typical CDF estimates. 100 4.12 Comparison of normalised execution times of PDF estimators: Pentium III 700 MHz 104 ix