AN EFFICIENT ESTIMATOR FOR GIBBS RANDOM FIELDS

Similar documents
A PARAMETRIC MODEL FOR DISCRETE-VALUED TIME SERIES. 1. Introduction

ON THE ZERO-ONE LAW AND THE LAW OF LARGE NUMBERS FOR RANDOM WALK IN MIXING RAN- DOM ENVIRONMENT

ENEE 621 SPRING 2016 DETECTION AND ESTIMATION THEORY THE PARAMETER ESTIMATION PROBLEM

Large Deviations for Weakly Dependent Sequences: The Gärtner-Ellis Theorem

Problem Set 6: Solutions Math 201A: Fall a n x n,

Topology Proceedings. COPYRIGHT c by Topology Proceedings. All rights reserved.

Bootstrap Approximation of Gibbs Measure for Finite-Range Potential in Image Analysis

INTERTWINING OF BIRTH-AND-DEATH PROCESSES

Asymptotically Efficient Nonparametric Estimation of Nonlinear Spectral Functionals

Institute of Mathematical Statistics LECTURE NOTES MONOGRAPH SERIES PLUG-IN ESTIMATORS IN SEMIPARAMETRIC STOCHASTIC PROCESS MODELS

Introduction to Machine Learning CMU-10701

1 Local Asymptotic Normality of Ranks and Covariates in Transformation Models

DETECTING PHASE TRANSITION FOR GIBBS MEASURES. By Francis Comets 1 University of California, Irvine

SIMILAR MARKOV CHAINS

A Note on the Central Limit Theorem for a Class of Linear Systems 1

Mean-field dual of cooperative reproduction

at time t, in dimension d. The index i varies in a countable set I. We call configuration the family, denoted generically by Φ: U (x i (t) x j (t))

General Theory of Large Deviations

The small ball property in Banach spaces (quantitative results)

Shih-sen Chang, Yeol Je Cho, and Haiyun Zhou

Generalized function algebras as sequence space algebras

Asymptotic Nonequivalence of Nonparametric Experiments When the Smoothness Index is ½

A CLT FOR MULTI-DIMENSIONAL MARTINGALE DIFFERENCES IN A LEXICOGRAPHIC ORDER GUY COHEN. Dedicated to the memory of Mikhail Gordin

ON THE UNIQUENESS PROPERTY FOR PRODUCTS OF SYMMETRIC INVARIANT PROBABILITY MEASURES

Regular finite Markov chains with interval probabilities

1 Introduction This work follows a paper by P. Shields [1] concerned with a problem of a relation between the entropy rate of a nite-valued stationary

Asymptotic efficiency of simple decisions for the compound decision problem

Tame definable topological dynamics

Topological dynamics: basic notions and examples

Probability and Measure

Convergence of generalized entropy minimizers in sequences of convex problems

Can we do statistical inference in a non-asymptotic way? 1

6 Markov Chain Monte Carlo (MCMC)

Efficiency of Profile/Partial Likelihood in the Cox Model

Acta Universitatis Carolinae. Mathematica et Physica

Verifying Regularity Conditions for Logit-Normal GLMM

Entropy dimensions and a class of constructive examples

Kesten s power law for difference equations with coefficients induced by chains of infinite order

An exponential family of distributions is a parametric statistical model having densities with respect to some positive measure λ of the form.

Maximization of the information divergence from the multinomial distributions 1

CONVOLUTION OPERATORS IN INFINITE DIMENSION

HOPF S DECOMPOSITION AND RECURRENT SEMIGROUPS. Josef Teichmann

A central limit theorem for randomly indexed m-dependent random variables

Consistency of the maximum likelihood estimator for general hidden Markov models

HAIYUN ZHOU, RAVI P. AGARWAL, YEOL JE CHO, AND YONG SOO KIM

Nonstationary Subdivision Schemes and Totally Positive Refinable Functions

A SHORT NOTE ON MULTIVARIATE DEPENDENCE MODELING

Asymptotic Tail Probabilities of Sums of Dependent Subexponential Random Variables

Petr Volf. Model for Difference of Two Series of Poisson-like Count Data

Statistical analysis of time series: Gibbs measures and chains with complete connections

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Inference For High Dimensional M-estimates. Fixed Design Results

Brief introduction to Markov Chain Monte Carlo

Viscosity Iterative Approximating the Common Fixed Points of Non-expansive Semigroups in Banach Spaces

A General Overview of Parametric Estimation and Inference Techniques.

Faithful couplings of Markov chains: now equals forever

Compact operators on Banach spaces

The International Journal of Biostatistics

The Skorokhod reflection problem for functions with discontinuities (contractive case)

On Approximate Pattern Matching for a Class of Gibbs Random Fields

FREMLIN TENSOR PRODUCTS OF CONCAVIFICATIONS OF BANACH LATTICES

Finite pseudocomplemented lattices: The spectra and the Glivenko congruence

Random Walks Conditioned to Stay Positive

TESTING PROCEDURES BASED ON THE EMPIRICAL CHARACTERISTIC FUNCTIONS II: k-sample PROBLEM, CHANGE POINT PROBLEM. 1. Introduction

On a Cauchy equation in norm

Egoroff Theorem for Operator-Valued Measures in Locally Convex Cones

A Note on Tail Behaviour of Distributions. the max domain of attraction of the Frechét / Weibull law under power normalization

Stat 516, Homework 1

SOME CONVERSE LIMIT THEOREMS FOR EXCHANGEABLE BOOTSTRAPS

Principles of Real Analysis I Fall VII. Sequences of Functions

Asymptotics of minimax stochastic programs

Functional Analysis. Franck Sueur Metric spaces Definitions Completeness Compactness Separability...

Local Asymptotic Normality

Introduction to Machine Learning CMU-10701

Precise Large Deviations for Sums of Negatively Dependent Random Variables with Common Long-Tailed Distributions

Some SDEs with distributional drift Part I : General calculus. Flandoli, Franco; Russo, Francesco; Wolf, Jochen

Lecture 3: September 10

Combinatorics in Banach space theory Lecture 12

Author(s) Huang, Feimin; Matsumura, Akitaka; Citation Osaka Journal of Mathematics. 41(1)

arxiv:math/ v1 [math.pr] 10 Jun 2004

(convex combination!). Use convexity of f and multiply by the common denominator to get. Interchanging the role of x and y, we obtain that f is ( 2M ε

Semiparametric posterior limits

Rafał Kapica, Janusz Morawiec CONTINUOUS SOLUTIONS OF ITERATIVE EQUATIONS OF INFINITE ORDER

Boundedly complete weak-cauchy basic sequences in Banach spaces with the PCP

ON KRONECKER PRODUCTS OF CHARACTERS OF THE SYMMETRIC GROUPS WITH FEW COMPONENTS

ROBUST TESTS BASED ON MINIMUM DENSITY POWER DIVERGENCE ESTIMATORS AND SADDLEPOINT APPROXIMATIONS

PACKING-DIMENSION PROFILES AND FRACTIONAL BROWNIAN MOTION

Efficient estimation for semiparametric semi-markov processes

Hardy martingales and Jensen s Inequality

Algebraic matroids are almost entropic

a n b n ) n N is convergent with b n is convergent.

Analysis-3 lecture schemes

Stochastic Convergence, Delta Method & Moment Estimators

On the distributional divergence of vector fields vanishing at infinity

ON THE DEFINITION OF RELATIVE PRESSURE FOR FACTOR MAPS ON SHIFTS OF FINITE TYPE. 1. Introduction

Negative Association, Ordering and Convergence of Resampling Methods

arxiv: v1 [math.fa] 23 Dec 2015

Gibbs Voronoi tessellations: Modeling, Simulation, Estimation.

arxiv: v1 [math.ds] 31 Jul 2018

Chi-square goodness-of-fit test for vague data

Transcription:

K Y B E R N E T I K A V O L U M E 5 0 ( 2 0 1 4, N U M B E R 6, P A G E S 8 8 3 8 9 5 AN EFFICIENT ESTIMATOR FOR GIBBS RANDOM FIELDS Martin Janžura An efficient estimator for the expectation R f dp is constructed, where P is a Gibbs random field, and f is a local statistic, i. e. a functional depending on a finite number of coordinates. The estimator coincides with the empirical estimator under the conditions stated in Greenwood and Wefelmeyer [6], and covers the known special cases, namely the von Mises statistic for the i.i.d. underlying fields and the case of one-dimensional Markov chains. Keywords: Gibbs random field, efficient estimator, empirical estimator Classification: 62F12, 62M40 1. INTRODUCTION The expectation of a local statistic defined on a stationary random field can be estimated from observations in a large window with the aid of the empirical estimator, which is given as the average of the function values over all shifts inside the window. Under appropriate conditions on the underlying random field, the estimator is consistent and asymptotically normal. But, in general, it may not be efficient. For the exponential type distributions of random processes and fields, which coincide with the Gibbs random fields as studied within the frame of statistical physics as well as image processing, the problem of efficiency was addressed by Greenwood and Wefelmeyer [6]. Under the regularity condition, which only makes the efficiency study meaningful, and which for the Gibbs fields reads as the restriction to the Dobrushin s uniqueness region, they showed that the empirical estimator is efficient if and only if the considered local statistic belongs (in a special wide sense to the same space as the potential of the underlying Gibbs random field. In the present paper we adopt most of the results obtained by Greenwood and Wefelmeyer [6], but we shall extend them substantially by a general construction of the efficient estimator. Roughly speaking, the empirical estimate f d P must be substituted by the estimate in the form f dp bg, where ĝ is the (efficient estimate of the unknown potential. We shall also prove that these two estimates coalesce under the conditions stated in Greenwood and Wefelmeyer [6]. In addition, we shall show that the known special results concerning the i.i.d. and Markov cases, are also covered by our general result. DOI: 10.14736/kyb-2014-6-0883

884 M. JANŽURA Besides the main reference, the efficiency issues are based on the pioneering Hájek s results [8] and the comprehensive book by Bickel et al. [1]. The regularity LAN condition is due to Janžura [10]. The results on Gibbs random fields are partly from Künsch paper [14] and Georgii s book [4], the statistical analysis results from Janžura [12]. For the sake of simplicity, within this paper we consider only the finite state space and the finite-dimensional (parametric case. With an appropriate effort, the generalization is possible. 2. GIBBS RANDOM FIELDS Let X 0 be a finite set, and, for some d 1, let Z d be the d-dimensional integer lattice, and X X0 Zd the corresponding product space. For V Z d we denote by F V the σ-algebra generated by the projection Proj V : X X0 V, and by B V the set of (bounded F V - measurable functions. Further, we shall write x V Proj V (x for x X and V Z d, and we shall denote by V {W Z d ; 0 < W < } the system of finite non-void subsets of Z d. We denote by P the set of all probability measures on X with the product σ-algebra F. Further, by P S we denote the set of stationary (translation invariant probability measures, i. e. P S {P P; P P τ 1 j for every j Z d }, where τ j : X X is for every j Z d the corresponding shift operator defined by [τ j (x] t x j+t for every t Z d, x X. Finally, by P E we denote the set of ergodic probability measures, i. e. P E {P P S ; P (F {0, 1} for every F E} where E {F F; F τ 1 j (F for every j Z d }. The functions from B V V B V will be quoted as (finite range potentials. For a potential Φ B V, V V, we define the Gibbs specification as the family of probability kernels Π Φ {Π Φ Λ } Λ V where Π Φ Λ(x Λ x Λ c ZΛ Φ (x Λ c exp Φ τ j (x with the normalizing constant Z Φ Λ (x Λ c y Λ X Λ 0 exp j Λ V j Λ V Φ τ j (y Λ, x Λ c for every Λ V. Note that Λ V {j Z d ; (j + V Λ }. We do not emphasize the range V in the notation. It is an inseparable part of the potential Φ, and it is only important that, due to the finiteness of V, the set Λ V is also finite. In fact, the set Λ V is the minimal set of indices for which the term Φ τ j (x actually depends on x Λ, and we could sum over any larger set. A probability measure P P is a Gibbs distribution (Gibbs random field with the potential Φ B if P Λ Λ c (x Λ x Λ c Π Φ Λ (x Λ x Λ c a. s. [P ] for every Λ V. The set of such P s will be denoted by G(Φ, while the set of stationary (resp. ergodic Gibbs distributions will be denoted by G S (Φ G(Φ P S (resp.

An efficient estimator for Gibbs random fields 885 G E (Φ G(Φ P E. In general G E (Φ. We may say that the (first-order phase transition occurs if G(Φ > 1. Then, some elements are non-ergodic, and some even may not be translation invariant (stationary although the specification is so. For a detailed treatment and the examples see, e. g., Georgii [4, Chapter 6.2]. Unfortunately, for the efficiency study the phase transitions mean the non-smoothness and non-regularity, and, therefore, must be excluded see Section 5 below. Let us end this section with the observation that, since for Φ B V we have Π Φ Λ B Λ+V V for every Λ V, the above defined Gibbs random fields obey the (spatial Markov property. 3. EQUIVALENCE OF POTENTIALS Besides the phase transitions, there is another kind of non-uniqueness that can complicate the treatment, namely the possible equivalence of potentials. Two potentials Φ, Ψ B are equivalent, we write Φ Ψ, if G(Φ G(Ψ. There is a couple of equivalent characterizations (see Georgii [4] or Janžura [11], e. g., Φ Ψ iff Π Φ {0} ΠΨ {0}. For our purposes, there will be important the following one: Φ Ψ iff Φ dp Ψ dp + c for every P P S with some fixed constant c. The equivalence can appear very easily, e. g., Φ Φ + g g τ j + c for some g B, j Z d, and a constant c. From the statistical analysis point of view the equivalence of potentials means breaking the basic identifiability condition, and, therefore, must be unambiguously avoided. A rather standard way consists in dealing with the equivalence classes instead of the particular potentials. But we prefer to restrict our considerations to a rich enough subclass of mutually nonequivalent potentials. Such subclass should contain representatives of all equivalence classes. That can be arranged by dealing with the so called vacuum potentials. Let us fix some state b X 0, quoted as the vacuum state. For any V V, let us denote BV b {Φ B V ; Φ(x V 0 if x t b for some t V }. Further, in order to avoid equivalence by shifting, let us introduce V 0 {V V; min t V {t} 0} where the minimum is with respect to some fixed complete (e. g., the lexicographical ordering. Now, for a finite subsystem A V 0 we set BA b {Φ B; Φ A A Φ A, Φ A BA b for every A A}, and, consequently, Bb A V 0, A < Bb A will be our class of vacuum potentials. Proposition 3.1. i For Φ, Ψ B b it holds: Φ Ψ iff Φ Ψ. ii For every Φ B there exists Ψ B b such that Φ Ψ. P r o o f. i We may observe that for Φ, Ψ B b we have Φ Ψ B b, and Φ Ψ iff Φ Ψ 0.

886 M. JANŽURA That simplifies a bit the tedious calculation. For some Φ B b A with A A A A we must deduce Φ 0 from the condition j A Φ τ j B (A A\{0} (that is equivalent to Φ 0 by a proper sequence of substituting. Let A 1 A be minimal in the sense: (A 1 c (A + t for every A A \ {A 1 } and t Z d. Then j A Φ τ j(x A 1, b (A A\A 1 Φ A 1(x A 1, and since by the assumption Φ A 1 must not depend on x 0 we may always substitute x 0 b and obtain Φ A 1 0. Then we repeat the consideration with A \ {A 1 }, etc., and finally we obtain Φ A 0 for every A A. ii The statement follows from Theorem 2.35 b in Georgii [4] or by direct calculations with the aid of Möbius formula for constructing the vacuum potentials. 4. EMPIRICAL RANDOM FIELDS For a fixed configuration x X and some Λ V we define a probability measure P x Λ Φ d P x Λ Λ 1 Φ τ t (x for every Φ B. t Λ by Such probability distributions will be quoted as empirical random fields. On the other hand, for fixed Φ B V we have Φ d P Λ B V +Λ, which means that for specifying the marginal distribution P x Λ /F V we actually need to know only x V +Λ X0 V +Λ. Now, let us consider a sequence of properly growing subsets {V n } n1 in Z d, e. g., the cubes V n [ n, n] d for simplicity. Then, for a fixed potential Φ B let us denote by q(φ lim n V n 1 log Z Φ V n (x c the pressure corresponding to the potential Φ. The limit exists uniformly for every x X, e. g., by Theorem 15.30 in Georgii [4]. In general, the pressure q is a convex continuous function on B, strictly convex on B b, and even strongly convex on every compact subset of B b (see Dobrushin and Nahapetian [3]. Further, for every P P S we may define the entropy rate H(P lim V n 1 [ log P (x ] dp (x n where the limit exists by Theorem 15.12 again in Georgii [4]. Proposition 4.1. For P P S, the following statements are equivalent: i P G S (Φ; ii V n 1 log P (x iii H(P q(φ Φ dp. Φ d P x + q(φ 0 for n ;

An efficient estimator for Gibbs random fields 887 P r o o f. While i ii and ii iii are rather straightforward, the proof of iii i needs a rather sophisticated construction (see, e. g., Georgii [4, Theorem 15.37] or Janžura [13]. 5. UNIQUENESS REGION For every t Z d let us denote { } γ t (Φ 1 2 sup Π Φ {0} (x 0 y Z \{0} Π Φ d {0} (x 0 z Z \{0} d ; y s z s for s t. x 0 X 0 If γ(φ t Z γ t(φ < 1, the potential is said to satisfy the Dobrushin s uniqueness d condition. Let us denote by D {Φ B; γ(φ < 1} the Dobrushin s uniqueness region. For every Φ D it holds G(Φ G S (Φ G E (Φ {P Φ }. For more details see Dobrushin [2], Künsch [14], or Georgii [4]. 6. FINITE-DIMENSIONAL (PARAMETRIC SPACE OF POTENTIALS From now, we shall consider the linear subspace H B b of potentials, where H Lin ( Φ 1,..., Φ K and Φ 1,..., Φ K B b are vacuum potentials. Then the potentials are mutually nonequivalent, i. e. K Φ i Φ i 0 iff 0. i1 Let us denote Θ { R K, Φ D} and P Θ {P } Θ, where P P Φ. Similarly, we shall write q( for q(φ, or cov for cov P, and denote B (f 1, f 2 t Z d cov (f 1, f 2 τ t for every f 1, f 2 B. Moreover, let us introduce the transform U : Θ R K defined by U( Φ dp where Φ (Φ 1,..., Φ K. Proposition 6.1. For q : Θ R and U : Θ R K it holds: i q is a smooth, strongly convex function. ii B (f 1, f 2 exists for any f 1, f 2 B; B (f, f > 0 iff f 0. iii q( U(, 2 q( U( B (Φ, Φ.

888 M. JANŽURA iv U is a regular mapping, B (f 1, f 2 is continuous for fixed f 1, f 2 B. P r o o f. The results follow by Künsch [14], together with Dobrushin and Nahapetian [3]. Theorem 6.2. For Θ, f B it holds: i f d P f dp, for n a. s. [P ]. ii V n 1 2 iii V n 1 2 [ [ f d P Φ d P f dp ] N (0, B (f, f for n in distribution [P ]. ] Φ dp l n 0 for n in probability [P ], where l n [ V n 1 log P V n ] is the score function. iv log P + 2 1 α [ V n PV V n 1 2 n for n in probability [P ]. Φ α d P Φ α dp ] + 1 2 B (Φ α, Φ α 0 P r o o f. While i is just the ergodic theorem, ii is the CLT due to Künsch[14]. For iii and iv (local asymptotic normality see Janžura [10]. 7. EFFICIENT ESTIMATOR For a fixed local statistic f B W, W V, let us set ( T n f dp b n with n U 1 Φ d P whenever Φ d P U(Θ. We claim that T n is an efficient estimator for f dp, Θ. Since U is regular and therefore U(Θ R K is open, and Φ d P Φ dp a. s. [P ] by the ergodic theorem, we have n, and, consequently, T n defined with a probability tending to 1. For Φ d P / U(Θ we may set T n arbitrary. In fact, due to Proposition 6.1, we have { } n arg min R K q( Φ d P which is well defined whenever Φ d P α R K { Φ dp ; P GS (Φ α }, and we need the formal definition only outside that set.

An efficient estimator for Gibbs random fields 889 Remark 7.1. For every i 1,..., K we have Φ i B b A i with A i A A i A. Thus Φ α B A with A K i1 A i, and, in order to have Φ d P well defined, we need the data to be observed from the region V n + A V n. Theorem 7.2. For every Θ we have [ ] V n 1 2 T n f dp gf d P 0 for n in probability [P ], where gf B (f, Φ B (Φ, Φ (Φ 1 Φ dp. P r o o f. Since T n f dp U 1 ( R Φ dp b and f dp f dp U 1 ( R Φ dp, we obtain by the first order expansion with some (random ε n [0, 1]: T n f dp d f dp U 1 ( R f(ε dp b +(1 ε dp εeεn dε ( B en (f, Φ B en (Φ, Φ 1 Φ d P Φ dp with ( n U 1 f( εn d P + (1 ε n dp. Thus [ ] V n 1 2 T n f dp gf d P [ B en (f, Φ B en (Φ, Φ 1 B (f, Φ B (Φ, Φ 1] ( V n 1 2 Φ d P Φ dp, where, with the aid of results of the preceding section, namely Proposition 4.1 and Theorem 6.2, the first term tends to zero a. s. and in probability [P ] due to the ergodic theorem and continuity of B. The second term tends to N (0, B (Φ, Φ in distribution by the CLT, which completes the proof. Remark 7.3. We may observe B (f g f, Φ B (f, Φ B (f, Φ B (Φ, Φ 1 B (Φ, Φ 0. Thus, gf is the canonical gradient, i. e. the projection of the local statistic f into the space H equipped with the scalar product B (, (cf. Greenwood and Wefelmeyer [6] or Bickel et al. [1, Section 3.3].

890 M. JANŽURA Remark 7.4. Due to Theorem 6.2 iii we can obtain the same limit theorem with the efficient influence function B (f, Φ B (Φ, Φ 1 l n (as required, e. g., in Bickel et al. [1, Section 2.3] instead of g f d P. Before proving the efficiency of the estimator, let us introduce the relevant definitions. Definition 7.5. The estimator T n is called regular, if ( V n 1 2 T n f dp 1 + 2 α L( for n in distribution [ ] P 1 + 2 α, where L( is some limiting distribution. Proposition 7.6. The estimator ( T n f dp b n with n U 1 Φ d P is regular with L( N ( 0, B (f, Φ B (Φ, Φ 1 B (Φ, f. P r o o f. For Θ D let us denote L n log P + 1 2 α V n [ PV and Tn V n 1 2 Tn f dp ]. n Then due to Theorem 6.2 iv and Theorem 7.2 we obtain where ( µ (T n, L n N 2 (µ, Σ in distribution [P ], 0 1 2 B (Φ α, Φ α and Σ ( B (gf, g f B (gf, Φα B (Φ α, gf B (Φ α, Φ α And by the third LeCam s lemma (see, e. g., Bickel et al. [1, Lemma A.9.3.] it also holds (Tn, L n N 2 (µ 1, Σ in distribution [P + 2 α V n ], where ( µ B (Φ α, gf 1 2 B (Φ α, Φ α. Now, we observe B (Φ α, gf B (Φ α, f, and ( ( V n 1 2 T n f dp 1 + 2 α Tn + V n 1 2 f dp. f dp 1 + 2 α T n B +ɛn V n 1 2 α (Φα, g f N 1 ( B (Φ α, f, B (g f, g f B (Φ α, f N 1 ( 0, B (g f, g f, where the limit is in distribution [P + 1 2 α V n ], and ɛ n [0, 1]. Finally, since B (g f, g f B (f, Φ B (Φ, Φ 1 B (Φ, f, we have the claimed result.

An efficient estimator for Gibbs random fields 891 Definition 7.7. A regular estimator T n will be called efficient if ( V n 1 2 T n f dp N ( 0, B (f, Φ B (Φ, Φ 1 B (Φ, f (see also Definition 1 in Section 2.3 in Bickel et. al. [1]. for n in distribution [P ] Remark 7.8. The above definition is fully justified by the Hájek s convolution theorem (cf., e. g. Theorem 2.3.1 in Bickel et. al. [1], which guaranties the maximum possible concentration of the estimate in the vicinity of the true value. In fact, the asymptotic linearity of the proposed estimator, as proved in Theorem 7.2, is crutial for both the regularity and the efficiency of the estimator (see also Proposition 7.6 in Greenwood and Wefelmeyer [6]. Theorem 7.9. The estimator T n is efficient for f dp, Θ. P r o o f. Directly from Theorem 7.2, Proposition 3.1, and the above definitions. 8. EXAMPLES 8.1. Efficiency of the empirical estimator By the definition of the transform U we have Φ dp b n Φ d P for every Φ {Φ 1,..., Φ K } and, consequently, for every Φ H. Therefore, whenever f H, the empirical estimator f d P is efficient, which fully agrees with the main Theorem in Greenwood and Wefelmeyer[6]. But, moreover, suppose only f Φ 0 for some Φ 0 H. Then, by Section 3, we have (f Φ 0 dp c for every P P S. Moreover, since B (f Φ 0, f Φ 0 0, by the CLT (Theorem 6.2 iii we now have V n 1 2 (f Φ 0 c d P 0 for n in probability [P ]. Then we may write f d P (f Φ 0 c d P + c + (f Φ 0 c d P +c+ Φ 0 d P Φ 0 dp b n (f Φ 0 c d P + f dp b n and since ( (f Φ 0 c d P o P V n 1 2, the empirical estimator f d P still remains efficient. The correction term occurs due to the non-stationarity of the empirical distribution P. By a slight modification it could be made stationary, but we would loose the unbiasness: Φ d V P n dp Φ dp which is crucial for the limit theorems.

892 M. JANŽURA Remark 8.1. Thanks to the above result we have also directly the efficiency of the parameter estimate ( n U 1 Φ d P which is a regular transform of the efficient empirical estimate Φ d P. We may observe that the estimate n is secondary, derived from the primary estimate Φ d P. The definition of the estimate is natural, it corresponds to the relation between the theoretical terms, and it also agrees with the (approximate maximum likelihood estimate (see, e. g., Janžura [12]. Anyhow, also the efficiency study for the estimate n would be hardly possible without studying the efficiency of the estimate Φ d P. 8.2. von Mises statistic for i.i.d. variables For X 0 {x 0,..., x K } with b x 0 we consider Φ i B b {0} for i 1,..., K, defined by Φ i (x δ(x, x i for x X 0. Then P G(Φ is the product distribution with P {t} (x i and P{t} (x 1 0 1+ P for every t K Zd. j1 e j With a given collection of data x for i 1,..., K. Providing Φ i d obtain Φ n i i d P log 1 K j1 Φj d X 0 we have Φ i d P e i 1+ P K j1 e j for i 0 1 V n P > 0 for i 1,..., K, and K i1 P for i 1,..., K. t V n δ(x i, x t Φ i d P < 1 we Now, for a local statistic f B Λ, Λ V, we obtain by Theorem 6.2 the efficient estimate of f dp in the form f d t Λ P b n {t} y Λ X Λ 0 f(y Λ t Λ 1 V n Λ 1 V n z Λ (bx Λ f(z Λ l V n δ(y t, x l which is nothing else but the von Mises statistic, i. e. the average over all data permutations. 8.3. Markov chains Now, suppose d 1, X 0 {x 0, x 1,..., x L }, K L(L + 1, and Φ { Φ i, Φ ik ; i, k 1,..., L }

An efficient estimator for Gibbs random fields 893 where Φ i (x δ(x, x i for x X 0, Φ ik (x, y δ(x, x i δ(y, x k for x X 0, y X 1. Then, providing the empirical distribution is again positive, we have P b n given as the distribution of an ergodic Markov chain with the initial distribution P b n {0} (x i Φ i dp 0 R Φ ij d b P R Φ i d b P for i 1,..., L, and the transition probabilities P b n {1 0} (y j x i for i, j 1,..., L. Thus, the result agrees with that of Greenwood and Wefelmeyer [5]. 8.4. Reversible Markov chains Let us suppose, moreover, that there is a finite group of transforms Γ {γ : H H} where again H Lin(Φ i, Φ ik ; i, k 1,..., L. Let us denote by H Γ {Φ H; γ Φ Φ for every γ Γ} the set of invariant potentials with respect to the group Γ. For all P Φ, Φ H Γ, suppose, in addition, Ψ dp Φ γ Ψ dp Φ for every γ Γ, Ψ H. (This is fulfilled, e. g., if γ Φ Φ γ with some γ : X X. Then the efficient estimator for Ψ dp Φ,Ψ H, is given by Namely, we may write Φ Ψ dp bn 1 Γ 1 Γ γ Γ γ Ψ d P. Φ γ Ψ dp bn γ Γ 1 γ Ψ d V P n Γ since 1 Γ γ Γ γ Ψ H Γ. Here Φ n H Γ is given by b Φ dp Φ n Φ H Γ. Φ d P V n for every Now, for the above Markov chains example, let us set Γ {γ 0, γ 1 } where γ 0 is the identity, i. e., γ 0 Φ Φ for every Φ H, and γ 1 Φ(x 0 Φ(x 0 for Φ B {0}, γ 1 Φ(x 0, x 1 Φ(x 1, x 0 for Φ B {0,1} is the reversion. Then all the assumptions are satisfied and we obtain the efficient estimator for f B {0,1} in the form n 1 1 1 [f( x i, x i+1 + f( x i+1, x i ] 2 n 1 i1 just as in Greenwood and Wefelmeyer [5].

894 M. JANŽURA 9. A NOTE ON THE PERFORMANCE Unfortunately, for the Gibbs random fields, there is a lack of analytic formulas for calculating the expectations f dp Φ. The only feasible way consists of substituting the theoretic terms by their simulated counterparts. The problem occurs both for calculating the estimate n U 1 ( Φ d P V n final evaluation of the expectation f dp b n. The estimate n can be calculated by an iterative procedure ( n (k + 1 (k C k Φ dp b n(k Φ d P, and the where C k 0 for k, but the term Φ dp bn (k must be for every actual k substituted by Φ d P x M M (k, where M V is as large as possible and x M (k X0 M is simulated with the distribution P b n(k, e. g., by a Markov Chain Monte Carlo procedure (cf., e. g., Younes [15]. At the stopping time k, the procedure returns both the estimate n (k and the simulated data x M (k. Thus, the expectation f dp b n is finally substituted by f d P x M M (k. The whole procedure remains optimal from the asymptotic point of view, but for a fixed size data x V n it is an open question whether it gives really a better estimate than the empirical estimator f d P. Apparently, if f B V, with V V much larger to compare with A where Φ B A, or, at least, with much larger diameter, our approach may be strongly recommended since the empirical mean f d P x M M (k is an average of much larger number of values than f d P x V, and would be very likely more precise. n And the whole approach can be understood as a bootstrap-like method. ACKNOWLEDGEMENT Supported by the grant No. P402/12/G097. (Received January 14, 2014 R E F E R E N C E S [1] P. J. Bickel, C. A. J. Klaassen, Y. Ritov, and J. A. Wellner: Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins University Press, Baltimore 1993. [2] R. L. Dobrushin: Prescribing a system of random variables by conditional distributions. Theor. Probab. Appl. 15 (1970, 458 486. [3] R. L. Dobrushin and B. S. Nahapetian: Strong convexity of the pressure for lattice systems of classical physics (in Russian. Teoret. Mat. Fiz. 20 (1974, 223 234. [4] H. O. Georgii: Gibbs Measures and Phase Transitions. De Gruyter Studies in Mathematics 9, De Gruyter, Berlin 1988.

An efficient estimator for Gibbs random fields 895 [5] P. E. Greenwood and W. Wefelmeyer: Efficiency of empirical estimators for Markov Chains. Ann. Statist. 23 (1995, 132 143. [6] P. E. Greenwood and W. Wefelmeyer: Characterizing efficient empirical estimators for local interaction Gibbs fields. Stat. Inference Stoch. Process. 2 (1999, 119 134. [7] K. Gross: Absence of second-order phase transitions in the Dobrushin uniqueness region. J. Statist. Phys. 25 (1981, 57 72. [8] J. Hájek: A characterization of limiting distributions of regular estimates. Wahrsch. Verw. Gebiete 14 (1970, 323 330. [9] M. Janžura: Statistical analysis of Gibbs random fields. In: Trans. 10th Prague Conference on Inform. Theory, Stat. Dec. Functions, Random Processes, Prague 1984, pp. 429 438. [10] M. Janžura: Local asymptotic normality for Gibbs random fields. In: Proc. Fourth Prague Symposium on Asymptotic Statistics (P. Mandl and M. Hušková, eds., Charles University, Prague 1989, pp. 275 284. [11] M. Janžura: Asymptotic behaviour of the error probabilities in the pseudo-likelihood ratio test for Gibbs Markov distributions. In: Prof. Asymptotic Statistics (P. Mandl and M. Hušková, eds., Physica Verlag, Heidelberg 1994, pp. 285 296. [12] M. Janžura: Asymptotic results in parameter estimation for Gibbs random fields. Kybernetika 33 (1997, 2, 133 159. [13] M. Janžura: On the concept of the asymptotic Rényi distances for random fields. Kybernetika 35 (1999, 3, 353 366. [14] H. Künsch: Decay of correlations under Dobrushin s uniqueness condition and its applications. Comm. Math. Phys. 84 (1982, 207 222. [15] L. Younes: Parameter inference for imperfectly observed Gibbsian fields. Probab. Theory Rel. Fields 82 (1989, 625 645. Martin Janžura, Institute of Information Theory and Automation Academy of Sciences of the Czech Republic, Pod Vodárenskou věží 4, 182 08 Praha 8. Czech Republic. e-mail: janzura@utia.cas.cz