These reference chapters have been taken from Volume I, and use the same chapter and section numbering as the printed version.
The Descriptive Statistics entry on the Model menu involves the formal calculation of statistics on database variables. Model-related statistics are considered in Chapters 17 and 18. This chapter provides the formulae underlying the computations. PcGive will use the largest available sample by default, here denoted by t=1,...,T. It is always possible to graph or compute the statistics over a shorter sample period.
This reports sample means and standard deviations of the selected variables:
| x= 1/T ∑t=1Txt, s=( |
| ∑t=1T( xt-x) 2)½. |
The correlation coefficient rxy between x and y is:
|
|
The correlation matrix of the selected variables is reported as a symmetric matrix with the diagonal equal to one. Each cell records the simple correlation between the two relevant variables. The same sample is used for each variable; observations with missing values are dropped.
This is the test statistic described in §18.4.4, which amounts to testing whether the skewness and kurtosis of the variable corresponds to that of a normal distribution. Missing value are dropped from each variable, so the sample size may be different for each variable.
This prints the sample autocorrelation function of the selected variables, as described in §18.4.2. The same sample is used for each variable; observations with missing values are dropped.
A crucial property of any economic variable influencing the behaviour of statistics in econometric models is the extent to which that variable is stationary. If the autoregressive description
|
|
has a root on the unit circle, then conventional distributional results are not applicable to coefficient estimates. As the simplest example, consider:
| xt=α+βxt-1+εt where β=1 and εt~IN( 0,σε 2) , |
which generates a random walk (with drift if α≠0). Here, the autoregressive coefficient is unity and stationarity is violated. A process with no unit or explosive roots is said to be I(0); a process is I( d) if it needs to be differenced d times to become I(0) and is not I(0) if only differenced d-1 times. Many economic time series behave like I(1), though some appear to be I(0) and others I(2).
The Durbin--Watson statistic for the level of a variable offers one simple characterization of this integrated property:
|
|
If xt is a random walk, DW will be very small. If xt is white noise, DW will be around 2. Very low DW values thus indicate that a transformed model may be desirable, perhaps including a mixture of differenced and disequilibrium variables.
An augmented Dickey--Fuller (ADF) test for I(1) against I(0) (see Dickey and Fuller, 1981) is provided by the t-statistic on β̂ in:
|
|
The constant or trend can optionally be excluded from (eq:16.4); the specification of the lag length n assumes that ut is white noise. The null hypothesis is H0: β=0; rejection of this hypothesis implies that xt is I(0). A failure to reject implies that Δxt is stationary, so xt is I(1). This is a second useful description of the degree of integratedness of xt. The Dickey--Fuller (DF) test has no lagged first differences on the right-hand side ( n=0) . On this topic, see the Oxford Bulletin of Economics and Statistics (Hendry, 1986a, Banerjee and Hendry, 1992a), and Banerjee, Dolado, Galbraith and Hendry (1993). To test whether xt is I(1), commence with the next higher difference:
|
|
Output of the ADF(n) test of (eq:16.4) consists of:
| coefficients | α̂ and μ̂ (if included), β̂, γ̂1,...,γ̂n, |
| standard errors | SE(α̂), SE(μ̂), SE(β̂), SE(γ̂i), |
| t-values | tα , tμ , tβ , tγi, |
| σ̂ | as (eq:17.10), |
| DW | (eq:16.3) applied to ût, |
| DW(x) | (eq:16.3) applied to xt, |
| ADF(x) | tβ , |
| Critical values | |
| RSS | as (eq:17.11). |
Most of the formulae for the computed statistics are more conveniently presented in the next section on simple dynamic regressions, but the t-statistic is defined (e.g., for α̂) as tα = α̂/SE(α̂), using the formula in (eq:17.5). Critical values are derived from the reponse surfaces in MacKinnon (1991), and depend on whether a constant, or constant and trend, are included (seasonals are ignored). Under the null (β=0), α≠0 entails a trend in {xt} and μ≠0 implies a quadratic trend. However, under the stationary alternative, α=0 would impose a zero trend. Thus the test ceases to be similar if the polynomial in time (1,t,t2 etc.) in the model is not at least as large as that in the data generating process (see, for example, Kiviet and Phillips, 1992). This problem suggests allowing for a trend in the model unless the data is anticipated to have a zero mean in differences. The so-called Engle-Granger two-step method amounts to applying the ADF test to residuals from a prior static regression (the first step). The response surfaces need to be adjusted for the number of variables involved in the first step: see MacKinnon (1991).
The default of PcGive is to report a summary test output for the sequence of ADF(n)...ADF(0) tests. The summary table lists, for j=n,...,0:
| D-lag | j (the number of lagged differences), |
| t-adf | the t-value on the lagged level: tβ , |
| beta Y_1 | the coefficient on the lagged level: β, |
| σ̂ | as (eq:17.10), |
| t-DY_lag | t-value of the longest lag: tγj, |
| t-prob | significance of the longest lag: 1-P( | τ| ≤| tγj| ) , |
| AIC | Akaike criterion, see §17.2.12 |
| F-prob | significance level of the F-test on the lags dropped up to that point, |
Critical values are given, and significance of the ADF test is marked by asterisks: * indicates significance at 5%, ** at 1%.
Principal components analysis (PCA) amounts to an eigenvalue analysis of the correlation matrix. Because the correlation matrix has ones on the diagonal, its trace equals k when k variables are involved. Therefore, the sum of the eigenvalues also equals k. Moreover, all eigenvalues are non-negative.
The eigenvalue decomposition of the k ×k correlation matrix C is:
| C=HΛ H', |
where λ is the diagonal matrix with the ordered eigenvalues λ1 ≥...≥λk ≥0 on the diagonal, and H=(h1, ..., hk) the matrix with the corresponding eigenvectors in the columns, H'H=Ik. The matrix of eigenvectors diagonalizes the correlation matrix:
| H' C H =Λ. |
Let (x1, ..., xk) denote the variables selected for principal components analysis (a T ×k matrix), and Z=(z1, ..., zk) the standardized data (i.e. in deviation from their mean, and scaled by the standard deviation). Then Z'Z/T = C. The jth principal component is defined as:
| pj = Zhj = z1 h1j+...+zk hkj, |
and accounts for 100 λj/k % of the variation. The largest m principal components together account for 100∑j=1m λj/k % of the variation.
Principal components analysis is used to capture the variability of the data in a small number of factors. Using the correlation matrix enforces a common scale on the data (analysis in terms of the variance matrix is not invariant to scaling). Some examples of the use of PCA in financial applications are given in Alexander (2001, Ch.6).
PCA is sometimes used to reconstruct missing data on y in combination with data condensation. Assume that T observations are available on y, but T+H on the remaining data, then two methods could be considered:
|
More recently, PCA has become a popular tool for forecasting.
Define the sample autocovariances {ĉj} of a stationary series xt, t=...,T:
|
|
using the full sample mean x= 1/T ∑t=1Txt. The variance σ̂2x corresponds to ĉ0.
The autocorrelation function (ACF) plots the series {r̂j} where r̂j is the sample correlation coefficient between xt and xt-j. The length of the ACF is specified by the user, leading to a figure which shows ( r̂1,r̂2,...,r̂s) plotted against ( 1,2,...,s) where for any j when x is any chosen variable:
|
|
The first autocorrelation, {r̂0}, is equal to one, and omitted from the graphs.
The asymptotic variance of the autocorrelations is 1/T, so approximate 95% error bars are indicated at ±2T-1/2 (see e.g. Harvey, 1993, p.42).
If a series is non-stationary, the usual definition of a correlation between successive lags is required: see Nielsen (2006a). This comment also applies to the partial autocorrelation function described in the next section.
Given the sample autocorrelation function {r̂j}, the partial autocorrelations are computed using Durbin's method as described in Golub and Van Loan (1989, §4.7.2). This corresponds to recursively solving the Yule--Walker equations. For example, with autocorrelations, r̂0, r̂1, r̂2, ..., the first partial correlation is α̂0=1 (omitted from the graphs). The second, α̂1, is the solution from
| ( |
| ) = ( |
| ) ( |
| ), |
et cetera.
The periodogram is defined as:
|
|
Note that p(0)=0.
When the periodogram is plotted, only frequencies greater than zero and up to π are used. Moreover, the x-axis, with values 0,...,π, is represented as 0,...,1. So, when T=4 the x coordinates are 0.5,1 corresponding to π/2, π. When T=5, the x coordinates are 0.4,0.8 corresponding to 2π/5, 4π/5.
The estimated spectral density is a smoothed function of the sample autocorrelations {r̂j}, defined as in (eq:16.7). The sample spectral density is then defined as:
|
|
where | .| takes the absolute value, so that, for example, r̂| -1| =r̂1. The K( .) function is called the lag window. OxMetrics uses the Parzen window:
|
|
We have that K(-j)=K(j), so that the sign of j does not matter ( cos (x)= cos (-x)). The r̂js are based on fewer observations as j increases. The window function attaches decreasing weights to the autocorrelations, with zero weight for j>m. The parameter m is called the lag truncation parameter. In OxMetrics, this is taken to be the same as the chosen length of the correlogram. For example, selecting s=12 (the with length setting in the dialog) results in m=12. The larger m, the less smooth the spectrum becomes, but the lower the bias. The spectrum is evaluated at 128 points between 0 and π. For more information see Priestley (1981) and Granger and Newbold (1986).
Given a data set {xt}=( x1...xT) which are observations on a random variable X. The range of {xt} is divided into N intervals of length h with h defined below. Then the proportion of xt in each interval constitutes the histogram; the sum of the proportions is unity on the scaling that is used. The density can be estimated as a smoothed function of the histogram using a normal or Gaussian kernel. This can then be summed (`integrated') to obtain the estimated cumulative distribution function (CDF).
Denote the actual density of X at x by fx( x) . A non-parametric estimate of the density is obtained from the sample by:
|
|
where h is the window width or smoothing parameter, and K( .) is a kernel such that:
| ∫-∞∞ K( z) dz=1. |
PcGive sets:
| h=1.06σ̂x/T0.2 |
as a default, and uses the standard normal density for K( .) :
|
|
fx( x) ̂ is usually calculated for 128 values of x, using a fast Fourier transform. An excellent reference on density function estimation is Silverman (1986).
The variable in a QQ plot would normally hold critical values which are hypothesized to come from a certain distribution. The QQ plot function then draws a cross plot of these observed values (sorted), against the theoretical quantiles. The 45o line is drawn for reference (the closer the cross plot to this line, the better the match).
The normal QQ plot includes the pointwise asymptotic 95% standard error bands, as derived in Engler and Nielsen (2009)) for residuals of regression models (possibly autoregressive) with an intercept.
Single equation estimation is allowed by:
| OLS-CS | ordinary least squares (cross-section modelling) |
| IVE-CS | instrumental variables estimation (cross-section modelling) |
| OLS | ordinary least squares |
| IVE | instrumental variables estimation |
| RALS | rth order autoregressive least squares |
| NLS | non-linear least squares |
| ML | maximum likelihood estimation |
Once a model has been specified, a sample period selected, and an estimation method chosen, the equation can be estimated. OLS-CS/IVE-CS and OLS/IVE only differ in the way the sample period is selected. In the first, cross section, case, all observations with missing values are omitted. Therefore, `holes' in the database are simply skipped. In cross-section mode it is also possible to specify a variable Sel by which to select the sample. In that case, observations where Sel has a 0 or missing values are omitted from the estimation sample (but, if data is available, included in the prediction set). In dynamic regression, the observations must be consecutive in time, and the maximum available sample is the leading contiguous sample. The following table illustrates the default sample when regressing y on a constant:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
For ease of notation, the sample period is denoted t=1,...,T+H, after allowing for any lagged variables created where H is the forecast horizon. The data used for estimation are X=( x1...xT) . The H retained observations XH=( xT+1...xT+H) are used for static (1-step) forecasting and evaluating parameter constancy.
This chapter discusses the statistics reported by PcGive following model estimation. The next chapter presents the wide range of evaluation tools available following successful estimation. Sections marked with * denote information that can be shown or omitted on request. In the remainder there is no distinction between OLS/IVE and OLS-CS/IVE-CS.
In most cases, recursive estimation is available:
| RLS | recursive OLS |
| RIVE | recursive IVE |
| RNLS | recursive NLS |
| RML | recursive ML |
Recursive OLS and IV estimation methods are initialized by a direct estimation over t=1,...,M-1, followed by recursive estimation over t=M,...,T. RLS and RIVE update inverse moment matrices . This is inherently somewhat numerically unstable, but, because it is primarily a graphical tool, this is not so important.
Recursive estimation of non-linear models is achieved by the brute-force method: first estimate for the full sample, then shrink the sample by one observation at a time. At each step the estimated parameters of the previous step are used as starting values, resulting in a considerably faster algorithm.
The final estimation results are always based on direct full-sample estimation, so unaffected whether recursive or non-recursive estimation is used. The recursive output can be plotted from the recursive graphics dialog.
The algebra of OLS estimation is well established from previous chapters. The model is:
| yt=β'xt+ut, with ut~IN( 0,σ2) t=1,...,T, |
|
|
The vectors β and xt are k×1. The OLS estimates of β are:
|
|
|
|
and estimated residual variance
|
|
Forecast statistics are provided for the H retained observations (only if H≠0). For OLS, these are comprehensive 1-step ahead forecasts and tests, described below.
The estimation output is presented in columnar format, where each row lists information pertaining to each variable (its coefficient, standard error, t-value, etc.). Optionally, the estimation results can be printed in equation format,, which is of the form coefficient × variable with standard errors in parentheses underneath.
The first column of these results records the names of the variables and the second, the estimated regression coefficients β̂=( X'X) -1X'y. PcGive does actually not use this expression to estimate β̂. Instead it uses the QR decomposition with partial pivoting, which analytically gives the same result, but in practice is a bit more reliable (i.e. numerically more stable). The QR decomposition of X is X=QR, where Q is T ×T and orthogonal (that is, Q'Q=I), and R is T ×k and upper triangular. Then X'X=R'R.
The following five columns give further information about each of the magnitudes described below in §17.2.2 to §17.2.11.
These are obtained from the variance-covariance matrix:
|
|
where dii is the ith diagonal element of ( X'X) -1 and σ̂u is the standard error of the regression, defined in (eq:17.4).
These statistics are conventionally calculated to determine whether individual coefficients are significantly different from zero:
|
|
where the null hypothesis H0 is βi=0. The null hypothesis is rejected if the probability of getting a t-value at least as large is less than 5% (or any other chosen significance level). This probability is given as:
|
|
in which τ has a Student t-distribution with T-k degrees of freedom. The t-probabilities do not appear when all other options are switched on.
When H0 is true (and the model is otherwise correctly specified in a stationary process), a Student t-distribution is used since the sample size is often small, and we only have an estimate of the parameter's standard error: however, as the sample size increases, τ tends to a standard normal distribution under H0. Large t-values reject H0; but, in many situations, H0 may be of little interest to test. Also, selecting variables in a model according to their t-values implies that the usual (Neyman--Pearson) justification for testing is not valid (see, for example, Judge, Griffiths, Hill, Lütkepohl and Lee, 1985).
The final column lists the squared partial correlations under the header Part.R^2. The jth entry in this column records the correlation of the jth explanatory variable with the dependent variable, given the other k-1 variables. Adding further explanatory variables to the model may either increase or lower the squared partial correlation, and the former may occur even if the added variables are correlated with the already included variables. If the squared partial correlations fall on adding a variable, then that is suggestive of collinearity for the given equation parametrization: that is, the new variable is a substitute for, rather than a complement to, those already included.
Beneath the columnar presentation an array of summary statistics is also provided as follows:
The residual variance is defined as:
|
|
where the residuals are defined as:
|
|
The equation standard error (ESE) is the square root of (eq:17.10):
|
|
This is labelled sigma in the regression output.
|
|
The variation in the dependent variable, or the total sum of squares (TSS), can be broken up into two parts: the explained sum of squares (ESS) and the residual sum of squares (RSS). In symbols, TSS=ESS+RSS, or:
| ∑t=1T( yt-y) 2=∑t=1T( ŷt-y) 2+∑t=1Tût2, |
and hence:
| R2= ESS/TSS = |
| =1- |
| =1- RSS/TSS , |
assuming a constant is included. Thus, R2 is the proportion of the variance of the dependent variable which is explained by the variables in the regression. By adding more variables to a regression, R2 will never decrease, and it may increase even if nonsense variables are added. Hence, R2 may be misleading. Also, R2 is dependent on the choice of transformation of the dependent variable (for example, y versus Δy) -- as is the F-statistic below. The equation standard error, σ̂u, however, provides a better comparative statistic because it is adjusted by the degrees of freedom. Generally, σ̂ can be standardized as a percentage of the mean of the original level of the dependent variable (except if the initial mean is zero) for comparisons across specifications. Since many economic magnitudes are inherently positive, that standardization is often feasible. If y is in logs, 100σ̂ is the percentage standard error.
R2 is not reported if the regression does not have an intercept.
The formula was already given:
|
|
Here, the null hypothesis is that the population R2 is zero, or that all the regression coefficients are zero (excluding the intercept). The value for the F-statistic is followed by its probability value between square brackets.
The adjusted R2 incorporates a penalty for the number of regressors:
| R2= R2 - |
| (1 - R2), |
assuming a constant is included. The adjusted R-squared can go down when the number of variables increases. Nonetheless, there is no rationale to use it as a model selection criterion.
An alternative way to express it uses (eq:17.8) and (eq:17.13):
| R2= 1 - |
| , |
so maximizing R2 corresponds to minimizing σ̂2u.
R2 is not reported if the regression does not have an intercept.
The log-likelihood for model (eq:17.1) is:
| l(β,σ2 | y, X) = -T/2 log 2π- T/2 log σ2 - ½ |
| . |
Next, we can concentrate σ2 out of the log-likelihood to obtain:
| lc(β | y, X) = Kc - T/2 log |
| , |
where
| Kc = -T/2(1 + log 2π). |
The reported log-likelihood includes the constant, so corresponds to:
| lc (β | y, X)= Kc - T/2 log RSS/T . |
The final entries list the number of observations used in the regressor (so after allowing for lags), and the number of estimated parameters. This is followed by the mean and standard error of the dependent variable:
|
|
Note
that we use T-1 in the denominator of σ̂2y, so this
is what would be reported as the equation standard error (eq:17.10)
when regressing the dependent variable on just a constant.
The four statistics reported are the Schwarz criterion (SC), the Hannan--Quinn (HQ) criterion, the Final Prediction Error (FPE), and the Akaike criterion (AIC). Here:
|
|
using the maximum likelihood estimate of σ2:
| σ̃2= |
| σ̂2= 1/T ∑t=1Tût2. |
For a discussion of the use of these and related scalar measures to choose between alternative models in a class, see Judge, Griffiths, Hill, Lütkepohl and Lee (1985) and §18.8 below.
These provide consistent estimates of the regression coefficients' standard errors even if the residuals are heteroscedastic in an unknown way. Large differences between the corresponding values in §17.2.2 and §17.2.13 are indicative of the presence of heteroscedasticity, in which case §17.2.13 provides the more useful measure of the standard errors (see White, 1980). PcGive contains two methods of computing heteroscedastic-consistent standard errors: as described in White (1980) (labelled HCSE), or the Jack-knife estimator from MacKinnon and White (1985) (labelled JHCSE; for which the code was initially provided by James MacKinnon).
The heteroscedasticity and autocorrelation consistent standard errors are reported in the column labelled HACSE. This follows Newey and West (1987), also see Andrews (1991).
The R2 is preceded by the seasonal means s of the first difference of the dependent variable (\Delta y for annual data, four quarterly means for quarterly data, twelve monthly means for monthly data etc.).
The R2 relative to difference and seasonals is a measure of the goodness of fit relative to ∑(Δyt-s)2 instead of ∑(yt-y)2 in the denominator of R2 (keeping ∑ût2 in the numerator), where s denotes the relevant seasonal mean. Despite its label, such a measure can be negative: if it is, the fitted model does less well than a regression of Δyt on seasonal dummies.
This reports the sample means and sample standard deviations of the selected variables:
| x= 1/T ∑t=1Txt, s=( |
| ∑t=1T( xt-x) 2)½. |
The correlation matrix of the selected variables is reported as a lower-triangular matrix with the diagonal equal to one. Each cell records the simple correlation between the two relevant variables. The calculation of the correlation coefficient rxy between x and y is:
|
|
The matrix of the estimated parameters' variances is reported as lower triangular. Along the diagonal, we have the variance of each estimated coefficient, and off the diagonal, the covariances. The k×k variance matrix of β̂ is estimated by:
|
|
where σ̂2 is the full-sample equation error variance. The variance-covariance matrix is only shown when requested, in which case it is reported before the equation output.
The remaining statistics only appear if observations were withheld for forecasting purposes:
Following estimation over t=1,...,T, 1-step forecasts are given by:
|
|
which requires the observations XH'=(xT+1,...,xT+H). The 1-step forecast error is the mistake made each period:
|
|
|
|
Assuming that E[β̂]=β, then E[et]=0 and:
|
|
This corresponds to the results given for the innovations in recursive estimation. The whole vector of forecast errors is e=( eT+1,...,eT+H) '. V[e] is derived in a similar way:
|
|
Estimated variances are obtained after replacing σu2 by σ̂u2.
The columns respectively report the date for which the forecast is made, the realized outcome (yt), the forecast (ŷt), the forecast error (et=yt-ŷt), the standard error of the 1-step forecast (SE( et) =√V[ et] ̂), and a t-value (that is, the standardized forecast error et/SE( et) ).
A χ2 statistic follows the 1-step analysis, comparing within and post-sample residual variances. Neither this statistic nor η3 below measure absolute forecast accuracy. The statistic is calculated as follows:
|
|
The null hypothesis is `no structural change in any parameter between the sample and the forecast periods' (denoted 1 and 2 respectively), H0: β1=β2; σ12=σ22. A rejection of the null hypothesis of constancy by ξ3 below implies a rejection of the model used over the sample period -- so that is a model specification test -- whereas the use of ξ1 is more as a measure of numerical parameter constancy, and it should not be used as a model-selection device (see Kiviet, 1986). However, persistently large values for this statistic imply that the equation under study will not provide very accurate ex ante predictions, even one step ahead. An approximate F-equivalent is given by:
|
|
A second statistic takes parameter uncertainty into account, taking the denominator from (eq:17.20):
|
|
This test is not reported in single-equation modelling, but individual terms of the summation can be plotted in the graphical analysis.
This is the main test of parameter constancy and has the form:
|
|
where H0 is as for ξ1. For fixed regressors, the Chow (1960) test is exactly distributed as an F, but is only approximately (or asymptotically) so in dynamic models.
Alternatively expressed, the Chow test is:
|
|
We can now see the relation between ξ3 and ξ1: the latter uses V[ e] ̂=σ̂u2I, obtained by dropping the (asymptotically negligible) term V[β̂] in (eq:17.21). In small samples, the dropped term is often not negligible, so ξ1 should not be taken as a test. The numerical value of ξ1 always exceeds that of ξ3: the difference indicates the relative increase in prediction uncertainty arising from estimating, rather than knowing, the parameters.
PcGive computes the Chow test efficiently, by noting that:
|
|
The recursive formulae are applicable over the sample T+1,...,T+H, and under the null of correct specification and H0 of ξ1 above, then the standardized innovations {νt/(ωt)1/2} are distributed as IN(0,σ2u). Thus:
|
|
This tests for a different facet of forecast inaccuracy in which the forecast errors have a small but systematic bias. This test is the same as an endpoint CUSUM test of recursive residuals, but using only the forecast sample period (see Harvey and Collier, 1977).
|
|
in which we have n-1 endogenous variables yt* and q1 non-modelled variables wt on the right-hand side (the latter may include lagged endogenous variables). We assume that we have q2 additional instruments, labelled wt*. Write yt=(yt:yt*')' for the n×1 vector of endogenous variables. Let zt denote the set of all instrumental variables (non-endogenous included regressors, plus additional instruments): zt=(wt':wt*')', which is a vector of length q=q1+q2.
The reduced form (RF) estimates are only printed on request. If Z'=( z1...zT) , and yt denotes all the n endogenous variables including yt at t with Y'=(y1,...,yT), then the RF estimates are:
|
|
which is q×n. The elements of Π̂' relevant to each endogenous variable are written:
|
|
with Yi'=(yi1,...,yiT) the vector of observations on the ith endogenous variable. Standard errors etc. all follow as for OLS above (using Z, Yi for X,y in the relevant equations there).
Generalized instrumental variables estimates for the k=n-1+q1 coefficients of interest β=(β0':β1')' are:
|
|
using xt=(yt*':wt')', X'=( x1...xT) , y=(y1...yT)', which is the left-hand side of (eq:17.29), and Z is as in (eq:17.30). This allows for the case of more instruments than explanatory variables (q>k), and requires rank(X'Z)=k and rank(Z'Z)=q. If q=k the equation simplifies to:
|
|
As for OLS, PcGive does not use expression (eq:17.32) directly, but instead uses the QR decomposition for numerically more stable computation. The error variance is given by
|
|
The variance of β̃ is estimated by:
|
|
Again the output is closely related to that reported for least squares except that the columns for HCSE, partial r2 and instability statistics are omitted. However, RSS, σ̃ and DW are recorded, as is the reduced form σ̂ (from regressing yt on zt, already reported with the RF equation for yt). Additional statistics reported are :
This tests for the validity of the choice of the instrumental variables as discussed by Sargan (1964). It is asymptotically distributed as χ2(q2-n+1) when the q2-n+1 over-identifying instruments are independent of the equation error. It is also interpretable as a test of whether the restricted reduced form (RRF) of the structural model (yt on xt plus xt on zt) parsimoniously encompasses the unrestricted reduced form (URF: yt on zt directly):
|
|
with π̂=( Z'Z) -1Z'y being the unrestricted reduced form estimates.
Reported is the χ2 test of β=0 (other than the intercept) which has a crude correspondence to the earlier F-test. On H0: β=0, the reported statistic behaves asymptotically as a χ2( k-1) . First define
|
|
Then ξβ /σ̃ε app ̃ χ2(k) would test whether all k coefficients are zero. To keep the intercept separate, we compute:
|
|
This amounts to using the formula for β̃ (eq. (eq:17.32)) in ξβ with y-yι instead of y.
A forecast test is provided if H observations are retained for forecasting. For IVE there are endogenous regressor variables: the only interesting issue is that of parameter constancy and correspondingly the output is merely ξ1 of (eq:17.22) using σ̃ε and:
|
|
Dynamic forecasts (which require forecasts of the successive xT+1,...,xT+H) could be obtained from multiple equation dynamic modelling, where the system as a whole is analyzed.
As discussed in the typology, if a dynamic model has common factors in its lag polynomials, then it can be re-expressed as having lower-order systematic dynamics combined with an autoregressive error process (called COMFAC. If the autoregressive error is of rth order, the estimator is called rth-order Autoregressive Least Squares or RALS, and it takes the form:
|
|
|
|
|
|
with εt~IN( 0,σε 2) .
|
|
as a function of the ( β,α) parameters yields a non-linear least squares problem necessitating iterative solution. However, conditional on values of either set of parameters, f( .) is linear in the other set, so analytical first and second derivatives are easy to obtain. There is an estimator-generating equation for this whole class (see Hendry, 1976, Section 7), but as it has almost no efficient non-iterative solutions, little is gained by its exploitation. Letting θ denote all of the unrestricted parameters in β0( .) , {βi( .) } and α( .) , then the algorithm programmed in PcGive for maximizing f( .) as a function of θ is a variant of the Gauss--Newton class. Let:
|
|
so that negligible cross-products are eliminated, then at the ith iteration:
|
|
where si is a scalar chosen by a line search procedure to maximize f(θi+1|θi). The convergence criterion depends on qi'Qi-1qi and on changes in θi between iterations. The bi-linearity of f( .) is exploited in computing Q.
Before estimating by RALS, OLS estimates of {βi} are calculated, as are LM-test values of {αi}, where the prespecified autocorrelation order is `data frequency+1' (for example, 5 for quarterly data). These estimates are then used to initialize θ. However, the {αi} can be reset by users. Specifically, for single-order processes, ut=αrut-r+εt, then αr can be selected by a prior grid search. The user can specify the maximum number of iterations, the convergence tolerance, both the starting and ending orders of the polynomial α( L) in the form:
| ut=∑i=srαiut-i+εt, |
and whether to minimize f( .) sequentially over s, s+1,...,r or merely the highest order, r.
On convergence, the variances of the θs are calculated (from Q-1), as are the roots of α( L) =0. The usual statistics for σ̂, RSS (this can be used in likelihood-ratio tests between alternative nested versions of a model), t-values etc. are reported, as is T-1∑( yt-y) 2 in case a pseudo-R2 statistic is desired.
|
|
|
|
where β̂ and {α̂i} are obtained over 1,...,T. The forecast error is:
|
|
|
|
|
|
|
|
|
|
where we define xt+'=xt-Σsrαixt-i, ûr'=( ût-s...ût-r) , wt'=( xt+':ûr') , and θ'=( β':α') when α'=( αs...αr) . E[ et] ~=0 for a correctly-specified model. Finally, therefore (neglecting the second-order dependence of the variance of wt'(θ-θ̂) on θ̂ acting through wt):
|
|
V[θ̂] is the RALS variance-covariance matrix, and from the forecast-error covariance matrix, the 1-step analysis is calculated, as are parameter-constancy tests.
The output is as for OLS: the columns respectively report the date for which the forecast is made, the realized outcome (yt), the forecast (ŷt), the forecast error (et=yt-ŷt), the standard error of the 1-step forecast (SE( et) =√V[ et] ̂), and a t-value (that is, the standardized forecast error et/SE( et) ).
The RALS analogues of the forecast test ξ1 of (eq:17.22), and of the Chow test η3 in (eq:17.26), are reported. The formulae follow directly from (eq:17.48) and (eq:17.53).
The non-linear regression model is written as
|
|
We take θ to be a k×1 vector. For example:
| yt=θ0+θ1xtθ2+θ3zt1-θ2+ut. |
Note that for fixed θ2 this last model becomes linear; for example, for θ2= 1/2 :
| yt=θ0+θ1xt*+θ3zt*+ut, xt*=(xt)½, zt*=(zt)½, |
which is linear in the transformed variables xt*, zt*. As for OLS, estimation proceeds by minimizing the sum of squared residuals:
|
|
In linear models, this problem has an explicit solution; for non-linear models the minimum has to be found using iterative optimization methods.
Instead of minimizing the sum of squares, PcGive maximizes the sum of squares divided by -T:
|
|
As for RALS, an iterative procedure is used to locate the maximum:
|
|
with q( .) the derivatives of g(.) with respect to θj (this is determined numerically), and Q( .) -1 a symmetric, positive definite matrix (determined by the BFGS method after some initial Gauss-Newton steps). Practical details of the algorithm are provided in §17.5.3; Volume II gives a more thorough discussion of the subject of numerical optimization. Before using NLS you are advised to study the examples given in the tutorial Chapter, to learn about the potential problems.
Output is as for OLS, except for the instability tests and HCSEs which are not computed. The variance of the estimated coefficients is determined numerically, other statistics follow directly, for example:
|
|
Forecasts are computed and graphed, but the only statistic reported is the ξ1 test of (eq:17.22), using 1-step forecast errors:
|
|
We saw that for an independent sample of T observations and k parameters θ:
|
|
This type of model can be estimated with PcGive, which solves the problem:
|
|
Models falling in this class are, for example, binary logit and probit, ARCH, GARCH, Tobit, Poisson regression. As an example, consider the linear regression model. PcGive gives three ways of solving this:
Clearly, the first method is to be preferred when available.
Estimation of (eq:17.61) uses the same technique as NLS. The output is more concise, consisting of coefficients, standard errors (based on the numerical second derivative), t-values, t-probabilities, and `loglik' which is ∑t=1Tl(θ̂|xt). Forecasts are computed and graphed, but no statistics are reported.
Non-linear model are formulated in algebra code. NLS requires the definition of a variable called actual, and one called fitted. It uses these to maximize minus the residual sum of squares divided by T:
| - 1/T ∑t=1T (actualt - fittedt)2. |
An example for NLS is:
actual = CONS; fitted = &0 + &1 * INC + &2 * lag(INC,1); &0 = 400; &1 = 0.8; &2 = 0.2;
This is just a linear model, and much more efficiently done using the normal options.
Models can be estimated by maximum likelihood if they can be written as a sum over the observations (note that the previous concentrated log-likehood cannot be written that way!). An additional algebra line is required, to define a variable called loglik. PcGive maximizes:
| ∑t=1T loglikt . |
Consider, for example, a binary logit model:
actual = vaso; xbeta = &0 + &1 * Lrate + &2 * Lvolume; fitted = 1 / (1 + exp(-xbeta)); loglik = actual * log(fitted) + (1-actual) * log(1-fitted); &0 = 0.74; &1 = 1.3; &2 = 2.3;
Here actual and fitted are not really that, but these variables define what is being graphed in the graphic analysis.
Note that algebra is a vector language without temporary variables, restricting the class of models that can be estimated. Non-linear models are not stored for recall and progress reports.
After correct model specification, the method is automatically set to Non-linear model (using ML if loglik is defined, NLS/RNLS otherwise); in addition, the following information needs to be specified:
NLS and ML estimation (and their recursive variants RNLS and RML) require numerical optimization to maximize the likelihood log L( φ( θ) ) = l( φ( θ) ) as a non-linear function of θ. PcGive maximization algorithms are based on a Newton scheme:
|
|
with
PcGive uses the quasi-Newton method developed by Broyden, Fletcher, Goldfarb, Shanno (BFGS) to update K = Q-1 directly. It uses numerical derivatives to compute ∂l( φ( θ) ) / ∂θi. However, for NLS, PcGive will try Gauss-Newton before starting BFGS. In this hybrid method, Gauss-Newton is used while the relative progress in the function value is 20%, then the program switches to BFGS.
Starting values must be supplied. The starting value for K consistes of 0s off-diagonal. The diagonal is the minimum of one and the inverse of the corresponding diagonal element in the matrix consisting of the sums of the outer-products of the gradient at the parameter starting values (numerically evaluated).
RNLS works as follows: starting values for θ and K for the first estimation (T-1 observations) are the full sample values (T observations); then the sample size is reduced by one observation; the previous values at convergence are used to start with.
Owing to numerical problems it is possible (especially close to the maximum) that the calculated δi does not yield a higher likelihood. Then an si∈[0,1] yielding a higher function value is determined by a line search. Theoretically, since the direction is upward, such an si should exist; however, numerically it might be impossible to find one. When using BFGS with numerical derivatives, it often pays to scale the data so that the initial gradients are of the same order of magnitude.
The convergence decision is based on two tests. The first uses likelihood elasticities (∂l/∂ log θ):
|
|
The second is based on the one-step-ahead relative change in the parameter values:
|
|
The status of the iterative process is given by the following messages:
The step length si has become too small. The convergence test (eq:17.63) was not passed, using tolerance ε=ε2.
The step length si has become too small. The convergence test (eq:17.63) was passed, using tolerance ε=ε2.
Both convergence tests (eq:17.63) and (eq:17.64) were passed, using tolerance ε=ε1.
The chosen default values for the tolerances are:
|
|
You can:
Graphic analysis focuses on graphical inspection of individual equations. Let yt, ŷt denote respectively the actual (that is, observed) values and the fitted values of the selected equation, with residuals ût=yt-ŷt, t=1,...,T. When H observations are retained for forecasting, then ŷT+1,...,ŷT+H are the 1-step forecasts. NLS/RNLS/ML use the variables labelled `actual' and `fitted' for yt, ŷt.
Fourteen different graphs are available:
(yt,ŷt) over t. This is a graph showing the fitted (ŷt) and actual values (yt) of the dependent variable over time, including the forecast period.
ŷt against yt, also including the forecast period.
( ût/σ̂) over t, where σ̂2=(T-k)-1RSS is the full-sample equation error variance. As indicated, this graph shows the scaled residuals given by ût/σ̂ over time.
The 1-step forecasts can be plotted in a graph over time: yt and ŷt are shown with error bars of ±2SE( et) centered on ŷt (that is, an approximate 95% confidence interval for the 1-step forecast); et are the forecast errors.
Plots the histogram of the standardized residuals ût/√(T-1RSS), t=1,...,T, the estimated density fu(.)̂ and a normal distribution with the same mean and variance (more details are in §16.10).
This plots the residual autocorrelations using ût as the xt variable in (eq:18.13).
This plots the partial autocorrrelation function (see §16.6)--the same graph is used if the ACF is selected.
If available, the individual Chow χ2(1) tests (see §eq:17.24) are be plotted.
( ût) over t;
This plots the estimated spectral density (see §16.9) using ût as the xt variable.
Shows a QQ plot of the residuals, see §16.11.
The non-parametrically estimated density fu(.)̂ of the standardized residuals ût/√(T-1RSS), t=1,...,T is graphed using the settings described in the OxMetrics book.
This plots the histogram of the standardized residuals ût/√(T-1RSS), t=1,...,T--the same graph is used if the density is selected.
Plots the distribution based on the non-parametrically estimated density.
The residuals can be saved to the database for further inspection.
Recursive methods estimate the model at each t for t=M-1,...,T. The output generated by the recursive procedures is most easily studied graphically, possibly using the facility to view multiple graphs together on screen. The dialog has a facility to write the output to the editor, instead of graphing it. The recursive estimation aims to throw light on the relative future information aspect (that is, parameter constancy).
Let β̂t denote the k parameters estimated from a sample of size t, and yj-xj'β̂t the residuals at time j evaluated at the parameter estimates based on the sample 1,...,t (for RNLS the residuals are yj-f(xj,β̂t)).
We now consider the generated output:
The graph shows β̂it±2SE(β̂it) for each selected coefficient i ( i=1,...,k) over t=M,...,T.
β̂it/SE(β̂it) for each selected coefficient i ( i=1,...,k) over t=M,...,T.
The residual sum of squares at each t is RSSt=∑j=1t(yj-xj'β̂t)2 for t=M,...,T.
The 1-step residuals yt-xt'β̂t are shown bordered by 0±2σ̂t over M,...,T. Points outside the 2 standard-error region are either outliers or are associated with coefficient changes.
The standardized innovations (or standardized recursive residuals) for RLS are:
νt=(yt-xt'β̂t-1)/(ωt) 1/2 where ωt=1+xt'( Xt-1'Xt-1) -1xt for t=M,...,T.
σ2ωt is the 1-step forecast error variance of (eq:17.20), and β̂M-1 are the coefficient estimates from the initializing OLS estimation.
1-step forecast tests are F( 1,t-k-1) under the null of constant parameters, for t=M,...,T. A typical statistic is calculated as:
|
|
Normality of yt is needed for this statistic to be distributed as an F.
Break-point F-tests are F( T-t+1,t-k-1) for t=M,...,T. These are, therefore, sequences of Chow tests and are also called N↓ because the number of forecasts goes from N=T-M+1 to 1. When the forecast period exceeds the estimation period, this test is not necessarily optimal relative to the covariance test based on fitting the model separately to the split samples. A typical statistic is calculated as:
|
|
This test is closely related to the CUSUMSQ statistic in Brown, Durbin and Evans (1975).
Forecast F-tests are F( t-M+1,M-k-1) for t=M,...,T, and are called N↑ as the forecast horizon increases from M to T. This tests the model over 1 to M-1 against an alternative which allows any form of change over M to T. A typical statistic is calculated as:
|
|
The statistics in (eq:18.1)--(eq:18.3) are variants of Chow (1960) tests: they are scaled by 1-off critical values from the F-distribution at any selected probability level as an adjustment for changing degrees of freedom, so that the significant critical values become a straight line at unity. Note that the first and last values of (eq:18.1) respectively equal the first value of (eq:18.3) and the last value of (eq:18.2).
The Chow test statistics are not calculated for RIVE/RML; the recursive RSS is not available for RML.
The general class of models estimable in PcGive can be written in the form:
|
|
where b0( L) and the bi( L) are polynomials in the lag operator L. Now q+1 is the number of distinct variables (one of which is yt), whereas k remains the number of estimated coefficients. For simplicity we take all polynomials to be of length m:
| bi( L) =∑j=0mbijLj, i=0,...,q. |
With b00=1 and using a(L)=-∑j=1mb0jLj-1 we can write (eq:18.4) as:
|
|
Finally, we use a=(b01,...,b0m)' and bi=(bi0,...,bim), i=1,...,q.
In its unrestricted mode of operation, PcGive can be visualized as analyzing the polynomials involved, and it computes such functions as their roots and sums. This option is available if a general model was initially formulated, and provided OLS or IVE was selected.
When working with dynamic models, concepts such as equilibrium solutions, steady-state growth paths, mean lags of response etc. are generally of interest. In the simple model:
|
|
where all the variables are stationary, a static equilibrium is defined by:
| E[ zt] =z* for all t |
in which case, E[ yt] =y* will also be constant if |α1|<1, and yt will converge to:
|
|
For non-stationary but cointegrated data, reinterpret expression (eq:18.7) as E[ yt-Kzt] =0.
PcGive computes estimates of K and associated standard errors. These are called static long-run parameters. If b0( 1) ≠0, the general long-run solution of (eq:18.4) is given by:
|
|
The expression yt-ΣKizit is called the equilibrium-correction mechanism (ECM) and can be stored in the data set. If common-factor restrictions of the form bj( L) =α( L) γj( L) , j=0,...,q are imposed, then α( 1) will cancel, hence enforced autoregressive error representations have no impact on derived long-run solutions.
The standard errors of K̂=( K̂1...K̂q) ' are calculated from:
|
|
PcGive calculates J analytically using the algorithm proposed by Båardsen (1989).
PcGive outputs the solved static long-run equation, with standard errors of the coefficients. This is followed by a Wald test of the null that all of the long-run coefficients are zero (except the constant term). The V[K̂]̂ matrix is printed when `covariance matrix of estimated coefficients' is checked under the model options.
The b̂i( L) , i=0,...,q of (eq:18.4) and their standard errors are reported in tabular form with the b̂i( 1) (their row sums) and associated standard errors.
The first column contains F-tests of each of the q+1 hypotheses:
| Hv0:a=0; Hvi:bi=0 for i=1,...,q. |
These test the significance of each basic variable in turn. The final column gives the PcGive unit-root tests:
| Hui:bi( 1) =0 for i=0,...,q. |
If Hui: bi( 1) =0 cannot be rejected, there is no significant long-run level effect from zit; if Hvi: bi=0 cannot be rejected, there is no significant effect from zit at any (included) lag. Significance is marked by * for 5% and ** for 1%. Critical values for the PcGive unit-root test (Hu0: b0( 1) =0) are based on Ericsson and MacKinnon (2002). For the unit-root test, only significance of the dependent variable is reported (not the remaining variables!),
Conflicts between the tests' outcomes are possible in small samples.
Note that bi( 1) =0 and bi=0 are not equivalent; testing Ki=0 is different again. Using (eq:18.6) we can show the relevant hypotheses:
|
F-tests of each lag length are shown, beginning at the longest ( m) and continuing down to 1. The test of the longest lag is conditional on keeping lags ( 1,...,m-1) , that of ( m-1) is conditional on ( 1,...,m-2,m) etc.
Finally, F-tests of all lags up to m are shown, beginning at the longest ( 1,...,m) and continuing further from ( 2,...,m) down to ( m,...,m) . These tests are conditional on keeping no lags, keeping lag 1, down to keeping ( 1,...,m-1) . Thus, they show the marginal significance of all longer lags.
COMFAC tests
for the legitimacy of common-factor restrictions of the
form:
|
|
where α( L) is of order r and * denotes polynomials of the original order minus r. The degrees of freedom for the Wald tests for COMFAC are equal to the number of restrictions imposed by α( L) and the Wald statistics are asymptotically χ2 with these degrees of freedom if the COMFAC restrictions are valid. It is preferable to use the incremental values obtained by subtracting successive values of the Wald tests. These are χ2 also, with degrees of freedom given by the number of additional criteria. Failure to reject common-factor restrictions does not entail that such restrictions must be imposed. For a discussion of the theory of COMFAC, see Hendry and Mizon (1978) for some finite-sample Monte Carlo evidence see Mizon and Hendry (1980). COMFAC is not available for RALS.
When the minimum order of lag length in the bi( L) is unity or larger (m say), the Wald test sequence for 1,2,...,m common factors is calculated. Variables that are redundant when lagged (Constant, Seasonals, Trend) are excluded in conducting the Wald test sequence since they always sustain a common-factor interpretation.
|
|
With |α1|<1 this can be written as:
| yt=w( L) zt+vt, |
when:
| w( L) =( β0+β1L) /( 1-α1L) =( β0+β1L) ( 1+α1L+α12L2+...) . |
Starting from an equilibrium z* at t=0, a one-off increment of δ to z* has an impact on y* at t=0,1,2,... of w0δ, w1δ, w2δ, w3δ,... with the ws defined by equating coefficients of powers of L as:
| w0=β0, w1=β1+β0α1, w2=α1w1, w3=α1w2,... |
PcGive can graph the normalized lag weights w0/w( 1) , w1/w( 1) ,..., ws/w( 1) and the cumulative normalized lag weights w0/w( 1) , ( w0+w1) /w( 1) ,..., ( w0+...+ws) /w( 1) .
Lag weights are available for models estimated by OLS or IVE.
Irrespective of the estimator selected, a wide range of diagnostic tests is offered. Tests are available for residual autocorrelation, conditional heteroscedasticity, normality, unconditional heteroscedasticity/functional form mis-specification and omitted variables. Recursive residuals can be used if these are available. Tests for common factors and linear restrictions are discussed in §18.3.4 and §18.5 below, encompassing tests in §18.9. Thus, relating this section to the earlier information taxonomy , the diagnostic tests of this section concern the past (checking that the errors are a homoscedastic, normal, innovation process relative to the information available), whereas the forecast statistics discussed in Chapter 17 concern the future and encompassing tests concern information specific to rival models.
Many test statistics in PcGive have either a χ2 distribution or an F distribution. F-tests are usually reported as:
F(num,denom) = Value [Probability] /*/**
for example:
F(1, 155) = 5.0088 [0.0266] *
where the test statistic has an F-distribution with one degree of freedom in the numerator, and 155 in the denominator. The observed value is 5.0088, and the probability of getting a value of 5.0088 or larger under this distribution is 0.0266. This is less than 5% but more than 1%, hence the star. Significant outcomes at a 1% level are shown by two stars. χ2 tests are also reported with probabilities, as for example:
Normality Chi^2(2) = 2.1867 [0.3351]
The 5% χ2 critical values with two degrees of freedom is 5.99, so here normality is not rejected (alternatively, Prob(χ2≥ 2.1867) = 0.3351, which is more than 5%). Details on the computation of probability values and quantiles for the F and χ2 tests are given under the probf, probchi, quanf and quanchi functions in the Ox reference manual (Doornik, 2007b).
Some tests take the form of a likelihood ratio (LR) test. If l is the unrestricted, and l0 the restricted log-likelihood, then -2(l0-l) has a χ2(s) distribution, with s the number of restrictions imposed (so model l0 is nested in l).
Many diagnostic tests are calculated through an auxiliary regression. For single-equation tests, they take the form of TR2 for the auxiliary regression so that they are asymptotically distributed as χ2( s) under their nulls, and hence have the usual additive property for independent χ2s. In addition, following Harvey (1990) and Kiviet (1986), F-approximations are calculated because they may be better behaved in small samples:
|
|
When the covariance matrix is block diagonal between regression and heteroscedasticity (or ARCH) function parameters, tests can take the regression parameters as given, see Davidson and MacKinnon (1993, Ch. 11):
| . |
| ~F( s,T-s). |
This may be slightly different if not all parameters are included in the test, or when observations are lost in the construction of the test.
The sample autocorrelation function (ACF) of a variable xt is the series
{rj} where rj is the correlation coefficient between
xt and xt-j for j = 1,...,s:
|
|
Here x= 1/T ∑t=jTxt is the sample mean of xt.
The residual correlogram is defined as above, but using the residuals from the econometric regression, rather than the data. Thus, this reports the series {rj} of correlations between the residuals ût and ût-j. In addition, PcGive prints the partial autocorrelation function (PACF) (see the OxMetrics book).
It is possible to calculate a statistic based on `T*(sum of s squared autocorrelations)', with s the length of the correlogram, called the Portmanteau statistic:
|
|
This is corresponds to Box and Pierce (1970), but with a degrees of freedom correction as suggested by Ljung and Box (1978). It is designed as a goodness-of-fit test in stationary, autoregressive moving-average models. Under the assumptions of the test, LB(s) is asymptotically distributed as χ2(s-n) after fitting an AR(n) model. A value such that LB( s) ≥2s is taken as indicative of mis-specification for large s. However, small values of such a statistic should be treated with caution since residual autocorrelations are biased towards zero (like DW) when lagged dependent variables are included in econometric equations. An appropriate test for residual autocorrelation is provided by the LM test in §18.4.3 below.
This is a test for autocorrelated residuals and is calculated as:
|
|
DW is most powerful as a test of {ut} being white noise against:
| ut=ρut-1+εt where εt~IID( 0,σε 2) . |
If 0<DW<2, then the null hypothesis is H0: ρ=0, that is, zero autocorrelation (so DW=2) and the alternative is H1: ρ>0, that is, positive first-order autocorrelation.
If 2<DW<4, then H0: ρ=0 and H1: ρ<0, in which case DW*=4-DW should be computed.
The significance values of DW are widely recorded in econometrics' textbooks. However, DW is a valid statistic only if all the xt variables are non-stochastic, or at least strongly exogenous. If the model includes a lagged dependent variable, then DW is biased towards 2, that is, towards not detecting autocorrelation, and Durbin's h-test (see Durbin, 1970) or the equivalent LM-test for autocorrelation in §18.4.3 should be used instead. Also see §16.4.
This is the Lagrange-multiplier test for rth order residual autocorrelation, distributed as χ2( r) in large samples, under the null hypothesis that there is no autocorrelation (that is, that the errors are white noise). In standard usage, r~= 1/2 s for s in §18.4.2 above, so this provides a type of Portmanteau test (see Godfrey, 1978). However, any orders from 1 up to 12 can be selected to test against:
| ut=∑i=prαiut-i+εt where 0≤p≤r. |
As noted above, the F-form suggested by Harvey (1981, see Harvey, 1990) is the recommended diagnostic test. Following the outcome of the F-test (and its p-value), the error autocorrelation coefficients are recorded. For an autoregressive error of order r to be estimated by RALS, these LM coefficients provide good initial values, from which the iterative optimization can be commenced. The LM test is calculated by regressing the residuals on all the regressors of the original model and the lagged residuals for lags p to r (missing residuals are set to zero). The LM test χ2(r-p+1) is TR2 from this regression (or the F-equivalent), and the error autocorrelation coefficients are the coefficients of the lagged residuals. For an excellent exposition, see Pagan (1984).
Let μ, σx2 denote the mean and variance of {xt}, and write μi=E[ xt-μ] i, so that σx2=μ2. The skewness and kurtosis are defined as:
|
|
Sample counterparts are defined by
|
|
A normal variate will have √β1=0 and β2=3. Bowman and Shenton (1975) consider the test:
|
|
which subsequently was derived as an LM test by Jarque and Bera (1987). Unfortunately e1 has rather poor small sample properties: √b1 and b2 are not independently distributed, and the sample kurtosis especially approaches normality very slowly. The test reported by PcGive is based on Doornik and Hansen (1994), who employ a small sample correction, and adapt the test for the multivariate case. It derives from Shenton and Bowman (1977), who give b2 (conditional on b2>1+b1) a gamma distribution, and D'Agostino (1970), who approximates the distribution of √b1 by the Johnson Su system. Let z1 and z2 denote the transformed skewness and kurtosis, where the transformation creates statistics which are much closer to standard normal. The test statistic is:
|
|
Table Table:18.1 compares (eq:18.19) with its asymptotic form (eq:18.18). It gives the rejection frequencies under the null of normality, using χ2(2) critical values. The experiments are based on 10000 replications and common random numbers.
| nominal probabilities of e2 | nominal probabilities of (eq:18.18) | |||||||
| T | 20% | 10% | 5% | 1% | 20% | 10% | 5% | 1% |
| 50 | 0.1734 | 0.0869 | 0.0450 | 0.0113 | 0.0939 | 0.0547 | 0.0346 | 0.0175 |
| 100 | 0.1771 | 0.0922 | 0.0484 | 0.0111 | 0.1258 | 0.0637 | 0.0391 | 0.0183 |
| 150 | 0.1845 | 0.0937 | 0.0495 | 0.0131 | 0.1456 | 0.0703 | 0.0449 | 0.0188 |
| 250 | 0.1889 | 0.0948 | 0.0498 | 0.0133 | 0.1583 | 0.0788 | 0.0460 | 0.0180 |
PcGive reports the following statistics under the normality test option, replacing xt by the residuals ut:
| mean | x |
| standard deviation | σx=(m2)½ |
| skewness | √b1 |
| excess kurtosis | b2-3 |
| minimum | |
| maximum | |
| asymptotic test | e1 |
| normality testχ2( 2) | e2 [ P( χ2( 2) ≥e2) ] |
This test is based on White (1980), and involves an auxiliary regression of {ût2} on the original regressors ( xit) and all their squares (xit2). The null is unconditional homoscedasticity, and the alternative is that the variance of the {ut} process depends on xt and on the xit2. The output comprises TR2, the F-test equivalent, the coefficients of the auxiliary regression, and their individual t-statistics, to help highlight problem variables. Variables that are redundant when squared are automatically removed, as are observations that have a residual that is (almost) zero. Some additional information can be found in Volume II.
This test is that of White (1980), and only calculated if there is a large number of observations relative to the number of variables in the regression. It is based on an auxiliary regression of the squared residuals ( ût2) on all squares and cross-products of the original regressors (that is, on r=½k( k+1) variables). That is, if T>>k( k+1) , the test is calculated; redundant variables are automatically removed, as are observations that have a residual that is (almost) zero. The usual χ2 and F-values are reported; coefficients of the auxiliary regression are also shown with their t-statistics to help with model respecification. This is a general test for heteroscedastic errors: H0 is that the errors are homoscedastic or, if heteroscedasticity is present, it is unrelated to the xs.
In previous versions of PcGive this test used to be called a test for functional form mis-specification. That terminology was criticized by Godfrey and Orme (1994), who show that the test does not have power against omitted variables.
This is the ARCH (AutoRegressive Conditional Heteroscedasticity) test: see Engle, 1982) which in the present form tests the hypothesis γ=0 in the model:
| E[ ut2|ut-1,...,ut-r] =c0+∑i=1rγiut-i2 |
where γ=( γ1,...,γr) '. Again, we have TR2 as the χ2 test from the regression of ût2 on a constant and ût-12 to ût-r2 (called the ARCH test) which is asymptotically distributed as χ2( r) on H0: γ=0. The F-form is also reported. Both first-order and higher-order lag forms are easily calculated (see Engle, 1982, and Engle, Hendry and Trumbull, 1985).
The RESET test (Regression Specification Test) due to Ramsey (1969) tests the null of correct specification of the original model against the alternative that powers of ŷt such as (ŷt2, ŷt3...) have been omitted (PcGive only allows squares). This tests to see if the original functional form is incorrect, by adding powers of linear combinations of xs since by construction, ŷt=xt'β̂t.
We use RESET23 for the test that uses squares and cubes, while RESET refers to the test just using squares.
Parameter instability statistics are reported for σ2, followed by the joint statistic for all the parameters in the model (also see §18.4.9), based on the approach in Hansen (1992). Next, the instability statistic is printed for each parameter ( β1,...,βk,σ2).
Large values reveal non-constancy (marked by * or **), and indicate a fragile model. Note that this measures within-sample parameter constancy, and is computed if numerically feasible (it may fail owing to dummy variables), so no observations need be reserved. The indicated significance is only valid in the absence of non-stationary regressors.
The LM tests for autocorrelation, heteroscedasticity and functional form require an auxiliary regression involving the original regressors xit. NLS uses ∂f(xt,θ)/∂θi (evaluated at θ̂) instead. The auxiliary regression for the autocorrelation test is:
|
|
These three tests are not computed for models estimated using ML.
Writing the model in matrix form as y=Xβ+u, the null hypothesis of p linear restrictions can be expressed as H0 : Rβ=r, with R a (p×k) matrix and r a p×1 vector. This test is well explained in most econometrics textbooks, and uses the unrestricted estimates (that is, it is a Wald test).
The subset form of the linear restrictions tests is: H0: βi=...=βj=0: any choice of coefficients can be made, so a wide range of specification hypothesis can be tested.
Writing θ̂ =β̂, with corresponding variance-covariance matrix V[ θ̂ ] , we can test for (non-) linear restrictions of the form:
| f( θ) =0. |
The null hypothesis H0:f(θ)=0 will be tested against H1:f(θ)≠0 through a Wald test:
| w=f( θ̂ ) '( ĴV[ θ̂ ] ̃Ĵ') -1f( θ̂ ) |
where J is the Jacobian matrix of the transformation: J=∂f(θ)/∂θ'. PcGive computes Ĵ by numerical differentiation. The statistic w has a χ2(s) distribution, where s is the number of restrictions (that is, equations in f(.)). The null hypothesis is rejected if we observe a significant test statistic.
Lag polynomials of any variable in the database can be tested for omission. Variables that would change the sample or are already in the model are automatically deleted. The model itself remains unchanged. If the model is written in matrix form as y=Xβ+Zγ+u, then H0: γ=0 is being tested. The test exploits the fact that on H0:
|
|
|
|
for p added variables.
Since ( X'X) -1 is precalculated, the F-statistic is easily computed by partitioned inversion. Computations for IVE are more involved.
Finally, PcGive has specific procedures programmed to operate when a
general-to-specific mode is adopted.
In PcGive, when a model is specified and estimated by least squares or instrumental variables, then the general dynamic analysis is offered: see §18.3.
However, while the tests offered are a comprehensive set of Wald statistics on variables, lags and long-run outcomes, a reduction sequence can involve many linear transformations (differencing, creating differentials etc.) as well as eliminations. Consequently, as the reduction proceeds, PcGive monitors its progress, which can be reviewed at the progress menu. The main statistics reported comprise:
Once appropriate data representations have been selected, it is of interest
to see whether the chosen model can explain (that is, account for) results
reported by other investigators. Often attention has focused on the ability
of chosen models to explain each other's residual variances (variance
encompassing), and PcGive provides the facility for doing so using test
statistics based on Cox (1961) as suggested by Pesaran (1974). Full
details of those computed by PcGive for OLS and IVE are provided in Ericsson (1983). Note that a badly-fitting model should be rejected against
well-fitting models on such tests, and that care is required in interpreting
any outcome in which a well-fitting model (which satisfies all of the other
criteria) is rejected against a
badly-fitting, or silly, model (see Mizon, 1984, Mizon and Richard, 1986,
and Hendry and Richard, 1989). The Sargan test is for the restricted reduced form
parsimoniously encompassing the unrestricted reduced form, which is implicitly defined by
projecting yt on all of the non-modelled variables. The F-test
is for each model parsimoniously encompassing
their union. This is the only one of these tests which is invariant to the
choice of common regressors in the two models.
Thus, the F-test yields the same numerical outcome for the first model parsimoniously encompassing either the union of the two models under consideration, or the orthogonal complement to the first model relative to the union. In PcGive, tests of both models encompassing the other are reported.
Ahumada, H. (1985). "An encompassing test of two models of the balance of trade for Argentina" Oxford Bulletin of Economics and Statistics, 47, 51--70.
Alexander, C. (2001). Market Models: A Guide to Financial Data Analysis. Chichester: John Wiley and Sons.
Amemiya, T. (1981). "Qualitative response models: A survey" Journal of Economic Literature, 19, 1483--1536.
Amemiya, T. (1985). Advanced Econometrics. Oxford: Basil Blackwell.
Anderson, T. W. (1971). The Statistical Analysis of Time Series. New York: John Wiley & Sons.
Andrews, D. W. K. (1991). "Heteroskedasticity and autocorrelation consistent covariance matrix estimation" Econometrica, 59, 817--858.
Baba, Y., Hendry, D. F., and Starr, R. M. (1992). "The demand for M1 in the U.S.A., 1960--1988" Review of Economic Studies, 59, 25--61.
Banerjee, A., Dolado, J. J., Galbraith, J. W., and Hendry, D. F. (1993). Co-integration, Error Correction and the Econometric Analysis of Non-Stationary Data. Oxford: Oxford University Press.
Banerjee, A., Dolado, J. J., Hendry, D. F., and Smith, G. W. (1986). "Exploring equilibrium relationships in econometrics through static models: Some Monte Carlo evidence" Oxford Bulletin of Economics and Statistics, 48, 253--277.
Banerjee, A., Dolado, J. J., and Mestre, R. (1998). "Error-correction mechanism tests for cointegration in a single equation framework" Journal of Time Series Analysis, 19, 267--283.
Banerjee, A., and Hendry, D. F.(eds.)(1992a). Testing Integration and Cointegration. Oxford Bulletin of Economics and Statistics: 54.
Banerjee, A., and Hendry, D. F. (1992b). "Testing integration and cointegration: An overview" Oxford Bulletin of Economics and Statistics, 54, 225--255.
Bårdsen, G. (1989). "The estimation of long run coefficients from error correction models" Oxford Bulletin of Economics and Statistics, 50.
Bentzel, R., and Hansen, B. (1955). "On recursiveness and interdependency in economic models" Review of Economic Studies, 22, 153--168.
Bollerslev, T., Chou, R. S., and Kroner, K. F. (1992). "ARCH modelling in finance -- A review of the theory and empirical evidence" Journal of Econometrics, 52, 5--59.
Bontemps, C., and Mizon, G. E. (2003). "Congruence and encompassing" In Stigum, B. P.(ed.), Econometrics and the Philosophy of Economics, pp. 354--378. Princeton: Princeton University Press.
Bowman, K. O., and Shenton, L. R. (1975). "Omnibus test contours for departures from normality based on √b1 and b2" Biometrika, 62, 243--250.
Box, G. E. P., and Jenkins, G. M. (1976). Time Series Analysis, Forecasting and Control. San Francisco: Holden-Day. First published, 1970.
Box, G. E. P., and Pierce, D. A. (1970). "Distribution of residual autocorrelations in autoregressive-integrated moving average time series models" Journal of the American Statistical Association, 65, 1509--1526.
Breusch, T. S., and Pagan, A. R. (1980). "The Lagrange multiplier test and its applications to model specification in econometrics" Review of Economic Studies, 47, 239--253.
Brown, R. L., Durbin, J., and Evans, J. M. (1975). "Techniques for testing the constancy of regression relationships over time (with discussion)" Journal of the Royal Statistical Society B, 37, 149--192.
Campos, J., Ericsson, N. R., and Hendry, D. F. (1996). "Cointegration tests in the presence of structural breaks" Journal of Econometrics, 70, 187--220.
Chambers, E. A., and Cox, D. R. (1967). "Discrimination between alternative binary response models" Biometrika, 54, 573--578.
Chow, G. C. (1960). "Tests of equality between sets of coefficients in two linear regressions" Econometrica, 28, 591--605.
Clements, M. P., and Hendry, D. F. (1998). Forecasting Economic Time Series. Cambridge: Cambridge University Press.
Clements, M. P., and Hendry, D. F. (1999). Forecasting Non-stationary Economic Time Series. Cambridge, Mass.: MIT Press.
Cochrane, D., and Orcutt, G. H. (1949). "Application of least squares regression to relationships containing auto-correlated error terms" Journal of the American Statistical Association, 44, 32--61.
Cox, D. R. (1961). "Tests of separate families of hypotheses" In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, pp. 105--123 Berkeley: University of California Press.
Cramer, J. S. (1986). Econometric Applications of Maximum Likelihood Methods. Cambridge: Cambridge University Press.
D'Agostino, R. B. (1970). "Transformation to normality of the null distribution of g1" Biometrika, 57, 679--681.
Davidson, J. E. H., Hendry, D. F., Srba, F., and Yeo, J. S. (1978). "Econometric modelling of the aggregate time-series relationship between consumers' expenditure and income in the United Kingdom" Economic Journal, 88, 661--692. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000; and in Campos, J., Ericsson, N.R. and Hendry, D.F. (eds.), General to Specific Modelling. Edward Elgar, 2005.
Davidson, R., and MacKinnon, J. G. (1993). Estimation and Inference in Econometrics. New York: Oxford University Press.
Dickey, D. A., and Fuller, W. A. (1981). "Likelihood ratio statistics for autoregressive time series with a unit root" Econometrica, 49, 1057--1072.
Doan, T., Litterman, R., and Sims, C. A. (1984). "Forecasting and conditional projection using realistic prior distributions" Econometric Reviews, 3, 1--100.
Doornik, J. A. (2007a). "Autometrics" Mimeo, Department of Economics, University of Oxford.
Doornik, J. A. (2007b). Object-Oriented Matrix Programming using Ox 6th ed. London: Timberlake Consultants Press.
Doornik, J. A. (2008). "Encompassing and automatic model selection" Oxford Bulletin of Economics and Statistics, 70, 915--925.
Doornik, J. A. (2009). "Autometrics" In Castle, J. L., and Shephard, N.(eds.), The Methodology and Practice of Econometrics: Festschrift in Honour of David F. Hendry. Forthcoming, Oxford: Oxford University Press.
Doornik, J. A., and Hansen, H. (1994). "A practical test for univariate and multivariate normality" Discussion paper, Nuffield College.
Doornik, J. A., and Hendry, D. F. (1992). PCGIVE 7: An Interactive Econometric Modelling System. Oxford: Institute of Economics and Statistics, University of Oxford.
Doornik, J. A., and Hendry, D. F. (1994). PcGive 8: An Interactive Econometric Modelling System. London: International Thomson Publishing, and Belmont, CA: Duxbury Press.
Doornik, J. A., and Hendry, D. F. (2006). Interactive Monte Carlo Experimentation in Econometrics Using PcNaive 2nd ed. London: Timberlake Consultants Press.
Doornik, J. A., and Hendry, D. F. (2009a). Econometric Modelling using PcGive: Volume III 3rd ed. London: Timberlake Consultants Press.
Doornik, J. A., and Hendry, D. F. (2009b). Modelling Dynamic Systems using PcGive: Volume II 4th ed. London: Timberlake Consultants Press.
Doornik, J. A., and Hendry, D. F. (2009c). OxMetrics: An Interface to Empirical Modelling 6th ed. London: Timberlake Consultants Press.
Doornik, J. A., and Ooms, M. (2006). Introduction to Ox 2nd ed. London: Timberlake Consultants Press.
Durbin, J. (1970). "Testing for serial correlation in least squares regression when some of the regressors are lagged dependent variables" Econometrica, 38, 410--421.
Eicker, F. (1967). "Limit theorems for regressions with unequal and dependent errors" In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Vol. 1, pp. 59--82 Berkeley: University of California.
Eisner, R., and Strotz, R. H. (1963). Determinants of Business Investment. Englewood Cliffs, N.J.: Prentice-Hall.
Emerson, R. A., and Hendry, D. F. (1996). "An evaluation of forecasting using leading indicators" Journal of Forecasting, 15, 271--291. Reprinted in T.C. Mills (ed.), Economic Forecasting. Edward Elgar, 1999.
Engle, R. F. (1982). "Autoregressive conditional heteroscedasticity, with estimates of the variance of United Kingdom inflation" Econometrica, 50, 987--1007.
Engle, R. F. (1984). "Wald, likelihood ratio, and Lagrange multiplier tests in econometrics" in Griliches, and Intriligator 1984, Ch. 13.
Engle, R. F., and Granger, C. W. J. (1987). "Cointegration and error correction: Representation, estimation and testing" Econometrica, 55, 251--276.
Engle, R. F., Hendry, D. F., and Richard, J.-F. (1983). "Exogeneity" Econometrica, 51, 277--304. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000; in Ericsson, N. R. and Irons, J. S. (eds.) Testing Exogeneity, Oxford: Oxford University Press, 1994; and in Campos, J., Ericsson, N.R. and Hendry, D.F. (eds.), General to Specific Modelling. Edward Elgar, 2005.
Engle, R. F., Hendry, D. F., and Trumbull, D. (1985). "Small sample properties of ARCH estimators and tests" Canadian Journal of Economics, 43, 66--93.
Engle, R. F., and White, H.(eds.)(1999). Cointegration, Causality and Forecasting. Oxford: Oxford University Press.
Engler, E., and Nielsen, B. (2009). "The empirical process of autoregressive residuals" Econometrics Journal, forthcoming.
Ericsson, N. R. (1983). "Asymptotic properties of instrumental variables statistics for testing non-nested hypotheses" Review of Economic Studies, 50, 287--303.
Ericsson, N. R. (1992). "Cointegration, exogeneity and policy analysis: An overview" Journal of Policy Modeling, 14, 251--280.
Ericsson, N. R., and Hendry, D. F. (1999). "Encompassing and rational expectations: How sequential corroboration can imply refutation" Empirical Economics, 24, 1--21.
Ericsson, N. R., and Irons, J. S. (1995). "The Lucas critique in practice: Theory without measurement" In Hoover, K. D.(ed.), Macroeconometrics: Developments, Tensions and Prospects, pp. 263--312. Dordrecht: Kluwer Academic Press.
Ericsson, N. R., and MacKinnon, J. G. (2002). "Distributions of error correction tests for cointegration" Econometrics Journal, 5, 285--318.
Escribano, A. (1985). "Non-linear error correction: The case of money demand in the UK (1878--1970)" Mimeo, University of California at San Diego.
Favero, C., and Hendry, D. F. (1992). "Testing the Lucas critique: A review" Econometric Reviews, 11, 265--306.
Finney, D. J. (1947). "The estimation from individual records of the relationship between dose and quantal response" Biometrika, 34, 320--334.
Fletcher, R. (1987). Practical Methods of Optimization, 2nd ed. New York: John Wiley & Sons.
Friedman, M., and Schwartz, A. J. (1982). Monetary Trends in the United States and the United Kingdom: Their Relation to Income, Prices, and Interest Rates, 1867--1975. Chicago: University of Chicago Press.
Frisch, R. (1934). Statistical Confluence Analysis by means of Complete Regression Systems. Oslo: University Institute of Economics.
Frisch, R. (1938). "Statistical versus theoretical relations in economic macrodynamics" Mimeograph dated 17 July 1938, League of Nations Memorandum. Reproduced by University of Oslo in 1948 with Tinbergen's comments. Contained in Memorandum `Autonomy of Economic Relations', 6 November 1948, Oslo, Universitets Økonomiske Institutt. Reprinted in Hendry D. F. and Morgan M. S. (1995), The Foundations of Econometric Analysis. Cambridge: Cambridge University Press.
Frisch, R., and Waugh, F. V. (1933). "Partial time regression as compared with individual trends" Econometrica, 1, 221--223.
Gilbert, C. L. (1986). "Professor Hendry's econometric methodology" Oxford Bulletin of Economics and Statistics, 48, 283--307. Reprinted in Granger, C. W. J. (ed.) (1990), Modelling Economic Series. Oxford: Clarendon Press.
Gilbert, C. L. (1989). "LSE and the British approach to time-series econometrics" Oxford Review of Economic Policy, 41, 108--128.
Godfrey, L. G. (1978). "Testing for higher order serial correlation in regression equations when the regressors include lagged dependent variables" Econometrica, 46, 1303--1313.
Godfrey, L. G. (1988). Misspecification Tests in Econometrics. Cambridge: Cambridge University Press.
Godfrey, L. G., and Orme, C. D. (1994). "The sensitivity of some general checks to omitted variables in the linear model" International Economic Review, 35, 489--506.
Golub, G. H., and Van Loan, C. F. (1989). Matrix Computations. Baltimore: The Johns Hopkins University Press.
Granger, C. W. J. (1969). "Investigating causal relations by econometric models and cross-spectral methods" Econometrica, 37, 424--438.
Granger, C. W. J. (1986). "Developments in the study of cointegrated economic variables" Oxford Bulletin of Economics and Statistics, 48, 213--228.
Granger, C. W. J., and Newbold, P. (1974). "Spurious regressions in econometrics" Journal of Econometrics, 2, 111--120.
Granger, C. W. J., and Newbold, P. (1977). "The time series approach to econometric model building" In Sims, C. A.(ed.), New Methods in Business Cycle Research, pp. 7--21. Minneapolis: Federal Reserve Bank of Minneapolis.
Granger, C. W. J., and Newbold, P. (1986). Forecasting Economic Time Series, 2nd ed. New York: Academic Press.
Gregory, A. W., and Veale, M. R. (1985). "Formulating Wald tests of non-linear restrictions" Econometrica, 53, 1465--1468.
Griliches, Z., and Intriligator, M. D.(eds.)(1984). Handbook of Econometrics, Vol. 2. Amsterdam: North-Holland.
Hansen, B. E. (1992). "Testing for parameter instability in linear models" Journal of Policy Modeling, 14, 517--533.
Harvey, A. C. (1981). The Econometric Analysis of Time Series. Deddington: Philip Allan.
Harvey, A. C. (1990). The Econometric Analysis of Time Series, 2nd ed. Hemel Hempstead: Philip Allan.
Harvey, A. C. (1993). Time Series Models, 2nd ed. Hemel Hempstead: Harvester Wheatsheaf.
Harvey, A. C., and Collier, P. (1977). "Testing for functional misspecification in regression analysis" Journal of Econometrics, 6, 103--119.
Harvey, A. C., and Shephard, N. G. (1992). "Structural time series models" In Maddala, G. S., Rao, C. R., and Vinod, H. D.(eds.), Handbook of Statistics, Vol. 11. Amsterdam: North-Holland.
Hendry, D. F. (1976). "The structure of simultaneous equations estimators" Journal of Econometrics, 4, 51--88. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000.
Hendry, D. F. (1979). "Predictive failure and econometric modelling in macro-economics: The transactions demand for money" In Ormerod, P.(ed.), Economic Modelling, pp. 217--242. London: Heinemann. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000; and in Campos, J., Ericsson, N.R. and Hendry, D.F. (eds.), General to Specific Modelling. Edward Elgar, 2005.
Hendry, D. F. (1980). "Econometrics: Alchemy or science?" Economica, 47, 387--406. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000.
Hendry, D. F.(ed.)(1986a). Econometric Modelling with Cointegrated Variables. Oxford Bulletin of Economics and Statistics: 48.
Hendry, D. F. (1986b). "Econometric modelling with cointegrated variables: An overview" Oxford Bulletin of Economics and Statistics, 48, 201--212. Reprinted in R.F. Engle and C.W.J. Granger (eds), Long-Run Economic Relationships, Oxford: Oxford University Press, 1991, 51--63.
Hendry, D. F. (1986c). "Using PC-GIVE in econometrics teaching" Oxford Bulletin of Economics and Statistics, 48, 87--98.
Hendry, D. F. (1987). "Econometric methodology: A personal perspective" In Bewley, T. F.(ed.), Advances in Econometrics, pp. 29--48. Cambridge: Cambridge University Press. Reprinted in Campos, J., Ericsson, N.R. and Hendry, D.F. (eds.), General to Specific Modelling. Edward Elgar, 2005.
Hendry, D. F. (1988). "The encompassing implications of feedback versus feedforward mechanisms in econometrics" Oxford Economic Papers, 40, 132--149. Reprinted in Ericsson, N. R. and Irons, J. S. (eds.) Testing Exogeneity, Oxford: Oxford University Press, 1994; and in Campos, J., Ericsson, N.R. and Hendry, D.F. (eds.), General to Specific Modelling. Edward Elgar, 2005.
Hendry, D. F. (1989). "Comment on intertemporal consumer behaviour under structural changes in income" Econometric Reviews, 8, 111--121.
Hendry, D. F. (1993). Econometrics: Alchemy or Science? Oxford: Blackwell Publishers.
Hendry, D. F. (1995a). Dynamic Econometrics. Oxford: Oxford University Press.
Hendry, D. F. (1995b). "On the interactions of unit roots and exogeneity" Econometric Reviews, 14, 383--419.
Hendry, D. F. (1995c). "A theory of co-breaking" Mimeo, Nuffield College, University of Oxford.
Hendry, D. F. (1996). "On the constancy of time-series econometric equations" Economic and Social Review, 27, 401--422.
Hendry, D. F. (1997). "On congruent econometric relations: A comment" Carnegie--Rochester Conference Series on Public Policy, 47, 163--190.
Hendry, D. F. (2000a). Econometrics: Alchemy or Science? Oxford: Oxford University Press. New Edition.
Hendry, D. F. (2000b). "Epilogue: The success of general-to-specific model selection" in Econometrics: Alchemy or Science? 2000a, pp. 467--490. New Edition.
Hendry, D. F., and Anderson, G. J. (1977). "Testing dynamic specification in small simultaneous systems: An application to a model of building society behaviour in the United Kingdom" In Intriligator, M. D.(ed.), Frontiers in Quantitative Economics, Vol. 3, pp. 361--383. Amsterdam: North Holland Publishing Company. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000.
Hendry, D. F., and Doornik, J. A. (1994). "Modelling linear dynamic econometric systems" Scottish Journal of Political Economy, 41, 1--33.
Hendry, D. F., and Doornik, J. A. (1997). "The implications for econometric modelling of forecast failure" Scottish Journal of Political Economy, 44, 437--461. Special Issue.
Hendry, D. F., and Ericsson, N. R. (1991). "Modeling the demand for narrow money in the United Kingdom and the United States" European Economic Review, 35, 833--886.
Hendry, D. F., Johansen, S., and Santos, C. (2004). "Selecting a regression saturated by indicators" Unpublished paper, Economics Department, University of Oxford.
Hendry, D. F., and Juselius, K. (2000). "Explaining cointegration analysis: Part I" Energy Journal, 21, 1--42.
Hendry, D. F., and Krolzig, H.-M. (1999). "Improving on `Data mining reconsidered' by K.D. Hoover and S.J. Perez" Econometrics Journal, 2, 202--219. Reprinted in J. Campos, N.R. Ericsson and D.F. Hendry (eds.), General to Specific Modelling. Edward Elgar, 2005.
Hendry, D. F., and Krolzig, H.-M. (2001). Automatic Econometric Model Selection. London: Timberlake Consultants Press.
Hendry, D. F., and Krolzig, H.-M. (2005). "The properties of automatic Gets modelling" Economic Journal, 115, C32--C61.
Hendry, D. F., and Massmann, M. (2007). "Co-breaking: Recent advances and a synopsis of the literature" Journal of Business and Economic Statistics, 25, 33--51.
Hendry, D. F., and Mizon, G. E. (1978). "Serial correlation as a convenient simplification, not a nuisance: A comment on a study of the demand for money by the Bank of England" Economic Journal, 88, 549--563. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000; and in Campos, J., Ericsson, N.R. and Hendry, D.F. (eds.), General to Specific Modelling. Edward Elgar, 2005.
Hendry, D. F., and Mizon, G. E. (1993). "Evaluating dynamic econometric models by encompassing the VAR" In Phillips, P. C. B.(ed.), Models, Methods and Applications of Econometrics, pp. 272--300. Oxford: Basil Blackwell. Reprinted in Campos, J., Ericsson, N.R. and Hendry, D.F. (eds.), General to Specific Modelling. Edward Elgar, 2005.
Hendry, D. F., and Mizon, G. E. (1999). "The pervasiveness of Granger causality in econometrics" in Engle, and White 1999, pp. 102--134.
Hendry, D. F., and Morgan, M. S. (1989). "A re-analysis of confluence analysis" Oxford Economic Papers, 41, 35--52.
Hendry, D. F., and Morgan, M. S. (1995). The Foundations of Econometric Analysis. Cambridge: Cambridge University Press.
Hendry, D. F., and Neale, A. J. (1987). "Monte Carlo experimentation using PC-NAIVE" In Fomby, T., and Rhodes, G. F.(eds.), Advances in Econometrics, Vol. 6, pp. 91--125. Greenwich, Connecticut: Jai Press Inc.
Hendry, D. F., and Neale, A. J. (1988). "Interpreting long-run equilibrium solutions in conventional macro models: A comment" Economic Journal, 98, 808--817.
Hendry, D. F., and Neale, A. J. (1991). "A Monte Carlo study of the effects of structural breaks on tests for unit roots" In Hackl, P., and Westlund, A. H.(eds.), Economic Structural Change, Analysis and Forecasting, pp. 95--119. Berlin: Springer-Verlag.
Hendry, D. F., Neale, A. J., and Ericsson, N. R. (1991). PC-NAIVE, An Interactive Program for Monte Carlo Experimentation in Econometrics. Version 6.0. Oxford: Institute of Economics and Statistics, University of Oxford.
Hendry, D. F., Neale, A. J., and Srba, F. (1988). "Econometric analysis of small linear systems using Pc-Fiml" Journal of Econometrics, 38, 203--226.
Hendry, D. F., and Nielsen, B. (2007). Econometric Modeling: A Likelihood Approach. Princeton: Princeton University Press.
Hendry, D. F., Pagan, A. R., and Sargan, J. D. (1984). "Dynamic specification" in Griliches, and Intriligator 1984, pp. 1023--1100. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000; and in Campos, J., Ericsson, N.R. and Hendry, D.F. (eds.), General to Specific Modelling. Edward Elgar, 2005.
Hendry, D. F., and Richard, J.-F. (1982). "On the formulation of empirical models in dynamic econometrics" Journal of Econometrics, 20, 3--33. Reprinted in Granger, C. W. J. (ed.) (1990), Modelling Economic Series. Oxford: Clarendon Press and in Hendry D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers 1993, and Oxford University Press, 2000; and in Campos, J., Ericsson, N.R. and Hendry, D.F. (eds.), General to Specific Modelling. Edward Elgar, 2005.
Hendry, D. F., and Richard, J.-F. (1983). "The econometric analysis of economic time series (with discussion)" International Statistical Review, 51, 111--163. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000.
Hendry, D. F., and Richard, J.-F. (1989). "Recent developments in the theory of encompassing" In Cornet, B., and Tulkens, H.(eds.), Contributions to Operations Research and Economics. The XXth Anniversary of CORE, pp. 393--440. Cambridge, MA: MIT Press. Reprinted in Campos, J., Ericsson, N.R. and Hendry, D.F. (eds.), General to Specific Modelling. Edward Elgar, 2005.
Hendry, D. F., and Srba, F. (1980). "AUTOREG: A computer program library for dynamic econometric models with autoregressive errors" Journal of Econometrics, 12, 85--102. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000.
Hendry, D. F., and von Ungern-Sternberg, T. (1981). "Liquidity and inflation effects on consumers' expenditure" In Deaton, A. S.(ed.), Essays in the Theory and Measurement of Consumers' Behaviour, pp. 237--261. Cambridge: Cambridge University Press. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000.
Hendry, D. F., and Wallis, K. F.(eds.)(1984). Econometrics and Quantitative Economics. Oxford: Basil Blackwell.
Hooker, R. H. (1901). "Correlation of the marriage rate with trade" Journal of the Royal Statistical Society, 64, 485--492. Reprinted in Hendry, D. F. and Morgan, M. S. (1995), The Foundations of Econometric Analysis. Cambridge: Cambridge University Press.
Hoover, K. D., and Perez, S. J. (1999). "Data mining reconsidered: Encompassing and the general-to-specific approach to specification search" Econometrics Journal, 2, 167--191.
Jarque, C. M., and Bera, A. K. (1987). "A test for normality of observations and regression residuals" International Statistical Review, 55, 163--172.
Johansen, S. (1988). "Statistical analysis of cointegration vectors" Journal of Economic Dynamics and Control, 12, 231--254. Reprinted in R.F. Engle and C.W.J. Granger (eds), Long-Run Economic Relationships, Oxford: Oxford University Press, 1991, 131--52.
Johansen, S. (1995). Likelihood-based Inference in Cointegrated Vector Autoregressive Models. Oxford: Oxford University Press.
Johansen, S., and Juselius, K. (1990). "Maximum likelihood estimation and inference on cointegration -- With application to the demand for money" Oxford Bulletin of Economics and Statistics, 52, 169--210.
Judd, J., and Scadding, J. (1982). "The search for a stable money demand function: A survey of the post-1973 literature" Journal of Economic Literature, 20, 993--1023.
Judge, G. G., Griffiths, W. E., Hill, R. C., Lütkepohl, H., and Lee, T.-C. (1985). The Theory and Practice of Econometrics, 2nd ed. New York: John Wiley.
Kiviet, J. F. (1986). "On the rigor of some mis-specification tests for modelling dynamic relationships" Review of Economic Studies, 53, 241--261.
Kiviet, J. F. (1987). Testing Linear Econometric Models. Amsterdam: University of Amsterdam.
Kiviet, J. F., and Phillips, G. D. A. (1992). "Exact similar tests for unit roots and cointegration" Oxford Bulletin of Economics and Statistics, 54, 349--367.
Kohn, A. (1987). False Prophets. Oxford: Basil Blackwell.
Koopmans, T. C.(ed.)(1950). Statistical Inference in Dynamic Economic Models. No. 10 in Cowles Commission Monograph. New York: John Wiley & Sons.
Koopmans, T. C., Rubin, H., and Leipnik, R. B. (1950). "Measuring the equation systems of dynamic economics" in Koopmans 1950, Ch. 2.
Kremers, J. J. M., Ericsson, N. R., and Dolado, J. J. (1992). "The power of cointegration tests" Oxford Bulletin of Economics and Statistics, 54, 325--348.
Kuh, E., Belsley, D. A., and Welsh, R. E. (1980). Regression Diagnostics. Identifying Influential Data and Sources of Collinearity. Wiley Series in Probability and Mathematical Statistics. New York: John Wiley.
Leamer, E. E. (1978). Specification Searches. Ad-Hoc Inference with Non-Experimental Data. New York: John Wiley.
Leamer, E. E. (1983). "Let's take the con out of econometrics" American Economic Review, 73, 31--43. Reprinted in Granger, C. W. J. (ed.) (1990), Modelling Economic Series. Oxford: Clarendon Press.
Ljung, G. M., and Box, G. E. P. (1978). "On a measure of lack of fit in time series models" Biometrika, 65, 297--303.
Lovell, M. C. (1983). "Data mining" Review of Economics and Statistics, 65, 1--12.
Lucas, R. E. (1976). "Econometric policy evaluation: A critique" In Brunner, K., and Meltzer, A.(eds.), The Phillips Curve and Labor Markets, Vol. 1 of Carnegie-Rochester Conferences on Public Policy, pp. 19--46. Amsterdam: North-Holland Publishing Company.
MacKinnon, J. G. (1991). "Critical values for cointegration tests" In Engle, R. F., and Granger, C. W. J.(eds.), Long-Run Economic Relationships, pp. 267--276. Oxford: Oxford University Press.
MacKinnon, J. G., and White, H. (1985). "Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties" Journal of Econometrics, 29, 305--325.
Makridakis, S., Wheelwright, S. C., and Hyndman, R. C. (1998). Forecasting: Methods and Applications 3rd ed. New York: John Wiley and Sons.
Marschak, J. (1953). "Economic measurements for policy and prediction" In Hood, W. C., and Koopmans, T. C.(eds.), Studies in Econometric Method, No. 14 in Cowles Commission Monograph. New York: John Wiley & Sons.
Mizon, G. E. (1977). "Model selection procedures" In Artis, M. J., and Nobay, A. R.(eds.), Studies in Modern Economic Analysis, pp. 97--120. Oxford: Basil Blackwell.
Mizon, G. E. (1984). "The encompassing approach in econometrics" in Hendry, and Wallis 1984, pp. 135--172.
Mizon, G. E. (1995). "A simple message for autocorrelation correctors: Don't" Journal of Econometrics, 69, 267--288.
Mizon, G. E., and Hendry, D. F. (1980). "An empirical application and Monte Carlo analysis of tests of dynamic specification" Review of Economic Studies, 49, 21--45. Reprinted in Hendry, D. F., Econometrics: Alchemy or Science? Oxford: Blackwell Publishers, 1993, and Oxford University Press, 2000.
Mizon, G. E., and Richard, J.-F. (1986). "The encompassing principle and its application to non-nested hypothesis tests" Econometrica, 54, 657--678.
Nelson, C. R. (1972). "The prediction performance of the FRB-MIT-PENN model of the US economy" American Economic Review, 62, 902--917. Reprinted in T.C. Mills (ed.), Economic Forecasting. Edward Elgar, 1999.
Newey, W. K., and West, K. D. (1987). "A simple positive semi-definite heteroskedasticity and autocorrelation-consistent covariance matrix" Econometrica, 55, 703--708.
Nickell, S. J. (1985). "Error correction, partial adjustment and all that: An expository note" Oxford Bulletin of Economics and Statistics, 47, 119--130.
Nielsen, B. (2006a). "Correlograms for non-stationary autoregressions" Journal of the Royal Statistical Society, B, 68, 707--720.
Nielsen, B. (2006b). "Correlograms for non-stationary autoregressions" Journal of the Royal Statistical Society B, 68, 707--720.
Pagan, A. R. (1984). "Model evaluation by variable addition" in Hendry, and Wallis 1984, pp. 103--135.
Perron, P. (1989). "The Great Crash, the oil price shock and the unit root hypothesis" Econometrica, 57, 1361--1401.
Pesaran, M. H. (1974). "On the general problem of model selection" Review of Economic Studies, 41, 153--171. Reprinted in Campos, J., Ericsson, N.R. and Hendry, D.F. (eds.), General to Specific Modelling. Edward Elgar, 2005.
Phillips, P. C. B. (1986). "Understanding spurious regressions in econometrics" Journal of Econometrics, 33, 311--340.
Phillips, P. C. B. (1987). "Time series regression with a unit root" Econometrica, 55, 277--301.
Phillips, P. C. B. (1991). "Optimal inference in cointegrated systems" Econometrica, 59, 283--306.
Poincaré, H. (1905). Science and Hypothesis. New York: Science Press.
Priestley, M. B. (1981). Spectral Analysis and Time Series. London: Academic Press.
Ramsey, J. B. (1969). "Tests for specification errors in classical linear least squares regression analysis" Journal of the Royal Statistical Society B, 31, 350--371.
Richard, J.-F. (1980). "Models with several regimes and changes in exogeneity" Review of Economic Studies, 47, 1--20.
Sargan, J. D. (1958). "The estimation of economic relationships using instrumental variables" Econometrica, 26, 393--415.
Sargan, J. D. (1959). "The estimation of relationships with autocorrelated residuals by the use of instrumental variables" Journal of the Royal Statistical Society B, 21, 91--105. Reprinted as pp. 87--104 in Sargan J. D. (1988), Contributions to Econometrics, Vol. 1, Cambridge: Cambridge University Press.
Sargan, J. D. (1964). "Wages and prices in the United Kingdom: A study in econometric methodology (with discussion)" In Hart, P. E., Mills, G., and Whitaker, J. K.(eds.), Econometric Analysis for National Economic Planning, Vol. 16 of Colston Papers, pp. 25--63. London: Butterworth Co. Reprinted as pp. 275--314 in Hendry D. F. and Wallis K. F. (eds.) (1984). Econometrics and Quantitative Economics. Oxford: Basil Blackwell, and as pp. 124--169 in Sargan J. D. (1988), Contributions to Econometrics, Vol. 1, Cambridge: Cambridge University Press.
Sargan, J. D. (1980a). "The consumer price equation in the post-war British economy. An exercise in equation specification testing" Review of Economic Studies, 47, 113--135.
Sargan, J. D. (1980b). "Some tests of dynamic specification for a single equation" Econometrica, 48, 879--897. Reprinted as pp. 191--212 in Sargan J. D. (1988), Contributions to Econometrics, Vol. 1, Cambridge: Cambridge University Press.
Shenton, L. R., and Bowman, K. O. (1977). "A bivariate model for the distribution of √b1 and b2" Journal of the American Statistical Association, 72, 206--211.
Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. London: Chapman and Hall.
Sims, C. A. (1980). "Macroeconomics and reality" Econometrica, 48, 1--48. Reprinted in Granger, C. W. J. (ed.) (1990), Modelling Economic Series. Oxford: Clarendon Press.
Sims, C. A., Stock, J. H., and Watson, M. W. (1990). "Inference in linear time series models with some unit roots" Econometrica, 58, 113--144.
Spanos, A. (1986). Statistical Foundations of Econometric Modelling. Cambridge: Cambridge University Press.
White, H. (1980). "A heteroskedastic-consistent covariance matrix estimator and a direct test for heteroskedasticity" Econometrica, 48, 817--838.
White, H. (1984). Asymptotic Theory for Econometricians. London: Academic Press.
White, H. (1990). "A consistent model selection" In Granger, C. W. J.(ed.), Modelling Economic Series, pp. 369--383. Oxford: Clarendon Press.
Wooldridge, J. M. (1999). "Asymptotic properties of some specification tests in linear models with integrated processes" in Engle, and White 1999, pp. 366--384.
Working, E. J. (1927). "What do statistical demand curves show?" Quarterly Journal of Economics, 41, 212--235.
Yule, G. U. (1926). "Why do we sometimes get nonsense-correlations between time-series? A study in sampling and the nature of time series (with discussion)" Journal of the Royal Statistical Society, 89, 1--64. Reprinted in Hendry, D. F. and Morgan, M. S. (1995), The Foundations of Econometric Analysis. Cambridge: Cambridge University Press.