An application of a modified wave equation to forecasting

The objective: to set out a quantitative framework to relate the cyclical behaviour revealed through factor analysis to a wave motion, and hence to show that the cycle has higher forecasting ability compared to either naive momentum or value.

1 Waves and prices

  • The wave equation has wide applicability in science and engineering.

\(\frac{\partial^2y}{\partial t^2} = v^2\frac{\partial^2y}{\partial x^2}\) - (1) classical wave equation

  • In the equation \(t\) is time, but the other symbols depend upon the problem at hand. In general \(y\) is ‘what the wave disturbs’ and \(x\) is ‘extension along the path of the wave’, and \(v\) is a constant with units extension/time, normally ‘velocity’. A simple example: in an ocean wave y is vertical displacement, x is distance perpendicular to the wavefront, and v is phase velocity.

  • In its application to property prices we will consider one factor derived from factor analysis on covariance of log price/m2, in levels. This analysis is described in detail elsewhere. For the present purposes log price/m2 written as p(i,t) for area i has four components:

    • a : ‘average’, mean log price level since 1995
    • m : factor 1, ‘the market’, a weighted average across all indices
    • c : factors 2,3 ‘cyclical’ or ‘relative value’, two factors which capture cyclical behaviour
    • b : factors 4:k ‘balancing item’ which makes the equation exact

\(p(i,t) = a_i + a_{im}.m(t) + a_{ic}.c_i(t) + b_i(t)\) - (2) price factor decomposition

  • The generalised wave equation will be applied to cyclical component c, and the other components are not considered in this discussion. The symbol \(y\) in (1) therefore refers to \(c\) in (2), and these symbols are used interchangeably henceforth.

  • The dimension \(x\) in (1) here corresponds to the counter \(i\) on the log price indices, meaning that the indices are ranked from 1 for prime central london to a high integer somewhere in the northeast. For the numerical example we consider fewer than 10 postcodes, either close to central london or nationally.

  • To make the wave equation practical for sampled data we write it thus, the subscript denoting the domain for differencing:

\(\Delta_t^2(y) =v^2\Delta_i^2(y)\) - (3) finite difference form

  • Stepping back just for a moment, consider the intuition of the wave equation.
    • It says that if y is not linear with respect to x and so has a non-zero 2nd derivative, y then accelerates toward equilibrium to correct the deviation, then overshoots, resulting in oscillation
    • Perhaps the simplest analogy is a plucked string which mean-reverts, overshoots, and hence vibrates
    • In our application, if an area is priced above the normal relation to its neighbours in price/m2 then it will underperform, and vice-versa - this is logical and intuitive

Armed with these definitions we are now ready to do some econometrics and hence to see whether this plausible formulation finds support in the data.

2 Generalised wave equation

Equation 3 can be rewritten in the following more useful form which places the future index t+1 on the left and the current values on the right:

\(\Delta_ty(i,t+1) = \Delta_ty(i,t)+v^2[\Delta_{i}y(i+1,t)- \Delta_{i}y(i,t)]\) - (4)

This could be considered a restricted version of the following general form which encompasses systems with ‘damping’ or ‘loss’:

\(\Delta_ty(i,t+1) = a_1+a_2.\Delta_ty(i,t)+a_3.y(i,t)+a_4.\Delta_iy(i,t) + a_5\Delta_i^2y(i,t)\) - (5) generalised difference form

This form encompasses some important special cases:

\(a_3=a_4=a_5=0\) - autoregressive or ‘continuation’ or ‘momentum’ form, with \(0<a_2<1\)

\(a_2=a_4=a_5=0\) - mean-reversion or ‘value-driven’ form, with \(-1<a_3<0\)

In addition it could be rewritten just slightly to result in the corresponding finite difference heat equation, with no second difference on time.

The terms 3,4,5 are a level, a first difference, and a second difference - this could be rewritten as two first differences or three levels without changing the equation. However this form probably is better for estimation since it almost certainly has less multicollinearity, and also gives the restrictions meaningful interpretations.

3 Results

We estimate equation (5) by OLS, then consider the special cases.

For these purposes the entire dataset of 104 postcode areas at monthly frequency is resampled in time to 24 calendar years and in cross-section to 9 representative national areas, or alternatively to 9 taken from the greater london area. In the case of London these are WC, W, NW, N, KT, E, UB, CR, and RM, and nationally they are WC, NW, SE, CR, CO, B, M, BL, TS. They are in both cases reasonably evenly sampled in log price/m2.

Regression results for London universe
Dependent variable: d.c(t+1) = next year log return
unrestricted heat equation momentum mean-reversion
c -0.279*** -0.350*** -0.113***
(0.025) (0.030) (0.032)
d.c 0.608*** 0.728***
(0.063) (0.050)
D.c -0.665*** -1.596***
(0.131) (0.113)
D.D.c -0.339*** -0.812***
(0.085) (0.088)
Constant -0.0003 0.002 -0.001 0.003
(0.002) (0.002) (0.002) (0.003)
Observations 154 161 198 207
R2 0.747 0.588 0.520 0.056
Adjusted R2 0.740 0.580 0.517 0.051
Residual Std. Error 0.019 (df = 149) 0.024 (df = 157) 0.028 (df = 196) 0.038 (df = 205)
F Statistic 109.766*** (df = 4; 149) 74.603*** (df = 3; 157) 212.060*** (df = 1; 196) 12.073*** (df = 1; 205)
Note: p<0.1; p<0.05; p<0.01
D operator is difference in cross-section: D.x(i,t)=x(i,t)-x(i-1,t)
d operator is difference in time: d.x(i,t)=x(i,t)-x(i,t-1)

Regression results for national universe
Dependent variable: d.c(t+1) = next year log return
unrestricted heat equation momentum mean-reversion
c -0.183*** -0.080** -0.114***
(0.024) (0.032) (0.032)
d.c 0.719*** 0.729***
(0.056) (0.049)
D.c -0.169*** -0.555***
(0.049) (0.055)
D.D.c -0.061 -0.201*
(0.078) (0.109)
Constant 0.001 0.001 0.0002 -0.0005
(0.002) (0.002) (0.002) (0.003)
Observations 154 161 198 207
R2 0.728 0.432 0.527 0.057
Adjusted R2 0.721 0.421 0.525 0.052
Residual Std. Error 0.020 (df = 149) 0.029 (df = 157) 0.031 (df = 196) 0.044 (df = 205)
F Statistic 99.630*** (df = 4; 149) 39.738*** (df = 3; 157) 218.395*** (df = 1; 196) 12.409*** (df = 1; 205)
Note: p<0.1; p<0.05; p<0.01
D operator is difference in cross-section: D.x(i,t)=x(i,t)-x(i-1,t)
d operator is difference in time: d.x(i,t)=x(i,t)-x(i,t-1)

The results show that the generalised wave equation strongly dominates the restricted versions of the regression, with adjusted \(R^2\) 0.72 and 0.74 for National and London universes respectively. The nearest contender is momentum, with adjusted \(R^2\) of 0.52.

The signs of the coefficients are as expected, so for example d.c takes the value .7 in the national model - this would take the value 1 in a ‘pure wave equation’ model, the reason being that if we take d.c to the left-hand-side we get the second difference in time of c. The restrictions have highly significant F statistics, so we reject them.

4 Current forecasts

Using the unrestricted specification we can feed in the current values for \(\Delta_t(c)\) (d.c or momentum), \(\Delta_i(c)\) (D.c or cross-sectional price slope) and \(\Delta_i^2(c)\) (D.D.c or cross-sectional price curvature).

The results are tabulated. This then is a 1-year forecast for the cyclical part of log return which is driven from the ‘ripple effect’ and captured in the generalised wave equation. Being 1-year forecasts they exclude seasonal effects - seasonals are addressed in another white paper.

London model
Area price/m2 log(price/m2) 12m cyclical forecast
WC: London 14900 9.6 NA
W: London 11900 9.4 -0.031
NW: London 9000 9.1 -0.027
N: London 7200 8.9 -0.037
E: London 6300 8.8 -0.030
KT: Kingston 5900 8.7 -0.020
UB: Southall 5000 8.5 -0.034
CR: Croydon 4600 8.4 -0.027
RM: Romford 4200 8.4 NA

National model
Area price/m2 log(price/m2) 12m cyclical forecast
WC: London 14900 9.6 NA
NW: London 9000 9.1 -0.025
SE: London 6700 8.8 -0.027
CR: Croydon 4600 8.4 -0.018
CO: Colchester 3100 8.0 0.003
B: Birmingham 2600 7.9 0.013
M: Manchester 2400 7.8 0.025
BL: Bolton 1900 7.5 0.036
TS: Cleveland 1500 7.3 NA

  • The ‘edge’ areas do not have forecasts in this specification because the price curvature is undefined.

  • The forecast for NW and CR differ across the two models because neither the coefficients nor the cross-sectional slope and curvature data are the same for this area. The definition of slope and curvature has been arbitrary in units of ‘x’. Nevertheless, the differences are less than 1%.

  • In general we can see that the negative cyclical return which started in London early 2016 is rippling slowly out, and the positive return which preceded it has reached the northern centres.

5 Conclusions

  • A finite difference version of the wave equation can be fitted to price data with significant coefficients on all components

  • The equation dominates three alternative specifications

  • It has significant forecast power both nationally and in the London region and \(R^2\) exceeding 0.7

  • There is a simple and powerful intuition behind the quantified ‘ripple effect’: quite simply that when prices are ‘out of line’ in the 2nd derivative, they mean-revert

  • The forecasts are plausible and economically significant in magnitude

Author: Giles Heywood