Dispersion produces winners and losers - it is quite common for the dispersion of returns to be quite wide even locally, and at the national level this is more significant. Currently in early 2022 this effect is very strong with a range of 23% over 12 months. Investors in the most affordable districts have experienced 21% return versus -2% in Prime London. The rapid re-appearance of significant return dispersion starting in 2020 confirms that the cycle which had been somewhat ‘on pause’ since 2018 is now continuing, the so-called ‘ripple effect’ propagating through ever-lower-priced regions. Return dispersion is the focus of attention in this document. Several analytical insights in conjunction reveal exactly what is happening, and the parameters selected in the first steps are guided by insights only clarified in later steps, but let us start at the beginning.
Indices summarise homeowners’ holding period return in the most accurate way possible by minimising the deviation of the fitted returns across the entire dataset in each locality. Because the entire set of transactions is available from the Land Registry, we can make sweeping statements about accuracy with full justification based on the entire relevant set of data, not just a sample. Furthermore, two key parameters of the indices - spatial and temporal sampling - can be customised once the systematic drivers have been revealed via factor analysis. The reference methodology for repeat sales index (RSI) construction is S&P Case-Shiller 1 Shiller, R.J., 1994. Macro markets. OUP Oxford. The benchmark UK indices for accuracy comparisons are the Land registry UK HPI, which also uses a repeat sales method.
UK HPI is less accurate. Accuracy is
quantified across multiple major cities using the standard deviation of
the difference between an index-derived return over a homeowner’s
holding period and the return they actually achieved. The same metric is
used for both our index and the benchmark UK HPI index. Comparing the
error using our RSI methodology we find very similar and consistently
slightly lower error rate averaging 5.9% versus a UK HPI rate of 6.3%.
UK HPI is not seasonally adjusted whereas our indices are adjusted and
so errors are measured accordingly, therefore seasonality effects are
neutralised to minimise any impact on accuracy metrics. One credible
reason as to why higher accuracy is achieved may be differences in
definition of city boundaries.
UK HPI is more volatile. Comparing our monthly RSI against UK HPI confirms that the trend is extremely similar, but UK HPI has both a strong seasonal cycle and high-frequency noise, so our monthly return volatility is 0.8% versus UK HPI of 1.8% for the example shown. The noise component is the result of over-fitting the data, and normally increases in-sample accuracy to the detriment of out-of-sample ‘true’ accuracy - hence the term over-fitting. A second reason for our higher accuracy is that both for estimation and appraisal we use all relevant holding periods: those prior to a given index date, those spanning it, and those subsequent. For an index provider or government body there are practical reasons for not including subsequent periods, otherwise index revisions never cease. In this document our primary objective is to understand the drivers of return, therefore accuracy trumps timeliness.
Adaptive binning calibrates the whole range of behaviour. For index accuracy evaluation purposes two key parameters were chosen to match the UK HPI benchmark: a monthly time increment and city-level indices. Free now to select the parameters more appropriately, it will be shown that a set of 10 indices with affordability-ranked geographical sampling can accurately span the range of behaviour across all cities. Combined with adaptive time-binning to sample the cyclical risk regularly, the result is increased accuracy of the factor model and forecasts derived from it. A quick visual inspection of cumulative log return for the 10 indices shows that high-priced indices (n=10,9,…) outperformed long-term, that they lead the low-priced indices (n=1,2,…) in a cycle, and that the 2008 bear market affected all simultaneously.
Conventional regional segmentation of the country can be greatly improved upon to get a smaller number of indices with high accuracy in replicating holding period returns. Factor model insights in the next section will strongly suggest that a region-based approach can be bettered by classifying into just a few affordability bins. For now we show the map of 10 spatial bins defined at the level of over 2,000 postcode districts, the exact motivation being clarified in the next section. It should be clear that the mid-range blue/green affordability bins are extensive with plentiful data, while at either end of the spectrum the geographic locus is deliberately much more tightly defined. The motivation is to achieve even risk-sampling whilst preserving sufficient data-points at the edges of the distribution.
Time-bins do not have to be regular. Following conventions borrowed from physical science and signal processing it is commonly assumed a priori that time should have equal bin lengths, for example a month, quarter or year. However this is not required in this application, and given the importance of the cyclical components of return it is more useful to divide time into bins of approximately equal cyclical risk without compromising signal/noise. A suitable increment is approximately 5-6% of spread return per time-bin, which accrues over intervals ranging from 62 to 891 days and averaging 224 days. One notable exception is the special 425-day time-bin 2007-12-31 to 2009-02-28 - the Global Financial Crisis (GFC). In that bin, there was insignificant cyclical risk despite being a relatively long and tumultuous period in all markets, including property.
A factor model for asset returns recognises that each asset’s returns are largely driven by systematic forces2 Connor, G., Goldberg, L.R. and Korajczyk, R.A., 2010. Portfolio risk analysis. Princeton University Press., and that the diversity of observed behaviour can be largely replicated by linear combinations of a much smaller number of factor time-series. This focus on a small number of independent drivers is sometimes called dimension reduction.
There are three main forms of factor model in use by portfolio managers3 Heywood, G., Marsland, J., & Morrison, G. (2003). Practical Risk Management for Equity Portfolio Managers. British Actuarial Journal, 9(5), 1061-1123. doi:10.1017/S1357321700004463, and for our purposes the third ‘statistical’ kind is the focus of attention. Here the factors and the individual asset sensitivities – also known as loadings or betas (\(\beta\)) – are jointly derived from the return covariance matrix. This category of model has the maximum achievable explanatory power and the factors are independent of one another. The principal disadvantage is the difficulty of explaining it, but it will be seen that in this application to residential property the factors have a direct interpretation based on firm economic foundations.
Factors are ranked in order of descending explanatory power k=1,2,…, such that each in turn explains the maximum possible of what remains unexplained. In our application the first 3 factors are dominant, explaining 97% of the variance of return across our 10 indices. From the perspective of an early-stage homeowner seeking a leg-up on the property ladder it is less important what the index does, and attention should rather be focused on relative returns. These return spreads between pairs of indices, unlike the indices themselves, are no longer dominated by the first factor, instead having 74% explained by cyclical risk factors 2 and 3 and only 6.7% explained by factor 1.
Holdings are the weights of indices in the factors, where factors are weighted (linear) combinations of indices. The first factor is a positive-weighted average with higher weights assigned to the extremes of affordability, the second is a spread between the low-price and high-price indices, while the third is a spread between mid-price and extreme indices. The scaling of each of these is such that the positive weights sum to 1. The second has a weight of 76% on the negative side. A +5% return to this factor indicates a 5% return to the spread between affordable indices and unaffordable ones. For convenience later on, we call the high priced bins ‘A’, the mid-priced ‘B’ and the low-priced ‘C’. In this terminology factor 2 is ‘C-A’, which – it should be obvious – is a spread. The same is true of factor 3 onward, so as a general statement, factors after the first are all spreads.
\(z=h^\intercal.x\) \(x\): index \(z\): factor
The 3 factors are uncorrelated. Factor 1 shows a strong and fairly steady up-trend with mean 6.1% p.a. and volatility 6.4% p.a., punctuated by a fall of -17.4% % in the 2008 crisis (GFC) and -2.7% in the 96 days thereafter. It took in total 6.2 years to recover the heights of December 2007. Factor 2 shows a cycle with three turning points dated October 2001, March 2006 and March 2016, so on that metric the trough-trough cycle period was 14.4 years with an upswing period of 4.4 years and a downswing of 10 years. The peak-trough amplitude is 0.62 in logs (86%), and the current upswing has so far returned 0.31 in logs (36%). Factor 3 shows a more irregular cycle which leads factor 2 and has the same period. The amplitude of the last cycle was 0.32 in logs (37%) which is 0.74 times the factor 2 amplitude. Its last trough was in June 2013 and the current upswing has so far returned 0.22. The factors 2 and 3 are analogous to a sine and cosine wave and are not on average rewarded – they represent pure risk, of significant magnitude. By contrast the risk in factor 1 is rewarded quite strongly and reasonably persistently. These series, and in particular \(z_2\), are more regularly sampled by using equal risk increments whereby factors 2 and 3 in conjunction generate a target amount of variance, which we can loosely refer to as ‘spread return’ or ‘spread vol’. This criterion defines our time-bins.
Beta is the sensitivity of each index to each factor in turn. Focusing for now on the cyclical factors \(\beta_2\) and \(\beta_3\), we see that the indices are all some distance from (0,0) so they all have cyclical risk. 9 representative postcode areas along the price spectrum are labelled, and all 104 postcode areas’ indices are plotted with a single point each. They form an arc with high-priced indices at low \(\beta_2\), curving around to lowest-priced indices at high \(\beta_2\), the beta values broadly mimicking the factor holdings \(h\) which were shown earlier. It is now clear that the indices 1:10 are designed to be approximately regularly-spaced in \(\beta_{2,3}\) space, and the extreme indices can precisely target the pricing extremes from below £1,000\(/m^2\) including e.g. TS1 to over £15,000\(/m^2\) in e.g. W1.
Linear Factor Model for attribution \(x=\alpha+\beta.h^\intercal.x+u\) \(x\): index \(\alpha\): abnormal return \(\beta\): factor loadings \(h\): factor holdings \(u\): residual
Indices can be exactly attributed into factor components using the Linear Factor Model equation. It is convenient to consider aggregate contributions from factors 2 and 3 as a single cyclical component, thus emphasising that all indices are in a single cycle, each at a different phase. Currently index n=10 (Prime London) is in a cyclical downswing, whilst index n=1 (e.g. Teesside) is exactly phase-reversed, as seen on their betas. The residual plus alpha shown in grey not only has low volatility – this is guaranteed by the way the residual is derived from low-ranked factors – but also has low mean alpha.
This finding is useful since it justifies our focus on the cycle as the main driver of index spreads, being the key risk driver for the holding period of a typical young homeowner.
Spreads are important to homeowners looking to outperform. Over the last 12 months ending December 2021 the spread of returns relative to Prime London has been large, and it is almost entirely due do the cycle component. This reflects the strong positive returns on factor 2. The same is true over 5 years, with spreads as high as 0.4 In logs (49%). Over the 18-year cycle however the cycle by definition becomes irrelevant and return spreads are dominated by factor 1. Since \(\beta_1\) is 1.18 for Prime London versus 0.91 at the other extreme, it is the top-tier indices that are most sensitive to the long-term upswing and are rewarded accordingly. For anyone with an 18-year investment horizon and no possibility of trading, London was the best choice, although the same effect can be achieved with higher leverage, provided that appropriate lines of credit are available.
For further insight into the cycle it is useful to borrow from the physical sciences and convert the linear factor model from rectangular to polar coordinates using basic geometry. Both the cycle loadings (\(\beta_2\) ,\(\beta_3\)) and factors (\(z_2\),\(z_3\)) are converted in the same way to their corresponding values (\(r_{\beta},\theta_{\beta}\)) and (\(r_{z},\theta_{z}\)).
Factor phase \(\theta_z\) varies slowly. Having converted to polar coordinates the phase angle \(\theta_{z}\) of the driver \(z\) evolves linearly through time, and this linear trend relation gives several insights. The slope implies a period of 15.5 years since 1994, and so if we followed the trend exactly the current trend value of -0.14 would correspond to August 2006. However we are currently somewhat behind trend, tarrying near zero with the latest phase angle in cycle units being 0.01, corresponding to a date one cycle back of March 2004. The points do tend in this way to cluster around the half-cycle increments, reflecting the fact that factor 3 is generally dominated by factor 2, and that factor 2 is closer to a series of linear trends in a sawtooth ‘flip-flop’ pattern rather than a sinusoid.
phase and price are one. In the beta scatter it was clear that both the area and affordability-quantile indices are arranged around an arc, with price rising monotonically with the polar angle \(\theta_{\beta}\). The relation \(\theta\sim(£/m^2)\) is in fact quite linear and it should now be clear how the spatial bins were designed: to sample \(\theta_{\beta}\) approximately evenly, whilst maintaining adequate datapoints in all bins. As the factor vector \(z\) rotates through the affordability spectrum, each price-point is favoured in turn, ending with the lowest. The two linear relations in conjunction are the most unambiguous representation possible of a ‘ripple effect’.
The ripple effect is driven by affordability in a cycle which starts with high demand for the most desirable, highest priced properties following a market correction. The correction serves to reset all prices, the highest priced having experienced a slightly greater price fall due to their higher \(\beta_1\). As prices rise at the premium end and the price-spectrum expands, budget-constrained buyers seek out the more affordable districts, with the progressive catch-up leading to a ripple which propagates downward along the price spectrum. The process propagates progressively across the country as an ever-increasing proportion becomes hard to afford.
Centred on each postcode area we partition districts from the local region into price quintiles, illustrated here with East London (postcode area E). Estimating the indices on the time periods locally appropriate and thence deriving the local factor model, we find factors 1 and 2 are completely dominant. Factor 1 is again a weighted average of the local indices, but to understand the local cycle we focus on factor 2 which once again is the difference between the prime quintiles 4,5 and the more affordable 1,2. By direct analogy with the national model we label the top-price quintiles A and the most-affordable quintiles C. Factor 2 - the C-A spread - could in this example be dubbed the Plaistow-Wapping spread.
The second factor is the local low-high price (C-A) spread, illustrated here with 3 examples. The local cycle shows a large-amplitude 15-20 year cycle of log amplitude 0.22 for Peterborough to 0.34 for London. London has shown this 0.34 amplitude outperformance of C versus A since June 2014, a trend which may soon reverse. The magnitude of this cycle is for homeowners both very significant and also actionable because it is local. The GFC did not impact on these cycles, it put them ‘on pause’. The cycles are not so much sinusoids, more a flip-flop between high price outperformance and low price outperformance. The late-cycle buyer of cheap districts in a cheap region is the most unfortunate: they suffered first a nationwide crash, then a 12-year high-amplitude cyclical headwind whilst also having low \(\beta_1\) during the recovery that followed. As we are in 2022 late-cycle, there are warnings here.
Local and national cycles can be understood in a unified framework. The ripple effect that starts in central London passes through the commuter zone and eventually beyond. Each local region’s quintiles betas \(\beta_{2,3}\)(q=1:5) lie on a slightly concave line, forming a very flat triangle. Each has a ‘local market factor’ which broadly coincides with the centroid of the triangle, plus two more which are triangle base vector C-A, and the perpendicular height. The radial coordinate \(r_{\beta}\) is smaller for desirable districts within a more affordable area, so for example a desirable district around Peterborough towards Stevenage is in sync with (having similar \(\theta_{\beta}\)) but less volatile than (having lower \(r_{\beta}\)) a similar-priced district in East London towards Ilford. Viewed in this context there is a simple link between local and national cycles which can be accurately interpreted with simple geometry.
Local cycles flip according to a simple insight from the \(\beta_{2,3}\) scatter in the previous graphic. The local C-A line provides a prediction: when the national factor-vector passes the perpendicular to C-A, this is the moment that the local market should then flip from favouring high-priced A to low-priced C, and the local cycle reverses. Unusually and by total coincidence it so happens that at the time of writing in early 2022 all areas nationally have entered in the same state: that which favours C over A (low price over high). The most recent wave of ‘flip to C’ started around Slough (the most \(\beta_3\)-dominated C-A at the high-price end) in November 2014 and ended recently with the final ‘flip’ in Lancaster (the most \(\beta_3\)-dominated C-A at the low-price end) in December 2021. For a series of areas we compare the local flip date with the national theta-z, and they have occurred broadly in sequence and on time. This confirms the surprising finding that local and national cycles can be understood together in a single geometrical framework.
We have seen that the cyclical factors 2 and 3 explain most of relative return over the average holding period, that factor 1 drives indices’ long-term return but the small spread of \(\beta_1\) reduces its importance, and the residual factors 4-10 can mostly be ignored. The two cyclical factors are forecast in a multivariate model 4 Sims, C.A., 1980. Macroeconomics and reality. Econometrica: journal of the Econometric Society, pp.1-48. doi: 10.2307/1912017 (Vector Autoregression ‘VAR’), borrowing a simplified method from the celebrated work of Sims.
The tests for stationarity, stability, and serial independence are passed5 Pfaff, B., 2008. Analysis of integrated and cointegrated time series with R. Springer Science & Business Media., and with this hurdle cleared we are able to bootstrap the two cyclical factors, taking random draws from past ‘shocks’ to simulate future scenarios.
The fan-chart shows the cyclical component alone, the shades of grey represent confidence bands. It can be seen that the model identifies a cycle of period 29 time-bins and projects it forward with relatively tight confidence intervals. Note the confident forecast of continued underperformance at the prime end, in contrast to the bottom-ranked index n=1.