The Cycle

Tracking winners and losers in residential property

Giles Heywood

2022-04-05

Dispersion

Dispersion produces winners and losers - it is quite common for the dispersion of returns to be quite wide even locally, and at the national level this is more significant. Currently in early 2022 this effect is very strong with a range of 23% over 12 months. Investors in the most affordable districts have experienced 21% return versus -2% in Prime London. The rapid re-appearance of significant return dispersion starting in 2020 confirms that the cycle which had been somewhat ‘on pause’ since 2018 is now continuing, the so-called ‘ripple effect’ propagating through ever-lower-priced regions. Return dispersion is the focus of attention in this document. Several analytical insights in conjunction reveal exactly what is happening, and the parameters selected in the first steps are guided by insights only clarified in later steps, but let us start at the beginning.

Repeat Sales Index

Indices

Indices summarise homeowners’ holding period return in the most accurate way possible by minimising the deviation of the fitted returns across the entire dataset in each locality. Because the entire set of transactions is available from the Land Registry, we can make sweeping statements about accuracy with full justification based on the entire relevant set of data, not just a sample. Furthermore, two key parameters of the indices - spatial and temporal sampling - can be customised once the systematic drivers have been revealed via factor analysis. The reference methodology for repeat sales index (RSI) construction is S&P Case-Shiller 1 Shiller, R.J., 1994. Macro markets. OUP Oxford. The benchmark UK indices for accuracy comparisons are the Land registry UK HPI, which also uses a repeat sales method.

Accuracy

UK HPI is less accurate. Accuracy is quantified across multiple major cities using the standard deviation of the difference between an index-derived return over a homeowner’s holding period and the return they actually achieved. The same metric is used for both our index and the benchmark UK HPI index. Comparing the error using our RSI methodology we find very similar and consistently slightly lower error rate averaging 5.9% versus a UK HPI rate of 6.3%. UK HPI is not seasonally adjusted whereas our indices are adjusted and so errors are measured accordingly, therefore seasonality effects are neutralised to minimise any impact on accuracy metrics. One credible reason as to why higher accuracy is achieved may be differences in definition of city boundaries.

Signal and noise

UK HPI is more volatile. Comparing our monthly RSI against UK HPI confirms that the trend is extremely similar, but UK HPI has both a strong seasonal cycle and high-frequency noise, so our monthly return volatility is 0.8% versus UK HPI of 1.8% for the example shown. The noise component is the result of over-fitting the data, and normally increases in-sample accuracy to the detriment of out-of-sample ‘true’ accuracy - hence the term over-fitting. A second reason for our higher accuracy is that both for estimation and appraisal we use all relevant holding periods: those prior to a given index date, those spanning it, and those subsequent. For an index provider or government body there are practical reasons for not including subsequent periods, otherwise index revisions never cease. In this document our primary objective is to understand the drivers of return, therefore accuracy trumps timeliness.

Binning

Adaptive binning calibrates the whole range of behaviour. For index accuracy evaluation purposes two key parameters were chosen to match the UK HPI benchmark: a monthly time increment and city-level indices. Free now to select the parameters more appropriately, it will be shown that a set of 10 indices with affordability-ranked geographical sampling can accurately span the range of behaviour across all cities. Combined with adaptive time-binning to sample the cyclical risk regularly, the result is increased accuracy of the factor model and forecasts derived from it. A quick visual inspection of cumulative log return for the 10 indices shows that high-priced indices (n=10,9,…) outperformed long-term, that they lead the low-priced indices (n=1,2,…) in a cycle, and that the 2008 bear market affected all simultaneously.

Spatial bins

Conventional regional segmentation of the country can be greatly improved upon to get a smaller number of indices with high accuracy in replicating holding period returns. Factor model insights in the next section will strongly suggest that a region-based approach can be bettered by classifying into just a few affordability bins. For now we show the map of 10 spatial bins defined at the level of over 2,000 postcode districts, the exact motivation being clarified in the next section. It should be clear that the mid-range blue/green affordability bins are extensive with plentiful data, while at either end of the spectrum the geographic locus is deliberately much more tightly defined. The motivation is to achieve even risk-sampling whilst preserving sufficient data-points at the edges of the distribution.

Time bins

Time-bins do not have to be regular. Following conventions borrowed from physical science and signal processing it is commonly assumed a priori that time should have equal bin lengths, for example a month, quarter or year. However this is not required in this application, and given the importance of the cyclical components of return it is more useful to divide time into bins of approximately equal cyclical risk without compromising signal/noise. A suitable increment is approximately 5-6% of spread return per time-bin, which accrues over intervals ranging from 62 to 891 days and averaging 224 days. One notable exception is the special 425-day time-bin 2007-12-31 to 2009-02-28 - the Global Financial Crisis (GFC). In that bin, there was insignificant cyclical risk despite being a relatively long and tumultuous period in all markets, including property.

Linear Factor Model

A factor model for asset returns recognises that each asset’s returns are largely driven by systematic forces2 Connor, G., Goldberg, L.R. and Korajczyk, R.A., 2010. Portfolio risk analysis. Princeton University Press., and that the diversity of observed behaviour can be largely replicated by linear combinations of a much smaller number of factor time-series. This focus on a small number of independent drivers is sometimes called dimension reduction.

There are three main forms of factor model in use by portfolio managers3 Heywood, G., Marsland, J., & Morrison, G. (2003). Practical Risk Management for Equity Portfolio Managers. British Actuarial Journal, 9(5), 1061-1123. doi:10.1017/S1357321700004463, and for our purposes the third ‘statistical’ kind is the focus of attention. Here the factors and the individual asset sensitivities – also known as loadings or betas (\(\beta\)) – are jointly derived from the return covariance matrix. This category of model has the maximum achievable explanatory power and the factors are independent of one another. The principal disadvantage is the difficulty of explaining it, but it will be seen that in this application to residential property the factors have a direct interpretation based on firm economic foundations.

Variance and spreads

Factors are ranked in order of descending explanatory power k=1,2,…, such that each in turn explains the maximum possible of what remains unexplained. In our application the first 3 factors are dominant, explaining 97% of the variance of return across our 10 indices. From the perspective of an early-stage homeowner seeking a leg-up on the property ladder it is less important what the index does, and attention should rather be focused on relative returns. These return spreads between pairs of indices, unlike the indices themselves, are no longer dominated by the first factor, instead having 74% explained by cyclical risk factors 2 and 3 and only 6.7% explained by factor 1.

Holdings h

Holdings are the weights of indices in the factors, where factors are weighted (linear) combinations of indices. The first factor is a positive-weighted average with higher weights assigned to the extremes of affordability, the second is a spread between the low-price and high-price indices, while the third is a spread between mid-price and extreme indices. The scaling of each of these is such that the positive weights sum to 1. The second has a weight of 76% on the negative side. A +5% return to this factor indicates a 5% return to the spread between affordable indices and unaffordable ones. For convenience later on, we call the high priced bins ‘A’, the mid-priced ‘B’ and the low-priced ‘C’. In this terminology factor 2 is ‘C-A’, which – it should be obvious – is a spread. The same is true of factor 3 onward, so as a general statement, factors after the first are all spreads.

Factors z

\(z=h^\intercal.x\) \(x\): index \(z\): factor

The 3 factors are uncorrelated. Factor 1 shows a strong and fairly steady up-trend with mean 6.1% p.a. and volatility 6.4% p.a., punctuated by a fall of -17.4% % in the 2008 crisis (GFC) and -2.7% in the 96 days thereafter. It took in total 6.2 years to recover the heights of December 2007. Factor 2 shows a cycle with three turning points dated October 2001, March 2006 and March 2016, so on that metric the trough-trough cycle period was 14.4 years with an upswing period of 4.4 years and a downswing of 10 years. The peak-trough amplitude is 0.62 in logs (86%), and the current upswing has so far returned 0.31 in logs (36%). Factor 3 shows a more irregular cycle which leads factor 2 and has the same period. The amplitude of the last cycle was 0.32 in logs (37%) which is 0.74 times the factor 2 amplitude. Its last trough was in June 2013 and the current upswing has so far returned 0.22. The factors 2 and 3 are analogous to a sine and cosine wave and are not on average rewarded – they represent pure risk, of significant magnitude. By contrast the risk in factor 1 is rewarded quite strongly and reasonably persistently. These series, and in particular \(z_2\), are more regularly sampled by using equal risk increments whereby factors 2 and 3 in conjunction generate a target amount of variance, which we can loosely refer to as ‘spread return’ or ‘spread vol’. This criterion defines our time-bins.

Betas \(\beta\)

Beta is the sensitivity of each index to each factor in turn. Focusing for now on the cyclical factors \(\beta_2\) and \(\beta_3\), we see that the indices are all some distance from (0,0) so they all have cyclical risk. 9 representative postcode areas along the price spectrum are labelled, and all 104 postcode areas’ indices are plotted with a single point each. They form an arc with high-priced indices at low \(\beta_2\), curving around to lowest-priced indices at high \(\beta_2\), the beta values broadly mimicking the factor holdings \(h\) which were shown earlier. It is now clear that the indices 1:10 are designed to be approximately regularly-spaced in \(\beta_{2,3}\) space, and the extreme indices can precisely target the pricing extremes from below £1,000\(/m^2\) including e.g. TS1 to over £15,000\(/m^2\) in e.g. W1.

Attribution

Linear Factor Model for attribution \(x=\alpha+\beta.h^\intercal.x+u\) \(x\): index \(\alpha\): abnormal return \(\beta\): factor loadings \(h\): factor holdings \(u\): residual

Indices can be exactly attributed into factor components using the Linear Factor Model equation. It is convenient to consider aggregate contributions from factors 2 and 3 as a single cyclical component, thus emphasising that all indices are in a single cycle, each at a different phase. Currently index n=10 (Prime London) is in a cyclical downswing, whilst index n=1 (e.g. Teesside) is exactly phase-reversed, as seen on their betas. The residual plus alpha shown in grey not only has low volatility – this is guaranteed by the way the residual is derived from low-ranked factors – but also has low mean alpha.

This finding is useful since it justifies our focus on the cycle as the main driver of index spreads, being the key risk driver for the holding period of a typical young homeowner.

Spreads

Spreads are important to homeowners looking to outperform. Over the last 12 months ending December 2021 the spread of returns relative to Prime London has been large, and it is almost entirely due do the cycle component. This reflects the strong positive returns on factor 2. The same is true over 5 years, with spreads as high as 0.4 In logs (49%). Over the 18-year cycle however the cycle by definition becomes irrelevant and return spreads are dominated by factor 1. Since \(\beta_1\) is 1.18 for Prime London versus 0.91 at the other extreme, it is the top-tier indices that are most sensitive to the long-term upswing and are rewarded accordingly. For anyone with an 18-year investment horizon and no possibility of trading, London was the best choice, although the same effect can be achieved with higher leverage, provided that appropriate lines of credit are available.

Polar coordinates \(r,\theta\)

For further insight into the cycle it is useful to borrow from the physical sciences and convert the linear factor model from rectangular to polar coordinates using basic geometry. Both the cycle loadings (\(\beta_2\) ,\(\beta_3\)) and factors (\(z_2\),\(z_3\)) are converted in the same way to their corresponding values (\(r_{\beta},\theta_{\beta}\)) and (\(r_{z},\theta_{z}\)).

Factor phase \(\theta_z\)

Factor phase \(\theta_z\) varies slowly. Having converted to polar coordinates the phase angle \(\theta_{z}\) of the driver \(z\) evolves linearly through time, and this linear trend relation gives several insights. The slope implies a period of 15.5 years since 1994, and so if we followed the trend exactly the current trend value of -0.14 would correspond to August 2006. However we are currently somewhat behind trend, tarrying near zero with the latest phase angle in cycle units being 0.01, corresponding to a date one cycle back of March 2004. The points do tend in this way to cluster around the half-cycle increments, reflecting the fact that factor 3 is generally dominated by factor 2, and that factor 2 is closer to a series of linear trends in a sawtooth ‘flip-flop’ pattern rather than a sinusoid.

Index phase \(\theta_{\beta}\)

phase and price are one. In the beta scatter it was clear that both the area and affordability-quantile indices are arranged around an arc, with price rising monotonically with the polar angle \(\theta_{\beta}\). The relation \(\theta\sim(£/m^2)\) is in fact quite linear and it should now be clear how the spatial bins were designed: to sample \(\theta_{\beta}\) approximately evenly, whilst maintaining adequate datapoints in all bins. As the factor vector \(z\) rotates through the affordability spectrum, each price-point is favoured in turn, ending with the lowest. The two linear relations in conjunction are the most unambiguous representation possible of a ‘ripple effect’.

Economic rationale

The ripple effect is driven by affordability in a cycle which starts with high demand for the most desirable, highest priced properties following a market correction. The correction serves to reset all prices, the highest priced having experienced a slightly greater price fall due to their higher \(\beta_1\). As prices rise at the premium end and the price-spectrum expands, budget-constrained buyers seek out the more affordable districts, with the progressive catch-up leading to a ripple which propagates downward along the price spectrum. The process propagates progressively across the country as an ever-increasing proportion becomes hard to afford.

Local index and factor

Centred on each postcode area we partition districts from the local region into price quintiles, illustrated here with East London (postcode area E). Estimating the indices on the time periods locally appropriate and thence deriving the local factor model, we find factors 1 and 2 are completely dominant. Factor 1 is again a weighted average of the local indices, but to understand the local cycle we focus on factor 2 which once again is the difference between the prime quintiles 4,5 and the more affordable 1,2. By direct analogy with the national model we label the top-price quintiles A and the most-affordable quintiles C. Factor 2 - the C-A spread - could in this example be dubbed the Plaistow-Wapping spread.

Local cycle

The second factor is the local low-high price (C-A) spread, illustrated here with 3 examples. The local cycle shows a large-amplitude 15-20 year cycle of log amplitude 0.22 for Peterborough to 0.34 for London. London has shown this 0.34 amplitude outperformance of C versus A since June 2014, a trend which may soon reverse. The magnitude of this cycle is for homeowners both very significant and also actionable because it is local. The GFC did not impact on these cycles, it put them ‘on pause’. The cycles are not so much sinusoids, more a flip-flop between high price outperformance and low price outperformance. The late-cycle buyer of cheap districts in a cheap region is the most unfortunate: they suffered first a nationwide crash, then a 12-year high-amplitude cyclical headwind whilst also having low \(\beta_1\) during the recovery that followed. As we are in 2022 late-cycle, there are warnings here.

Radial dimension \(r\)

Local and national cycles can be understood in a unified framework. The ripple effect that starts in central London passes through the commuter zone and eventually beyond. Each local region’s quintiles betas \(\beta_{2,3}\)(q=1:5) lie on a slightly concave line, forming a very flat triangle. Each has a ‘local market factor’ which broadly coincides with the centroid of the triangle, plus two more which are triangle base vector C-A, and the perpendicular height. The radial coordinate \(r_{\beta}\) is smaller for desirable districts within a more affordable area, so for example a desirable district around Peterborough towards Stevenage is in sync with (having similar \(\theta_{\beta}\)) but less volatile than (having lower \(r_{\beta}\)) a similar-priced district in East London towards Ilford. Viewed in this context there is a simple link between local and national cycles which can be accurately interpreted with simple geometry.

Local turning point

Local cycles flip according to a simple insight from the \(\beta_{2,3}\) scatter in the previous graphic. The local C-A line provides a prediction: when the national factor-vector passes the perpendicular to C-A, this is the moment that the local market should then flip from favouring high-priced A to low-priced C, and the local cycle reverses. Unusually and by total coincidence it so happens that at the time of writing in early 2022 all areas nationally have entered in the same state: that which favours C over A (low price over high). The most recent wave of ‘flip to C’ started around Slough (the most \(\beta_3\)-dominated C-A at the high-price end) in November 2014 and ended recently with the final ‘flip’ in Lancaster (the most \(\beta_3\)-dominated C-A at the low-price end) in December 2021. For a series of areas we compare the local flip date with the national theta-z, and they have occurred broadly in sequence and on time. This confirms the surprising finding that local and national cycles can be understood together in a single geometrical framework.

Forecasts

We have seen that the cyclical factors 2 and 3 explain most of relative return over the average holding period, that factor 1 drives indices’ long-term return but the small spread of \(\beta_1\) reduces its importance, and the residual factors 4-10 can mostly be ignored. The two cyclical factors are forecast in a multivariate model 4 Sims, C.A., 1980. Macroeconomics and reality. Econometrica: journal of the Econometric Society, pp.1-48. doi: 10.2307/1912017 (Vector Autoregression ‘VAR’), borrowing a simplified method from the celebrated work of Sims.

The tests for stationarity, stability, and serial independence are passed5 Pfaff, B., 2008. Analysis of integrated and cointegrated time series with R. Springer Science & Business Media., and with this hurdle cleared we are able to bootstrap the two cyclical factors, taking random draws from past ‘shocks’ to simulate future scenarios.

The fan-chart shows the cyclical component alone, the shades of grey represent confidence bands. It can be seen that the model identifies a cycle of period 29 time-bins and projects it forward with relatively tight confidence intervals. Note the confident forecast of continued underperformance at the prime end, in contrast to the bottom-ranked index n=1.

Repricing

Repricing at the horizon requires tracking along the scenario path two further risk drivers for which we don’t yet have a model: (i) time-bin length and (ii) factor 1, the overall market movement. Using the standard financial engineering method, the four risk drivers (these two, plus two sets of cyclical shocks) are simultaneously resampled (drawn with replacement) from the 45 past periods over which the model was trained, generating an unlimited set of random scenarios, for simplicity here assumed each to be equally probable. These are used to reprice the assets at each period-end, interpolated to year-ends, and here presented as spreads relative to Prime London (n=10). The box and whisker plot shows the inter-quartile range as a box and the whiskers contain the rest of the distribution. It shows clearly the aggressive forecasts for the spread of the bottom-price index n=1 versus Prime London n=10, reflecting the large and opposite-phase impact of factor 2.

The table underlines the magnitude of the factor-2 cycle, and hence the importance of tracking and projecting the cycle phase through time.

price bin n 1 1-5 6-10 10
Tracked units (k) 58 4,993 4,568 57
Present Value (£B) 6 1,114 2,220 98
Peak-trough delta Present Value (£B) -3 -237 269 36
\(\beta_2\) 0.73 0.34 -0.16 -0.73
Peak-trough log cyclical return 0.57 0.21 -0.12 -0.36
Average floor area (m2) 89 93 95 99
Average Present Value (£k) 100 223 486 1,723

The peak-trough change in value is approximately symmetrical, so we can refer to this as a transfer of wealth to the value of approximately £250B. This calculation uses the present value of the tracked properties, which at £3.4T are estimated to be slightly over one-third of the total value of the market, so the numbers can be factored up accordingly to reflect the country-wide aggregate.

Horizon at cycle-end

Cycle end is a sensible horizon to end the scenarios, especially as now in late-cycle it is quite close. The last cycle terminated end-2007 shortly after the point where (a) factor 2 had turned negative and the factor vector was decisively in the third quadrant (i.e. negative on both factors) making Prime London once again top-performer (b) all areas had by August 2007 experienced uncannily similar 10-year log returns, with Mayfair at 11.65% p.a. and Teesside at 11.74% p.a. , the entire range of 10-year index returns being just 1.07% p.a. (c) momentum in factor 1 was weakening. In each bootstrap scenario a similar point can be identified where the most affordable areas have had a rapid ramp-up and factor 2 toggles to negative, causing Prime London to start its catch-up. Due to its huge economic significance, the tide-turn of factor 2 is taken as probably the single clearest indicator, one which must be associated with some fairly profound shift in national sentiment. The graphic shows a cloud of scenario outcomes, one point for each of 500 random bootstrap scenarios when factor 2 turns negative. Last time this toggle gave 6-9 months’ notice of entering a period of high risk of a correction. In these scenarios this date has a slightly skewed normal distribution, with a 1.4 year standard deviation around a central estimate of October 2027. As we approach the date, the confidence interval will become more precise.

Investment strategy

Tactical timing has been the emphasis so far – the same framework offers a strategic perspective on the asset class. Like other investment returns, the indices show negative skew and fat tails on an annual period, but over 5 years they are close to normal. Prime London has shown high log returns of 7% p.a., exceeding gilts and equities over this period. A 50/50 portfolio of high and low-priced indices perfectly hedges cyclical risk and reduces vol from 8.3% to 6.4% p.a. The returns on Prime London show insignificant annual autocorrelation of 0.12. Correlation with gilts is -0.21 and with equities 0.47. The gilt and equity indices include reinvested income, but the property indices do not include rent, nor do they include costs such as management, maintenance, insurance, improvements, nor service charge or lease renewal for leasehold – however these expenses might reasonably be covered by the rent. If the ‘imputed rent net of costs’ which the homeowner receives exceeds the the risk-free rate – which is normally the case – then the Sharpe Ratio exceeds 1, a value higher than London-listed equities or a broad government bond index. In summary the asset class is highly attractive and diversification can reduce volatility significantly.

Conclusions

  1. Index accuracy is high compared with the benchmark Land Registry UK HPI, having higher accuracy on the entire population of transactions in all cities tested.

  2. A parsimonious set of indices can be constructed with just 10 geographical bins and 45 time-bins which still spans the entire spectrum of behaviour and samples it evenly in coordinates of the factor model.

  3. A linear factor model recovers 97% of index variance with just 3 independent factors derived from the covariance matrix, of which only the first is systematically rewarded in the long run with the next two following a cycle of 15-20 years.

  4. The sensitivities (betas) to these factors are determined by affordability (the polar angle theta) and desirability (the radial dimension r), and the factors sweep through descending prices to generate a wave, cycle, or ripple.

  5. Nowhere is exempt from cyclical risk, every index has an exposure, so cyclical risk can only be eliminated in a portfolio. Just two suitably chosen assets are sufficient to entirely eliminate cyclical risk in a portfolio.

  6. This well-documented ‘ripple effect’ 6 Meen, G., 1999. Regional house prices and the ripple effect: a new interpretation. Housing studies, 14(6), pp.733-753.. doi:10.1080/02673039982524 is driven by affordability, as each price-band in turn becomes less affordable and demand migrates to progressively lower levels.

  7. The national cycle is composed of a series of local cycles, each local cycle having higher volatility at the low-price end driven by relatively ‘manic’ buying and a lack of supply.

  8. The price extremes – prime central London and deprived areas – are in anti-phase and have the highest cyclical risk exposures.

  9. Last time the cycle ended with a crash at a time when the ripple effect had reached and entirely passed through the very cheapest areas, having already started again at the top of the price-spectrum - a point in the cycle when all areas have experienced unsustainable and almost equal 10-year returns.

  10. A bivariate Vector Autoregression on the two factors fits a stable cycle and passes the test for serial independence of the shocks, so we are able to project it forward in a bootstrap method with rigour.

  11. Bootstrapping in addition the two further stochastic variables in our model (overall market direction factor 1 and the time-bin increment in days) gives repriced indices along a large number of random scenario paths.

  12. From the scenario distribution we derive index levels, spreads, repriced assets, which in turn allows repricing of financial instruments driven by asset prices.

  13. Tentatively we assume that the factor-2 sign change heralds the onset of significant crash risk, a phase which is currently still a quarter-cycle away, implying possibly a few years’ wait.

  14. Because the most important driver of relative performance is the cycle, an accurate model for the cycle can guide significantly better decisions when choosing region, area or district.

FAQ

  1. Are the indices averages? Unlike most indices which are indeed averages, repeat sales indices are based on regression which is a form of statistical inference or ‘inversion’, so they are better thought of as ‘de-averages’ which seek to remove the smoothing process which is caused by long holding-periods and infer the hidden, regular, more frequently time-sampled index. A physical analogy is found in optics, where a blurred image can be filtered or ‘deconvolved’ to make it sharp.

  2. Are the indices ‘smoothed’? It is often asserted that all property indices are artificially smoothed. The assertion that there is intrinsic smoothing needs to be critically evaluated in the light of question 1 and the following observations:

  1. I want to sample time regularly and do regional indices in the conventional way – is this possible? This is completely possible in the same framework, at the geographical level by bottom-up aggregating district-level indices to form bigger spatial bins, or in time by interpolating indices for example to quarter-ends. An alternative is to estimate custom indices directly from the raw data.

  2. What about houses vs flats, new vs used, positive and negative influence from infrastructure projects…? To investigate the effect of attributes not captured in an aggregate index we do similar analysis on the residuals, being that part of each holding period return not explained by the index. This can cover building type, freehold/leasehold, new/used, or any other attribute, and again the phenomenon can be accrued into time-bins and grouped into zones. The new build premium and its fade is especially interesting and will be described shortly.

  3. Where do factors come from? From the covariance matrix of the index returns, which is the correlation matrix before each row and column is divided by their respective volatility. This provides the factor \(h\) and \(\beta\). In the example shown here with 10 indices the covariance matrix is simply 55 numbers since the 10x10 matrix is real and symmetric.

  4. What are the units of the factors and betas? In terms of physics we would say that just like returns they are dimensionless, but it is certainly true that there are certain freedoms in defining factor portfolios h and betas. In particular they can be rotated, reflected and scaled in the most helpful way for a given application without changing any observable data. Econometricians use the technical term ‘identification problem’ for this type of issue. The scaling chosen here, as described, is that each factor is a spread or portfolio h where the positive weights or holdings sum to 1. Other scalings are more useful for other applications – for example when the linear factor model is to be used for risk management the factors are often scaled to unit annual variance, so the betas are then in units of volatility, which is a familiar unit. However caution is needed here: the annualization is usually based on a premise of uncorrelated returns and this may not be appropriate (see point 2).

  5. Why not forecast factor 1 (the overall market direction)? Forecasting the overall performance of an asset class is notoriously difficult and almost doomed to failure. If proof were needed, no-one expected rampant house price inflation during and after a global pandemic and recession, followed by war in Europe against a superpower. Secondly it is relatively hard to manage because accessing high \(\beta_1\) entails investing in London, but we cannot be certain how much of London’s high \(\beta_1\) is due to one-off ‘gentrification’ effects with attendant heavy investment in improving older properties to high spec. The third reason is the one repeatedly stressed throughout this document: because spreads of \(\beta_1\) are considerably small compared to \(\beta_2\) and \(\beta_3\), the impact of different \(\beta_1\) is of secondary importance for anyone already committed to the asset class.

  6. Is this the Kuznets Swing? Slightly less famous than the long Kondratieff Wave of innovation or the Juglar Investment Cycle, the Kuznets Swing7 Kuznets, S.S., 1930. Secular movement in production and prices: Their nature and their bearing upon cyclical fluctuations. Houghton Mifflin, Boston. is said to have roughly an 18-year period and is related to infrastructure investment. This hypothesis is not proven but for example Barras8 Barras, R., 2009. Building cycles: growth and instability. John Wiley & Sons. has gathered an impressive array of long-term evidence of construction cycles going back up to 2 centuries, drawn from around the English-speaking world. This reference completes - in extra time - a hat-trick of citations of Nobel Laureate economists.

  7. Is this related to ‘Statistical Arbitrage’ security trading strategies? It is related but differs in an important respect. The similarity is that we identify mean-reverting factors that create predictable oscillations around a common trend. The difference is that we are using the most significant factors rather than the least significant ones which are usually the focus of attention in ‘statarb’. The low-ranked factors tend to be period-specific and generally fail to mean-revert as expected, their risk being significantly higher than in-sample. In our case the returns are very large even without leverage, there is a clear rationale, and the model is performing as expected out-of-sample. In fact the main concern is that they might fail to show the same amplitude of excursions seen in the past.

About

Giles Heywood graduated from Cambridge University in Physics and Theoretical Physics, received his MBA from Bayes Business School and MSc in Econometrics from London Guildhall University. Senior positions on the buy side and sell side include Head of Quant Research at Gartmore Investment Management and Head of Single Strategy Hedge Funds at ABN AMRO Asset Management. He is Senior Consultant at SilverStreet Capital and Chief Data Scientist at Seven Dials Fund Management. This is an independent project developed in the R language, now seeking partners.