hpfilter is a package that implements the one-sided and two-sided versions of the Hodrick-Prescott filter in R.
The two-sided implementation uses sparse matrices to enhance the efficiency of calculations when using large datasets. Its traditional use has been in macroeconomics, to smooth out the time series of variables such as the GDP. The one-sided version is based on the Kalman filter. An important use of the one-sided HP filter is for the calculation of the credit gap, according to the Basel III methodology for setting the Countercyclical Capital Buffer (CCyB) guide.
The Hodrick-Prescott filter is a technique commonly used to smooth macroeconomic data such as the GDP. It consists in separating short-term, cyclical movements in the data from the long-term trend. Although the HP filter has received numerous critiques, recent studies show that alternative methods do not necessarily perform better. Furthermore, the Basel committee, an international body responsible for setting international guidelines for prudential regulation, has recommended the usage of the one-sided HP filter for calculations pertaining to the analysis of financial cycles.
Filtering techniques, such as the HP filter are based on the concept of time series decomposition. This means that an original series \(y\) is separated into its trend and cyclical components, with the remainder of the variability being classified as noise (or the error term \(ε_t\)).
\(y_t = ytrend_t + ycycle_t + \varepsilon_t\)
where:
\(t = [0, T]\)
\(T\) - the total number of observations
\(ytrend_t\) - the trend component
\(ycycle_t\) - the cyclical component
\(\varepsilon_t\) - the error term.
The HP filter finds a trend series that solves the minimization problem shown in the equation below:
\(\min\limits_{ytrend} ( \sum\limits_{t=1}^T(y_t-ytrend_t)^2+\lambda*\sum\limits_{t=2}^{T-1}[(ytrend_{t+1}-ytrend_{t})-(ytrend_{t}-ytrend_{t-1})]^2)\)
From the first-order conditions, we derive the following expression to be solved:
\(A*ytrend=y\)
where:
\[ A = \begin{bmatrix}1+\lambda &-2*\lambda &\lambda &0&\cdots&\cdots&\cdots&\cdots&0\\-2*\lambda&1+5*\lambda&-4*\lambda&\lambda&0&\cdots&\cdots&\cdots&0\\\lambda&-4*\lambda&1+6*\lambda&-4*\lambda&\lambda&0&\cdots&\cdots&0\\0 &\lambda &-4*\lambda & 1+6*\lambda & -4*\lambda &\lambda &0 &\cdots&0 \\\vdots&&&&&&&&\vdots\\0&\cdots&\cdots&0 &\lambda &-4*\lambda &1+6*\lambda &-4*\lambda &\lambda\\0&\cdots&\cdots&\cdots&0 & \lambda & -4*\lambda & 1+5*\lambda & -2*\lambda\\0&\cdots&\cdots&\cdots&\cdots&0 & \lambda & -2*\lambda & 1+\lambda\end{bmatrix} \]
\(y\) - the vector containing the data to be processed
\(\lambda\) - the smoothing parameter.
For more details on the method, its advantages, disadvantages and use cases, see the following papers:
This function applies the one-sided Hodrick-Prescott filter to the selected data, for the given smoothing parameter value.
The one-sided HP filter is obtained by using the Kalman filter to find solutions of the local linear trend model. The approximation produced by the filter requires that a steady-state solution of the Kalman filter can be found. Since, for positive smoothing parameters, this is the case - the method can be used.
In R, simply call the function:
hp1(y, lambda = 1600, x_user, P_user, discard)The function takes as input:
| y | a dataframe of size Txn, where “T” is the number of observations for each variable (number of rows) and “n” - the number of variables in the dataframe (number of columns). | 
| lambda | the smoothing parameter; a numeric scalar which takes the default value of 1600, if unspecified by the user. | 
| x_user | user defined initial values of the state estimate for each variable in y. Takes the form of a 2xn matrix. Since the underlying state vector is 2x1, two values are needed for each variable in y. Default: if no values are provided, by default, backwards extrapolations based on the first two observations are used. | 
| P_user | a structural array with n elements, each of which being a 2x2 matrix of initial MSE estimates for each variable in y. Default: If no values are provided, the default matrix with large variances is used. | 
| discard | the number of discard periods, expressed as a numeric scalar. The user specified amount of values will be discarded from the start of the sample, resulting in output matrices of size (T-discard)xn. Default: If no values are provided, the value of 0 is used. | 
The function outputs the following series:
| ytrend | the trend series with cyclical components removed as a (T-discard)xn dataframe, where each column houses the filtered series corresponding to the original data. | 
This function applies the two-sided Hodrick-Prescott filter to the selected data, for the given smoothing parameter value.
The two-sided HP filter uses a sparse-matrix implementation to enable faster calculation with large datasets.
hp2(y, lambda = 1600)The function takes as input:
| y | a dataframe of size Txn, where “T” is the number of observations for each variable (number of rows) and “n” - the number of variables in the dataframe (number of columns). | 
| lambda | the smoothing parameter; a numeric scalar which takes the default value of 1600, if unspecified by the user. | 
The function outputs the following series:
| ytrend | the trend series with cyclical components removed as a Txn dataframe, where each column houses the filtered series corresponding to the original data. | 
In this example, we will apply the HP filter to GDP data for the EU
from Eurostat. More details about the series can be found in its help
file, which you can access by typing: ?GDPEU. Below, we can
preview the first observations of the dataset:
#> Warning in data(GDPEU): data set 'GDPEU' not found
#> Error in head(GDPEU, 10): object 'GDPEU' not foundThe next code snippet shows the workflow associated with estimating the HP Filter - it involves loading the package, calculating the trend and deriving the cycle.
# R CODE - Smoothing GDP data with the two-sided HP filter
# Load library
library(hpfilter)
# Keep only the y series and store in df object
y = data.frame(gdp=c(2402903.9, 2416576.3, 2428126.1, 2437407.9, 2443883.8, 2459908.3, 2474557.8, 2487734.5, 2502900.4, 2531872, 2550913.6, 2579582.2, 2598307.2, 2611524.5, 2628335, 2636377, 2662108.5, 2673788.6, 2706474.2, 2739615.4, 2771013.4, 2796237.5, 2810427.4, 2827455.4, 2854217.7, 2860625.6, 2868091.8, 2873472.3, 2879547.6, 2895157.8, 2908087.9, 2918434.9, 2917344.7, 2925034.7, 2945193.3, 2968681, 2985760.1, 3003359.6, 3011430.9, 3024493.3, 3034608.1, 3057625.7, 3082932, 3107995.9, 3134685.3, 3167958.6, 3184861.1, 3215225.8, 3239777.4, 3262175.7, 3280780.3, 3300040, 3318443, 3308233.2, 3285991.3, 3224235.3, 3137585.4, 3134888.1, 3145754.4, 3158823.5, 3173245.5, 3205158.6, 3222254.3, 3239486.8, 3266120.1, 3268551.9, 3275857.1, 3266151.1, 3264311.3, 3257251.5, 3260049.6, 3247444.9, 3243545.5, 3261329.2, 3275659.2, 3288417.7, 3304747.7, 3315578, 3334340.8, 3348942.3, 3372453.9, 3390320.1, 3407279, 3424800.8, 3440244.9, 3451983.1, 3467597.7, 3494562.1, 3519862.5, 3546513.9, 3572479.2, 3598925.7, 3603964.3, 3624883.5, 3630997.8, 3652886.3, 3676573.4, 3689104.4, 3699971.7, 3702073.3))
# Run the two-sided HP filter with the default value of 1600
# for lambda
ytrend = hp2(y)
# Calculate cyclical component
ycycle = y - ytrendNow, we can create graphs to show the results. We first plot the HP filtered trend and the actual series.
# Plot
plot(y$gdp, type="l", col="black", lty=1)
lines(ytrend$gdp, col="#066462")
legend("bottom", horiz=TRUE, cex=0.75, c("y", "ytrend"), lty = 1, col = c("black", "#066462"))Then, we plot the cyclical component.
plot(ycycle$gdp, type = "l")
abline(h=0)In this example, we generate random time series data drawn from a normal distribution and apply the one-sided HP filter. After calculating the cyclical component from the trend series generated by the hp1 function, we plot the output alongside the original data.
# Generate the data and plot it
set.seed(10)
y <- as.data.frame(rev(diffinv(rnorm(100)))[1:100])+30
colnames(y) <- "gdp"
plot(y$gdp, type="l")
# Apply the HP filter to the data
ytrend = hp1(y)
ycycle = y - ytrend
# Plot the three resulting series
plot(y$gdp, type="l", col="black", lty=1, ylim=c(-10,30))
lines(ytrend$gdp, col="#066462")
polygon(c(1, seq(ycycle$gdp), length(ycycle$gdp)), c(0, ycycle$gdp, 0), col = "#E0F2F1")
legend("bottom", horiz=TRUE, cex=0.75, c("y", "ytrend", "ycycle"), lty = 1,
       col = c("black", "#066462", "#75bfbd"))