hicp-package
The Harmonised Index of Consumer Prices (HICP) is the key
economic figure to measure inflation in the euro area. The methodology
underlying the HICP is documented in the HICP Methodological Manual
(European Commission 2024). Based on this
manual, the hicp-package provides functions for data users
to work with publicly available HICP price indices and weights
(upper-level aggregation).
This vignette highlights the main package features. It contains four sections on global package options, data access, the classification of individual consumption by purpose (COICOP) underlying the HICP, as well as index aggregation, change rates and contributions of lower-level indices to the overall inflation rate. It also shows how the package functions can be similarly applied to quarterly index series like the owner-occupied housing price index (OOHPI).
The package works with several global options controlling the
function behavior. Most importantly,
options("hicp.coicop.version") defines the COICOP version
to be used. Several versions are supported. The HICP uses the European
COICOP version 2, which is the package’s default. Since none of the
COICOP versions contains a code for the all-items index,
options("hicp.all.items.code") allows to define this code.
If the COICOP codes include a certain prefix, this prefix can be set by
options("hicp.coicop.prefix"). At package start-up, the
following options are set as default.
# load package:
library(hicp)
# set global options:
options(hicp.coicop.version="ecoicop2.hicp") # COICOP version to be used
options(hicp.coicop.prefix="CP") # prefix of COICOP codes
options(hicp.all.items.code="TOTAL") # internal code for the all-items index
options(hicp.chatty=TRUE) # print package-specific messages and warningsThe hicp-package offers easy access to HICP data from
Eurostat’s public database. For
that purpose, it uses the download functionality provided by Eurostat’s
restatapi-package.
This section shows how to list, filter and retrieve HICP data using the
functions datasets(), datafilters(), and
data().
Eurostat’s database contains various data sets of different
statistics. All data sets are classified by topic and can be accessed
via a navigation tree. HICP data can be found under “Economy and finance
/ Prices”. An even simpler solution that does not require visiting
Eurostat’s database is provided by the function datasets(),
which lists all available HICP data sets with corresponding metadata
(e.g., number of observations, last update).
The function output shows the first five HICP data sets. As can be
seen, a short description of each data set and some metadata are
provided. The variable code is the data set identifier,
which is needed to filter and download data.
dtd[1:5, list(title, code, lastUpdate, values)]
#> title
#> <char>
#> 1: Harmonised index of consumer prices (HICP) - ECOICOP ver.2 - indices and rates of change at constant tax rates, monthly data
#> 2: Harmonised index of consumer prices (HICP) - ECOICOP ver.2 - contributions to euro area annual inflation
#> 3: Harmonised index of consumer prices (HICP) - ECOICOP ver.2 - administered prices composition
#> 4: Harmonised index of consumer prices (HICP) - ECOICOP ver.2 - country weights
#> 5: Harmonised index of consumer prices (HICP) - ECOICOP ver.2 - item weights
#> code lastUpdate values
#> <char> <char> <num>
#> 1: prc_hicp_ct 2026.01.28 15363359
#> 2: prc_hicp_ctr 2026.01.30 83220
#> 3: prc_hicp_admp 2026.01.27 583425
#> 4: prc_hicp_cw 2026.01.27 3280
#> 5: prc_hicp_iw 2026.01.27 462624The HICP is compiled each month in each member state of the European
Union (EU) for various items. Its compilation started in 1996.
Therefore, the data set of price indices is relatively large. Sometimes,
however, data users only need the price indices of certain years or
specific countries. Eurostat’s API and, thus, the
restatapi-package allows to provide filters on each data
request, e.g., to download only the price indices of the euro area for
the all-items HICP. The filtering options can differ for each data set.
The function datafilters() returns the allowed filtering
options for a given data set.
The function output shows that the data set prc_hicp_iw
for the HICP item weights can be filtered with respect to the frequency
(freq), the COICOP code (coicop18), the
statistical unit (statinfo) and the geographical area
(geo). The table dtf contains for each filter
the allowed values, e.g., CP011 for coicop18
and A for freq. These filters can be
integrated in the data download as explained in the following
subsection.
# allowed filters:
unique(dtf$concept)
#> [1] "freq" "coicop18" "statinfo" "geo"
# allowed filter values:
dtf[1:5,]
#> concept code name
#> <char> <char> <char>
#> 1: freq A Annual
#> 2: coicop18 TOTAL Total
#> 3: coicop18 CP01 Food and non-alcoholic beverages
#> 4: coicop18 CP011 Food
#> 5: coicop18 CP0111 Cereals and cereal products (ND)Applying a filter to a data request can noticeably reduce the
downloading time, particularly for bigger data sets. The function
data() can be used to download a specific data set without
any filters
or with filters on the time dimension and other filtering options:
# download item weights with filters:
item.weights <- hicp::data(id="prc_hicp_iw",
filters=list("geo"=c("EA","DE","FR")),
date.range=c("2019","2025"),
flags=TRUE)The downloaded object item.weights contains 11572 HICP
item weights for the euro area, Germany, and France from 2019 to
2025.
item.weights[1:5, ]
#> Key: <coicop18, statinfo, geo, time>
#> coicop18 statinfo geo time values flags conf_status
#> <char> <char> <char> <char> <num> <char> <char>
#> 1: AP IW DE 2019 123.29 <NA>
#> 2: AP IW DE 2020 115.02 <NA>
#> 3: AP IW DE 2021 146.58 <NA>
#> 4: AP IW DE 2022 116.73 <NA>
#> 5: AP IW DE 2023 166.36 <NA>HICP item weights and price indices are classified according to the European COICOP version 2 (ECOICOP2-HICP). At the lowest level of subclasses (5-digit codes), there is the finest differentiation of items by consumption purpose, e.g., cereals (01111) or bread and bakery products (01113). Both subclasses belong to the same class, cereal and cereal products (0111), and, at higher levels, to the same group food (011) and division food and non-alcoholic beverages (01). Hence, COICOP and thus the aggregation of the HICP follows a pre-defined hierarchical tree. This section shows how to work with the COICOP codes and the HICP special aggregates whose definition is based on COICOP codes.
In general, COICOP codes consist of numbers. Using the function
is.coicop(), it can be easily checked if a code is a valid
COICOP code or not. This validation is based on the selected COICOP
version in options("hicp.coicop.version"). It further
considers any prefix of the COICOP codes defined in
options("hicp.coicop.prefix"). For the COICOP codes from
Eurostat’s database, the prefix CP is expected. The code
TOTAL is used in this package for the all-items HICP
although it is not considered a valid COICOP code.
# all-items code and codes without prefix "CP" are no valid ECOICOP codes:
is.coicop(id=c("TOTAL","CP01","CP011","CP012","012"))
#> [1] FALSE TRUE TRUE TRUE FALSE
# games of chance are not valid in ECOICOP-HICP ver. 1:
is.coicop("CP0943", settings=list(coicop.version="ecoicop1.hicp"))
#> [1] FALSE
# but in ECOICOP-HICP ver. 2:
is.coicop("CP0943", settings=list(coicop.version="ecoicop2.hicp"))
#> [1] TRUEFor the aggregation of HICP data from bottom to top, the children and
parents of each COICOP code must be properly derived. Children are those
codes that belong to the same higher-level code (or parent). Such
relations can be direct (e.g., 01->011) or indirect
(e.g., 01->0111). The functions child() and
parent() allow to derive all relatives of a COICOP
code.
# get parents:
parent(id=c("CP01","CP011","CP01111","CP01112"), usedict=TRUE)
#> [[1]]
#> [1] "TOTAL"
#>
#> [[2]]
#> [1] "CP01"
#>
#> [[3]]
#> [1] "CP0111"
#>
#> [[4]]
#> [1] "CP0111"
# get children:
child(id=c("CP01","CP011","CP01111","CP01112"), usedict=TRUE)
#> [[1]]
#> [1] "CP011" "CP012" "CP013"
#>
#> [[2]]
#> [1] "CP0111" "CP0112" "CP0113" "CP0114" "CP0115" "CP0116" "CP0117" "CP0118"
#> [9] "CP0119"
#>
#> [[3]]
#> NULL
#>
#> [[4]]
#> NULLIf the 4-digit or 5-digit level is not available for some divisions
in the HICP data, it is not possible to derive the all-items HICP only
from the 5-digit level. In this case, the item weights would not add up
to 1000. Instead, the missing 4-digit and 5-digit codes must be replaced
with their higher-level parents. The function tree() allows
to derive this composition of COICOP codes at the lowest possible level.
This can be particularly useful if one wants to aggregate the price
indices at the lowest level in a single step into the all-items index
(see also next section).
# example codes:
ids <- c("CP01","CP011","CP012","CP0111","CP0112")
# derive COICOP tree from top to bottom:
tree(ids)
#> [[1]]
#> [1] "CP01"
#>
#> [[2]]
#> [1] "CP011" "CP012"
#>
#> [[3]]
#> [1] "CP0111" "CP0112" "CP012"
# still same tree because weights add up:
tree(id=ids, w=c(0.2,0.08,0.12,0.05,0.03))
#> [[1]]
#> [1] "CP01"
#>
#> [[2]]
#> [1] "CP011" "CP012"
#>
#> [[3]]
#> [1] "CP0111" "CP0112" "CP012"
# now (CP011,CP012) because weights do not correctly add up at lower levels:
tree(id=ids, w=c(0.2,0.08,0.12,0.05,0.01))
#> [[1]]
#> [1] "CP01"
#>
#> [[2]]
#> [1] "CP011" "CP012"
#>
#> [[3]]
#> [1] "CP011" "CP012"For the HICP, various special aggregates like food and energy are
calculated. Each special aggregate is composed of a selection of COICOP
codes. This composition is fix over time but depends on the COICOP
version. The function spec.agg() provides the definitions
of all HICP special aggregates, while the function
is.spec.agg() validates the codes of special
aggregates.
# validate codes:
is.spec.agg(id=c("TOTAL","CP01","FOOD","NRG"))
#> [1] FALSE FALSE TRUE TRUE
# get compositions of non-processed food and energy:
spec.agg(id=c("FOOD_NP","NRG"))
#> $FOOD_NP
#> [1] "CP01121" "CP01122" "CP01124" "CP01131" "CP01134" "CP01137" "CP01148"
#> [8] "CP01161" "CP01162" "CP01163" "CP01164" "CP01165" "CP01168" "CP01171"
#> [15] "CP01172" "CP01173" "CP01174" "CP01175" "CP01178" "CP01194" "CP01250"
#>
#> $NRG
#> [1] "CP04510" "CP04521" "CP04522" "CP04530" "CP04541" "CP04542" "CP04543"
#> [8] "CP04549" "CP04550" "CP07221" "CP07222" "CP07223"The HICP is a chain-linked Laspeyres-type index (European Union 2016). The (unchained) price indices in each calendar year refer to December of the previous year, which is the price reference period. These price indices are chain-linked to the existing index using December to obtain the HICP. The HICP indices currently refer to the index reference period 2025=100. Monthly and annual change rates can be derived from the price indices. The contributions of the price changes of individual items to the annual rate of change can be computed by the “Ribe method”. More details can be found in European Commission (2024, chap. 8).
The all-items index is a weighted average of the items’ subindices. However, because the HICP is a chain index, the subindices cannot simply be aggregated. They first need to be unchained, i.e., expressed relative to December of the previous year. These unchained indices can then be aggregated as a weighted average. Since the Laspeyres-type index is consistent in aggregation, the aggregation can be done gradually from the bottom level to the top or directly in one step.
In the following example, the euro area HICP is computed directly in
one step and also gradually through all higher-level indices. First, the
monthly price indices are downloaded from Eurostat’s database for the
index reference period 2025=100 (unit) and the period from
December 2019 to December 2025.
# download monthly price indices:
dtp <- hicp::data(id="prc_hicp_minr",
filters=list(unit="I25", geo="EA"),
date.range=c("2019-12", "2025-12"))# convert into proper dates:
dtp[, "time":=as.Date(paste0(time, "-01"))]
dtp[, "year":=as.integer(format(time, "%Y"))]
setnames(x=dtp, old="values", new="index")Second, the price indices are unchained separately for each ECOICOP
using the function unchain().
Next, the price indices prc and item weights
inw are merged into one data set.
# manipulate item weights:
dtw <- item.weights[geo=="EA", list(coicop18,geo,time,values)]
dtw[, "time":=as.integer(time)]
setnames(x=dtw, old=c("time","values"), new=c("year","weight"))
# merge price indices and item weights:
dtall <- merge(x=dtp, y=dtw, by=c("geo","coicop18","year"), all.x=TRUE)For aggregating the unchained price indices in one step into the
all-items index, the lowest level of the COICOP tree must be derived.
Based on the derived COICOP tree, the unchained price indices are
aggregated using the function laspeyres(), chained into a
long-term index series using the function chain(), and
finally re-referenced to the index reference period 2025 using the
function rebase(). The resulting index is plotted
below.
# derive COICOP tree for index aggregation:
dtall[weight>0 & !is.na(dec_ratio),
"tree" := tree(id=coicop18, w=weight, flag=TRUE, settings=list(w.tol=0.1)),
by="time"]
# compute all-items HICP in one aggregation step:
hicp.own <- dtall[tree==TRUE,
list("laspey"=laspeyres(x=dec_ratio, w0=weight)),
by="time"]
setorderv(x=hicp.own, cols="time")
# chain the resulting index:
hicp.own[, "chain_laspey" := chain(x=laspey, t=time, by=12)]
# rebase the index to 2025:
hicp.own[, "chain_laspey_25" := rebase(x=chain_laspey, t=time, t.ref="2025")]
# plot all-items index:
plot(chain_laspey_25~time, data=hicp.own, type="l", xlab="Time", ylab="Index")
title("Euro area HICP")
abline(h=0, lty="dashed")Similarly, the (unchained) price indices are aggregated gradually following the COICOP tree, which produces in addition to the all-items index all lower-level indices.
# compute all-items HICP gradually from bottom to top:
hicp.own.all <- dtall[weight>0 & !is.na(dec_ratio),
aggregate.tree(x=dec_ratio, w0=weight, id=coicop18, formula=laspeyres),
by="time"]
setorderv(x=hicp.own.all, cols="time")
hicp.own.all[, "chain_laspey" := chain(x=laspeyres, t=time, by=12), by="id"]
hicp.own.all[, "chain_laspey_25" := rebase(x=chain_laspey, t=time, t.ref="2025"), by="id"]A comparison to the all-items index that has been computed in one step shows no differences, which highlights the consistency in aggregation of the Laspeyres-type index.
# all-items HICP from direct and gradual aggregation identical:
all(abs(hicp.own.all[id=="TOTAL", chain_laspey_25]-hicp.own$chain_laspey_25)<0.1)
#> [1] TRUEUser-defined aggregates can be easily calculated with the functions
aggregate() and disaggregate(). This is
particularly useful for the calculation of the HICP special aggregates
like food, energy or the overall index excluding the two as shown
below.
# compute food and energy by aggregation:
dtall[time>="2019-12-01",
aggregate(x=dec_ratio, w0=weight, id=coicop18,
agg=spec.agg(id=c("FOOD","NRG")),
settings=list(exact=FALSE, names=c("FOOD","NRG"))),
by="time"]
#> time id w0 laspeyres
#> <Date> <char> <num> <num>
#> 1: 2019-12-01 FOOD NA NA
#> 2: 2019-12-01 NRG NA NA
#> 3: 2020-01-01 FOOD 190.71 1.0068145
#> 4: 2020-01-01 NRG 98.48 1.0080618
#> 5: 2020-02-01 FOOD 190.71 1.0103345
#> ---
#> 142: 2025-10-01 NRG 93.97 0.9797652
#> 143: 2025-11-01 FOOD 193.16 1.0251769
#> 144: 2025-11-01 NRG 93.97 0.9890975
#> 145: 2025-12-01 FOOD 193.16 1.0250906
#> 146: 2025-12-01 NRG 93.97 0.9806562
# compute overall index excluding food and energy by disaggregation:
dtall[time>="2019-12-01",
disaggregate(x=dec_ratio, w0=weight, id=coicop18,
agg=list("TOTAL"=c("FOOD","NRG")),
settings=list(names="TOT_X_FOOD_NRG")),
by="time"]
#> time id w0 laspeyres
#> <Date> <char> <num> <num>
#> 1: 2019-12-01 TOT_X_FOOD_NRG NA NA
#> 2: 2020-01-01 TOT_X_FOOD_NRG 710.78 0.9829889
#> 3: 2020-02-01 TOT_X_FOOD_NRG 710.78 0.9866878
#> 4: 2020-03-01 TOT_X_FOOD_NRG 710.78 0.9979379
#> 5: 2020-04-01 TOT_X_FOOD_NRG 710.78 1.0053489
#> ---
#> 69: 2025-08-01 TOT_X_FOOD_NRG 712.80 1.0207492
#> 70: 2025-09-01 TOT_X_FOOD_NRG 712.80 1.0222860
#> 71: 2025-10-01 TOT_X_FOOD_NRG 712.80 1.0251807
#> 72: 2025-11-01 TOT_X_FOOD_NRG 712.80 1.0196345
#> 73: 2025-12-01 TOT_X_FOOD_NRG 712.80 1.0233334The resulting aggregates can finally be chained and rebased as shown before.
User-defined functions can be passed to aggregate() as
well, which allows aggregation using various weighted or unweighted
bilateral index formulas. By contrast, the function
disaggregate() requires the underlying data to be
aggregated as a Laspeyres-type index.
Monthly change rates are computed by dividing the HICP index in the
current period by the index one month before. Annual change rates are
derived by comparing the index in the current month to the index in the
same month one year before. Both rates can be easily derived using the
function rates(). Contributions of the price changes of
individual items to the overall annual rate of change can be computed by
the Ribe method as implemented in the function
contrib().
# compute annual rates of change for the all-items HICP:
dtall[, "ar" := rates(x=index, t=time, type="year"), by=c("geo","coicop18")]
# add all-items HICP:
dtall <- merge(x=dtall,
y=dtall[coicop18=="TOTAL", list(geo,time,index,weight)],
by=c("geo","time"), all.x=TRUE, suffixes=c("","_all"))
# Ribe decomposition:
dtall[, "ribe" := contrib(x=index, w=weight, t=time,
x.all=index_all, w.all=weight_all, type="year"),
by="coicop18"]
# annual change rates and contribtuions over time:
plot(ar~time, data=dtall[coicop18=="TOTAL",], type="l", xlab="Time", ylab="", ylim=c(-1,13))
lines(ribe~time, data=dtall[coicop18=="CP011"], col="red")
title("Contributions of food to overall inflation")
legend("topleft", col=c("red","black"), lty=1, bty="n",
legend=c("Contributions of food (in pp-points)", "Overall inflation (in %)"))Most of the calculations shown in the previous two sections can be similarly applied to quarterly (or annual) index series. The owner-occupied housing price index (OOHPI) is a prominent example for a chained quarterly Laspeyres-type price index. The OOHPI indices and weights can be obtained from Eurostat’s database. Below, they are downloaded for the period from 2014 to 2024 for the euro area.
# download quarterly OOHPI for euro area:
dtp <- hicp::data(id="prc_hpi_ooq",
filters=list(unit="I15_Q", geo="EA"),
date.range=c("2014-10","2024-12"))
# download annual OOH weights for euro area:
dtw <- hicp::data(id="prc_hpi_ooinw",
filters=list(geo="EA"),
date.range=c("2014","2024"))Before calculations can start, any time variables in the data must be put first into proper dates. Afterwards, the indices and weights can be merged into a single data set.
# manipulate indices:
dtp[, c("year","quarter") := tstrsplit(x=time, split="-Q", fixed=TRUE)]
dtp[, "year":=as.integer(year)]
dtp[, "quarter":=as.integer(quarter)]
dtp[, "time":=as.Date(paste(year, quarter*3, "01", sep="-"), format="%Y-%m-%d")]
dtp[, c("unit","quarter"):=NULL]
setnames(x=dtp, old="values", new="index")
# manipulate item weights:
dtw[, "year":=as.integer(time)]
dtw[, c("unit","time"):=NULL]
setnames(x=dtw, old="values", new="weight")
# merge indices and item weights:
dtooh <- merge(x=dtp, y=dtw, by=c("geo","expend","year"), all.x=TRUE)
setcolorder(x=dtooh, neworder=c("geo","expend","year","time"))
setkeyv(x=dtooh, cols=c("geo","expend","time"))The OOHPI is chained using the fourth quarter of the previous year.
Hence, for the aggregation of the OOHPI subcomponents, the indices must
first be unchained using the function unchain(). The
argument by of this function should now match to one month
of the relevant quarter. Hence, for the fourth quarter, by
should be set to 10, 11 or 12.
The unchaining then works as usual.
The subcomponents of the OOHPI do not follow the COICOP system.
Instead, they are classified into expenditure categories
(expend). These must be (manually) selected for index
aggregation. For example, the total OOHPI is an aggregate of the two
categories ‘acquisition of dwellings’ (DW_ACQ) and
‘ownership of dwellings’ (DW_OWN). These two expenditure
categories are further broken down into finer ones. In the following,
they are used to compute the overall OOHPI, which is finally chained and
rebased to the year 2015.
# aggregate, chain and rebase:
dtagg <- dtooh[expend%in%c("DW_ACQ","DW_OWN"), list("oohpi"=laspeyres(x=ratio, w0=weight)), by="time"]
dtagg[, "oohpi" := chain(x=oohpi, t=time)]
dtagg[, "oohpi" := rebase(x=oohpi, t=time, t.ref="2015")]It is important to note that the functions unchain(),
chain() and rebase() auto-detect the frequency
of the time series. If users prefer to manually define the frequency,
the function settings can be changed to
settings=list(freq="quarter"). The same is true for the
derivation of annual (or quarterly) change rates:
# derive annual change rates:
dtagg[, "ar" := rates(x=oohpi, t=time, type="year", settings=list(freq="quarter"))]The annual change rates ar show the percentage change of
the overall OOHPI in the current quarter compared to the same quarter
one year before. These change rates could be further decomposed into the
individual contributions of each expenditure category using the function
contrib().