Data and Documentation

  • How are the data structured and identified?
  • A LIS (or LWS) dataset refers to harmonised microdata for one country (identified by the two-digit ISO country code) and one year (identified by the last two-digits of the reference year).
    It consists of two datafiles: one household-level file (identified by the suffix h) and one individual-level file (identified by the suffix p) including their respective household members.
    In addition, some LWS datasets include a household-level replicate weights file (identified by the suffix r).

    As an example, United States 2010 LWS replicate weights is identified by us10r. Note that since you cannot simultaneously access LIS and LWS datasets within a same job, Luxembourg 2004 household-level LIS datafile and Luxembourg 2004 household-level LWS datafile are identified by the same alias: lu04h.

  • What does the year in the dataset name refer to?
  • The reference year included in the name of harmonised datasets refers to the year to which the income data pertain (LIS database) and to the year to which the wealth data pertain (LWS database).

    For example, a survey fielded in 2008 reporting income from the prior calendar year is considered as a 2007 dataset.

    Note: Due to the different concept of the reference year between LIS and LWS datasets, the same survey data point used for LIS/LWS datasets could have different naming convention (LIS DE11 vs. LWS DE12).

  • Where can I find country-specific information and documentation?
  • A comprehensive and detailed set of documentation about LIS and LWS Databases is available through the LIS’ metadata information system called METIS.

    METIS includes for both databases general information on variable definition and dataset availability as well as dataset specific information on – among many others – characteristics of the original surveys, institutional information, variable content and their availability, basic statistics, etc..

  • Can the data be used for longitudinal analysis?
  • LIS and LWS microdata are cross-sectional only. They cannot be used for household- or person-level longitudinal analysis. There are no identifiers that link households or persons across waves of data.

    Anayway, comparing results across multiple time points is possible in most cases since, especially for LIS datasets, data span over several decades.

  • Can the data be used for regional analysis?
  • The geographic location indicator (variable “REGION_C”)is filled in many datasets allowing to compare outcomes across sub-national regions within countries.
    It is also possible to compare sub-national regions from more than one country: for instance, getting economic outcomes in eastern Canada, the eastern United States and in eastern Mexico. You may finally combine those three into a new entity such as eastern North America

  • What does a value of . mean?
  • Observations coded as . mean that the information is not available. Consult the LIS and LWS guidelines for further information.

  • What is the difference between gross income datasets and net income datasets? How should I treat them in my analysis?
  • In LIS Databases, in most datasets, detailed income variables are filled – as is ideal – with gross values, that is, values before taxes and mandatory employee social contributions are deducted. In these datasets, the sum of these gross values is equal to total household income (reported in variable hi). We subtract taxes and contributions from hi to arrive at household disposable income, often used as the basis of poverty and inequality analysis (reported in variable dhi).

    However, in some cases, the original datasets report only net income, that is, values after taxes and mandatory employee social contributions have been deducted. You can reliably compare dhi across LIS datasets, regardless of whether the dataset is classified by us as a net versus a gross dataset.

  • What are PPPs and deflators? How can I use them with the microdata?
  • PPPs refer to Purchasing Power Parities, which are often used to adjust exchange rates, to account for cross-national differences in price levels. While exchange rates can be used for comparative research on income, most researchers prefer to use PPP-adjusted exchange rates, because PPP-adjusted incomes hold roughly equal purchasing power measured in international prices. Price deflators render currencies equivalent across years within countries.

    The income, wealth and consumption variables in the LIS and/or LWS microdata (available through LISSY) are reported in national currencies – as are the mean and median income values in the Key Figures. To compare monetary amounts across countries and over time, researchers have to convert these values into a common currency and a common year’s prices. LIS leaves it up to researchers to choose exchange rates and/or deflators, as they wish, when using the LIS and LWS microdata.

    LIS has applied a set of PPPs and deflators (taken from the World Development Indicators) to the currencies in the Web Tabulator. We converted all income variables from nominal local currency units to 2011 international dollars. The conversion was done by applying first a national price deflator to the nominal amounts to express them all in terms of year 2011 national prices. Those amounts were then converted to international dollars using PPPs.

  • Why are the values in the LIS Key Figures different from similar indicators distributed by other organisations, such as the OECD?
  • There are several reasons that indicators – such as poverty rates, inequality measures, and labour force outcomes – vary across sources. The main reason is that the underlying microdata from which these measures are constructed may be different, which would imply different sampling techniques, imputation techniques, income concepts, and so on. Furthermore, even if using the same microdata, each organization makes its own decisions regarding, for example, defining countable income, equivalising income, weighting, top- and bottom-coding, PPPs, and so on.

    We make available extensive information about how we constructed our Key Figures (See. Key Figures programs), so that you can always understand how we arrived at our indicators.