Data and Documentation

  • How are the data structured and identified?
  • A LIS (or LWS) dataset refers to harmonised microdata for one country (identified by the two-digit ISO country code) and one year (identified by the last two-digits of the reference year).
    It consists of two datafiles: one household-level file (identified by the suffix h) and one individual-level file (identified by the suffix p) including their respective household members.
    In addition, some LWS datasets include a household-level replicate weights file (identified by the suffix r).

    As an example, United States 2010 LWS replicate weights is identified by us10r. Note that since you cannot simultaneously access LIS and LWS datasets within a same job, Luxembourg 2004 household-level LIS datafile and Luxembourg 2004 household-level LWS datafile are identified by the same alias: lu04h.

  • What does the year in the dataset name refer to?
  • The reference year included in the name of harmonised datasets refers to the year to which the income data pertain (LIS database) and to the year to which the wealth data pertain (LWS database).

    For example, a survey fielded in 2008 reporting income from the prior calendar year is considered as a 2007 dataset.

    Note: Due to the different concept of the reference year between LIS and LWS datasets, the same survey data point used for LIS/LWS datasets could have different naming convention (LIS DE11 vs. LWS DE12).

  • Where can I find country-specific information and documentation?
  • A comprehensive and detailed set of documentation about LIS and LWS Databases is available through the LIS’ metadata information system called METIS.

    METIS includes for both databases general information on variable definition and dataset availability as well as dataset specific information on – among many others – characteristics of the original surveys, institutional information, variable content and their availability, basic statistics, etc..

  • Can the data be used for longitudinal analysis?
  • LIS and LWS microdata are cross-sectional only. They cannot be used for household- or person-level longitudinal analysis. There are no identifiers that link households or persons across waves of data.

    Anayway, comparing results across multiple time points is possible in most cases since, especially for LIS datasets, data span over several decades.

  • Can the data be used for regional analysis?
  • The geographic location indicator (variable “REGION_C”)is filled in many datasets allowing to compare outcomes across sub-national regions within countries.
    It is also possible to compare sub-national regions from more than one country: for instance, getting economic outcomes in eastern Canada, the eastern United States and in eastern Mexico. You may finally combine those three into a new entity such as eastern North America

  • What does a value of . mean?
  • Observations coded as . mean that the information is not available. Consult the LIS and LWS guidelines for further information.

  • What is the difference between gross income datasets and net income datasets? How should I treat them in my analysis?
  • In LIS Databases, in most datasets, detailed income variables are filled – as is ideal – with gross values, that is, values before taxes and mandatory employee social contributions are deducted. In these datasets, the sum of these gross values is equal to total household income (reported in variable hi). We subtract taxes and contributions from hi to arrive at household disposable income, often used as the basis of poverty and inequality analysis (reported in variable dhi).

    However, in some cases, the original datasets report only net income, that is, values after taxes and mandatory employee social contributions have been deducted. You can reliably compare dhi across LIS datasets, regardless of whether the dataset is classified by us as a net versus a gross dataset.

  • What are PPPs and deflators? How can I use them with the microdata?
  • PPPs refer to Purchasing Power Parities, which are often used to adjust exchange rates, to account for cross-national differences in price levels. While exchange rates can be used for comparative research on income, most researchers prefer to use PPP-adjusted exchange rates, because PPP-adjusted incomes hold roughly equal purchasing power measured in international prices. Price deflators render currencies equivalent across years within countries.

    The income, wealth and consumption variables in the LIS and/or LWS microdata (available through LISSY) are reported in national currencies – as are the mean and median income values in the Key Figures. To compare monetary amounts across countries and over time, researchers have to convert these values into a common currency and a common year’s prices. LIS leaves it up to researchers to choose exchange rates and/or deflators, as they wish, when using the LIS and LWS microdata.

    LIS has applied a set of PPPs and deflators (taken from the World Development Indicators) to the currencies in the Web Tabulator. We converted all income variables from nominal local currency units to 2011 international dollars. The conversion was done by applying first a national price deflator to the nominal amounts to express them all in terms of year 2011 national prices. Those amounts were then converted to international dollars using PPPs.

  • Why are the values in the LIS Key Figures different from similar indicators distributed by other organisations, such as the OECD?
  • There are several reasons that indicators – such as poverty rates, inequality measures, and labour force outcomes – vary across sources. The main reason is that the underlying microdata from which these measures are constructed may be different, which would imply different sampling techniques, imputation techniques, income concepts, and so on. Furthermore, even if using the same microdata, each organization makes its own decisions regarding, for example, defining countable income, equivalising income, weighting, top- and bottom-coding, PPPs, and so on.

    We make available extensive information about how we constructed our Key Figures (See. Key Figures programs), so that you can always understand how we arrived at our indicators.

Managing LISSY Jobs and Listings

  • What statistical packages can I use with LISSY?
  • R(3.4.2),SAS(9.4) ,SPSS(22), and Stata(16.1) programs all work with LISSY.

  • May I use external files with the LIS databases?
  • If you wish to use external files with the microdata, send your request, along with the attached file, to usersupport@lisdatacenter.org.
    Your request will be reviewed and, if meeting our security standards, you will receive an email with instructions on how to access your file.

  • I get the error message wrong header. How can I avoid this?
  • This error message is received – while submitting a job via email – when LISSY cannot properly read the first four-line header of the job. The easiest – and recommended – way to solve this problem is to submit jobs via the web-based Job Submission Interface.

    Would you prefer to submit jobs via email, ensure the following requirements are met in order for LISSY to properly process jobs, regardless of the programming language used.

    • All emails must be sent in ASCII/plain text format. Ensure your email software is properly configured
    • All job instructions must be written inside the body of the email and not as an attachment
    • Each job must start exactly with a specific four-line header at the very beginning of the email body:
    *user = <your userid>
    *password = <your password> (case-sensitive)
    *package = <statistical package chosen> (SAS, SPSS, Stata or R)
    *project = <project to access> (LIS or LWS)

    If the header contains an error, LISSY returns an error message email to the address from which the job request was submitted

  • What does set for review mean, and how can I avoid this?
  • LISSY automatically applies pre-processing checks on received jobs to authenticate the user and to ensure that the confidentiality of our data is never breached.

    As a result, some program syntaxes and commands may trigger security alerts. This includes, for example, syntax that displays frequencies on continuous variables or that would allow users to print individual records.
    As soon as LISSY detect that a job is potentially suspicious, LISSY stores it to a security area for review and send the error message set for review.

    As a consequence:

    • Adjust your code in order to avoid syntax that would potentially allow to print variables, regardless of the programming language used
    • Do not add the following commands in your job:


     print, NOXWAIT, NOXSYNC




     list, erase, rm, pwd, cd, rmdir, type, dir, ls

  • What does handling listing mean, and how can I avoid this?
  • The most common reason for which LISSY to hold on listings in the secure area for review is that listings are excessively long.

    As a consequence:

    • Split your program code into smaller parts and send several shorter job submissions
    • As much as possible, limit the number of datasets combined in a single run
    • Include in your code statistical commands to shorten outputs such as:


     OPTIONS nosource nonotes


     nolog, quietly or noisily

  • I get an error message in my output. Is there a way to debug my program?
  • Debugging can be helped by testing, on your own computer, your code on LIS or LWS downloadable sample files before submitting them to LISSY.

    Be aware that you cannot draw any conclusions based on these artificial samples including a small sub-sample of a random selection households and their respective household members.
    These have been created for instructional and debugging purposes only.

  • Can I receive non-ASCII output/listings from LISSY such as worksheets, HTML-pages, etc.?
  • So far, LISSY only sends back plain ASCII output/listings from SAS, SPSS, R, and Stata.

  • Why do I receive the error message Could not connect to the server?
  • The Job Submission Interface freezes or an error message Could not connect to the server? is sent when LISSY is temporarily down.
    Feel free to contact the user support usersupport@lisdatacenter.org if you face such problem.

Would you still experience a specific issue not addressed above, please feel free to contact usersupport@lisdatacenter.org. When related to LISSY and whenever possible, provide the user support with the following information:

  1. The type of access used ( web-based Job Submission Interface versus email)
  2. The network and the operating system from which you attempted to access to LISSY (e.g.: at Princeton university on Windows7 64-bits)
  3. A description of the exact stage at which the issue occurs. Any screenshot is welcome
  4. The error message you received if any
  5. The date and time when the problem occured
  6. If the problem is related to a specific job, please mention the job number