/* The Inequality Key Figures disseminated on the LIS website are computed using the R programming language. When attempting to replicate these figures in Stata, small discrepancies may occasionally arise. These differences are due to variations in numerical precision between the two software environments, as well as dataset-specific characteristics. For example, in datasets with very small subsamples (such as poorsm – rs19), results may be sensitive to whether a few households are classified as poor or not. In other datasets with many identical values, even minor precision differences can affect poverty classification and lead to mismatches. Although such discrepancies are very rare, users seeking full replication of the published figures are advised to use R. For further questions, please contact the LIS User Support team at usersupport@lisdatacenter.org */ // package = stata // project = lis // To select specific datasets, other than an entire country series, use option 'ccyy()' instead of 'iso2()' --> example: ccyy(de22 it16) // 1) Load data lissyuse, iso2(lu) /// hvars(hid hwgt nhhmem dhi dname year nhhmem65 nhhmem17) // (data is at household-level) levelsof dname, local(levels) foreach ccyy of local levels { ** preserve ** di "`ccyy'" keep if dname == "`ccyy'" // 2) Data preparation *=================================== * Filter out missing observations *=================================== drop if missing(dhi) *=================================== * Create person\children\elderly weights (data is at household-level) *=================================== generate double pwt = hwgt * nhhmem generate double cwt = hwgt * nhhmem17 generate double ewt = hwgt * nhhmem65 *=================================== * Bottom and top coding / outlier detection *=================================== generate double dhi_log = log(dhi) replace dhi_log = 0 if dhi_log == . & dhi != . // keep negatives and 0 sort dhi_log hwgt qui percentils dhi_log [aw = hwgt], p(25 75) local p75 = e(Perc_75) local p25 = e(Perc_25) gen double iqr = `p75' - `p25' // interquartile range gen double upper_bound = `p75' + (iqr * 3) // upper bound for extreme values gen double lower_bound = `p25' - (iqr * 3) // lower bound for extreme values replace dhi=exp(upper_bound) if dhi>exp(upper_bound) // top code income at upper bound for extreme values replace dhi=exp(lower_bound) if dhi= poverty_line & ey!=. cap drop poverty_line qui sum pov`x' [w=pwt] display "Relative Poverty Rates - Total Population (`x'%) = " r(mean) qui sum pov`x' [w=cwt] display "Relative Poverty Rates - Children (`x'%) = " r(mean) qui sum pov`x' [w=ewt] display "Relative Poverty Rates - Elderly (`x'%) = " r(mean) } ** restore ** }