* user = your-id * password = your-password * package = stata * project = lis ***Variable selection and data preparation*** local ccyy "" // enter here the country-year identifier, e.g. "us19" local hhvars "did hid hwgt nhhmem nhhmem65 nhhmem17 dhi" use `hhvars' using $`ccyy'h, clear * select only records if dhi filled drop if dhi==. ***Bottom and top coding / outlier detection*** * create disposable household income in logs gen dhi_log=log(dhi) * keep negatives and 0 in the overall distribution of non-missing dhi replace dhi_log=0 if dhi_log==. & dhi!=. * detect interquartile range qui sum dhi_log [w=hwgt],de gen iqr=r(p75)-r(p25) * detect upper bound for extreme values gen upper_bound=r(p75) + (iqr * 3) gen lower_bound=r(p25) - (iqr * 3) * top code income at upper bound for extreme values replace dhi=exp(upper_bound) if dhi>exp(upper_bound) * bottom code income at lower bound for extreme values replace dhi=exp(lower_bound) if dhi=r(p50)*.`r' & ey!=. quietly sum pov`r' [w=pwt] display "Relative Poverty Rates - Total Population (`r'%) = " r(mean) quietly sum pov`r' [w=cwt] display "Relative Poverty Rates - Children (`r'%) = " r(mean) quietly sum pov`r' [w=ewt] display "Relative Poverty Rates - Elderly (`r'%) = " r(mean) }