ZAF_1993-2019_PALMS_v01_M
Post Apartheid Labour Market Series 1993-2019
Name | Country code |
---|---|
South Africa | ZAF |
Labor Force Survey [hh/lfs]
Kerr, A. Lam, D. and M. Wittenberg. Post-Apartheid Labour Market Series 1993-2019 [dataset]. Version 3.3. Cape Town: DataFirst [producer and distributor], 2019. DOI: https://doi.org/10.25828/gtr1-8r20
The Post-Apartheid Labour Market Series (PALMS) version 3.3 is a stacked cross sectional dataset created by DataFirst at the University of Cape Town. The data consists of microdata from 69 household surveys conducted by Statistics South Africa between 1994 and 2019, as well as the 1993 Project for Statistics on Living Standards and Development conducted by SALDRU at UCT. The Statistics South Africa surveys include the October Household Surveys from 1994 to 1999, the bi-annual Labour Force Surveys from 2000-2007, including the smaller LFS pilot survey from February 2000, and the Quarterly Labour Force Surveys from 2008-2019. The data is at individual level, but household level variables may be created using the household id variable uqnr. No attempt has been made to link individuals or households across waves, although there was a panel element to the earlier rounds of the LFS, as well as the QLFS.
Sample survey data [ssd]
Households and individuals
v3.3: Edited, anonymised dataset for public distribution
2019-09-02
The current version of the Post Apartheid Labour Market Series (PALMS) is version 3.3 covering 1993-2019. Data from Statistics SA's Quarterly Labour Force Surveys 2018 and 2019 have been added to the data in this dataset. This version also contains the new the new calibrated weights for the data produced on the 4th of March 2020. This is a temporary solution to provision of the weights, which will be incorporated into PALMS version 3.4 currently being prepared by DataFirst.
Earlier versions of this dataset were:
Version 1.0.1:
The following derived variables were added to this version of the dataset:
enrollment3, enrolled: enrolment variables
hrslstwk: Hours worked in last week
employer1, businesstype1, businesstype2: questions about wage/self employment. More detailed in LFSs.
publicemp: a dummy for whether the individual is employed in the public sector
Changes in versions 1.0.2 and 1.0.3 were not recorded.
Version 1.0.4
The following derived variables were added to this version of the dataset:
Improved psu LFS 03:1 variable.
personnum in OHSs and LFSs to allow easier merging, on advice from Nicola Branson.
Extra EA/PSU variables in some years where these were missing.
numlabels, were also added, where these were missing.
Version 1.0.5
The following derived variables were added to this version of the dataset:
EA/PSU variables for both 2000 waves, both 2006 waves and both 2007 waves. EA variables may not always be consistent ACROSS waves but are correct WITHIN waves to allow for checks on the number of hh per ea, which look right in ALL LFS waves now.
Version 1.0.6.
In this version the 1994 income data from the OHS 1994 has been corrected, and the 1994 data no longer has the imputations and fixes from Statistics South Africa.
The OHS 1994 wage employment income is included as wageempincome2 and wageempincome3 (different for gross and net responses).
Version 1.0.7
This version has label changes (data signature is the same as version 1.0.6)
Version 1.0.8
In this version the variable uqnr_orig has been added, which is a string version of the household id variable exactly as it appeared in each survey. This will assist those researchers who wish to merge in extra data from the OHSs or LFSs.
Version 1.0.9, October 2012
The following variables have been added to this version of the data:
(i) The LFS variable for hours worked in last week
(ii) An homogenised EA variable
(iii) A variable indicating formality of firm an individual owns or works for
(iv) Variables for number of workers for self-employed (worker numbers for OHS only): selfformalreg selfvatreg selfpaidemp selfunpaidemp wageformalreg formalreg
In this version the OHS 1997 industry variable has been corrected to include industry of the self-employed.
Version 2.0, August 2013
(i) Included the QLFS up until March 2012 (inclusive)
(ii) Reworked approach to labour income variable creation (see documentation for more information)
Version 2.1, September 2013 (modified September 2015)
(i) Included a separate datafile with multiple imputations for labour income. This datafile is called "palmsv2.1miincomes" and can be used for analysis of trends in labour income over time. Details of the imputation process can be found in the document titled, "Multiply Imputed Labour Income Data,
PALMS v2.1 (1994-2012)"
(ii) Also included with this version of the PALMS are the cross-entropy weights
Not all the files in this dataset are version 2.1. DataFirst versions at file level, so only the files which have been updated will be re-versioned in a new release. This prevents users having to download the files which have not been changed.
Some of the data files in this version of the dataset - the ones that remain unchanged - will therefore still be version 2. The dataset will receive the version number of the latest versioned file.
The PALMS 1994-2012 version 2.1 dataset consists of the following files:
The data files (unchanged version 2 files)
The cross-entropy weights (unchanged version 1.2 files)
The incomes data (unchanged version 2)
The imputed incomes data version 2.1
Version 3.1 includes data from the QLFS 2013-2015 and data from the 1993 survey "Project for Statistics on Living Standards and Development", conducted by SALDRU at UCT.
Version 3.2 had data from the QLFS 2016-2017 added to version 3.1
There are currently over 120 variables in the dataset and over 5.7 million observations, including concerning children and the elderly. The variables included are mainly those to do with the labour market, although some household variables, such as dwelling type and access to services, as well as access to government social grants, are also included.
Topic | Vocabulary | URI |
---|---|---|
employment [3.1] | CESSDA | http://www.nesstar.org/rdf/common |
unemployment [3.5] | CESSDA | http://www.nesstar.org/rdf/common |
housing [10.1] | CESSDA | http://www.nesstar.org/rdf/common |
specific social services: use and provision [15.3] | CESSDA | http://www.nesstar.org/rdf/common |
income, property and investment/saving [1.5] | CESSDA | http://www.nesstar.org/rdf/common |
EDUCATION [6] | CESSDA | http://www.nesstar.org/rdf/common |
National coverage
The lowest level of geographic aggregation in PALMS is province.
The target population is all households. Coverage of workers' hostels, convents/monasteries, as well as institutions such as old age homes, hospitals, prisons and military barracks varied across the surveys. Data users will need to consult the individual OHS, LFS and QLFS datasets for information on the universe for each survey.
Name | Affiliation |
---|---|
Andrew Kerr | DataFirst, University of Cape Town |
David Lam | University of Michigan |
Martin Wittenberg | DataFirst, University of Cape Town |
Name | Role |
---|---|
Statistics South Africa | Producer of original datasets used to create the PALMS dataset |
PALMS v3+ includes several weight variables: the person weights released by Statistics South Africa/SALDRU, cross entropy weights created by Nicola Branson from SALDRU at the University of Cape Town, and a cross entropy weights created by Takwanisa Machemedze at DataFirst, UCT, who uses Branson’s method but the 2008 model for the South African population from the Actuarial Society of South Africa (ASSA). Branson used the ASSA 2003 model. Cross entropy weights are included and are the weighting variables preferred by DataFirst because the Stats SA weights presented in the data are problematic for analyses over time, for two main reasons. First, the auxiliary data used as a benchmark in the post-stratication adjustment are unreliable and inconsistent over time and hence result in temporal inconsistencies even at the aggregate level. Second, since the adjustments were made at the person level until 2003, there is no hierarchical consistency between the person and household weighted series until 2003. Thus estimates at the household and person level may disagree. Branson’s weights, created using entropy estimation, result in consistent demographic and geographic trends and can be used at both the person and household level. Machemedze’s weights should be used as these update Branson’s method using the newest ASSA model. For further details on the cross entropy approach see Branson and Wittenberg (2014).
Version 3.3. of PALMS includes a Stata data file with new weights for the data - “palms_v1_cewgt_care_model_20200304” which will eventually be added to version 3.4 of PALMS.
The weights are calibrated to mid-year population estimates from the CARe model discussed in Dorrington, Machemedze and Kerr 2020. The CARe model population estimates are an improvement to the previously used ASSA model estimates as the new model is based on updated assumptions and is consistent with recent data. The CARe model population estimates are also available for an extended period of time (1991-2022) compared to mid-year population estimates from Statistics South Africa (2002-2019).
Start | End |
---|---|
1993 | 2019 |
Name | Affiliation | URL | |
---|---|---|---|
DataFirst | University of Cape Town | http://support.data1st.org | support@data1st.org |
Public use files, available to all
Kerr, A. Lam, D. and M. Wittenberg. Post-Apartheid Labour Market Series 1993-2019 [dataset]. Version 3.3. Cape Town: DataFirst [producer and distributor], 2019. DOI: https://doi.org/10.25828/gtr1-8r20
Name | Affiliation | URL | |
---|---|---|---|
DataFirst Helpdesk | University of Cape Town | support@data1st.org | http://support.data1st.org/ |
World Bank Microdata Library | The World Bank | microdata.worldbank.org |
DDI_ZAF_1993-2019_PALMS_v01_M
Name | Affiliation | Role |
---|---|---|
DataFirst | University of Cape Town | Metadata Producer |
2020-05-03
Version 01: This survey metadata is identical to the same survey metadata (zaf-datafirst-palms-1993-2019-v3.3) available in the DataFirst website except Document ID and Study ID.