ZAF_2011_QLFS-Q1_v02_M
Quarterly Labour Force Survey 2011
First Quarter
Name | Country code |
---|---|
South Africa | ZAF |
Labor Force Survey [hh/lfs]
The Quarterly Labour Force Survey (QLFS) is a household-based sample survey conducted by Statistics South Africa (Stats SA). It collects data on the labour market activity of individuals aged 15 years or older who live in South Africa.
Starting in 2005, Stats SA undertook a major revision of the Labour Force Survey (LFS). This revision resulted in changes to the survey methodology, the survey questionnaire, the frequency of data collection and data releases, and the survey data capture and processing systems. The redesigned labour market survey is the QLFS which was launched in 2008.
The Quarterly Labour Force Survey (QLFS) is a household-based sample survey conducted by Statistics South Africa (Stats SA). It collects data on the labour market activities of individuals aged 15 years and above who live in South Africa.
Sample survey data [ssd]
Individuals, households
v02: Edited, anonymous dataset for public distribution.
This version of the QLFS 2011 Q1 was downloaded from the Statistics South Africa (Stats SA) website in April 2014 as a revision to the version previously downloaded in January 2012.
The two versions have different weights. Stats SA updated the QLFS results (2008-2013) to reflect the new population benchmarks from Census 2011. Although the weighting changes are not clearly documented by Stats SA, users are advised to remain aware of these slight calibration differences when employing weights.
Household characteristics, household listing, demographics, education, economic activity, work for pay, business ownership, unemployment, employers, main work activity in the past week, wages, salary, employment, migration
National Coverage
The QLFS sample covers the non-institutional population except for workers’ hostels. However, persons living in private dwelling units within institutions are also enumerated. For example, within a school compound, one would enumerate the schoolmaster’s house and teachers’ accommodation because these are private dwellings. Students living in a dormitory on the school compound would, however, be excluded.
Name |
---|
Statistics South Africa |
The QLFS sample covers the non-institutional population except for workers' hostels. However, persons living in private dwelling units within institutions are also enumerated. For example, within a school compound, you would enumerate the schoolmaster's house and teachers' accommodation because these are private dwellings. Students living in a dormitory on the school compound would therefore be excluded.
Survey requirements and design :
The Labour Force Survey frame has been developed as a general purpose household survey frame that can be used by all other household surveys irrespective of the sample size requirement of the survey. The sample size for the QLFS is roughly 30 000 dwellings and these are divided equally into four rotation groups, i.e. 7 500 dwellings per rotation group.
The sample is based on information collected during the 2001 Population Census conducted by Stats SA. In preparation for the 2001 census, the country was divided into 80 787 enumeration areas (EAs). Some of these EAs are small in terms of the number of households that were enumerated in them at the time of Census 2001. Stats SA's household-based surveys use a Master Sample which comprises of EAs that are drawn from across the country. For the purposes of the Master Sample the EAs that contained less than 25 households were excluded from the sampling frame, and those that contained between 25 and 99 households were combined with other EAs to form Primary Sampling Units (PSUs). The number of EAs per PSU ranges between one and four. On the other hand, very large EAs represent two or more PSUs.
The sample is designed to be representative at the provincial level and within provinces at the metro/non-metro level. Within the metros, the sample is further distributed by geography type. The four geography types are: urban formal, urban informal, farms and tribal. This implies that for example, that within a metropolitan area the sample is designed to be representative at the different geography types that may exist within that metro.
The current sample size is 3 080 PSUs. It is equally divided into four sub-groups or panels called rotation groups. The rotation groups are designed in such a way that each of these groups has the same distribution pattern as that which is observed in the whole sample. They are numbered from one to four and these numbers also correspond to the quarters of the year in which the sample will be rotated for the particular group.
The sample for the redesigned Labour Force Survey is based on a stratified two-stage design with probability proportional to size (PPS) sampling of primary sampling units (PSUs) in the first stage, and sampling of dwelling units (DUs) with systematic sampling in the second stage.
Sample rotation :
The sampled PSUs have been assigned to 4 rotation groups, and dwellings selected from the PSUs assigned to rotation group "1" are rotated in the first quarter. Similarly, the dwellings selected from the PSUs assigned to rotation group "2" are rotated in the second quarter, and so on. Thus, each sampled dwelling will remain in the sample for four consecutive quarters. It should be noted that the sampling unit is the dwelling, and the unit of observation is the household. Therefore, if a household moves out of a dwelling after being in the sample for, say 2 quarters and a new household moves in then the new household will be enumerated for the next two quarters. If no household moves into the sampled dwelling, the dwelling will be classified as vacant (unoccupied).
Each quarter, ¼ of the sampled dwellings rotate out of the sample and are replaced by new dwellings from the same PSU or the next PSU on the list. A total of 3 080 PSUs were selected for the redesigned LFS, and 770 have been assigned to each of the four rotation groups.
Western Cape - 81.9%
Eastern Cape - 98.9%
Northern Cape - 91.5%
Free State - 95.9%
KwaZulu-Natal - 97.4%
North West - 94.9%
Gauteng - 82.0%
Mpumalanga - 96.2%
Limpopo - 99.5%
South Africa - 92.9%
Stats SA updated the QLFS results (2008-2013) to reflect the new population benchmarks from Census 2011. Although the weighting changes are not clearly documented by Stats SA, users are advised to remain aware of these slight calibration differences between the previous version and the current (revised) data version when employing weights.
The sampling weights for the data collected from the sampled households are constructed so that the responses could be properly expanded to represent the entire civilian population of South Africa. The weights are the result of calculations involving several factors, including original selection probabilities, adjustment for non-response, and benchmarking to known population estimates from the Demographic division of Stats SA. The base weight is defined as the product of the provincial Inverse Sampling Rate (ISR) and the three adjustment factors, namely adjustment factor for informal PSUs, adjustment factor for subsampling of growth PSUs, and an adjustment factor to account for small EAs excluded from the sampling frame (i.e.EAs with fewer than 25 households).
Non-response adjustment:
In general, imputation is used for item non-response (i.e. blanks within the questionnaire), and edit failure (i.e. invalid or inconsistent responses). The eligible households in the sampled dwellings can be divided into two response categories: respondents and non-respondents, and weight adjustment is applied to account for the non-respondent households (e.g. refusal, no contact, etc.). The sampled dwellings with no eligible households, e.g. foreigners only, or no households, (i.e. vacant dwellings), do not contribute to the survey. The non-response adjusted weight is the product of the base weight with the non-response adjustment
factor given above. If the PSU level non-response rate is too high, the non-response adjustment is applied at the VARUNIT level, where two VARUNITs have been created by grouping PSUs within strata. PSU level non-response adjustment is applied only if the corresponding adjustment factor is less than 1,5.
Final survey weights:
The final survey weights are constructed using regression estimation to calibrate to the known population counts at the national level population estimates (which are supplied by the Demography division) crossclassified
by 5-year age groups, gender and race, and provincial population estimates by broad age groups are used for calibration weighting. The 5-year age groups are: 0–4, 5–9, 10–14,…………………. 55–59, 60–64, and 65 and over. The provincial level age groups are: 0–14, 15–34, 35–64, and 65 years and over. The final weights are constructed in such a manner that all persons within a household would have the same weight.
The questionnaire consists of the following sections:
Section 1 - Biographical information (marital status, language, migration, education,training, literacy, etc.
Section 2 - Economic activities
Section 3 - Unemployment and economic inactivity
Section 4 - Main work activities in the last week
Section 5 - Earnings in the main job
All sections - Comprehensive coverage of all aspects of the labour market
Start | End |
---|---|
2011-01-11 | 2011-03-15 |
Name |
---|
Statistics South Africa |
Data Processing
Introduction :
The purpose of data processing is to ensure that the information collected from the sampled primary sampling units, dwelling units and households (i.e. the boxes containing QLFS questionnaires) are physically received, stored and processed. The aim is to produce a clean dataset that has all the information contained in the questionnaires. Except for the scanning system, all other elements of the data processing system were developed in-house. One important innovation that is central to the smooth operation of the entire system is the development of barcodes that are linked to a unique number on each questionnaire. This information provides the link between the information recorded in the Master Sample database and other processes such as editing and imputation as well as weighting and variance estimation.
Processing phases :
QLFS data processing is continuous, starting on the second week of every month. Data processing for each quarter must be completed by the first Friday of the subsequent month to ensure that the four-week deadline for publication of the QLFS results is met.
The phases listed below occur sequentially.
Receiving of questionnaires :
The contents of the boxes containing questionnaires sent from the regional offices are verified when received at the DPC. The questionnaire barcodes captured in the provinces are captured again at the DPC to ensure that all questionnaires have been received.
Primary preparation :
The purpose of primary preparation is to ensure that all questionnaires are correctly stacked and positioned prior to being guillotined.
Guillotining:
The purpose of the guillotine process is to cut off the spines of the questionnaires in order to have pages separated for scanning.
Secondary preparation :
The purpose of secondary preparation is to ensure that the questionnaires are correctly stacked and positioned for scanning. At the same time, quality assurance takes place on the work done during the primary preparation and guillotining processes.
Scanning :
The purpose of scanning and recognition is to convert the questionnaires into an electronic format and Tagged Image File Format (TIFF) images.
Verification :
The purpose of scanning verification is to manually correct un-interpretable characters, missing data and errors detected by validation rules.
Electronic coding:
Industry and occupation codes are assigned using the electronic coding system which converts the respondents' industry and occupation descriptions into numeric codes based on Standard Industry Classification (SIC) and South African Standard Occupation Classification (SASCO). If the system fails to assign a code for either industry or occupation, the coding is assigned manually.
Automated editing and imputation :
QLFS uses the editing and imputation module to ensure that output data is both clean and complete10. There are three basic components, called functions, in the Edit and Imputation Module:
Function A: Record acceptance
Function B: Edit and imputation
Function C: Clean up, derived variables and preparation for weighting
Function A: Record acceptance
This function is divided into three phases:
First phase: Pre-function A :
The first phase ensures that the records contain valid information in selected Cover Page questions required during edit and imputation and during the subsequent weighting and variance estimation. Any blanks or other errors that need to be corrected are done here before processing of the record can proceed.
Second phase: Function A record acceptance :
The second phase ensures that there is enough demographic and labour market activity information to ensure that editing and imputation can be successfully completed.
Third phase: Post Function A clean up :
This phase ensures that certain data are present where there is evidence that they should be. This for example, involves:
• Ensuring that if there is written material in the job description questions then there are corresponding industry and occupation codes for them.
• Ensuring that partial blanks or non-numeric characters that appear in questions where the Survey Officer is required to enter numbers are validated.
• Ensuring that where there is written material in the space provided for "Other - specify" that the corresponding option is marked.
Function B: Edit and imputation :
Having determined in Function A that the content of the record would support extensive editing and imputation, this function carries out those activities. Editing is the detection of errors in the captured questionnaire. Imputation is the correction of the detected errors.
Function C: Clean up, derived variables and preparation for weighting :
Function C includes all of the "post E&I clean up" functions such as "Off-path cleaning", "Result Code validation", verification of the presence of industry and occupation codes, and the generation of all derived variables.
Electronic data processing systems have been developed to ensure that the key QLFS results are published four weeks after the end of data collection each quarter. The system is fully automated and includes the seven sub-systems discussed in detail in the subsections that follow (9.3.1 to 9.3.6): Real Time Management System (RTMS)
RTMS serves two important functions. Firstly, it is a management tool for personnel involved with field operations to monitor progress. Secondly, it provides an important link between field operations and data processing as follows:
• Ensures that publicity information at the PSU and dwelling unit level can be rapidly assessed, thus allowing for speedy intervention should the need arise.
• Enables the tracking and monitoring of PSU listing books and questionnaires from the provinces to HO as well as through the processing phase.
Variance estimation:
The most commonly used methods for estimating variances of survey estimates from complex surveys, such as the QLFS, are the Taylor Series Linearization, Jackknife Replication, Balanced Repeated Replication (BRR), and Bootstrap methods (Wolter, 2007)1. We implemented the replication method for the QLFS mainly because of simplicity. The QLFS sampled 3 080 PSUs by selecting an even number of 4 or more PSUs from within strata. The Jackknife method would be applicable for the sample design with more than two PSUs per stratum, but this would result in 3 080 replicates, which would be computationally very intensive. The Fay’s BRR method on the other hand is applicable when two primary sampling units (PSUs) are sampled from each stratum. Therefore we decided to use Fay’s BRR method by collapsing PSUs into two groups of PSUs within each stratum.
Other measures of precision:
In practice, the sampling variance itself is hardly ever reported. Instead, users find it more useful to rely on one of the derivatives of the sampling variance, such as the standard error, the coefficient of variation, the
margin of error, or the confidence interval.These are all related expressions, and it is quite easy to go from one to the other using simple mathematical operations.
See the metadata and QLSF guide for more detailed information.
Name | URL | |
---|---|---|
Statistics South Africa | http://www.statssa.gov.za | distribution@statssa.gov.za |
Users may apply or process this data, provided Statistics South Africa (Stats SA) is acknowledged as the original source of the data; that it is specified that the application and/or analysis is the result of the user's independent processing of the data; and that neither the basic data nor any reprocessed version or application thereof may be sold or offered for sale in any form whatsoever without prior permission from Stats SA.
Statistics South Africa. Quarterly Labour Force Survey 2011: Q1 [dataset]. Version 02. Pretoria: Statistics South Africa [producer], 2011. Cape Town: DataFirst [distributor], 2012.
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
Name | Affiliation | URL | |
---|---|---|---|
User Information Services | Statistics South Africa | info@statsa.gov.sa | http://www.statssa.gov.za |
DataFirst | University of Cape Town | info@data1st.org | http://www.datafirst.uct.ac.za |
Statistics South Africa - Printing and Distribution | distribution@statssa.gov.za | http://www.statssa.gov.za |
DDI_ZAF_2011_QLFS-Q1_v02_M_WB
Name | Affiliation | Role |
---|---|---|
Development Economics Data Group | The World Bank | Documentation of the DDI |
2011-11-20
Version 02 (April 2014)
This version is identical to Version 01, with revisions to data.