Ajay Goel’s Updates

Week 2 community assignment

1. Describe the primary sampling units (PSU) and secondary sampling units (SSU) used in country C and in Kinshasa province in the study by Burnett et al.

PSU – 15 neighborhoods sampled using systematic probability proportional to estimated size methods from a sample frame all neighborhoods with estimated population in each zone.

SSU – 16% geographically randomly selected households selected in each of the 15 neighborhood using a GIS software. The listing of HH by neighborhood was not available therefore selected neighborhood were grouped by HF catchment areas. Geographically random selection of HH was done from HF catchment areas.

2. The Kinshasa province survey described in Burnett et al. had 3 sampling stages, what was the third sampling stage (hint: look under “Survey Objectives and Sample Size”?)

First stage – Cluster selection, 15 neighborhood selection in each of the 12 zones

Second stage – Household selection, households were selected randomly using a GIS software with in HF catchment area

Third stage – for HH survey, In HHs with >1 child aged 12–23 months, one of the children was randomly selected.

for linked HH and HF Survey: In houses with>1 infant aged 6–11 months, one of the infants was randomly selected to participate.

3. Based on the definitions of probability sampling and sampling frame found in the 2015 Vaccination Coverage Survey Reference Manual (section 3.6, section 6.2 and annex A), what do you think of the sampling frames used in country C vs. the sampling frame used in the survey described by Burnett et al.? Describe potential limitations of the frames used and how they may relate to sampling bias

Country C: At first stage - For each of the six provinces, Enumeration areas with their number of households is sample frame for seletion of primary sampling units or clusters. For the second stage of sampling, the list of households in the selected clusters or EAs is a sampling frame to randomly select the required 22 households from each cluster using systematic random sampling methodology.

Burnett et al. At first stage, the sampling frame consists of a list of all neighborhoods with estimated population in each of the 12 zones provided by DRC EPI. For 2nd stage - After selection of the 15 neighborhoods for each of the zone, neighborhood were regrouped according the HF catchment area. Sampling frame at this stage are all the household which are available in satellite image or digital map form for random selection using a GIS software within HF catchment areas.

Sampling frame for Coutnry C is very well defined enumeration areas used for census purpose. The population estimation in these EAs will be more accurate than the neighborhoods with estimated population provided by EPI in Burnett er al. survey. EA in Country C also have well defined admin geographical boundary compare to neighborhoods mentioned in Burnett paper.

Limitation of Burnett et all sampling frame: absence of well-defined admin geographic areas for neighborhood will lead to coverage estimation problem at neighborhood level.

In this survey, spatial sampling was not originally planned. Spatial sampling has its own sampling biases which might need geostatistical methods to reduce the impact of biases.

The HF catchment areas are never strictly defined. Population often overlap between different catchment areas, Selecting sample in catchment areas has proximity biases. Household near to catchment areas has higher probability to be vaccinated compared to HH further to HF.

Limitation of Country C sampling frame: Clusters have homogenous population which contribute to recording same population characteristics multiple time and may lead to overrepresentation of one group.

4. In the Kinshasa province survey described in Burnett et al., the expected sample size was not reached. The authors describe two potential factors that may have contributed to this. How could have this been prevented? What are the main consequences of not reaching the expected sample size?

The two factors mentioned are

a) the use of the DHS HH line list because this list only includes HHs with an adult female
b) the response rate was lower than anticipated because many HHs were not available during the survey period.
 

Prevention

following steps can be taken to prevent high non-response rate

Visiting HH in the evening

Visiting HH during weekend

Staying longer in the field

Check the internal migration pattern and increase the HH sampled if internal migration is an issue

Consequences

Non-response bias can affect how well data represents the population being surveyed. The result will not truly represent the population.

Wider confidence interval

Waste of money, time and resources and also missed opportunity