Produced with Scholar

Module A3 (2018) Survey ANALYST Creator project

Project Overview

Project Description

IMPORTANT: THIS PROJECT IS ONLY FOR SURVEY ANALYSTS.

Your Creator assignment is to draft an analysis plan that contains the following sections and tasks.

  1. Describe data cleaning checks
  2. Describe your plan for weighting
  3. Prepare table shells for five indicators
  4. Calculate the results for one of your tables
  5. Generate a graphical summary of one vaccination coverage indicator for one dose across all 13 strata
  6. Summarize methods
  7. Summarize results
  8. Identify caveats or concerns
  9. Identify strengths and limitations

DO NOT START THIS PROJECT IF YOU ARE A SURVEY MANAGER.

 

Icon for Survey ANALYST Creator project

Survey ANALYST Creator project

Analyst Creator Project: Module A3

I. Data Cleaning Checks:

Data cleaning is an essential step before analysis of any data. It ensures that invalid data is removed leaving quality dataset for analysis. In order to clean the given dataset from Nigeria (combined Multi-Indicator Cluster Survey and National Immunization Coverage Survey data), following checks were performed using STATA 15.0.

1. No Missing ID variables (stratum ID number, cluster ID number, Household ID & Individual number for child) or respondent ID

2. No Duplication of Unique ID variable

3. Missing values for must filled variable like sex, wealth index and vaccination status

4. Age must be equal to or more than 12 months and less than 24 months

5. Date of all vaccination should be in between date of birth and date of interview

6. Minimum duration of all vaccination (not before valid date)

7. Minimum duration gap between multi dose vaccines should be 28 days (Pentavalent, OPV and PCV)

8. BCG vaccination date should not be before birth date

9. All the date should not have invalid value like more than 31 or 30th February or 31st June etc

10. If card was seen, there must be at least one tick or one mentioned date on it

11. If card was not seen there must not be any value for tick on card or date on card for any vaccine

12. If child was vaccinated for at least one vaccine there must be one place of immunization

13. If child was not vaccinated for any vaccine there must be one reason for that mentioned

As data was recorded through tablets and photographs of the home based vaccination cards were taken. Missing values were checked and records were traced back through ID or key variables. Invalid dates were checked from cards and if no record were found, they were considered as missing data. For multi dose vaccines, inappropriate dated vaccines (before minimum gap days) were considered invalid. Duplicate ID variable records were checked and deleted from the analysis. After appropriate and justified data for missing values were obtained, these were replaced by using appropriate commands in STATA. The duration of data cleaning was 4 weeks.

II. Plan for weighting:

As the present survey was done under the multistage probability sampling, weighted analysis was done for the outcome indicators considering sampling probabilities are different for each respondent. The sampling weight was calculated using following 3 steps viz. 1) Calculating design weight 2) adjusting for non response 3) post stratification weights for aggregation

1) Calculation of design weight – This is calculated as reciprocal of probability of selection of every respondent in the survey. For this we need probabilities of selection of the clusters, the households and respondents. The probability of selection of cluster was calculated as number of clusters selected, divided by total number of clusters in the state. Similarly, probability of selected households was calculated by number of household selected, divided by total number of households in that cluster. The probability of selection of each child calculated by, number of children included in the survey from a household, divided by total number of children in the house. The probabilities at all stage of selection will be multiplied and reciprocal of this product will be the design weight.

2) Adjustment for non response – As it was not possible to get information from all selected respondents because of absence or refusal (non response), an adjustment was made to design weight to transfer the weight from non participant to participants in order to make the survey result to be representative of the target population. Non response adjustment factor can be calculated by taking the ratio of number of selected respondents and number of respondents who completed questionnaire. This factor will be multiplied with design weight to calculate the final weight after non response adjustment. For this we can use information of non response at each state (house hold stage and respondent stage)

3) Post stratification weights for aggregation – As the data were collected by a Multiple Indicator Cluster Survey (MICS) team, the household listing quality is high and thorough and this survey data will correctly estimate the relative proportion of respondents from each strata. So, post stratification is not required for weight aggregation in this case.

For further explanation about weight, WHO Vaccination cluster survey manual 2018 can be referred.

III. Table Shells for five indicators and IV. Completed Table with their explanation and interpretations were given with the attached word file below

updated_20tables.docx

IV. Completed Table

Table 1:Percentage of fully immunized children across gender as par history or card among 12-23 months of children, Nigeria 2016-17 (N=1728)

WN - Weighted Number, CI- Confidence Interval

Weighted coverages for fully immunization as per card or history and their confidence intervals for each state and zone were calculated using STATA. Result from STATA was copy pasted to this table. Total number of children(unweighted and weighted) contributed to the outcome of each state was stated in the table. Gender specific coverages were also given in the table for each state in view of concerns regarding equity. No group has less than 25 children (unweighted). 

“19.6 % (17.1-22.3) of the North East Zone 12-23 months children population who were eligible for the survey are estimated to be fully vaccinated, with having received <BCG, OPV1,2,3, PENTA1,2,3, MCV1>.” 95% confidence interval can be used to ascertain target population coverage. But this indicator is 42.5 (39.1-45.9) % in South South Zone. 95% CI can be used to calculate the DEFF and ICC which can be used as parameter for sample size calculation in future survey.

V. Table Filling Syntax:

Commands used in STATA software to fill the table was attached here as commented Syntax with the do file. These can be replicable and code is comprehensible.

*Analysing for a survey data set
svyset clusterid [pweight = psweight], strata(stratumid)
*calculating weighted crude fully vaccinated coverage across all states with confidence interval
svy: proportion fully_vaccinated_crude, percent citype(wilson) over( MICS_5_hh7)
*calculating gender specific crude fully vaccine coverage for each state
svy: proportion fully_vaccinated_crude if MICS_5_hl4==1, percent citype(wilson) over( MICS_5_hh7)
svy: proportion fully_vaccinated_crude if MICS_5_hl4==2, percent citype(wilson) over( MICS_5_hh7)
*calculating weighted crude fully vaccinated coverage across 2 zones with confidence interval
svy: proportion fully_vaccinated_crude, percent citype(wilson) over( MICS_5_zone )
*calculating gender specific crude fully vaccine coverage for zones
svy: proportion fully_vaccinated_crude if MICS_5_hl4==1, percent citype(wilson) over( MICS_5_zone )
svy: proportion fully_vaccinated_crude if MICS_5_hl4==2, percent citype(wilson) over( MICS_5_zone )
* getting data for graphical presentation of crude penta 3 coverage across states
svy: proportion got_crude_penta3_c_or_h, percent citype(wilson) over( MICS_5_hh7)
svy: proportion got_crude_penta3_c_or_h, percent citype(wilson) over( MICS_5_zone)

Syntax file

commented_20syntax.do

VI. Graphical presentation

Graph 1: Bar chart with error bar showing crude coverage and 95% CI of Pentavalent 3 antigen among 12-23 months children across 12 states of Nigeria

Weighted crude coverage and 95% confidence interval for Pentavalent-3 antigen were calculated for each state separately in STATA and results were exported to excel. In excel, bar chart and error bar were drawn. 95% CI was presented in error bars. Syntax of this was given along with table filling syntax

VII. Method Summary

STATA 15.0 was used for analysis of result and chart was made using excel. Missing data was considered as non immunized and not included in numerator during coverage calculation and was included in denominator. Indicators for each state and zone were calculated separately. As mentioned under table shell, weigthed and unweighted indicators were calculated appropriately. For crude coverage evidence from any source were considered and age eligible vaccines with proper scheduled were considered valid coverage. For each indicator coverage for boys and girls were calculated separately to assess the gender equity. Wilson 95% Confidence interval was calculated to generalize the result for the target population. Drop out for multi dose vaccines were calculated, taking coverage of 1st dose as denominator and those who couldn’t reach last dose after initiating 1st dose as numerator. Unweighted dropout % was calculated. Timeliness of vaccines was assessed considering recommended minimum age for the vaccine administration. Card retention was also calculated for population who were eligible for the survey and were estimated to have homebased record (card) with one or more vaccination dates on it. Numbers of respondents (weighted and unweighted) contributed for each row were provided in last column.

VIII. Result Summary

Table 1 depicts the fully immunization crude coverage across the 12 states and 2 zones. As per result, there was a gap in fully immunization coverage between north east [19.6 % (14.8-25.5)] and south soth zone[42.5 (39.1-45.9) %]. Weighted Vaccination coverage of south south zone states were found to be higher than the north east zone. In south south zone, bayelsa and delta had fully immunization coverage below the regional average. Vaccination status of Yobe, Traba and Bauchi staes were lagging behind among all states. Graphical presentation showed that crude Pentavalent 3 coverage was found to be lesser in North east Zone states than the south soth zone states. The figure revealed a regional disparity of pentavalent 3 coverage in Nigeria. 

VIII. Concern, Strength & Limitations

Strength: 1) Probability sampling was used 2) MICS had high quality of house listing data 3) Weighted indicators were calculated appropriately which more clearly present the target population 4) Wilson 95% Confidence intervals were calculated for point estimate 5) DEFF or ICC can be calculated from the result which can be used for sample size calculation for future survey

Limitations: 1) Post stratification aggregat was not done. This may introduce a source of sampling variability 2) Health facility records were not checked 3) Inch worm plots were not drawn, which were considered to be better than bar and error diagram in presenting confidence interval