Assessment for Learning MOOC’s Updates

Educational Evaluation: Kirkpatrick Evaluation Model

Kirkpatrick Evaluation Model is developed by Donald Kirkpatrick in the 1950s. Kirkpatrick has four-level criteria in assessing educational product or program as below:

Four levels in Kirkpatrick Evaluation Model
Criteria Description
Level 1 - Reaction The degree to which the experience was valuable (satisfaction assessment)
Level 2 – Learning The degree to which participants acquired the intended knowledge, skills, and attitudes as a result of the training
Level 3 – Behavior The degree to which participants’ behaviors change as a result of the training
Level 4 – Results Determine the tangible results of the training

Strength and weakness analysis as per the level assessment:

Level 1 – Reaction

Guidelines for Evaluating Reaction:

  1. Determine what you want to find out.
  2. Design a form that will quantify reactions.
  3. Encourage written comments and suggestions.
  4. Get 100 percent immediate response.
  5. Get honest responses.
  6. Develop acceptable standards.
  7. Measure reactions against standards and take appropriate action.
  8. Communicate reactions as appropriate.

(Kirkpatrick and Kirkpatrick, 2006)

Data collection method:

Based on point number 2 above, we can conclude that Kirkpatrick and Kirkpatrick referred to written surveys or questionnaires.

Strengths:

  1. The assessment flow is more organized as it uses written surveys.
  2. 5 points Likert-scale is easy to measure especially if the data analyst employs technology such as Microsoft Excel and an online survey platform.
  3. Point number 3, the comment and suggestion, enables stakeholders to collect perception-based insights. 

Weaknesses:

  1. To benchmark the score relevantly over time, the institution should maintain the same rubric in the survey which quite hard to adapt because the advancement of teaching and learning could shift the society's paradigm towards the ideal education, especially the advent of developing technology, therefore the assessment criteria.
  2. Points number 6 and 7 could be perceived as challenges because they refer to how appropriate the rubric criteria or rubric scoring is made.

Level 2 – Learning

Guidelines for Evaluating Learning

  1. Use a control group if practical.
  2. Evaluate knowledge, skills, and/or attitudes both before and after the program.
  3. Use a paper-and-pencil test to measure knowledge and attitudes.
  4. Use a performance test to measure skills.
  5. Get a 100 percent response.
  6. Use the results of the evaluation to take appropriate action.

(Kirkpatrick and Kirkpatrick, 2006)

Data collection methods:

Based on points number 1 and 2 above, we can conclude that Kirkpatrick and Kirkpatrick referred to observation and formative and summative evaluation respectively.

Strengths:

  1. Observation-based assessments are able to balance formative and summative assessments
  2. Control group and experimental group treatment enable the institutions to assess differences in applying methods

Weakness:

  1. Point number 4 is not applicable to assess short-term training programs, e.g 5 days training course which commonly only uses one-time formative assessment usually known as training needs analysis, and q one-time closed-question exam at the end of the training.

Level 3 – Behaviour

Guidelines for Evaluating Behavior

  1. Use a control group if practical.
  2. Allow time for behavior change to take place.
  3. Evaluate both before and after the program if practical.
  4. Survey and/or interview one or more of the following: trainees, their immediate supervisor, their subordinates, and others who often observe their behavior.
  5. Get 100 percent response or a sampling.
  6. Repeat the evaluation at appropriate times.
  7. Consider cost versus benefits.

Data collection methods:

  1. Surveys and questionnaires
  2. Observation and checklists
  3. Work review
  4. Interviews and focus groups

(Kirkpatrick and Kirkpatrick, 2007)

Strength:

  1. The interviewees or assessed subject is more than students, in point number 4 more stakeholders are involved. This multi-feedback supports multi-perspective analysis.

Weakness:

  1. The internal assessors might lead to less objective assessment. The institution may look for external assessors in order to have fair and good judgment.

Level 4 – Result

Guidelines for Evaluating Results

  1. Use a control group if practical.
  2. Allow time for results to be achieved.
  3. Measure both before and after the program if practical.
  4. Repeat the measurement at appropriate times.
  5. Consider cost versus benefits.
  6. Be satisfied with evidence if proof is not possible.

Data collection methods:

From the book Implementing the Four Level, it is stated that Level 4 data can be obtained by two major methods—borrowing it from your internal partners, or gathering it yourself (pg. 114)

The result here refers to the performance indicators that should be set prior to the program. This advance setting impacts the method of collecting data, for instance, to measure % Students attendants, institutions should consistently calculate it everyday perhaps with the help of technology.

Strength:

  1. Performance indicators are reliable monitoring tools as they express the situation in a numerical manner.

Weakness:

  1. The lack of capability of setting the right metrics and target could lead to not focused orientation, unrealistic expectations, and wasting time and money to achieve institutional goals.

Overall valuation:

  1. Kirkpatrick's model is a good evaluation model because it has many areas of modification. But if the institution failed to adjust the measurement metrics, rubrics, and tools to the institution's operational condition hence the utilization of this model cannot be optimal.
  2. Jennifer Greene in the video lectures stated that to evaluate programs, we should gather internal and external stakeholders' perspectives. While Kirkpatrick's model is more focused on the progress of students, this opinion is drawn from Levels 2, 3, and 4 which the students are the most subject to be assessed.
  3. Kirkpatrick's model is more like a workflow of assessment rather than a standard if we look at the point which Jennifer Greene emphasizes "Evaluating the Evaluator". Kirkpatrick's model doesn't provide standardized rubrics and scores to this aspect.
  4. Kirkpatrick's model can be adaptable for an agile working environment (i.e learning development program inside a software company) as long as the follow-up action is made immediately after the assessment because the four-level is not linear. The four-level is an iterative concept which highly supports the changes in fast-paced circumstances.

References:

Kirkpatrick, D. L., Kirkpatrick, J. D. (2006), Evaluating Training Programs, available at: https://www.worldcat.org/title/evaluating-training-programs-the-four-levels/oclc/318612381

Kirkpatrick, D. L., Kirkpatrick, J. D. (2007), Implementing the Four Levels, available at: https://www.amazon.com/Implementing-Four-Levels-Practical-Evaluation/dp/1576754545