According to the 2013 National Assessment of Education Progress (NAEP) results, only 35% of the nation’s fourth graders scored at or above the proficient level in reading (NAEP 2013). This lack in reading proficiency has ramifications for all subject areas and has the potential to put students on a path to failure in school and beyond.

While there are various assessments in the market that can provide an abundance of empirical data about students, these rarely if ever prove to be useful instructionally. For example, a widely used assessment for the purpose of instructional improvment is MAP (Measures of Educational Progress). This is a computer adaptive assessment that is supposed to be administered at multiple points throughout the year in order to help teachers monitor the progress of students and provide differentiated instruction based on the skill levels of each student. However, the US Department of Education's National Center for Education and Evaluation (NCEE) recently conducted a two year study on the impact of the MAP assessment on reading achievement for students in grades 4 and 5 and found: "MAP teachers were not more likely than control teachers to have applied differentiated instruction in their classes. Overall, the MAP program did not have a statistically significant impact on reading achievement in either grade 4 or grade 5" (Cordray et al, 2012, xii). In this manner, although the teachers were able to get a lot of data on their students from the assessment, this data was not actually meaningful.

Two major challenges with such item based style reading assessments is that 1) they do not allow for additional probes to determine the thinking process about why a student responded in a certain way, and 2) most often the type of readings that students do for this style of reading assessment are a series of short passages (see sample passage lengths for the new PARCC and Smarter Balanced computer based reading assessments below; for example, with 3rd grade, the max length is 800 words, which is only a little over three pages of traditional print) and then answer a few questions on each.

In a real classroom setting, this is often not the case. In actual classroom instruction, a teacher will often ask follow-up probes to determine if a student is on the right track and by the time students are in third grade, most of the books students reading are chapter books where students must be able to hold onto knowledge for sustained periods of time (see results from top 10 books for third graders below, which nearly all are chapter books, but sadly, none are non-fiction). Assessments should be a reflection of the day-to-day instructional practices.

This is referred to as the ecological validity of an assessment. Afflerbach summarizes the need for ecological validity in assessment and overall instruction through the form of having to answer two basic questions:

Does the student work on an assessment generalize to what is normally done in the classroom?
Does student work in the classroom generalize to important tasks and accomplishments in the world outside the classroom? (Afflerbach, 2012, p. 22)

In essence, the focus is not only on having an assessment that reflects what is done in the classroom, but also making sure that what is done in the classroom helps provide students with skills that are applicable to life outside the classroom.

The pilot assessment described in this Work attempts to address the problems of standardized item-based assessment and ecological validity through the use of a formative reading assessment that is based on having individual comprehension conversations with students around a set of nonfiction readings that are not part of their regular curriculum.

PARCC Reading Passage Lengths

Smarter Balanced Passage Lengths

What Third Graders Are Reading (from Farr, 2013)

Description of the Assessment

Summary of Pilot Study for Individual Comprehension Conversations
Purpose	Create an assessment that will help teachers better support intermediate grade students in their reading of informational textsAssessors/teacher will be able to provide students with up to three prompts per question to probe for student understandingusing the following seven prompts: Tell me more. [TMM] Why is that important? [WI] What in the text makes you think that? [WIT] Why do you think that? [WTT] (Repeat the Question) [RQ] Can you provide an additional perspective or point of view? [APV] Can you clarify about whom (or what) you are talking? [CW]
Participants	Individual students in grades 3-6; the total sample size will eventually be 600 students, but the current total (as of September) is approximately 200 studentsStudents are selected that are reading are reading at grade level, or one grade level above or below, as determined by classroom teacher judgement and reading test scores (e.g. MAP, Lexile, BAS, DRA) from various public schools in the Chicagoland areaFor the purposes of the pilot, the assessors will all be trained literacy specialists that are unfamiliar to the students.
Texts	24 original, developmentally leveled nonfiction books of tradebook quality created by well-respected authors in the field of children's publishing written on various high-interest topics such as the life cycle of butterflies, Pluto not being a planet, and the life of Nelson Mandela18 secondary articles on related topics
Components	Student and assessor/teacher preview a nonfiction anchor text together Student reads anchor text in book form (800-1200 words) Student fills out the first half of a graphic organizer as s/he reads in order to prepare for a comprehension conversation Assessor/teacher and student have a one-on-one comprehension conversation about the text (usually 10-15 minutes per student) Student reads a second nonfiction article on a related topic (400-600 words), and fills out the second half of a graphic organizer and answers a written short answer response question, sythesizing information gained from both texts
Length of Study	Three months: September 1, 2014-November 25, 2014

Underlying Theoretical Background

Design Rationale

It is well documented that students are often capable of reading texts that are "beyond" their reading level, if the topic is of interest to them. In this manner, teachers should try to give student the opportunity to select readings that are of particular interest to them, however, beyond that, is the importance that students understand the purpose of reading not just for pleasure, but for gaining information for concrete purposes. Here, the student is reading with an assessor to learn how talking about texts helps everyone involved gain a better understanding of the text. Reading expert, Roger Farr explains: "Reading with students as they read, making sure thoughtful discussions take place after reading, and providing opportunities for students to use what they have gained from reading can all foster both a feeling of success and learning the value of reading to meet goals" (p. 2). In this manner, reading with a student is not just about the specific information that is gained from a text, but the affective aspect of having a student see him/herself as a success reader.

The approach to reading used here is based on a developmental continuum of reading. The assessment is meant to help provide teachers with a roadmap to providing differentiated instruction to students based on individual student needs. The goal is to take a wholistic approach to students to find individual students' "instructional reading level", where the student is challenged by the reading and assessment tasks, but not to the point of frustration. In this manner, if a student provides logical responses to all given questions, this would demonstrate that the testing level was too easy, and the student should be given a higher level assessment level (there are 12 total levels for grades 3-6, two parallel forms per level, for a total of 24 anchor texts plus 18 secondary texts).

Having prompts that are open-ended provides teachers with a set of tools that allow them to use these probes in their everyday instruction to help better evaluate if, with additional scaffolding (ie. the probes), the student can come up with a logical response to a text-based question using text evidence.

Although each item has a "target" response created by the research and development team, these are meant to serve as guidelines, and not the "only" correct response. This is meant to allow the students to demonstrate understanding of the text, potentially showing a different perspective than the original design, but that this response can be just as valid as the "target." For the purposes of the pilot, the assessors are all part of the research team, but for future use, the assessors will be the students' own teachers, and the expectation is that teachers will have periodic assessment reliability meetings in order to compare select responses from students and determine whether or not they are acceptable responses. In this respect, the assessment will follow a limited recursive feedback system in which assessors and students have a dialogue about the items during the assessment and then the assessors will later discuss the responses and come to consensus about how they as a school will be evaluating student responses to questions. In this case, the feedback system is limited in the sense that students may ask the teachers/assessors questions during the comprehension conversation, but that the teachers are not really allowed to respond to any text-based questions, other than perhaps turning it back on the student and saying, "what do you think?" The assessor/teacher is not allowed to respond to questions because it might lead students to a different way of thinking, and the assessment is meant to evaluate student thinking, not the thinking of the assessor/teacher.

The questions center around three types of thinking: literal, inferential, and critical thinking. The literal questions target factual knowledge drawn from the text that enable the reader to understand the information being presented. For example, if the text is on the life cycle of butterflies, the student must be able to understand what each of the stages of the life cycle are. The Inferential thinking questions have to do with being able to pull information from a text that is not explicitly stated, such as a student being able to infer that by painting a subject, one must use observation skills. The critical thinking questions require the student to think beyond the text to evaluate and critique the information presented in the text or the way the information is presented. As teachers have the reliability meetings, they should be evaluating each sample response not only as they releate to the target responses, but also from the perspective of the type of question being asked, for example, if for a critical thinking question, the student cites text evidence, but provides no evaluation of the text, the response should be coded as something like "incorrect; limited response."

The fact that the texts are not part of the regular curriculum has to do with the idea that the text should be new to the student so that s/he will not necessarily have a lot of built-up prior knowledge about the topic and be able to see the text with fresh eyes. However, the point is not so much about the information the student has learned from the text, so much as it is about seeing what strategies the student is able to apply to these unfamiliar texts.

Sample Data

Sample Text from Assessment

In this sample, third and fourth grade students were asked to preview and then read and talk about a book on the real-life story of a young girl in the 1600's (Maria Merian) who through scientific observation discovered the life cycles of butterflies. The title of the book is Butterflies Don't Grow from Mud: How One Curious Girl Revealed the Mysterious Lives of Insects. The tables below show a sampling of questions and actual student responses based on this book. The text in parenthesis represents the additional prompts assessors gave the student (see coding in the table above), and the student's subsequent responses.

Sample Questions and Student Responses on Previewing a Text
Question	Sample Student Responses
3. Please take a few minutes to preview the text and think about what questions you have about the topic of this book.	[Student A] One of my questions are they show a lot of plants and trees, so I'm wondering if that's important to butterflies. (TMM) They show other kinds of insects.I wonder why they show other different insects and not just butterflies.
5. Before reading this text, you said that you had questions about [refer to response from question #2]; did reading the text answer this question? Please explain.	[Student A] The plants, I got an answer to that. They're important because silkworms, butterflies and things like that they mostly eat plants. She seen insects eating leaves and different other kinds of plants in her garden

In the example above, you can see that in question #3, the student is begining to think about text structure and why the author would include certain pictures and how these pictures might relate to the text. In question #5, the student is asked to revisit the questions from the initial previewing of the text, and see that he did find answers to his questions. By modeling this concept of previewing a text before begining to read it and then returning to those initial wonderings, the assessor/teacher is showing the value of taking the time to orient oneself to a text before reading in order to help set a purpose/goal for reading. By probing with "Tell me more," the student has to provide more details and think more about what he is about to read. In essence, this is meant to help students become more active readers and take a more metacognitive approach when reading.

Sample Inferential Thinking Question on Text Details
Question	Sample Student Response
9. How does painting help Maria become a scientist? Target Response: When she painted she had to watch closely and look for details, this how she learned so much and could write a book.	[Student B] Painting helped her be a scientist because she studies them outside and she draws exactly what she sees. (WTT) because in the text it says in the garden she watched insects on the plants and painted exactly what she saw.

With the student's initial response, we can see that the student was on the right track, but with the additional prompt of "What in the text makes you think that?", we are able to see the student provide text evidence for his answer with the additional details of Maria having to pay close attention to the insects on the plants in their natural environment in order to paint them accurately.

Sample Critical Thinking Question
Question	Sample Student Response
8. What was so different or unusual about Maria? Target Response: She painted live insects not models; she was a girl studying insects and in the past only boys did those kinds of things.	[Student C] That she could paint and do research on butterflies and paint at the same time. And she worked hard to do that and multitask so she can get that done. (TMM) And... Maria is also a... was a good parent because she taught her daughters about butterflies and moths and she was still able to do research and paint at the same time.

In this sample, you can see that the student is on the right track, in that the student realized the importance of observation in being a good artist and scientist, but didn't quite get the uniqueness of Maria's occupation in being a female entomologist at a time when there were few scientists, and even fewer female scientists. However, the "Tell Me More" prompt, did illustrate that the student saw Maria as a positive role model for her children. Perhaps an additional prompt of "Why is that important? might have revealed more about why this might have been particularly significant to Maria's daughters at the time and might have recentered the discussion on the text and time period, instead of starting to tread into the area of the student's personal experience with different parenting styles.

Critical Reflection

Strengths

Building student/teacher relationship

To begin with, meeting with students individually to talk about a text sends a positive message that the teacher cares about him/her as a reader. It helps honor the students as each being unique readers with unique and valuable perspectives and can help build a rapport with students when administered at regular times throughout the year.

Additional probes

Although not all of the assessment is oral, the comprehension conversation does allow for select probes in order to try to tap into the student's zone of proximal debvelopment, in which the student is able to access and express a deeper level of understanding when given a certain degree of scaffolding.

More open-ended design for formative assessment

Since the comprehnsion questions are open-ended items, students may come up with logical responses that are not listed in "target response," and still be able to get a correct response. Additionally, since the assessment is meant to used at mulitple points throughout the year, the teachers can better monitor individual and collective student progress on the assessment and use this information to make instructional decisions. For example, if a teacher notices that students are struggling with the item involving the concept of previewing a text, s/he may create a lesson on ways to preview including looking through the title, chapter headings, looking at graphics, reading captions, jotting down notes about questions, etc. Then, the teacher may compare how students performed on this task after having the lesson on previewing.

Models good instructional practice

In a regular classroom setting, as Farr points out, it is important for teachers to take the time to read with students and discuss the text with them. Moreover, the assessment is meant use forms of questions that will help students think more metacognitively about reading and use text evidence to support their answers, such as the question items that ask about author's purpose in writing a text and with the prompt "what in the text makes you think that?"

Weaknesses

Time consuming, especially if there are many students in the class. (On average, the assessment takes about 1 hour to administer, even thought the comprehension converation typically only takes 15 minutes. However, once we've collected samples from around 600 students, we will be able to complete the item analysis study to see which items were a misfit for the assessment and shorten the overall assessment). Additionally, it takes time for the assessor/teacher to read all the texts and be familiar enough with each of the texts in order to carry on a conversation with students about them.
Some students may be unfamiliar with the format of having a comprehension conversation and may feel self-concious having to talk one-on-one with an assessor/teacher. In fact, out of roughly 200, there have been at least two cases where students have been moved to near tears when they didn't understand what was being asked of them (at that point, assessors stopped administering the assessment)
Since we do not have authorization to take audio or video recordings in most school settings, it has been difficult to transcribe the students' responses verbatim in real time for assessors, so it may be that assessors may miss part of a student's response. Additionally, this typing also adds to the overall length of the assessment. On the other hand, often students will begin talking off topic, which can be distracting for assessors to record, and again, time consuming.
As most are constructed response questions, the assessment is not easy to score and could have wide variability in answers, and can be difficult to establish reliability. So far, we have just begun starting the scoring process with an internal team of six scorers (assessors in the field did not give numeric scores)

Conclusion

Overall, we want students to be able to apply the knowledge they gain through reading to their life outside the assessment setting and see value in reading. The point of this assessment is try and pinpoint areas where students are struggling as readers and give teachers a roadmap of where they might be able to provide additional support based on individual student needs through differentiated instruction. The assessment may be long and a bit arduous for both the assessor and the students, but we are still working to streamline it and create professional development to help better prepare teachers. Plus, the data that we are getting (or at least the glimpses we've seen so far) is so valuable precisely because it is difficult to get, and sometimes you need that one-on-one talking time with students to get know their perspectives better, probe deeper, and challenge students' way of thinking so that they base their responses on what they've gained from reading, instead of what they've learned from personal experiences. We've still got a long road ahead of us with this pilot project, but we are confident that what we've learned from the successes and pitfalls of this assessment we will be able to help others be better at teaching reading instruction in a way that allows students to apply their reading skills to the rest of their subject areas as well as to tasks in the world outside the classroom.

References

Afflerbach, P. (2012). Understandong and using reading assessment, K-12 (2nd ed.). Newark, DE: International Reading Association.

Cordray, D., Pion, G., Brandt, C., Molefe, A, & Toby, M. (2012). The Impact of the Measures of Academic Progress (MAP) Program on Student Reading Achievement. (NCEE 2013–4000). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.

Farr, R. (2013) "What Kids Are Reading: The Book-Reading Habits of Students in American Schools"

PARCC Passage Selection Guidelines for Assessing CCSS ELA http://parcconline.org/sites/parcc/files/Combined%20Passage%20Selection%20Guidelines%20and%20Worksheets_0.pdf

Smarter Balanced ELA Stimulus Specifications, http://www.smarterbalanced.org/wordpress/wp-content/uploads/2012/05/TaskItemSpecifications/EnglishLanguageArtsLiteracy/ELAStimulusSpecifications.pdf

Assessment Practice Analysis

Project Overview