Spotlight MacArthur Foundation

Measuring Classroom Progress: 21st Century Assessment Project Wants Your Input

Filed by Daniel Hickey and Brian Nelson at 9:59 am on February 8, 2010 in Assessment, Policy, Schools23 comments

Guest authors Daniel Hickey and Brian Nelson argue that the opportunity to institute true reform in assessment practices is now, and the Race to the Top Assessment Initiative should think more broadly about how we measure progress in the classroom. They welcome comments on findings from the MacArthur 21st Century Assessment Project.

image
Photo by Extra Ketchup

Education Secretary Arne Duncan has set aside up to $350 million of Race to the Top funds for the potential purpose of supporting states in developing a next generation of assessments of student learning.

The competitive grant program is called Race to the Top-Assessment (RTT-Assessment). Members of the MacArthur Foundation’s 21st Century Assessment Project are scrutinizing this initiative. Our investigation reflects the project’s continuing analysis of assessment practices that reveal the reasoning, communication and learning needed for economic, social, creative and civic success in a networked 21st century world. We encourage feedback on our findings.

Grant Evaluation Criteria May Lock in Old Problems

Members of the MacArthur Foundation’s 21st Century Assessment Project have reviewed expert testimony and recent reports from the National Academy of Sciences, and interviewed experts and education program officers at major foundations. This investigation has revealed both obstacles and opportunities to developing assessments needed for real reform of educational policies and practices.
Like many others, we are concerned that the evaluation criteria for broader state-level RTT proposals may well lock in some of the problematic testing practices of No Child Left Behind.  There is a danger that many states will continue to rely on the same narrow tests of basic math and reading skills that have thus far failed to lead to instruction that enhances deep conceptual understanding and innovative problem-solving. 

In addition, the politics of mid-term elections may cause RTT resources to be distributed evenly to all states, with little regard for proposal quality or innovation. Stimulus funding limits RTT efforts to two years, but it reportedly takes at least three years to incorporate new assessment items into the relatively straightforward tests required by NCLB.   

Potential Opportunities in RTT Initiatives Are Also Evident

On the other hand, five recent developments point to the tremendous opportunities that could arise with the RTT and RTT-A initiatives. 

1. Good Examples Abound

The first is the progress that other countries and leading technology firms have had in overhauling their educational testing systems. Many other English-speaking countries (especially Canada, Singapore and Australia) have been working on more balanced accountability systems that use multiple indicators and include mechanisms for schools to tell the government how students are doing.

Likewise, the Cisco Networking Academy (which must certify thousands of technicians and administrators every year) has used Mislevy’s Evidence Centered Design to leverage the assessment efforts of numerous independent vocational centers around the world. 

These integrated systems use evidence of actual learner performance on authentic challenging tasks. The problems used in these assessments represent the broader curriculum and reflect content-driven understanding of the way that learning actually progresses. These accountability systems are organized around partnerships involving teachers and technology in ways that support real continuous improvement, instead of debatable bumps on tests of increasingly trivial skills.

2. Balanced and Integrated Approach Has Support

The second opportunity is the emergence of a large consortium of states endorsing the principles behind a balanced and integrated approach for their RTT-Assessment efforts.

These principles emerged from efforts associated with the Council of Chief State School Officers and call for an integrated system of assessments based on student performances on meaningful tasks, designed to continuously improve the quality of learning and teaching. They advocate for involving teachers in creating and scoring assessments and using technology to support innovative assessments. 

3. Bottom-Up Approach Promises Innovation

The third opportunity is the proposed $1.4 billion RTT competition in 2011 for innovative reform proposals initiated by school systems themselves. Announced Jan. 19, this “bottom up” approach promises much more innovative approaches than the state-level competition. 

We are particularly enthusiastic about the possibility of dynamic new partnerships between districts and assessment researchers working at universities and nonprofit organizations. 

4. Alliances Are Forming among Technology Leaders for Integrated Approaches to Accountability

The fourth opportunity centers on the emerging alliances among the world’s leading (and otherwise competing) technology firms to support the networking infrastructure and partnerships necessary for truly integrated approaches to accountability. 

One such alliance is between Intel, Microsoft and Cisco, and it has a strong international focus. Others appear likely. 

5. Philanthropies Are Banding Together

A fifth opportunity concerns new alliances across some of the nation’s most influential philanthropies to support more balanced and integrated assessment approaches at all levels. 


Digital Technologies Should Be Used to Assess 21st Century Skills

We believe it is crucial that RTT funds—and other funds for innovation—support new thinking about assessment. Digital technologies hold out the promise of assessments centered on problem-solving, innovation critical thinking. They can be used to gather meaningful data that tell us about the development of proficiencies that will be useful now and in the future. 

We cannot let punitive criteria on narrowly focused standardized tests stop us, as a country, from leading the way into a new age of assessment. At the least, RTT must support alternative forms of assessment and compare and contrast them with our current forms of large-scale assessments.  RTT can become an incubator for new forms of 21st century learning and assessment for all learners.

The members of the 21st Century Assessment Project encourage concerted efforts to both overcome the problems described here and take full advantage of the opportunities for real reform so that Race to the Top can achieve its stated goals. If not, the existing achievement gap between the United States and other countries may well remain, while dangerous new gaps emerge around “21st century” proficiencies. 

Depending on the nature and scope of reform efforts supported by RTT and RTT-A and on the progress made in other countries, the United States could either become a “nation at the top” or a “nation left behind” as other countries widen the “assessment gap” that certain aspects of NCLB have left us with.

We encourage comments and input on this statement, and ideas for moving forward. Add your thoughts and ideas below.

Daniel Hickey is an associate professor in the School of Education at Indiana University at Bloomington. Brian Nelson is an assistant professor of educational technology in the Mary Lou Fulton College of Education at Arizona State University.

The MacArthur Foundation’s 21st Century Assessment Project is investigating assessment practices that reveal the reasoning, communication and learning needed for economic, social, creative and civic success in a networked 21st century world. This investigation of RTT-Assessment also reflects the broader goals of the MacArthur Foundation’s $80 million Digital Media and Learning (DML) initiative.

Next: Got Game? Digital Media and Learning Competition Offers $50,000 for Gaming Ideas > >


< < Previous: Apply Now: Professional Development Trainings in Virtual Worlds

Save or share this post

Bookmark and Share

Tags

Tags: 21st century assessment project, arne duncan, brian nelson, daniel hickey, no child left behind, race to the top, race to the top-assessment

Comments (23)

1: Mechelle De Craene from Florida at 8:38 am on Wednesday, February 10, 2010

I am interested to learn more about the “bottom up approach” for assessments. Teachers make assessments everyday. I’ve used various technologies with my students with special needs as well as my gifted students to measure various skills and insights. I’m curious when they say “districts and academia” does it include teachers? Or is it the folks at the district level? I’ve worked on both sides and as a liason between academia and teachers while in grad school. Please elaborate. Thanks!!

2: Cathy Davidson from John Hope Franklin Center, Duke University at 10:11 am on Wednesday, February 10, 2010

Thank you very much for your excellent work in this area.  In case it might be useful, I’m reposting here a list of “Twenty-First Century Literacies” (the term is Howard Rheingold’s) that I pass on to my students.  We do not have ways, in NCLB and in most metrics, of assessing these: 

“21st Century Literacies” compiled by Cathy N. Davidson

Media theorist and practitioner Howard Rheingold has talked about four “Twenty-first Century Literacies”—attention, participation, collaboration, and network awareness—that must to be addressed, understood and cultivated in the digital age. (see, http://www.sfgate.com/cgi-bin/blogs/rheingold/category?blogid=108&cat=2538). Futurist Alvin Toffler argues that, in the 21st century, we need to know not only the three R’s, but also how to learn, unlearn, and relearn.  Expanding on these, here are ten “literacies” that seem crucial for our discussion of “This Is Your Brain on the Internet.”

•  Attention:  What are the new ways that we pay attention in a digital era?  How do we need to change our concepts and practices of attention for a new era?  How do we learn and practice new forms of attention in a digital age?
•  Participation:  Only a small percentage of those who use new “participatory” media really contribute.  How do we encourage meaningful interaction and participation?  What is its purpose on a cultural, social, or civic level?
•  Collaboration:  How do we encourage meaningful and innovative forms of collaboration?  Studies show that collaboration can simply reconfirm consensus, acting more as peer pressure than a lever to truly original thinking.  HASTAC has cultivated the methodology of “collaboration by difference” to address the most meaningful and effective way that disparate groups can contribute.
•  Network awareness:  What can we do to understand how we both thrive as creative individuals and understand our contribution within a network of others?  How do you gain a sense of what that extended network is and what it can do?
•  Design:  How is information conveyed differently in diverse digital forms?  How do we understand and practice the elements of good design as part of our communication and interactive practices?
•  Narrative, Storytelling:  How do narrative elements shape the information we wish to convey, helping it to have force in a world of competing information?
•  Critical consumption of information:  Without a filter (such as editors, experts, and professionals), much information on the Internet can be inaccurate, deceptive, or inadequate.  Old media, of course, share these faults that are exacerbated by digital dissemination.  How do we learn to be critical?  What are the standards of credibility?
•  Digital Divides, Digital Participation:  What divisions still remain in digital culture?  Who is included and who is excluded and how do basic aspects of economics, culture, and literacy levels dictate not only who participates in the digital age but how we participate?
•  Ethics and Advocacy:  What responsibilities and possibilities exist to move from participation, interchange, collaboration, and communication to actually working towards the greater good of society by digital means in an ethical and responsible manner?
•  Learning, Unlearning, and Relearning:  Alvin Toffler has said that, in the rapidly changing world of the twenty-first century, the most important skill anyone can have is the ability to stop in one’s tracks, see what isn’t working, and then find ways to unlearn old patterns and relearn how to learn.  This requires all of the other skills in this program but is perhaps the most important single skill we will teach.  It means that, whenever one thinks nostalgically, wondering if the “good old days” will ever return, that one’s “unlearning” reflex kicks in to force us to think about what we really mean with such a comparison, what good it does us, and what good it does to reverse it.  What can the “good new days” bring?  Even as a thought experiment—gedanken experiment—trying to unlearn one’s reflexive responses to change situation is the only way to become reflective about one’s habits of resistance.

3: bob bradley from nashville at 2:36 pm on Wednesday, February 10, 2010

we are also working on this opportunity with the DMSC Governors Challenge, found at the site listed. We are exploring open-source, curatorial crowdsourcing form of continuous assessment. We see a way to ultimately standardize personal development for life-long learning. Please join us!! Find our MacArthur grant at Curatorial Crowdsourcing on the archive. All best to all….bob bradley 615.579.7446

4: Susan Paley from Stamford Public Schools, Stamford, CT at 3:06 pm on Wednesday, February 10, 2010

Yes, I agree that computer generated assessments, and particularly those that seek to measure high level skills and creative thinking are necessary and the way to go.  I think that perhaps the real challenge will be making the connection between the results of the assessments and improving student learning. Ive seen too many teachers struggle to utilize the data they have now to improve students’ learning. Frequently they don’t have the time during the school day,the educational resources, or the knowledge of how to reteach the information.  Without helping teachers to utilize the data, change won’t happen.

5: Katie at 3:50 pm on Wednesday, February 10, 2010

One of the consequences of RTT is that it is requiring states who want to funding to adopt a merit-pay system for teachers.  While I’m not absolutely opposed to merit pay. merit, of course, has to be measured.  And so far, it looks like it would be how well your students performed on the ACT—or whatever high-stakes test is the flavor of your region.

So what I’m saying is this narrow vision of assessments will affect not only student achievement but what teachers value and teach as well.

6: Richard Thall from Stow, Massachusetts, USA at 9:35 pm on Wednesday, February 10, 2010

Using existing Internet technology, every test a student takes can be a standardized test.  Students can build their standardized assessment portfolios little-by-little every time they take a quiz or examination.  Teachers can construct standardized tests customized for each class or each student.  Contact (JavaScript must be enabled to view this email address) for details. Bob Bradley seems to be on the right track.

7: Mechelle De Craene from Florida at 9:47 am on Thursday, February 11, 2010

It would be great if assessment development were open-source. In effect, teachers already do this everyday. We like to share. However, while I was a grad student, and even as an undergrad in pre-med psychobiology, I noticed that a lot of the pys profs maketed their assessments. As a teacher, I see that some assessements are correlated with textbook companies, etc. This can cost schools lots of money to buy the assessments and assessment preps.

Also, I believe that linking teacher pay with high stakes assessments is not a good idea. I could write a lot about that. However, it is a point that has been debated and argued over way too much…and teachers really have little say in the matter.

In reading about Rheingold’s 21C Literacies (thank you Cathy for posting them) and Toffler, I am reminded of Cattell’s Fluid Intelligence. I like the the 21C Literacies are focused more on the Fluid, because that’s what we really need. Unfortunately, most schools, from Pre-K through college are focused on Crystallized Intelligence. When I teach Pre-K we already have required assessments on Crystallized Intelligence we must take data on and my students at the college in the night class I teach ask me what is on the test.

Many folks are aware that the arts which promote Fluid Intelligence are being cut. Also, what I see that has not been mentioned in school reform efforts is that many of the Voc programs have been cut. Many voc programs did encourage Fluid intelligence. Eighty-five percent of students do not attend college. It would be great to have a skill.

Also, many of the labs are being cut in schools. I’ve taught self-contained g & t science and we were able to do labs. However, when teaching inclusion science this is very difficult. Due to the inclusion movement, many gen ed teachers have as much as half of their class. Further, many special ed teachers are not be STEM savvy. Hence, labs are cut. It is sad, because chem labs are a great place to glean fluid intelligence.

I guess my point is that I’m really excited to see that folks are considering assessing fluid intelligence. It would be nice if schools provided more opportunites in the classrooms as well.

8: Jeff Kupperman at 9:47 am on Thursday, February 11, 2010

From Cathy’s comments: “Alvin Toffler has said that, in the rapidly changing world of the twenty-first century, the most important skill anyone can have is the ability to stop in one’s tracks, see what isn’t working, and then find ways to unlearn old patterns and relearn how to learn.”

If one takes this idea seriously (and I agree very much that we should), it implies not just “better assessments,” but an essentially different way of thinking about assessments, and of the role of educators in general.  A teacher can’t ONLY be a skilled practitioner; one must be first and foremost a scholar of learning and a designer of learning experiences. In our schools and teacher education programs, that is far too seldom expected, and even more seldom supported or rewarded.

9: James Paul Gee from Sedona, Arizona at 12:42 pm on Thursday, February 11, 2010

Key aspects of NCLB are well intentioned, but some of its implementation mechanisms have driven change in the wrong direction.  These mechanisms have driven change away from what we need in a global world, which are: greater teacher professionalization, problem solving rather than fact-centered learning, and the creation of learners who can innovate and who can produce and not just consume with digital media.  These are all things we need if we are to compete as a nation in the 21st century global world. 

True transformation in assessment is what can drive these changes and undo the unintended consequences of NCLB.  We do know, at least in a general way, how to engage in better assessments and how to assess 21st Century skills.  MacArthur’s 21st Century Learning and Assessment project, as well as many other projects across the world, have developed key new ideas here.  We know that the future of assessment is in integrating learning and assessment more closely, using digital media to collect and represent copious information on growth across time, and introducing new paradigms for learning that teach learners to use facts and formulas as tools to solve problems and not just as fodder for tests. 

The problem is that, like all new inventions, many of these ideas are in need of testing for and eventually at scale and are not yet economically practical at scale (but can be and will be if we support them).  The way forward in my view is for the nation to do what any smart high tech company would do: start a “skunk works” national laboratory/incubatory that would trial new learning/assessment ideas and incubate (as businesses) and widely disseminate the best of them nationally and globally.  The national laboratory/incubator would use a selected group of innovation friendly schools, for example some state charter school associations and other schools genuinely open to innovations.  These would be schools not co-opted by fears of standardized test scores or schools purposely allowed to pit new assessments against current tests by taking the risk not to teach to the current tests.  This would prepare the U.S. to flourish in future global education and 24/7 learning markets. 

This national “skunk works” laboratory/incubator would also give us something we have long needed: a single integrated process that combines research, design, implementation, evaluation, incubation, and dissemination.  This will ensure that funding agencies finally see projects they fund flourish in a sustainable way and not die the day the government or foundation funding dies.  We do have good promising ideas.  Do we have the will to make them work?  Or is innovation going to be another victim of political polarization in America?

10: Mechelle De Craene from Florida at 3:35 pm on Thursday, February 11, 2010

Jeff, you make an excellent point regarding, ” In our schools and teacher education programs, that is far too seldom expected, and even more seldom supported or rewarded.” Actually, like creative students, innovative teachers are actually punished sometimes in schools where they just want to fly under the radar.

Also, Dr. Gee, you make an excellent point about Skunk Works. Perhaps we need to have another Woods Hole Conference too.

I think folks are forgetting with the inclusion movement that we are having such heterogeneous classrooms that teachers are not only having to do more assessing they are learning how to manage their classrooms in different ways. There are so many students with medical needs and we don’t often have the hands to teach the way we’d like. It is very challenging. I am both a special ed and gifted ed teacher. I think the thing people forget is that we are “racing to the top” with countries that are not serving students with severe disablities in their classrooms. Therefore, the classroom dynamics are different than American teachers face.

Please note, I am for inclusion ideologically. However, in reality often teachers don’t have the support. Hence, it is often challenging to teach the way we know and would love to teach.

P.S. Please forgive previous grammar mistakes. I have been in multitask mode. On that note as well, I’m thinking of Rheingold’s attention as a 21C Literacy…I think that teachers attention is so scattered now a days. We have so much to juggle, and it is getting harder and harder to filter. In some regards, technology does help to filter it though because it is easier to put it into patterns. Just curious, is there any research on 21C teachers as related to these 21C Literacies?

11: R. S. Webster from Worthington, OH at 5:37 pm on Thursday, February 11, 2010

CHANGE behavior by “learning leaders” in the classroom (aka “teachers, instructors, faculty members) and “apprentice leaders,” learners (aka “students”) soon to be knowledge workers in our global networked information society.
HOW? (1) Learning leaders make specific their chosen “teaching methods,” i.e., their “learning process elements–LPEs,” the “how” they use to present their course content “what.”
(2) Learning leaders task their learners to use these LPEs as instructed for doing their learning work. Learners “keep score” of time on task and any other learning process metrics their learning leaders direct. Use of LPEs, and their mastery, (more and more IT-connected) are skills that transfer from learning work in school and higher education to knowledge work on the job.
  Students, being smart, will soon see the useful connection between using LPEs for content mastery and mastering the learning skills that employers want their new hires to possess.
  Employers will soon see the benefit of hiring students with proven skills at using LPEs.
  Employers, as tax payers, will “encourage” education to focus on both learning (course) content and learning process.
WHO SAYS SO? The logic and process of LPEs is based on work done by Dr. William D. Hitt in the Center for Improved Education at the Battelle Memorial Institute in Columbus, OH, ca. 1970-1975. Dr. Hitt added the powerful tool of self-assessment for self-renewal, described in his book “Education As A Human Enterprise”–1973. With modern computer equipment, the self-assessment tools Dr. Hitt presents can be used in any classroom (physical or on-line) where learning leaders believe that learners’ self-generated data about their learning process is likely to motivate increased learning behavior and improved learning results. Dr. Hitt’s “Self-renewing Educational System—SES” has been updated as “Self-renewing Learning and Leadership Improvement Process™–SLIP™, implemented using the “Hitt/Likert Assessment Process™.”
WHAT NEXT? These tools for improving performance, results, and readiness for knowledge work in the information society await only resources to develop the necessary computer-based self-assessment and report delivery capability. Inquiries (and RFPs) are welcome.

12: bob bradley from nashville at 9:30 pm on Thursday, February 11, 2010

simply put, standardizing personal development is where its at. the vaunted personal learning environment is a pipedream without well-designed technology enhancements. snapshots of personal and comparative growth should be standard. dashboards could easily entertain personalized benchmarks and be upgraded daily. keystrokes count. so this thread is very intriguing and packs a wallop so far.

13: Daniel Bassill from Chicago, Il at 10:16 am on Friday, February 12, 2010

The addition of new technologies makes assessment opportunities much broader than ever before. The recommendations posted in the article and some of the comments seem very good to me.

However, as the evaluations of what kids learn are developed, I encourage the assessment model to use visual technologies to help understand the context of these results, and to be better able to mobilize community resources to help underperforming schools improve from year to year.

One way to do this is to overlay the assessment data that is collected on Geographic Information Systems maps, along with poverty demographics, to show where learning is or is not happening to the same degree.  Such maps could be available on public web sites, enabling a larger number of stakeholders to become more engaged in strategies that lead to community well-being and economic vitality.

Such assessment tools also might have layers of information to show the availability of non-school and school based learning and mentoring opportunities, where business and professional people connect with youth in community based organizations, business sites, or at community schools, to mentor arts, technology, communications, teamwork and problem solving.

A school could be evaluated on how well it makes a wide range of learning supports available for kids who live in lower income areas where such resources don’t occur naturally in the community, as well as how well this combination of school and non-school resources help kids learn and prepare for 21st century adult responsibilities.

An example of using GIS mapping to understand the distribution of non-school tutor/mentor programs in Chicago can be seen at http://www.tutormentorprogramlocator.net

Understanding results based on barriers to learning, such as high poverty, can lead to better use of the analysis to overcome those barriers and assure that all kids are succeeding in school and in life.

14: Jerrie Bascome McGill from Dayton, Oho at 9:38 pm on Friday, February 12, 2010

These ideas and this information is exciting.  As these activities move forward, it’s critical to include poor and minority children, schools and communities.  The gaps between wealthy and poor communities that currently exist must be taken into account so that all children and young people will benefit from the ideas herein contained.

15: Mechelle De Craene from Florida at 2:27 pm on Saturday, February 13, 2010

I think Daniel’s idea of maps and assessments is intriguing. I think it would be interesting to see how many schools partner with resources in their communities.

Also, in reading his comment about poverty I wanted to share what I am seeing as a teacher that is not mentioned in school reform efforts. My experiences teaching have been at Title 1 schools and there is a issue with families/guardians abusing the disability system. Parents can receive between $400-$800 a month and there is no accountability.  This is working against education because if the student is failing then parents/guardians receive money…and teachers are often blamed.

16: Daniel Hickey from Bloomington, IN at 5:34 pm on Sunday, February 14, 2010

Well this is quite a something. We seem to have generated some nice conversation here, so thanks to everybody for reading and responding.  The comments and hundreds of hits along make the effort worth our while.  Reading these great comments brings to mind lots of things we learned in our inquiry that did not end in our post. Plus since we complete our inquiry I have had time to look at the Race to the Top proposal that my own state of Indiana has prepared.  Let me just read back over all of them and make some responses. 

To #1 Mechelle’s first comment, teacher-developed assessments are a huge issue that is not being addressed.  In many of the other countries that have not been hamstrung by NCLB, teachers are much more involved in the assessments that are used in accountability measures.  Mostly teachers are involved in scoring, but in some cases designing and refining.  When worthwhile assessments are involved, this can be tremendously helpful.  Of course, this raises a lot of issues about common curriculum as well.  I have some mixed emotions about this issue.  Personally, I think that we need to give a lot more consideration to helping teachers (both as individuals and in larger communities) learn to align their classroom assessments (both on-demand performance assessments as well as scoring of artifacts) with the accountability systems.  The standards-oriented measures used in accountability systems should directly and clearly tell teachers know whether grades on curriculum-oriented classroom assessments are valid estimates of learning.  Put differently, accountability systems should reveal an “echo” where teachers can directly see that students who do well on their classroom assessments also do well on the accountability measures.  But I am afraid this won’t happen.  NCLB has already overstretched the infrastructure for designing, administering, and using standards.  I am worried that the combination of value-added and pay for performance will completely overwhelms the system.

As for #2 Cathy’s post, Howard’s is yet another good list to work from.  This is also an area where I have some mixed emotions.  Given that the constraints on RTT and RTTA are going to lead to a pretty narrow focus on mathematics and reading, I am pretty sure that these practices won’t end up in the mix.  But Cathy’s note reminds me that I am overdue with a post to my own blog on this issue.  I have one up already at Remediating Assessment (http://bit.ly/58gVxl) about the validity issues that emerge when we frame these things as “skills.” I worry that all of the debates about which of them (literacies, skills, proficiencies, etc.) is most important is obscuring the more fundamental issues that emerge when we take these essentially social practices and treat them as properties of individuals that can be measured in a meaningful way.  My next post to RMA should be a review of the various efforts underway by smart folks around the world.  So far I have not had time to look over the individual RTT proposals from the various states yet, but from what I have seen it looks like there will be very little of this sort of knowledge in RTT. I am hoping that some will emerge from the consortia forming for the RTTA competition, but it is pretty hard for me to imagine how assessment of any such practices can co-exist with pay for performance.  I can see a way how you could have pay-for-performance on the very distal measures of basic skills and then make sure that measure of these more proximal constructs are also ensuring that basic skills are improving, but that seems well beyond the testing and assessment infrastructure already.  If anybody knows of a link to all of the 41 RttT proposals that are being made public, I would love to get it.  And if somebody did an analysis of those proposals from this perspective I would love to see and comment on it.

More generally, there is so much great new stuff coming out to read in this regard that I feel like I just keep getting further behind every time I start to write.  Case in point is Cathy’s new report The Future of Thinking (http://bit.ly/4xeJwV).  As a cognitive psychologists, just the title alone intrigues the heck out of me. 

To #3, Bob Bradley, I checked out your competition and it looks quite interesting (I also like the TN connection as I studied at Vandy)  I think that others should take note of the way you appear to be merging the stability of a university course with a broader social media completion.  I have been trying to figure out how to get something like that started around my classroom assessment course, but so far to no avail.  But there are some smart folks in ISSOTL world making some headway and the incoming president Randy Bass looks to have some pretty good examples.  I did not write about it in this post but in my other project we are really working hard to use innovations in learning assessment to advance open sourcing learning (and importantly, vice versa).  As part of our inquiry I talked with officers at the foundations that are supporting the open-education resources (OER) movement about assessment, and launched pretty comprehensive review of what has been done so far.  It turns out that “assessment” in OER mostly refers to assessing the content of OER courses, and “testing” is mostly about pilot testing those courses.  I think Henry Jenkin’s ideas about spreadable media practices have a lot to offer here (we wrote about such “spreadable educational practices” at Project NML at http://bit.ly/nmlSEP1 and http://bit.ly/nmlSEP3).  (And Bob, I also share your grief over Vic Chestnutt’s passing.  He was one of several Athens artist I regret not seeing live during my five years at UGA.  My copy of Ghetto Bells is damaged and I really miss it!)

To #4 Susan Paley, the issue of technology looms VERY large in this set of issues.  The concern is that the lead in technology right now is really is going to growth modeling across K-12 with basic math and content-free reading tests. The testing companies have a massive pool of existing items from ten year of NCLB, and it is pretty straightforward to use technology to track gains on the existing tests.  Certainly Sanders’ work in TN was convincing.  He showed that when you find the teachers associated with the bottom third of gains, and then find students who had three of those teachers in a row that those students have terrible levels of growth that put them seriously at risk.  But that does not automatically mean the using the data from those system is going to improve schools Quite to the contrary, as detailed in Richard Rothstein’s excellent book Grading Education, there are lots of reasons and ways for that data to be used to undermine the very scores they aim to increase.  Each of the growth models has a student id number and a teacher id number.  We are going to end up with 21st century testing technology that is stuck with early 20th century conceptions of learning and teaching.  What we need instead are shorter learning progressions with more worthwhile assessments based on current understanding of the way knowledge develops in particular domains.  This was one of the biggest concerns raised by many of the experts I spoke with, and one that I am thrilled that the foundations appear interested in addressing by hooking up the tech folks with the progressive assessment folks.  Keep your fingers crossed.

More generally, anybody who is interested in assessment at RTT should read the new report from the National Academy of Sciences panel chaired by Henry Braun (http://bit.ly/NAPva) as well as letter from the National Academy Board on Testing and Assessment led by Ed Haertel (http://bit.ly/BOTAva).  Braun and Haertel were RttT invited experts).  Both raise pretty serious concerns about things like what my state is proposing (http://bit.ly/InGrowth). As Jim Popham reminds us, schools had already done most of what they could do with basic achievement data by 2000, a decade after newspapers began regularly publishing them.  Teachers in my state have never been able to do much with our CTB-made ISTEP test; adding in pay for performance is not going to help.  Like Lorrie Shepard (who was on the National Academy Panel) said “you just can’t keep shocking the chicken!”  You just can’t keep pointing at the schools and the teachers and tell them they need to somehow raise scores.  What I find worrisome is that this is being touted as a “progressive” alternative to the status quo of NCLB.  Even more worrisome is that there will be no room left for more innovative computer-based assessment like what Edys Quellmalz, Brain Nelson, Val Shute, Bob Mislevy, Dan Schwartz and others overseas are doing.

To #5 Katie, yes the merit pay issue is huge, and is really locked in under the RTT requirements. 

To #6 Richard Thall…. REALLY?  Sorry, but this is exactly what I am afraid of.  I am afraid that you missed the entire point of our inquiry.  A score on a standardized test is NOT an artifact that belongs in a student portfolio.  This is a terrible idea and I find it worrisome that you would refer to such a thing as a portfolio.  The entire point about a portfolio is that it contains artifacts that are personally meaningful to the student.  There only meaningfulness of scores on basic math and reading test is the consequence that get attached to them.  I put up a post in October about participatory approaches to portfolio assessment that takes a very different position.  (http://bit.ly/RMAPortfolios).  I do think we should be able to show that our portfolio assessment practices ultimately lead to gains on external tests, but only very indirectly, and only by looking at achievement scores in the aggregate and over longer scales of time.  (this is exactly the sort of thing that growth modeling would be ideal for).  That is just never going to happen if people think of test scores as relevant artifact to include in a student’s portfolio, alongside (or in place of) essays, reports, etc.  If you put the scores in the “portfolio” then you undermine the validity of the very scores.

To #7 Mechelle’s second comment,  I actually find notions like Fluid Intelligence pretty worrisome.  I started grad school starting differential psychology and my very first publication was in the journal Learning and Individual Differences.  I got out of it those constructs seemed useless (or worse) for reforming education.  I am a huge believer in Arts Ed, and hands on instruction in labs, and I just don’t see how linking to constructs like that helps.  The point is that our schools should engage students in the forms of discourse that create a trajectory towards various real world communities of expert practice. 

To #8 Jeff, yes an entirely different way of thinking about assessment IS what we need.  My worry is that if each state’s entire accountability infrastructure is overwhelmed by pay-for-performance based on the math and reading test scores in each of their students “portfolio” there won’t be any room.  But Jim’s next point certainly is encouraging, and I suspect he is just the sort of person who can make it happen.

#9 Jim, count me in!  It seems to me that NCLB has given the rest of the world a ten year head start, and I am worried we are going to wake up one day and discover that our “flat” earth is actually tilted sharply in favor of other countries who are doing just what you describe.  Ford, Hewlett, Nellie Mae, and Sandler funded a workshop for the US DOE last year where they brought in Folks from other (mostly English speaking) countries to show them the kinds of innovations that were underway.  As part of this Linda Darling Hammond produced a lovely paper summarizing these innovations.  But many of my overseas colleagues tell me that things are not as progressive in those other countries as some of us think.  Certainly we are talking about pockets of innovation here, but when I see how many forces are aligned against innovation in this country, I get worried.  But in the end, I think what we really need is a dynamic networked laboratory where TEACHERS work with assessment innovators to use and share high-quality assessments and the associated accounts and artifacts that result.  We are doing this right now in Southern Indiana with two schools districts who got netbooks with stimulus funds and we have made some pretty good progress in our first year.  We are using proxy items from our state End of Course Assessments, but in my prior implementations I have found that you have to make gains over one standard deviation on curriculum-oriented classroom assessments before you can find significant gains on standards-oriented tests.  We were getting close, but we just found out that new budget cuts are closing the alternative school where we are having the greatest success.

#12, Bob’s second point:  By standards, I think you mean what Jim Gee has been reminding us for years:  “standards” used to mean “high standards”.  But that is very different than “standardized.”  In our participatory assessment design model, we set very high standards for students in terms of the quality of the artifacts that they produce and their ability to reproduce those insights in formal assessments.  We warn students and teachers that performance will be well below that standard in our initial implementations.  But we don’t stop there.  We embed reflections into the enactment of the activities and the design of the activities and artifacts trying to ensure that all students enlist all of the relevant big ideas and concepts.  But we don’t stop there either.  We then do extensive iterative refinements to make sure that discourse happens, and then that they can use that knowledge on classroom assessments AND on external achievement tests.  It is hard but it works.

Finally to #13, Daniel, absolutely we should look at visual engagement.  There are lots of folks doing interesting stuff with GIS these days.  Your efforts to match tutors with learners looks like a great example.  It does look like a pretty ambitious undertaking, so even a small success is noteworthy.  More generally, There are some interesting opportunities and challenges doing assessment with it, but “civic skills” are really being squeezed out in favor of math and reading.  I will say that I hope that efforts like you describe have students engaging in writing about what they are seeing.  I agree with Deb Brandt and others who say that writing is the new mass literacy (I think multi-modal written discourse is THE “21st Century Skill”).  I am prepared to go so far as to say that I would have a hard time supporting school-based activities like you describe if it did not include a component where students are writing about what they seeing.  I will write Kurt Squire and ask him to chime in on this and see if he has any examples of fostering and assessing student writing in his GIS-based stuff.

Well, that was fun but took a while.  I think I caught most of my typos.  Thanks for all the great comments!

17: Richard Thall from Stow MA USA at 9:22 pm on Monday, February 15, 2010

Reply to Prof Hickey’s comment: 

Prof Hickey has significantly misunderstood my comment.  This is entirely understandable since I was attempting to explain a complex and, possibly, novel idea as a brief blog entry using loaded terminology to save space. So let me start over from the beginning in an organized way. 

At the outset, I wish to make it entirely clear that I do not believe that monolithic testing, such as is typical with state NCLB tests performed once every year or two, is a good way to assess the performance of individual students.  These methods are grossly unfair to students, are unable to adapt to differentiated instruction or differing curricula, are unable to fairly cover the content of one or two years of learning, and are too infrequent to guide teachers in correcting students’ instruction on a week-to-week basis. 

On the other hand, I believe that teachers need a more objective yardstick than their own subjective judgment.  Using computer and Internet technology it is now possible to create extremely flexible and fair, yet objective assessments.  (In this context, an assessment implies a traditional format where test items are presented to students who then respond with answers in a short time.  Authentic assessments, projects portfolios, etc and their attendant subjectivity are beyond what I have in mind.  However, item formats much richer than multple choice, e.g., graphical questions, can be easily used in the method proposed here.) 

If an educator wishes to assess student performance, then at some point it becomes necessary to objectively compare the performance of students to other students in the same classroom, in the next classroom, in another school, in another district, in another state, or another country.  Absent such comparisons, what objective yardsticks exist?  How can “schools ... tell the government how students are doing” (from item 1, above)?  Absent comparison, how can grades from different teachers have any comparable meaning?  Can such comparisons be made for students following different curricula, differentiated instruction, and even different tests created by different teachers?  Yes, it is possible.  I call this approach objectively comparable assessment.  If I correctly interpret the meaning of “Curatorial Crowdsourcing”,  Bob Bradley may have the kernel of this idea in mind.  Crowdsourcing and curating are key parts of this concept, in any case. 

My contribution to this discussion is the following nuts-and-bolts description of how objectively comparable assessment can be obtained.  This is treated purely as a technology item showing how ‘Digital Technologies Could Be Used to Assess 21st Century Skills’.  Although obviously important, the organizational and accountability issues mentioned at the beginning of this blog are beyond the scope here. 

Proposed approach

An item bank (database of test questions) is created by starting with one or more existing item banks and adding open (crowdsourced) contributions.  Items are reviewed by curators prior to publication.  Teachers compose assessments by selecting appropriate items from the database.  Appropriateness is judged by relevance to the curriculum, the ability of the student population, and type of assessment, e.g., quiz or major examination.  Different items can be selected for specific student groupings or even for specific students.

Each student sits at a computer and is presented with the appropriate version of the assessment.  The computer manages the presentation of items and not only records the students’ responses but also the time spent on each item.  Demographic information about each school and each student is also associated with each response.  This is all kept as a permanent growing record of responses for each item.  This may seem like an enormous body of information, but it is well within the capability of modern computer systems.  From these data, many statistics are generated including the mean and median times spent on the item and the difficulty (the fraction of students providing correct responses).  The results of the items on each student’s assessment are aggregated, producing a grade for the assessment taken as a whole.  Many grading methods are possible.  An aggregate difficulty rating for the test can be computed from the statistics, taking the time available into account if desired.  Various methods exist for weighting the final grade by these or other factors. 

Since the aggregate grade on each assessment is obtained by comparing the student’s performance on each item to the accumulated statistics for each item, the aggregate grade is based on a broad statistical sample of all students who have ever answered the items presented.  The grade is not dependent upon or biased by the student population of one classroom or teacher.  This makes the answer to the question ‘how are my students doing?’ much more robust and free from local bias factors.  The associated demographic data make it possible to study the effects of geography, school, school district, socio-economics, etc. even at the level of a single item or single assessment. 

Using this method, it is possible to develop a fair grade for each student, even when each student in a class took a different test!  It is hard to see how teachers could be given more flexibility in tailoring assessments to challenge and probe bright students while still providing fair tests for average students.  Of course, when used in this way, the difficulty rating must be folded into the grades. 

Examination of the assessments used in each classroom can reveal many things about the quality of the teaching in that classroom.  The choice of items can reveal the extent to which the standard curriculum was covered.  The difficulty of items chosen can reveal the extent to which the teacher is challenging the students.  And the outcomes of the assessments can be used to measure the effectiveness of the instruction. 

There is much more to the proposed assessment system than can be described here.  Many aspects of this proposal require investigation and refinement, from curatorial methods and test presentation design to security, privacy, computer systems engineering, software engineering, and human factors engineering just to name a few.  However, all of these are well within the capabilities of current web application technology. 

Contact (JavaScript must be enabled to view this email address)

18: Gregory Louie from Durham, NC at 12:22 am on Tuesday, February 16, 2010

Great discussion:  I would love to join in a deeper discussion with all of you.  I just don’t have time.  I have lesson plans to write, student papers to read, laboratories to set up and faculty meetings to attend.
So I intend to post once and get some sleep.

On Good Examples Abound:  I agree with Daniel and Brian that Mislevy’s work at Cisco point the way.  Yet Cisco has a worldwide database on service calls to define networking problems at all grain sizes.  I imagine that they also deployed teams of instructors,  educational psychologists, statisticians and programmers to work on Mislevy’s assessment project.  I also assume it was a well-coordinated process.
It would be great if MacArthur foundation would fund a similar process.  As a working teacher, I would love to have access to a Mislevy type assessment system that is tied to challenge-based instructional materials that foster the mastery of real-world problem-solving skills.  But I’m not holding my breath.  I have classes to teach.

On the way forward and the idea of lab schools by James Paul Gee.  I teach in a former lab school of Duke University that is very innovation friendly, but no longer has any connection with a university.  Although, I would love to have fallen into a school with a pre-existing university connection, I am not waiting and making my own university education research connections.

My goal is to create a set of formative assessments that tie the results to suggested sets of specific challenge-based instructional materials similar to the Cisco system, but designed for the 7th grade science curriculum.

To get there, I need to start with defining the range of student misconceptions using open-ended formative assessments that will help me sketch out a simple learning progression for each concept from simple to complex.  Then I need to develop or find resources for each “level” or “path” in the learning progression.  My first attempts will directed towards defining what “mastery” of concepts, skills and thinking practices look like in 7th grade.  My hope is that with the right open-ended formative assessments, I hope to get an idea of the various pathways that “naïve” students take towards mastery.

I have to do this for each of the subject areas I teach in science:  The Nature of Science and Technology, Human body systems, Forces and motion, Biodiversity and climate change.  Wow!  What a challenge! 
Fortunately, I’ve made some useful connections at last summer’s Learning Progressions in Science Conference in Iowa City.  I have an ongoing conversation with Nancy Songer’s group at the University of Michigan to help define a learning progression for Climate Change. 

Wish us well.

Aside:  It would be interesting to dialogue map this entire discussion and other Hastac discussions on debategraph, which would allow the various threads to be visually displayed.  http://debategraph.org/

19: Caro from Madison WI at 4:10 pm on Tuesday, February 16, 2010

I successfully defended my master’s thesis mere hours after this was posted, and quoted directly from it.  My thesis was “Algebra and No Child Left Behind:  Standardized tests and algebraic complexity,” and focused on the recent history of NCLB assessment—your post helped me answer questions about the future.  Thanks much smile

20: Charles Youngs from Pennsylvania at 9:26 pm on Tuesday, February 16, 2010

Education Put to the Test, Students Be Damned

Here’s a story problem for you:  “If the federal government establishes a 500-point system and awards 138 points to states that increase a measure of teacher effectiveness by using student performance as the criteria and pay-incentive, how long will it take teachers to teach the test and not the student?”

My guess is not long. Speaking about the $4 billion “Race to Top” deal U.S. Secretary of Education Arne Duncan puts it crassly.“All of this money is voluntary,” he says. “If states don’t want to apply or compete they have every right not to do that. But I will tell you that when we put billions of dollars on the table, you’ll see people more than step up” (National Public Radio, 1/19/10, bold mine).

It would be laughable if it weren’t so damnable to quality in our schools, damnable to the educators in their halls, and damnable to the students most of all.  Oh, don’t get me wrong. Scores will increase; that much is all but certain. Learning—except how to whiz the test—will not. It will be reduced to skills and facts delivered by clerks and online programs, not educators. In fact, I predict a lot will be lost. If not school systems altogether, then we will lose the following: 

Creativity, 
productivity,
industry,
innovation,
performance in the arts,
entertainment, and
sports arenas, and
curiosity in the maths and sciences. (all non-out-sourceable talents, by the way)

Teachers can inspire those talents in students . . . but not with standardized tests.  These are the attributes of America’s success and prosperity.  And they are not standardizable—they are as revolutionary as the spirit of 1776. Standardized tests by their very essence are antithetical to the creativity, productivity, industry, artistic performance, entertainment, and sports, and curiousity in the mathematics and sciences. Our citizens have led the world in these pursuits for the past century—despite the fact that our standardized scores have lagged.

As more an more standarized tests are added to the school year, students learn less and less that will be meaningful to their lives, liberty, or happiness, much less to our country’s success.  Before a student is graduated from high school, he will have spent more than 180 days devoted to standardized testing; that is, more than a whole year—gone. A year of teachable moments that might have expanded his world with hope and curiosity.  How can we forfeit so much for these tests, now promised in multifolds, and bearing such damage?

According to the bibliographic information giant Bowker, who tracks reading trends, a quarter of our population did not read a book in 2008. And less than half did not read more than one. Do we imagine that standardized tests with their overworked passages and hackneyed, insipid prompts will inspire a love rather a fear and dread of reading and writng? Last year’s statistics will seem halcyon when viewed from 2022.

Why do our legislators yearn to standardize and hold our students and teachers accountable to international systems that are arguably inferior to ours?  Ease. Scores are easy. Not terribly meaningful, but easy.  You can publish the results and say “there.”  The politician says, “See, now relect me.” Though low on meaning, they are high on stakes. If you don’t believe it, we’ll bribe you. That’s another easy answer. Confuse the issue with funding.

Pay teachers more for so-called effectiveness of so-called student achievement and you provide the meanest incentive to the basest gain. Funding for scores that are so limited in their meaningfulnes—that is, save the injury of demeaning communities which don’t make the grade and then the insult of not funding them so they go defunct—insults the very professionalism of the discipline. 

As for me personally, I see that, as I head toward retirement pensions based on my salary, I could fatten my wallet by teaching less.  Rather than teaching students, I could teach the test. Rather than working on concepts and skills that will prepare students for their futures yet unimagined, I can work to the bubble test defined for the here and now.  Rather than a career professional who has strived for decades to appeal to the hearts and minds of the next generation, I’ll become a clerk and time-keeper. The whole deal is one to be made with the devil.

States are signing on to “Race to the Top” with Faustian panic and expectation. It’s a race afterall, not a thoughtful, meaningful process of learning. Governors seek funding in exchange for doing the devil’s work. Signing on the dotted line of a moral blank check with the testing companies (and their lobbyists) playing banker. Districts be shamed. Teachers be pressed. Students be damned.

21: Melissa Malkin-Weber at 10:01 am on Thursday, February 18, 2010

I would like to echo the comment earlier about the value of open source assessment standards. Assessment of something as subjective as “learning” and “performance” of students must be iterative and evolving. Open source allows for innovation and experimentation around a common platform.

22: Mechelle De Craene from Florida at 8:32 am on Tuesday, February 23, 2010

Congratulations Caro!! Are you going to post your thesis anywhere? I’d be interested to read it. I think it such an important topic that a lot of folks aren’t addressing. I used to teach high school inclusion Algebra and it was very challenging in the fact that we had students who couldn’t multiply in our class, yet they were expected to know how to do quadratic equations, etc. Since I was the special ed teacher I was the case manager of these students. I made the suggestion that they needed intensive small group instruction to bring their skills up. When I teach math I use a lot of art and technology. I learned geometry through Logo back in the 80’s, so I don’t really teach in traditional ways. However, was told by school admin that they needed to be exposed to the reg-ed curriculum. I was going to adapt the materials and still meet the standards. However, they said the kids needed to be in with the reg-ed kids and they weren’t going to graduate anyway (they get a special diploma). It was all about NCLB and student placement dollars and it was very challenging to try to teach my students Algebra that year. I did learn a lot though.

I would love to read your thesis. I’m a part time professor at night and teach special ed. Congratulations again Caro!! Rock on!!  : )

23: Caro from Madison WI at 6:06 pm on Tuesday, March 2, 2010

Hey, Mechelle!  I’ll be posting my thesis publicly once it’s all official and such, and I’ll come right back here and post the link up.  And it’s always nice to know that others are interested in the topic that’s so recently consumed my life smile

Robust discussion/debate is encouraged. Comments are reviewed before posting to ensure they are on topic and do not promote commercial products or services.

Add a Comment

Name
Email (required but private)
Location
URL
Comment
Please enter the word you see in the image below:
Remember my personal information
Notify me of follow-up comments?

Search Spotlight

Blog Archives | Behind the Research Archives

About Spotlight

Spotlight magazine showcases the projects and people funded by the MacArthur Foundation’s Digital Media and Learning Initiative and covers the intersections of technology and learning.  We go beyond the research to show how digital media is being used in classrooms and programs around the world.

Spotlight welcomes guest posts and reader suggestions and comments. Learn more and meet the Spotlight team.

View Spotlight videos and interviews on Vimeo.

Subscribe to Newsletter

Enter your e-mail address to receive our periodic e-newsletter of Spotlight highlights.

Subscribe to Feed

Enter your e-mail address to receive daily updates.

Follow Spotlight

Follow Spotlight on Twitter