From Common Measures to “Measures in Common” Convening

January 31st, 2020

the commons: resources belonging to or affecting the whole of a community

common: found frequently in many places or among many people

in common: 1. in joint use; shared 2. of joint interest (Oxford)

A convening of 72 practitioners, researchers and evaluators, and other stakeholders in Washington, DC from December 11 to 13, 2019, explored the current state of evaluation and measurement tools in afterschool STEM programs. The goal was to address the need to monitor the quality and outcomes of a wide range of programs.

Formed by the Afterschool STEM Hub and led by the University of Washington in partnership with the Afterschool Alliance and the National Girls Collaborative, the convening aimed to produce:

a detailed, visual representation of the range of outcomes that afterschool programs are seeking to achieve;
a taxonomy of current evaluation instruments aligned to these outcomes; and
an identification of overlaps, gaps, and needs in order to guide practitioner choices about which tools to use and when, and to point to contributions that researchers can make by adapting existing instruments or creating new ones

The convening was funded by the National Science Foundation’s (NSF’s) Advancing Informal STEM Learning (AISL) program (award #1811487). It provided a structure for practitioners from national organizations and leading researchers to do intra-group work with peers, joined by “disseminators” and funders. The gathering featured overlapping sessions to allow the communities to work together. The organizers and many in attendance identified with and function in more than one role.

The content of the convening was informed by a literature review by the University of Washington that identified some of the tools that out-of-school time (OST) STEM programs are using to measure a range of constructs. The researchers focused the review on two lines of argument in the literature as they pertain to the value and potential of OST outcomes and quality: issues of equity in, and ecological perspectives on, STEM learning (Barron, 2015; Philip & Azevedo, 2017). The review found a variety of measurements related to achievement, interest, and engagement, but few measures addressed communities’ everyday STEM practices or STEM as a tool for social justice and community development. A public version of the literature review, which became a working document at the convening, will be available on the InformalScience.org website in spring 2020.

The convening also built on previous convenings and events where professionals from across the informal STEM learning field have grappled with similar themes in assessment, evaluation capacity building, and the opportunities and challenges presented by the notion of common measures. CAISE has led and tracked the trajectory of some of these discussions, which have resulted in new thinking, sharing, instrument development, and knowledge-building about approaches to measuring and understanding the impact of designed programs, activities, and experiences.

The agenda was structured around panels that presented the perspectives of practitioners, researchers, disseminators, and funders, interspersed between whole- and small-group discussions and exercises designed to surface and address issues, opportunities, and challenges. Throughout the convening, participants contributed sticky notes with their ideas and priorities for constructs, outcomes, and measures to a map that grew and evolved as a result of the discussions and interactions.

Common Measures STEM Practices

Working together during the first day and a half, practitioners surfaced common challenges, such as the complications of relying on youth self-reports, understanding long-term outcomes, measuring staff practices, having measures that reflect and are informed by the community context, programs’ access to data for improvement efforts, and assessing equity and social justice and the role of family in youth outcomes. As researchers and evaluators joined the conversation, group research questions emerged, such as: How does STEM capital influence pathways? How do measures of STEM program quality relate to youth outcomes? How can we conduct longitudinal studies with the qualities that we admire in shorter term studies?

During the cumulative activities over 2.5 days, participants identified, shared, and wrestled with definitions of terms and constructs, tensions between the potential political power of common measures and the richness of a variety of approaches, the competing priorities and timelines of practice and research, and the opportunities and challenges of attending to equity in design and measurement. One strong area of agreement was around the current context for the convening and the centrality of equity and social justice concerns among practitioners and researchers alike. Those who had been involved in the previous convenings and meetings with similar or related goals were particularly heartened by this emphasis, which was notably lacking from the field six years ago.

Existential questions such as the “why” of doing design and measurement work were raised by both communities as well. Practitioners spoke of primarily wanting to learn more about and improving their own programs’ outcomes but feeling constant pressure to be in the competitive “horse race” for funding. Meanwhile, researchers reflected on the challenge of wanting to do research in the service of practice, which is at odds with the reputational schemes to which they are held accountable in academia, where the dominant narrative relates to “achieving status.” Both groups agreed on the need to work together to move policy so that more quality STEM learning experiences are available to more children in the OST space.

Common Measures board

A recurrent theme throughout the convening was the question of the unit of measure in this work. In other words, should the focus be on particular psychological traits in individuals or on trying to situate individuals in interactions within the cultural context, families, and communities of which they are a part, and what does that mean for the approaches that we use? One researcher suggested that the field might be at a point of crisis with regard to this tension and called for more systems thinking to address it. Another suggested that perhaps we need a “hierarchical or nested, shared measure suite” (i.e., a shared common instrument that can capture the big picture, within which are nested tools that would measure nuances at the community, family, and individual levels).

While private funders were well represented among the participants, there was general agreement that the agenda of this convening was less driven by their interests than were similar discussions in the past. One funder noted that over time “a depth to the messiness and focus to the angst” has emerged in this work. Going forward, funders in attendance recommended that both practitioners and researchers continue to challenge their assumptions about constructs, measures, and program design, and that they regularly check in with program participants about what they think is important to measure and assess about their experiences and learning.

The convening ended with reflections on how the sessions, discussions, and map-making exercise informed participants’ thinking about the opportunities and challenges ahead and the actions they intended to take as a result. Some tensions remained unresolved, such as the gains and losses of “letting a thousand [measurement] flowers bloom” vs. the potential scientific and political power of replication and comparisons that common measures allow. The voices for these points of view were represented compellingly, particularly with regard to what each offers as a social justice agenda. In the end, there was some consensus around the idea that there is a time and place for both approaches in the larger project of creating a compelling narrative around research and measurement, centered on the contributions that OST programs can make for equity. Researchers noted the need in their community for additional capacity-building to design studies and questions that address equity goals.

The group also acknowledged that this community doesn’t have to try to take on every issue at once, and that sometimes relatively “small” developments have led to widespread change and influence in the field. An example given was the impact categories identified in the 2008 Framework for Evaluating Impacts of Informal Science Education Projects that resulted from an NSF-funded evaluation workshop, and the six learning strands that were outlined in the 2009 National Research Council/National Academy of Sciences consensus report, Learning Science in Informal Environments: People, Places, and Pursuits. Both measures have gained a great deal of traction in the field since they were released. One researcher suggested that it may be time to add current relevant constructs and outcomes, such as community engagement, social empowerment, or opportunity to learn, to these categories.

Overall, the convening made progress on all three of its original goals, while pointing the way for more in-depth work on constructs and measures. It produced a document that the wider field can learn from and use to advance practice, research, and equity, and that can inform enlightened philanthropy in OST STEM learning.

Some of the specific actions and resources that participants suggested to build on the work of the convening included:

Leverage existing networks (e.g., the NSF Division of Research on Learning in Formal and Informal Settings (DRL)-funded resource centers, such as STELAR and CAISE) and propose related sessions at DRL Principal Investigator (PI) meetings. With regard to CAISE, those who have completed evaluation reports should upload them to the InformalScience.org website here.
Connect with or join professional associations, such as the STEM Education Topical Interest Group at the American Evaluation Association or the Research and Evaluation Community of Practice at the Association of Science and Technology Centers (ASTC).
Read the Shared Measures for Evaluating Common Outcomes of Informal STEM Education Experiences chapter by Amy Grack Nelson et al. in the American Evaluation Association New Directions in Evaluation journal special issue on Evaluation in Informal Science, Technology, Engineering and Math Education (2019).
Explore existing tools that are being applied in a variety of OST STEM learning settings, such as Assessment Tools in Informal STEM (ATIS) (a 2019 update of the database includes 25 new tools that can assess engagement, attitudes, content knowledge, competence, and other domains within the fields of computer science, technology, science, and mathematics), the Weikart Center’s Youth Program Quality Assessment, the Assessing Women and Men in Engineering STEM Assessment Tools, the STEM Next Opportunity Fund’s STEM in Afterschool System-Building Toolkit, and the National Assessment of Educational Progress (NAEP) Data Explorer.
Explore the CAISE Evaluation and Measurement Task Force’s interview series on STEM identity, interest, and engagement.
Explore the Twitter feed from the convening for additional perspectives and images from participants.
Review the Building Blocks of STEM Act signed into law in December 2019, which instructs NSF to prioritize research on STEM learning in early childhood education, including promising practices for effecting educator training and professional development.