Evaluation Tools and Instruments

Resources » Design Evaluation

Most evaluations require the use of a data collection tool—a survey or other data collection instrument. Evaluators either need to adopt or adapt existing tools or create new ones. Either method can pose challenges: Tools that have been developed for one evaluation may not prove suitable for another, at least not without careful modification. At the same time, creating new tools requires expertise in measurement and instrument design.

How do you know if an existing instrument is appropriate for your needs?

Good question! When considering the use of an instrument, keep in mind the following:

What is the instrument measuring? Review how the instrument developers define what it is they are measuring. Does it match exactly what you want to measure? Also look for validity evidence that the instrument measures what it proposes to measure. Validity evidence can be from expert reviews, think-aloud interviews, factor analysis, and other validation techniques.
What audience was the instrument created for and tested with? Instruments are created for a particular audience. If your audience matches the one that an instrument was designed for, great. If not, you’ll need to do some testing to see if the instrument works for your audience before you use it for an evaluation. For instance, a survey created for adults may or may not be appropriate for children. You won’t know until you test it.
What context or setting is the instrument meant for? An instrument meant for one setting may not work well in a different one. For instance, a survey developed to measure an experience kids have in a school classroom may not be valid for evaluating an experience they have within a museum. Again, testing is required if an instrument is to be used in a new setting.
Do I have the expertise to be able to judge the appropriateness and quality of the instrument? Experience with evaluation and instrument design are necessary to successfully choose and use an “off-the-shelf” instrument. If you don’t have this experience, be sure to call on someone who does.

The following websites provide tools and instruments that can be used for evaluating the wide range of outcomes addressed by informal STEM education projects, or that can serve as starting points for modification.

Activation Lab (ActLab): ActLab is a national effort to learn and demonstrate how to activate children in ways that ignite persistent engagement in science, technology, engineering, art, and mathematics learning and innovation. Visit the website to find a variety of instruments developed and tested by ActLab to measure constructs such as science learning activation, engagement, and scientific sensemaking.
Building Informal Science Education (BISE): The BISE project coded and synthesized 520 evaluation reports on InformalScience.org to see what could be learned about evaluation in the informal science education field. Search the BISE NVivo Database or download the Excel File of BISE Report Level Codes to locate reports with attached surveys, observation instruments, focus group and interview protocols, timing and tracking forms, and much more.
Collaboration for Ongoing Visitor Experience Studies (COVES): COVES provides a set of common tools and measures for museums across the country to systematically collect, analyze, and report on visitor experience data. Members pay an annual fee.
DEVISE: Developing, Validating, and Implementing Situated Evaluation Instruments (DEVISE) is an initiative from the Cornell Lab of Ornithology that developed a set of constructs and associated instruments designed to measure individual learning outcomes from participation in citizen science such as interest, motivation, self-efficacy, and skills. Each instrument comes with instructions for use and scoring. Start with the User’s Guide for Evaluating Learning Outcomes from Citizen Science.
EvalFest: Originally formed to support the science festival community, EvalFest has built a suite of resources that are useful to all kinds of informal learning professionals. They have gathered several instruments that can help you collect data at public events, and developed videos and other training tools around data collection, management, and analysis. Here’s an overview of the resources.
From Soft Skills to Hard Data: Measuring Youth Program Outcomes: In this document, the Forum for Youth Investment reviews ten outcome measurement tools appropriate for use in afterschool and other settings to measure soft skills such as communication, collaboration, critical thinking, decision making, initiative, and self-direction. For each tool, it provides sample items and information about usability, cost, and evidence of reliability and validity.
InformalScience.org: Browse evaluation instruments and tools submitted by our community. Use the search filters to find tools specific to your educational setting, audience, content area, and more!
Online Evaluation Resource Library (OERL): OERL is a NSF-funded project developed for professional evaluators and program developers. Although targeted for those who work in school environments, it provides extensive evaluation resources and samples—instruments, plans, and reports that can be modeled, adapted, or used as is.
STEM Evaluation Repository: This crowdsourced and curated collection of resources consolidates new knowledge from STEM evaluation practitioners, while also identifying and linking to existing repositories to avoid duplication of efforts. Created through a partnership between the American Evaluation Association’s Topical Interest Group (TIG) for STEM Education and Workforce Training, Oak Ridge Associated Universities, the Google-originated Computer Science (CS) Impact Network, and other collaborators.
STEM Learning and Research Center (STELAR): STELAR supports the NSF’s Innovative Technology Experiences for Students and Teachers (ITEST) program, and provides a searchable database of data collection instruments that have been used by various ITEST projects.