Get Updates via Email Get Updates Get our RSS Feed
  Follow Mathematica on Twitter  Share/Save/Bookmark

Annual Meeting of the American Evaluation Association

"Values and Valuing in Evaluation"

November 1-6, 2011—Hilton Anaheim—Anaheim, CA

Workshop: Systems Thinking for Evaluation Practice
Janice Noga (Pathfinder Evaluation and Consulting) and Margaret Hargreaves (Mathematica)

Systems thinking can help evaluators understand the world—in all its diversity—in ways that are practical, comprehensive, and wise. For those interested in making sense of the complex and sometimes messy situations we often encounter in practice, this workshop provides an overview of systems thinking and how it can be used in evaluation. A systems approach is particularly useful in situations where rigorous rethinking, reframing, and unpacking complex realities and assumptions are required. Using systems approaches provides a broader perspective and helps the evaluator see the interconnectedness of component parts in a coordinated manner that emphasizes balance and fit. Evaluations based on systems concepts generate rich descriptions of complex, interconnected situations based on multiple perspectives that help participants build deeper meanings and understanding that can inform choices for subsequent action.

Through mini-lectures, group activities, and hands-on practice, this workshop teaches fundamental concepts of systems thinking and provides opportunities to apply learnings to everyday practice—making sense of the world, using systems to understand things better, and orienting ourselves towards the world in a way that embraces complexity and ambiguity.

You will learn:

  • Basic concepts underlying systems thinking
  • How to use systems perspectives in evaluation practice and evaluative thinking
  • How to apply systems thinking to evaluation practice

Assessing Coalition Building and Relationships Through Social Network Analysis
Todd Honeycutt, Marykate Zukiewicz, and Debra Strong (Mathematica)

Among the many objectives of funding a program, funders and participants want to help build relationships among those involved that could potentially last beyond the initial project funding. The Consumer Voices for Coverage program, funded by the Robert Wood Johnson Foundation initially for three years, sought to help 12 state-level consumer advocacy coalition address health policy in their states as well strengthen the relationships among participating organizations. As part of a larger multi-mode evaluation, we used social network analysis (SNA) methods to assess the extent to which participating organizations of each coalition had worked together before the grant and how organizations communicated and worked collaboratively with each other in the first and third grant years. This presentation will describe how we used SNA for the evaluation and compare our findings on coalition building and relationships with other results from the evaluation.

System Boundaries: Separators and Filters
Presenter: Michael Lieber (University of Illinois Chicago)
Discussants: Eve Pinsker, Geoffrey Downie, and Michael Lieber (University of Illinois Chicago); and Margaret Hargreaves (Mathematica)

If all a system boundary did was to separate the system from its environment, there would be little point in dwelling on it. But, boundaries do much more than that. In closed systems, boundaries are rigid; in open, dynamic systems, boundaries are more porous, acting as an entrance point for inputs to and an exit point for outputs from the system to its environment. Like white blood cells, boundaries filter environmental inputs, selecting which inputs get to other system components and which do not. As in biological systems, filtering can create a barrier in social systems, including evaluations. Managing the filtering process in evaluation is the focus of this Think Tank. We shall give brief presentations sketching the boundary concept and introduce an evaluation case in which boundary/filtering issues are challenging the evaluation. Then, participants will work in small groups, presenting their findings to the whole group for discussion.

Tanzania Energy Sector Impact Evaluation: Findings from the Zanzibar Baseline Study
Denzel Hankinson and Lauren Pierce (DH Infrastructure); Duncan Chaplin and Arif Mamun (Mathematica); Minki Chatterji (Abt Associates, Inc.); Shawn Powers (Princeton University); and Elana Safran (Harvard University)

The Millennium Challenge Corporation is funding an electricity project in Tanzania that includes construction of a cable connecting the electricity grid on the mainland of Tanzania to Zanzibar. This report describes findings from a baseline study regarding the potential impacts of the new cable. Our results suggest that in recent years the quality and reliability of electricity in Zanzibar has deteriorated. In addition, Zanzibar has experienced two major blackouts, the most recent of which lasted from December 2009 to March of 2010. That blackout appears to have had large negative impacts on the hotel industry in Zanzibar suggesting that the new cable could have important economic benefits for the island. The cable is scheduled to be built in 2012. We will conduct a follow-up study at that time to assess the degree of improvement in electricity services and associated changes in the hotel industry.

Reviewing Systematic Reviews: Meta-Analysis of What Works Clearinghouse Computer-Assisted Interventions
Andrei Streke (Mathematica) and Tsze Chan (American Institutes for Research)

The What Works Clearinghouse (WWC) offers reviews of evidence on broad topics in education, identifies interventions shown by rigorous research to be effective, and develops targeted reviews of interventions. This paper systematically reviews research on the achievement outcomes of computer-assisted interventions that have met WWC evidence standards (with or without reservations). Computer-assisted learning programs have become increasingly popular as an alternative to the traditional teacher/student interaction intervention on improving student performance on various topics. The paper systematically reviews computer-assisted programs featured in reading topic areas. This work updates previous work by the authors, includes new and updated WWC intervention reports released since September 2010, and investigates which program and student characteristics are associated with the most positive outcomes.

Three's Company: Results and Lessons Learned Through a Collaboration Among Funder, Grantee, and Evaluator to Establish Targets and Measure Child Progress and Parent Engagement
Emily Moiduddin (Mathematica), Chair and Discussant

Through a collaborative process, a funder, a grantee, and a team of evaluators developed a set of targets for child progress in multiple domains of school readiness and parent engagement. These performance targets are part of an accountability framework between the funder, First 5 Los Angeles (First 5 LA) and grantee, Los Angeles Universal Preschool (LAUP). LAUP maintains a network of over 300 preschools serving more than 10,000 4-year-olds throughout Los Angeles County. In the years before targets were set, First 5 LA commissioned Mathematica to conduct the Universal Preschool Child Outcomes Study (UPCOS). The session papers will describe how data from UPCOS were used to identify metrics and establish targets, how outcomes were measured, and how LAUP performed in the first year. Papers also will describe how the results are being used for program improvement and provide reflections on the collaborative dynamic. Throughout the presentation, lessons learned will be highlighted.

Using Evaluation to Inform the Process of Setting and Meeting Shared Goals
Yange Xue, Emily Moiduddin, Sally Atkins-Burnett, Elisha Smith, and Cay Bradley (Mathematica); and Ama Atiedu (Los Angeles Universal Preschool)

As part of the Universal Preschool Child Outcomes Study, Mathematica has been working with First 5 LA and Los Angeles Universal Preschool to conduct outcomes evaluations in the areas of child progress (since 2007) and family engagement (since 2009). Data from these studies are being used for the purpose of setting targets in the context of the performance-based contract between the two organizations. In this paper, we describe the data that were collected (direct child assessments, self-administered questionnaires to parents and providers), report how the findings for those studies were used to develop targets for child progress and engagement, and discuss how data are being used to determine whether the targets have been met (including key findings). This paper illustrates how data from descriptive evaluations can be used to both inform program improvement efforts and enable a funder and program to work together to determine whether shared goals are met.

Impact of the DRA Citizenship and Identity Documentation Requirement on Enrollment and Retention in Medi-Cal
Margaret Colby and Brittany English (Mathematica)

Between June 2007 and September 2008, California's 58 counties began implementing the Deficit Reduction Act of 2005 (DRA) citizenship and identity documentation requirement for beneficiaries seeking Medi-Cal enrollment or renewal. Using enrollment data from the Medi-Cal Eligibility Data System from May 2007 through March 2009, we conducted multivariate regression analyses to estimate average county-level monthly changes in retention, full scope enrollment, and restricted scope enrollment. Models included county and month fixed effects and an indicator for DRA implementation. Separate regressions were run for populations subject to and exempt from the DRA (i.e. current Medicare beneficiaries) and for subgroups defined by age and primary household language. Estimates suggest that DRA implementation did not impact Medi-Cal retention or restricted scope enrollments. However, enrollment for full scope beneficiaries subject to the DRA decreased by 3.8 percent (p=0.019), with larger effects for children. This estimate translates into about 60,000 fewer enrollments than expected in the year following DRA implementation.

Replicating Innovative Program Models: What Evidence Do We Need to Make It Work?
Margaret Hargreaves and Beth Stevens (Mathematica)

"Scaling up" is partially a result of replication. Can an organization adopt or replicate a new model? If not, could scaling up be achieved? How can evaluation contribute to answers to this question? What elements of knowledge, strategy, and local conditions need to be present in both the original organization and the organization replicating the model for successful replication to occur? The evaluation of the RWJF Local Funding Partnerships Program included case studies of four pairs of programs—the organizations that had developed innovative program models and the organizations that replicated them. These case studies reveal that the goal of most evaluations—evidence of effectiveness, is only one of the elements that further the chances of successful replication. Diffusion of knowledge of the innovation, identification of appropriate candidates for replication, and the provision of technical assistance to transplant the innovation are also part of the process.

Social Network Analysis TIG Business Meeting and Presentation: The Application of Multiple Measures in SNA Evaluations
TIG Leaders: Maryann Durland (Durland Consulting); Stacey Friedman (Foundation for Advancement of International Medical Education & Research); Irina Agoulnik (Brigham and Women's Hospital); and Todd Honeycutt (Mathematica)
Presenter: Maryann Durland (Durland Consulting)

This expert lecture will illustrate the application of multiple social network analysis (SNA) measures. Multiple measures allow for exploring and explaining the complexity of networks and move analysis away from one "statistically significant" measure, such as density, when comparing networks. Programs create networks. Defining these, for evaluation purposes, is critical, and will form the base for determining measures. Some program-related networks are small, and bounded by the program specifics (i.e. a network includes the participants in a group, participating over time as a small group). Other program networks may be more loosely defined and bounded by a relationship theory (i.e. support networks, communication networks, etc.). Some programs have one specified network and others have multiple parallel networks. In each case, multiple measures provide a means to understand the complexity of networks and to evaluate multiple networks on specific criteria, which can include more traditional statistical significance testing.

Non-Profit and Foundations Evaluation TIG Business Meeting
TIG Leaders: Charles Gasper (Missouri Foundation for Health); Beth Stevens (Mathematica); Helen Davis Picher (William Penn Foundation); and Joanne G. Carman (The University of North Carolina, Charlotte)

Using a Developmental Evaluation Approach to Create an Evaluation Partnership for the Healthy Weight Collaborative
Margaret Hargreaves (Mathematica) and Amanda Cash (U.S. Department of Health and Human Services)

Over the last year, Mathematica has been working with the Health Resources and Services Administration (HRSA) and the National Initiative for Children's Healthcare Quality (NICHQ) to evaluate the Healthy Weight Collaborative (HWC), an innovative quality improvement (QI) initiative designed to spread clinical and community-based interventions that prevent and treat obesity among children and families. The HWC is adapting the Institute of Health Improvement's (IHI) Breakthrough Series healthcare QI model for use in community-based learning collaboratives. Mathematica is using a developmental evaluation approach to work closely with HRSA and NICHQ evaluation and program staff to provide ongoing evaluation support as the IHI QI model is adapted for this community-based setting. In this session, Mathematica and HRSA staff discuss the challenges and opportunities of this fascinating and rapidly evolving project and evaluation. We will provide examples of how the evaluation's interim products have been used in the adaptation process.

Health Care Public Reporting: A High Stakes Evaluative Tool
Sasigant O'Neil and John Schurrer (Mathematica); and Christy Olenik (National Quality Forum)

Public reporting of health care quality performance measures has become a high stakes game. However, the diversity in purposes, audiences, and data sources among public reporting initiatives can make it difficult to identify opportunities for coordination in pursuit of a national agenda for assessing, evaluating, and promoting health care quality improvement. To help identify such opportunities, we conducted an environmental scan of public reporting initiatives and their measures. Initiative characteristics included audience, geographic level, report dates, payer type, sponsor, organization type, and when public reporting began. Measures were mapped to a framework of national priorities and goals, as well as other conceptual areas of importance, such as cost and health condition. Measures characteristics such as data source, endorsement by the National Quality Forum, target population, and unit of analysis were also collected. A group of national leaders used the scan results to begin identifying a community dashboard of standardized measures.

A Short History of the WWC and Its Review Standards
Jill Constantine (Mathematica)

Using a set of standards based on scientifically valid criteria, the What Works Clearinghouse (WWC) evaluates education research. To be effective, WWC reviews must be based not only on rigorous standards but also on consistent procedures to determine which research to include and to rate the literature and synthesize findings. Recognizing that randomized controlled trials are not feasible in all contexts, the WWC has developed standards for reviewing randomized controlled trials (RCTs) as well as quasiexperimental designs. The standards adhere to three principles—they are (1) exhaustive, (2) inclusive, and (3) well documented. The presentation will provide a description of the considerations in developing standards and the standards development process, an overview of the WWC standards, what it means to meet them, and how attrition standards were developed. We will also describe how reviews of research against the standards are consistently implemented.

Using Standards in Systematic Reviews
Neil Seftor (Mathematica)

The core of the What Works Clearinghouse (WWC) is research design standards to identify studies with the strongest causal validity. This presentation will cover the design and reporting requirements researchers must demonstrate to meet WWC standards and discuss how to use the standards to do a systematic review. It will also look at common challenges that arise in applying standards consistently and describe statistical adjustments made by the WWC to ensure findings can be compared across studies.

CMO Impacts on Student Achievement and Promising Practices
Joshua Haimson, Brian Gill, Josh Fergeson, Bing-ru Teh, Moira McCullough, Ira Nichols-Barrer, Alexandra Killewald, and Natalya Verbitsky Savitz (Mathematica)

What are the effects of charter management organizations (CMOs) on student achievement and which CMO practices are most promising? This presentation will summarize the study's impact findings including the average impacts of all CMOs on student achievement, the variation of impacts across different CMOs, and the practices that are positively associated with impacts. We will discuss the extent to which impacts are associated with a variety of CMO practices and structures including teacher coaching and evaluation approaches, use of CMO instructional models, student behavior strategies, the use of student formative assessments to inform instruction, the expansion of the school day or year, and CMO size and growth. The quasi-experimental and experimental impact methods will be briefly described.

Integrated Monitoring, Evaluation, and Planning (IMEP): An Approach to Evaluating International Research and Systems Change
Margaret Hargreaves (Mathematica), Discussant

In 2008, the McKnight Foundation began a new phase of its Collaborative Crop Research Program (CCRP). An evaluation design was intended to encourage cross-program coherence among 65 independent projects; to build local and regional capacity; to support communities of practice (Andes, Western Africa, East/Horn of Africa, Southern Africa); to encourage systems thinking in the areas of gender equality, agroecointensification, and sustainability; and to evaluate systemic improvements in nutrition and livelihood arising from basic crop research. IMEP was the result. These three panel members provide perspectives on design, development, and implementation of this approach. The funder (McKnight Foundation) discusses the drivers and challenges of an integrated M&E approach. The designers (Human Systems Dynamics Institute) discuss the principles and practices of a dialogue-based, systemic evaluation. The implementers in the field (Regional M&E Support) describe the process of introducing IMEP to project teams in diverse cultures, locales, scientific disciplines, and readiness.

Systems in Evaluation TIG Business Meeting and Think Tank: International Perspectives on Systems Evaluation
TIG Leaders: Janice Noga (Pathfinder Evaluation and Consulting); Margaret Hargreaves (Mathematica); and Mary McEathron (University of Minnesota)
Presenter: Janice Noga (Pathfinder Evaluation and Consulting)

As the Systems in Evaluation TIG continues to grow and bring in new members, the TIG is becoming more international, welcoming members from across the globe, including Asia (Japan, South Korea, Indonesia); Africa (South Africa, Tanzania, Uganda, and Tanzania); Central Asia (Azerbaijan, India, and Pakistan); Europe (Austria, France, Germany, Greece, Italy, the Netherlands, Spain, Sweden, Switzerland, and the U.K.); the Middle East (Egypt and Saudi Arabia); North America (Canada and the U.S.); South America (Brazil, Peru, and Venezuela), and the South Pacific (Australia and New Zealand). This diversity has enriched the TIG through lively workshop discussions, enlightening panel presentations, and culturally enriched evaluation music! We have assembled a panel of evaluators from New Zealand, South Africa, Brazil, the Netherlands, North America, and the U.K., to talk about their systems evaluation approaches and perspectives. We invite you to join us for what will be a fascinating discussion.

Leadership, Bipartisanship, and Dual Strategies: Lessons Learned from an Evaluation of a California Governance Reform Initiative
Jacqueline Berman (Mathematica) and Hannah Betesh (Berkeley Policy Associates)

Reform of public policy processes, particularly in the current fiscal climate, represents a fundamental challenge requiring flexibility, strategic action, and responsive reform proposals. Such proposals must appeal to the broadest cross-section of constituents but be simultaneously bold enough to engender change. How can reform efforts reconcile often competing demands to produce change in a crisis? A recent evaluation of a governance reform initiative charged with addressing California's troubled policy environment suggested areas of both significant challenge and possibility for gaining broad support and providing bold leadership in the context of attempts to set, enact, and implement a reform agenda while building broad support. Lessons and strategies that emerged included the need for engaging grass-roots and 'grass-tips' leaders and the advantages of simultaneous pursuit of ballot-box activism and legislative action. We will also discuss methodologies that can be used to identify and strengthen key strategies of governance reform.

Large-Scale Comparative Effectiveness Study of Four Elementary School Math Curricula
Roberto Agodini and Barbara Harris (Mathematica)

This large-scale evaluation examines the relative effectiveness of four elementary school math curricula that use varying approaches to math instruction: (1) Investigations in Number, Data, and Space, (2) Math Expressions, (3) Saxon Math, and (4) Scott Foresman-Addison Wesley Mathematics. The evaluation uses an experimental design based on 110 schools from 12 districts, where all four curricula were randomly assigned to schools within each participating district. The study compares average student math achievement gains to determine the relative effects of the curricula. This session presents causal evidence of the relative curriculum effects on first- and second-grade math achievement during the first year of curriculum implementation. At the first-grade level, the results favored Math Expressions; at the second-grade level, they favored Math Expressions and Saxon. Correlational (mediational) analyses also were conducted to examine whether instructional practices explain the differences in curriculum effects.

Evaluating Research-to-Practice in Disability: A Knowledge Value Mapping Approach
Frank Martin (Mathematica) and Juan Rogers (Georgia Institute of Technology)

This presentation will describe the use of knowledge value mapping (KVM) for evaluating knowledge translation (KT) initiatives in the disability arena. KT has emerged recently in the health science community as a means to address perceived gaps in the application of the best research to treatment of disease. Specifically, in the area of disability and rehabilitation research, federal policymakers have identified KT as an area for critical evaluation and outcome achievement. This presentation analyzes some of the issues raised by the notion of KT. First, it puts KT in the broader context of the evaluation of knowledge flow problems. Second, it introduces the knowledge value mapping framework as an avenue for addressing the fundamental issues that KT raises for research-to-practice evaluation. Third, it illustrates the application of the framework with a KVM case study of accessible currency.

The Effectiveness of Mandatory-Random Student Drug Testing
Susanne James-Burdumy, Brian Goesling, and John Deke (Mathematica); and Eric Einspruch (RMC Research)

The Mandatory-Random Student Drug Testing (MRSDT) Impact Evaluation tested the effectiveness of MRSDT in 7 school districts and 36 high schools in the United States. The study is based on a rigorous experimental design that involved randomly assigning schools to a treatment group that implemented MRSDT or to a control group that delayed implementation of MRSDT. To assess the effects of MRSDT on students, we administered student surveys at baseline and follow up, collected school records data, conducted interviews of school and district staff, and collected data on drug test results. Over 4,000 students were included in the study. The presentation will focus on the study's findings after the MRSDT programs had been implemented for one school year.

Workshop: Integrating Systems Concepts with Mixed Research Methods to Evaluate Systems Change
Margaret Hargreaves, Heather Koball, and Todd Honeycutt (Mathematica)

Increased interest in large-scale social change has led to increased focus on the evaluation of system change initiatives. But foundations, government agencies, and social entrepreneurs find a wide array of competing approaches and methods for evaluating change—and no one method or approach is best. The quandary becomes how best to move forward and what method(s) to use and why.

Through lecture, discussion, and small group activities, this workshop offers a pragmatic mixed methods approach for evaluating four basic types of systems: (1) systems with unknown dynamics; (2) networked systems; (3) nested, multi-level systems; and (4) systems with complex, non-linear dynamics. For each system type, the workshop suggests specific combinations of mixed research methods, demonstrating how they have been applied successfully in real-life evaluations. This workshop goes beyond basic lists of system concepts and evaluation approaches to provide useful, ready-made packages of mixed method evaluation designs for several common types of systems change initiatives.

You will learn:

  • How to identify specific system dynamics
  • How to design a system change evaluation
  • How to use mixed methods to evaluate networked system change
  • How to use mixed methods to evaluate multi-level, multi-sector system change
  • How to use mixed methods to evaluate non-linear, complex adaptive system change