Examining Multidisciplinary Research at the Intersection of Human and Machine Learning

by Alex Gurn and Camellia Sanford-Dolly

Rockman et al (REA) recently completed an external evaluation of the National Science Foundation–funded project, Situating Big Data: Assessing Game-Based STEM Learning in Context. The project was a collaboration between three research teams (two at the University of Wisconsin–Madison, and one at Arizona State University) around a shared dataset that focused on middle school youths’ experiences playing Virulent, an online game designed to teach systems biology concepts. The game puts users in the role of a virus to experience and understand how viruses operate and interact within the body. Situating Big Data sought to integrate theories of situated cognition with analytic techniques derived from the big data movement through the collection and analysis of multi-modal data generated through online and in-person interactions while youth played Virulent.

The purpose of the external evaluation was to examine the implementation of a collaborative research design and offer an assessment of the project’s methodological approaches and deliverables. To address these goals, REA used qualitative case study methods, including direct observations of the research team’s data collection processes, in-depth interviews with project stakeholders, and a review of articles, presentations, and other documentation generated by the project.

The research project directly responded to a current limitation of “big data” analytics, as applied to game-based learning. Typically, learning researchers collect in-game analytics (or clickstream data) as their main metric of the effectiveness of game-based experiences. Studies of learning through online games have not merged young people’s talk, social interactions, and pre/post-content assessments with analyses of clickstream “data exhaust” passively collected by the game system.

The Situating Big Data project explored how to curate a blended gaming environment that combined multiple qualitative and quantitative data streams, and how to effectively make use of those data streams to better understand how youth learn through games.

REA concluded that while the research teams’ assembling and structuring of their mixed dataset proved extremely challenging, these efforts were ultimately successful. Much of the research activities were focused on developing strategies to overcome methodological problems that arose in trying to capture and connect the diverse forms of information in a single dataset. Tracking and organizing individual youth’s data streams proved to be a complex and expensive process because of the quantity of data available and the amount of time required to validate the qualitative corpus.

In addition, whereas strong collaborative relationships developed between two of the three teams involved in the project, one team did not play a direct role in certain key stages of the research processes and decision-making. Geographic distribution of team members, a lack of regular communication, and other external factors created constraints that limited the growth of an in-depth collaboration across all parties. Project team members recognized the ongoing work and commitment needed to build and maintain active relationships, particularly when separated by institutional and disciplinary boundaries.

Based on the teams’ successes and challenges in collaborating, REA identified key actions that may support future inter-disciplinary teams:

  • Identify specific stakeholders’ needs and a driving purpose for the collaboration,
  • Recognize and plan for inherent costs and risks of collaboration,
  • Leverage existing professional relationships,
  • Develop group norms and protocols,
  • Invest in relationship building, and
  • Assess progress, identify challenges, and celebrate successes.

To learn more about REA’s evaluation of the Situating Big Data project, please read our full evaluation report here.