Recommendation 4
Rob Gould, Robin Lock, Ambika Silva (chair), Jennifer Broatch, Jamie Perrett
Integrate real data with a context and purpose throughout the course. Select data that are meaningful and engaging to the students.
Real data refers to datasets from genuine, practical situations, showing the complexity and variety of data we encounter daily. Integrating real data into a statistics course bridges the gap between abstract concepts and practical applications, making learning more engaging and relevant. By selecting datasets related to topics students care about—such as social issues, sports, health, or the environment—students can explore meaningful questions while developing statistical skills (Willett and Singer 1992).
What makes data “real”? To some extent, this depends on the instructional purpose of the data. Students need to engage in authentic problem-solving and learn to extract meaning from data. This can, in part, be achieved by providing clear context for each dataset, including its origins and purpose. Students also need experience working with data with different variable types, and with different sources such as survey data, experimental results, observational data, or business metrics. Real data sets can provide authentic opportunities for students to connect their analysis to real-world problem-solving. This approach encourages critical thinking, creativity, and engagement, preparing students to use statistics as a tool for understanding complex, real-world challenges that also relate to their personal pursuits, past experiences, and cultural traditions (Louie et al. 2021).
Using real data serves both curricular and pedagogical goals. Curricularly, it emphasizes problem-solving over rote application or theoretical study, ensuring students learn to evaluate statistical claims, pose investigative questions, and connect statistics to their field of study. Pedagogically, working with real or realistic data exposes students to challenges they will encounter in authentic analyses, such as missing values or non-standard distributions. Realistic data can be simulated datasets designed to mimic patterns and challenges of authentic data, providing controlled yet realistic learning opportunities. While simplified and/or simulated datasets can be useful for focused teaching moments, the integration of real data provides the depth and complexity needed to prepare students for future work in data analysis. Additionally, real data can enhance student engagement, as it provides opportunities for students to examine contexts that they’re interested in (Neumann, Hood, and Neumann 2013).
To access real data, instructors can explore online repositories, textbook datasets, or classroom-generated data from activities or surveys. Many modern textbooks include relevant, well-formatted datasets, and collaborative efforts with colleagues across disciplines can provide additional sources. Data sources and webpages specifically geared toward education can also simplify the process of finding and using meaningful datasets. By integrating these resources, educators can help students build statistical expertise while solving real-world problems. Integrating real data enables students to develop critical thinking and problem-solving skills while applying statistical methods in meaningful, applied contexts.
Real data may include some or all of these key characteristics:
Authenticity: Collected for actual purposes, such as research, decision-making, or operational processes, rather than being artificially created solely for instructional use. Authenticity can be enhanced by asking meaningful investigative questions, such as those questions for which the data were initially collected.
Relevance: Connected to topics or issues that are meaningful and relatable for students, such as public health, climate change, sports analytics, social trends, or themselves.
Complexity (when necessary): May include features commonly encountered in practical analysis, such as missing values, outliers, non-linear relationships, and varied distributions.
Contextual Depth: Accompanied by clear explanations of its origins, purpose, and the questions it seeks to address, helping learners understand its significance and practical application.
Additional Resources
Annotated bibliography
Assessments