Search Results
Data & AI
Recognize that data “fuels” AI, that AI can be compared to a function machine (math), algorithm (CS), or a prediction model (statistics) that relies on data to both operate and improve itself, and that AI tools can also be used to analyze complex data in research.
Data & AI
Recognize that data “fuels” AI, that AI can be compared to a function machine (math), algorithm (CS), or a prediction model (statistics) that relies on data to both operate and improve itself, and that AI tools can also be used to analyze complex data in research.
Apply context
Recognize that the context surrounding the data and the investigation shapes interpretation. Many fields (biology vs. psychology; economics vs. sociology) have created very different frameworks to organize problems. Considering multiple approaches may reveal useful insights from the same data.
Apply context
Recognize that the context surrounding the data and the investigation shapes interpretation. Many fields (biology vs. psychology; economics vs. sociology) have created very different frameworks to organize problems. Considering multiple approaches may reveal useful insights from the same data.
Frequency tables
Organize data into frequency tables based on shared characteristics. Summarize data using counts, fractions, relative frequencies, or proportions to enable comparisons and generalizations. Understand the implications of choices made when creating and interpreting frequency tables.
Frequency tables
Organize data into frequency tables based on shared characteristics. Summarize data using counts, fractions, relative frequencies, or proportions to enable comparisons and generalizations. Understand the implications of choices made when creating and interpreting frequency tables.
Comparing variability
Examine differences between groups by analyzing measures of spread, such as range and standard deviation. Utilize visualizations like box plots and apply statistical methods, including mean, median, and standard deviation, to compare datasets, assess variability, and uncover patterns in data distributions and models.
Comparing variability
Examine differences between groups by analyzing measures of spread, such as range and standard deviation. Utilize visualizations like box plots and apply statistical methods, including mean, median, and standard deviation, to compare datasets, assess variability, and uncover patterns in data distributions and models.
Explaning significance
Clearly describe the basic logic of statistical significance to others, differentiating between significance, the size of an effect, and the statistical power of an analysis. Recognize what statistical significance can reveal and cannot reveal about a phenomenon.
Explaning significance
Clearly describe the basic logic of statistical significance to others, differentiating between significance, the size of an effect, and the statistical power of an analysis. Recognize what statistical significance can reveal and cannot reveal about a phenomenon.
Sense-making with visualizations
Practice creating visualizations to summarize many things at once, relationships between things in one place, or exceedingly complex ideas in one place. Recognize that visuals can be more efficient or compelling than other forms of communication.
Sense-making with visualizations
Practice creating visualizations to summarize many things at once, relationships between things in one place, or exceedingly complex ideas in one place. Recognize that visuals can be more efficient or compelling than other forms of communication.
Recognize that multiple types of data can provide valuable insights into the same inquiry.
11-12
50
Explore the origins of some standardized unit measurements (e.g., horsepower, mole, scores on AP exams).
11-12
50
Identify the risks and tradeoffs of using traditional measurements (e.g., IQ, BMI).
11-12
50
Explore different types of variability for making inferences (e.g., confidence intervals, various tests, classification models).
11-12
50
Design and compare alternative data representations, justifying choices to address inherent uncertainty.
11-12
50
Identify and label a simple prediction algorithm or equation for a very basic AI prediction model. e.g., a basic AI model’s equation looks like: Prediction = (weight₁ × input₁) + (weight₂ × input₂), such as a college admission model might weight GPA (input₁) and test scores (input₂) to output an acceptance likelihood, and training involves automatically adjusting weights to match historical data.
11-12
50
Understand that algorithms use cost functions to measure errors and adjust predictions.
11-12
50
Recognize that some AI tools can be used to explore complex data with many variables.
11-12
50
Recognize the types of problems that are ideal for using an AI tool to analyze complex data.
11-12
50
Recognize that data risk can change based on time, circumstance, and purpose.
11-12
50
Identify data benefits that can appear well into the future and in unexpected ways.
11-12
50
Propose multiple perspectives on data to mitigate inherent biases.
11-12
50
Understand the difference between implicit and explicit bias.
11-12
50
Use data to support arguments, design solutions, or challenge inequities.
11-12
50
Investigate case studies where data advanced scientific, economic, or social progress.
11-12
50
Identify when data alone is insufficient and complementary methods are needed. e.g., Data may quantify the number of people affected by a policy, but personal testimonies are needed to illustrate its human impact
11-12
50
Conduct independent investigations to inform decisions, leveraging advanced tools and addressing uncertainty.
11-12
50
Compare investigative approaches across fields to critique strengths and limitations.
11-12
50
Propose new approaches for leveraging the investigative process to strengthen inferences and arguments.
11-12
50
Critically evaluate and update inferences as data scales or methods advance.
11-12
50
Interpret data drawn from different fields and topics based on accepted norms within those fields.
11-12
50
Compare multiple problem-solving approaches, and identify how those differences may compound over time and when repeated.
11-12
50
Establish accountability by basing claims and decisions on relevant data.
11-12
50
Explore career fields and their intersection with data collection, curation, storytelling, and societal impact.
11-12
50
Develop comprehensive data validation procedures, including automated checks.
11-12
50
Implement verification protocols for complex datasets with multiple dependencies.
11-12
50
Develop and implement data organization systems that accommodate both structured and unstructured data types.
11-12
50
Create scalable data organization strategies that maintain data integrity while handling missing values, irregular structures, and evolving data requirements.
11-12
50
Design and implement metadata documentation systems to track data lineage, transformations, and organizational structures.
11-12
50
Use an identifying variable (e.g., index, case ID) to merge two separate datasets that have the same observation, but contain different variables to merge datasets together.
11-12
50
Use appropriate procedures to join two datasets together that have different observations with the same variables measured.
11-12
50
Use datasets with derived variables, based on other variables in the dataset.
11-12
50
Construct data-based questions that address complex systems with multiple interacting variables, including consideration of confounding factors and effect modifiers.
11-12
50
Design research questions that incorporate multiple levels of analysis and account for both direct and indirect relationships between variables.
11-12
50
Formulate questions that address the validity and reliability of data collection methods, including considerations of systematic bias and measurement error.
11-12
50
Design simulations (e.g., using an RNG or computer software) and underlying models to generate data specific to a problem of interest.
11-12
50
Identify optimal sensors or automated data collection methods for answering a data-based question or designing an experiment.
11-12
50
Distinguish between surveys, observational studies, and experiments.
11-12
50
Recognized how concerns about data privacy and human subjects may affect the collection and distribution of data.
11-12
50
Identify (and know you can request access to) non-publicly available datasets by contacting researchers, reading scientific literature, or communicating with public officials.
11-12
50
Develop strategies for finding and accessing datasets that require special permissions, logins, or formal data requests.
11-12
50
Evaluate and navigate licensing and citation requirements when using secondary data sources for research.
11-12
50
Combine multiple secondary datasets to create more comprehensive or useful data for specific investigations.
11-12
50
Evaluate and critique measurement validity, reliability, and bias in data collection methods, and design comprehensive datafication strategies that address ethical considerations and potential sources of measurement error.
11-12
50
Work with data collected over time and consider how to aggregate appropriately.
11-12
50
Work with data collected over space and consider how to aggregate appropriately.
11-12
50
Create strategies for dealing with data that is constantly updated.
11-12
50
Develop data collection protocols that prevent bias, protect privacy, and ensure ethical representation across diverse populations.
11-12
50
Apply validation techniques to prevent bias and ensure ethical use of secondary data, including AI tools.
11-12
50
Apply advanced data cleaning techniques to handle complex data quality issues such as outliers, inconsistencies, and systematic errors.
11-12
50
Develop and document reproducible data cleaning workflows that maintain data integrity.
11-12
50
Evaluate and validate cleaned datasets using statistical methods and domain knowledge.
11-12
50
Create and use expected value models to support data-based decision making.
11-12
50
Work with multiple datasets that combine multiple types of data and combine and transform the different types.
11-12
50
Work with complex derived variables and understand their calculation methods.
11-12
50
Work with very large datasets multiple thousands of observations.
11-12
50
Use selection, sampling, and transformation tools to navigate very large datasets.
11-12
50
Design and implement data structures that can accommodate longitudinal data and multiple levels of aggregation.
11-12
50
Handle data aggregation across different observation structures and time scales.
11-12
50
Create flexible organizational systems that can handle both structured and unstructured data sources.
11-12
50
Develop documentation systems for complex data structures that track relationships and dependencies between variables.
11-12
50
Explore the sensitivity of the mean to outliers compared to the median.
11-12
50
Discuss instances when to use the mean or median based on the context and data distribution (e.g., skewed vs. symmetric distribution).
11-12
50
Numerically operationalize the meaning of an "outlier" using standard deviation as a measure of variability and a modified boxplot.
11-12
50
Explain how the shape of a distribution influences the relationship between measures of center. e.g., in symmetric distributions - the mean and median are close, in a right-skewed distribution - the mean is greater than the median, in a left-skewed distribution - the mean is less than the median
11-12
50
Discuss implications of choices made when generating a frequency table.
11-12
50
Describe how missing data affects analysis and resulting relationships, patterns, or models.
11-12
50
Reasonably ideate on some potential modeling approaches when given the metadata (e.g., data and time, text, continuous, geolocation) for a dataset.
11-12
50
Use simulations to investigate associations between two categorical variables and to compare groups.
11-12
50
Use variability in distributions to engage in statistical reasoning.
11-12
50
Understand and interpret variability in sampling distributions and how it impacts population estimates.
11-12
50
Conduct linear regression analysis to find the best-fit.
11-12
50
Construct prediction intervals and confidence intervals to determine plausible values of a predicted observation or a population characteristic.
11-12
50
Generate a word cloud of a given text after standardizing (e.g., all lower case), stemming, and removing stop words.
11-12
50
Explore how gradient descent optimizes loss functions and powers machine learning applications like neural networks.
11-12
50
Apply statistical or simulation methods to model variability to explore uncertainty in real-world situations.
11-12
50
Explore variability through statistical methods, such as analyzing residuals or variance in linear models.
11-12
50
Examine differences between groups by analyzing measures of spread, such as range and standard deviation. Utilize visualizations like box plots and apply statistical methods, including mean, median, and standard deviation, to compare datasets, assess variability, and uncover patterns in data distributions and models.
Estimate and describe errors between predictions and actual outcomes. e.g., residuals, misclassification rates
11-12
50
Analyze error patterns to assess model performance. e.g., residual plot, confusion matrix
11-12
50
Use insights from error analysis to improve the model. e.g., in linear regression, add a variable or use a curve; in classification, balance the groups or adjust the cutoff
11-12
50
Appreciate that many AI tools are pre-trained with large quantities of data so that inferences can be drawn on smaller sample sizes.
11-12
50
Create models to perform simulations using a digital tool.
11-12
50
Perform data analysis using a digital tool.
11-12
50
Critique the societal effect of AI by exploring issues surrounding bias, accountability, and transparency in decision-making using AI tools, as well as the effects on privacy, jobs, and policy.
11-12
50
Identify differences between a no-code, low-code, or high-code digital tool.
11-12
50
Select multiple digital tools suited for different tasks throughout the data investigation process.
11-12
50
Describe how digital tools are used in the workforce.
11-12
50
Recognize how computer code can be used to produce reproducible data analysis processes.
11-12
50
Recognize the advantages and limitations of using computer code compared to no-code or low-code tools.
11-12
50
Design data visualizations that include accessible features such alt-text and text descriptions.
11-12
50
Examine how policies, limitations, and technological advancements impact the development of accessible digital tools.
11-12
50
Discern that different models, such as decision trees and neural networks, analyze patterns and relationships in data to make predictions.
11-12
50
Assess relationships in the context of uncertainty, bias, and reliability of the data.
11-12
50
Investigate how assumptions and bias influence a model's results.
11-12
50
Develop models that incorporate multiple variables and explicitly consider interactions between them.
11-12
50
Use computational methods, coding, or machine learning techniques to build and refine models.
11-12
50
Assess assumptions, limitations, and biases in models to evaluate their impact on predictions in real-world scenarios.
11-12
50
Clearly state the result or finding and indicate the level of certainty regarding the statistical analysis and the quality of the evidence (e.g., dataset or source characteristics, similar findings in alternative data) as justification.
11-12
50
Summarize previous assumptions and potential updates in written conclusions from a data analysis, and identify any known contradictory findings to mitigate confirmation bias (psychology).
11-12
50
Describe Bayes Theorem by explaining how it relates to conditional probability, which includes the probability of an event occurring, the probability of that event given the evidence is true, and the probability that the evidence itself is true.
11-12
50
