Making machines see

4. Term Definition
5. Analyzing data to determine why something happened.
7. A human-centered approach to problem-solving emphasizing empathy and creativity.
13. Measures such as accuracy, recall, precision, and F1-score used to evaluate models.
15. Summarizing past data to understand what happened.
16. The portion of data used to evaluate the performance of a model.
18. A model too simple to capture data patterns, leading to poor performance.
19. Clearly defining the real-world problem before solving it with data science methods.
21. Proportion of true positive results among all predicted positives.
26. Exploring data patterns and anomalies through visualization and statistics.
27. An interdisciplinary field combining statistics, computer science, and domain knowledge to extract insights from data.
29. Constructing AI/ML models based on algorithms trained with data.
30. Specification of the type, format, and source of data needed for analysis.
31. Cleaning, transforming, and structuring raw data for analysis.

1. Choosing the most suitable algorithm for a problem.
2. Suggesting actions based on data predictions.
3. Proportion of true positive results among all actual positives.
6. Gathering data from structured, semi-structured, and unstructured sources.
8. Sequential data points collected at fixed time intervals.
9. Cycle of gathering user/system feedback to improve models.
10. R² value measuring how well predictions approximate real outcomes.
11. Using models to forecast future outcomes.
12. Creating new features from existing data to improve models.
14. Square root of MSE, showing average magnitude of prediction error.
17. Harmonic mean of precision and recall to balance both metrics.
20. A model that performs well on training data but poorly on unseen data.
22. Regression metric measuring the average of squared prediction errors.
23. The portion of data used to fit a machine learning model.
24. The chosen strategy for analyzing data: descriptive, diagnostic, predictive, or prescriptive.
25. Integrating the trained model into production systems for real-world use.
28. A technique for testing model reliability by splitting data into folds.