Data Science Methodology
Across
- 2. The process of clearly defining the real-world problem before attempting to solve it with data science methods.
- 4. A regression metric measuring the average of squared differences between predictions and actual values.
- 5. A system for collecting responses after deployment to improve models iteratively.
- 7. The process of creating AI/ML models using algorithms trained on data.
- 9. A portion of the dataset used to evaluate the model performance after training.
- 10. When a model learns noise instead of signal, performing poorly on unseen data.
- 15. The process of gathering raw data from structured, semi-structured, and unstructured sources.
- 17. An interdisciplinary field that combines statistics, computer science, and domain knowledge to extract insights from data.
- 22. The proportion of true positive predictions among all positive predictions.
- 24. The square root of MSE, representing average prediction error magnitude.
- 25. A sequence of data points collected or recorded at successive time intervals.
- 26. Choosing the appropriate algorithm based on problem type and data characteristics.
- 27. The step where raw data is cleaned, transformed, and prepared for modeling.
- 28. When a model is too simple to capture underlying patterns in data.
- 29. Analytics used to identify reasons why something happened.
- 30. A portion of the dataset used to train the model.
Down
- 1. The process of integrating a validated model into production for real-world use.
- 3. Analytics that use models and machine learning to predict future outcomes.
- 6. The step involving exploratory data analysis (EDA) to identify patterns and anomalies in data.
- 8. The proportion of true positive predictions among all actual positives.
- 11. Also known as R², it measures how well regression predictions approximate real values.
- 12. Creating new input features from existing data to improve model performance.
- 13. The method chosen to analyze data, including descriptive, diagnostic, predictive, and prescriptive analytics.
- 14. The harmonic mean of precision and recall, balancing the two measures.
- 16. Analytics that suggest actions or strategies based on predictions and outcomes.
- 18. A statistical technique for assessing model performance using multiple train-test splits.
- 19. Quantitative measures used to evaluate the accuracy and effectiveness of models.
- 20. A human-centered iterative approach to problem-solving that emphasizes empathy, ideation, and experimentation.
- 21. Specification of the content, format, and sources of data needed for analysis.
- 23. Analytics focused on summarizing past data to understand what happened.