Data Science Methodology

123456789101112131415161718192021222324252627282930
Across
  1. 2. The process of clearly defining the real-world problem before attempting to solve it with data science methods.
  2. 4. A regression metric measuring the average of squared differences between predictions and actual values.
  3. 5. A system for collecting responses after deployment to improve models iteratively.
  4. 7. The process of creating AI/ML models using algorithms trained on data.
  5. 9. A portion of the dataset used to evaluate the model performance after training.
  6. 10. When a model learns noise instead of signal, performing poorly on unseen data.
  7. 15. The process of gathering raw data from structured, semi-structured, and unstructured sources.
  8. 17. An interdisciplinary field that combines statistics, computer science, and domain knowledge to extract insights from data.
  9. 22. The proportion of true positive predictions among all positive predictions.
  10. 24. The square root of MSE, representing average prediction error magnitude.
  11. 25. A sequence of data points collected or recorded at successive time intervals.
  12. 26. Choosing the appropriate algorithm based on problem type and data characteristics.
  13. 27. The step where raw data is cleaned, transformed, and prepared for modeling.
  14. 28. When a model is too simple to capture underlying patterns in data.
  15. 29. Analytics used to identify reasons why something happened.
  16. 30. A portion of the dataset used to train the model.
Down
  1. 1. The process of integrating a validated model into production for real-world use.
  2. 3. Analytics that use models and machine learning to predict future outcomes.
  3. 6. The step involving exploratory data analysis (EDA) to identify patterns and anomalies in data.
  4. 8. The proportion of true positive predictions among all actual positives.
  5. 11. Also known as R², it measures how well regression predictions approximate real values.
  6. 12. Creating new input features from existing data to improve model performance.
  7. 13. The method chosen to analyze data, including descriptive, diagnostic, predictive, and prescriptive analytics.
  8. 14. The harmonic mean of precision and recall, balancing the two measures.
  9. 16. Analytics that suggest actions or strategies based on predictions and outcomes.
  10. 18. A statistical technique for assessing model performance using multiple train-test splits.
  11. 19. Quantitative measures used to evaluate the accuracy and effectiveness of models.
  12. 20. A human-centered iterative approach to problem-solving that emphasizes empathy, ideation, and experimentation.
  13. 21. Specification of the content, format, and sources of data needed for analysis.
  14. 23. Analytics focused on summarizing past data to understand what happened.