Data engineering

12345678910
Across
  1. 2. A distributed computing framework that allows for the processing of large datasets across clusters of computers, often using a MapReduce paradigm.
  2. 3. A system for managing and processing streaming data
  3. 6. A method of data storage that allows for the efficient querying of large datasets by organizing data into a multi-dimensional cube.
  4. 7. The process of executing a DAG on a specific schedule.
  5. 9. A process of evaluating the performance of a machine learning model.
  6. 10. A feature that allows tasks to run in parallel in Airflow.
Down
  1. 1. A framework for building and managing data pipelines that emphasizes automation, monitoring, and collaboration among data teams.
  2. 4. A tool for orchestrating data workflows
  3. 5. A collection of tasks in Airflow that are executed in a specific order.
  4. 8. A process of cleaning and organizing raw data