Big Data Applications, Apache Spark, Workflow of MapReduce, Pig Latin Parser, HiveQL Data Definition
Across
- 3. Framework widely used for distributed storage and processing of big data.
- 4. The phase in MapReduce that combines and aggregates mapper output.
- 6. Data warehouse infrastructure that uses SQL-like queries for big data.
- 9. Component in Pig Latin responsible for syntax checking and producing logical plans.
- 11. Custom function written by users to extend Hive or Pig functionality.
- 12. Fundamental data structure in Apache Spark representing immutable distributed collections.
- 14. The first phase of MapReduce responsible for processing and filtering data.
Down
- 1. A complete execution process triggered in MapReduce by user input.
- 2. Fast in-memory data processing engine used as an alternative to MapReduce.
- 5. The Spark program component that schedules tasks and maintains metadata.
- 7. The Spark component responsible for executing tasks assigned by the driver.
- 8. Resource management layer used by Hadoop to run applications like Spark.
- 10. Structure that defines column names, types, and metadata in Hive tables.
- 13. Category of HiveQL commands used to create, alter, and drop tables.
- 15. High-level platform used for analyzing large datasets with a scripting language.
