Across
- 1. Stores redundant copies of data across clusters of commodity servers
- 2. Read log files into Hadoop ("Apache" Log Ride)
- 6. Big Data infrastructure for distributing parallel processing jobs and managing job completion
- 7. Stores schema with data to pass to various programming languages
- 9. Popular Hadoop query tool focused on parallel processing of large data sets
- 10. Manages jobs across large clusters and HBase configurations
Down
- 1. NoSQL Database for random, realtime read/write access to Big Data via Hadoop and HDFS
- 3. Strategy for distributing parallel jobs by grouping data into sets for aggregation
- 4. Read RDBMS data into Hadoop SQ(L to Had)oop
- 5. Machine Learning to classify data
- 6. Query tool for accessing HDFS data using SQL like language
- 8. Schedules and prioritizes Hadoop Batch Jobs
- 11. Buzzword for Download, Modify, and Save or optionally Import, Clean, and Export
