Hadoop Ecosystem

1234567891011
Across
  1. 1. Stores redundant copies of data across clusters of commodity servers
  2. 2. Read log files into Hadoop ("Apache" Log Ride)
  3. 6. Big Data infrastructure for distributing parallel processing jobs and managing job completion
  4. 7. Stores schema with data to pass to various programming languages
  5. 9. Popular Hadoop query tool focused on parallel processing of large data sets
  6. 10. Manages jobs across large clusters and HBase configurations
Down
  1. 1. NoSQL Database for random, realtime read/write access to Big Data via Hadoop and HDFS
  2. 3. Strategy for distributing parallel jobs by grouping data into sets for aggregation
  3. 4. Read RDBMS data into Hadoop SQ(L to Had)oop
  4. 5. Machine Learning to classify data
  5. 6. Query tool for accessing HDFS data using SQL like language
  6. 8. Schedules and prioritizes Hadoop Batch Jobs
  7. 11. Buzzword for Download, Modify, and Save or optionally Import, Clean, and Export