BIG DATA

123456789101112131415161718192021222324
Across
  1. 1. (of data) structured, unstructured, and semistructured data that is gathered from multiple sources
  2. 3. a query language in Apache Hive for processing and analyzing structured data
  3. 7. (data) data that were collected in the past, usually for a purpose other than research
  4. 9. provides a complete record of the information resources maintained by an organisation.
  5. 11. a term used to describe cloud-based software tools used for working with data, such as managing data in a data warehouse or analyzing data with business intelligence.
  6. 13. delay before a transfer of data begins following an instruction for its transfer.
  7. 14. (computing) a system for connecting a large number of computer nodes into a distributed architecture that delivers the compute resources necessary to solve complex problems.
  8. 15. the proportion of visitors to a web page who follow a hypertext link to a particular site
  9. 17. a process that allocates system resources to control the execution of unattended background programs.
  10. 19. (analytics) the use of data, statistical algorithms and machine learning techniques to identify the likelihood of future outcomes based on historical data.
  11. 21. (processing) the technique of linking together multiple computer servers over a network into a cluster, to share data and to coordinate processing power
  12. 24. (processing) a method of running high-volume, repetitive data jobs
Down
  1. 1. (of data) the speed at which data is entered into a system and must be processed
  2. 2. (hardware) a device or device component that is relatively inexpensive, widely available and more or less interchangeable with other hardware of its type.
  3. 4. the resource management and job scheduling technology in the open source Hadoop distributed processing framework
  4. 5. a set of computers that work together so that they can be viewed as a single system
  5. 6. (database) database that arranges data elements in vertical columns and horizontal rows.
  6. 8. (data) data that fits a predefined model or format.
  7. 10. a distributed file system that handles large data sets running on commodity hardware
  8. 12. database computer language designed for the retrieval and management of data in a relational database
  9. 16. (data) information that either does not have a pre-defined data model or is not organized in a pre-defined manner
  10. 18. (of data) how reliable and significant the data really is
  11. 20. (database) type of database that stores and provides access to data points that are related to one another
  12. 21. (data) any data that are essentially not alike, or are distinctly different in kind, quality, or character. they are unequal and cannot be readily integrated to meet the business information demand.
  13. 22. (also cleansing) the process of fixing incorrect, incomplete, duplicate or otherwise erroneous data in a data set
  14. 23. a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation