Data Engineering

3. A data schema where one fact table is present with multiple related dimension tables.
5. A type of database that stores and retrieves data without needing to define its structure first - an alternative to the more rigid relational databases.
8. The process of efficiently organizing data in a database, often by eliminating redundancy.
9. An open-source framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.
10. A series of data processing steps or stages, where data is ingested, transformed, and loaded from one system or format to another.
12. A storage system or repository that holds a vast amount of raw data in its native format until it's needed. It allows businesses to store all of their data in one place.

1. The database practice of splitting your data into multiple, smaller, more manageable pieces, yet still being able to treat it as a single dataset.
2. A large store of data collected from a wide range of sources within a company and used to guide management decisions.
3. A method by which data is split across multiple databases to improve performance, scalability, and manageability.
4. An open-source stream-processing software platform that provides a unified, high-throughput, low-latency platform for handling real-time data feeds.
6. The practice of processing data on-the-fly, as it's created, rather than in batches.
7. An open-source distributed computing system that can process data quickly. It's known for its ability to handle big data analytics.
11. Extract, Transform, Load