Data Lake & Data Pipeline
A centralized repository that allows you to store all your structured and unstructured data at any scale. You can store your data as-is, without having to first structure the data, and run different types of analytics—from dashboards and visualizations to big data processing, real-time analytics, and machine learning to guide better decisions. “Data Pipeline” is a broader term that encompasses ETL as a subset. It refers to a system for moving data from one system to another. The data may or may not be transformed, and it may be processed in real time (or streaming) instead of batches. When the data is streamed, it is processed in a continuous flow which is useful for data that needs constant updating, such as data from a sensor monitoring traffic.