Abstract
Cloud data lakes are a modern approach for storing large amounts of data in a convenient and inexpensive way. The main idea is the separation of compute and storage layers. However, to perform analytics on the data in this architecture, the data should be moved from the storage layer to the compute layer over the network for each calculation. Obviously, that hurts calculation performance and requires huge network bandwidth. We are exploring different approaches for adding indexing to the cloud data lakes with the goal of reducing the amounts of data read from the storage, and as a result, improving query execution time.
Original language | English |
---|---|
Title of host publication | SYSTOR '21: The 14th ACM International Systems and Storage Conference, Haifa, Israel, June 14-16, 2021 |
Editors | Bruno Wassermann, Michal Malka, Vijay Chidambaram, Danny Raz |
Publisher | Association for Computing Machinery, Inc |
ISBN (Electronic) | 9781450383981 |
DOIs | |
State | Published - 2021 |
Event | 14th ACM International Conference on Systems and Storage, SYSTOR 2021 - Virtual, Online, Israel Duration: 14 Jun 2021 → 16 Jun 2021 |
Conference
Conference | 14th ACM International Conference on Systems and Storage, SYSTOR 2021 |
---|---|
Country/Territory | Israel |
City | Virtual, Online |
Period | 14/06/21 → 16/06/21 |
Keywords
- engineering
- Electrical and Electronic Engineering
- Computer Science Applications
- Hardware and Architecture
- Software