Indexing cloud data lakes within the lakes

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    3 Scopus citations

    Abstract

    Cloud data lakes are a modern approach for storing large amounts of data in a convenient and inexpensive way. The main idea is the separation of compute and storage layers. However, to perform analytics on the data in this architecture, the data should be moved from the storage layer to the compute layer over the network for each calculation. Obviously, that hurts calculation performance and requires huge network bandwidth. We are exploring different approaches for adding indexing to the cloud data lakes with the goal of reducing the amounts of data read from the storage, and as a result, improving query execution time.

    Original languageEnglish
    Title of host publicationSYSTOR 2021 - Proceedings of the 14th ACM International Conference on Systems and Storage
    PublisherAssociation for Computing Machinery, Inc
    ISBN (Electronic)9781450383981
    DOIs
    StatePublished - 14 Jun 2021
    Event14th ACM International Conference on Systems and Storage, SYSTOR 2021 - Virtual, Online, Israel
    Duration: 14 Jun 202116 Jun 2021

    Publication series

    NameSYSTOR 2021 - Proceedings of the 14th ACM International Conference on Systems and Storage

    Conference

    Conference14th ACM International Conference on Systems and Storage, SYSTOR 2021
    Country/TerritoryIsrael
    CityVirtual, Online
    Period14/06/2116/06/21

    ASJC Scopus subject areas

    • Electrical and Electronic Engineering
    • Computer Science Applications
    • Hardware and Architecture
    • Software

    Fingerprint

    Dive into the research topics of 'Indexing cloud data lakes within the lakes'. Together they form a unique fingerprint.

    Cite this