Analyzing large-scale genomic data with cloud data lakes

Grisha Weintraub, Noam Hadar, Ehud Gudes, Shlomi Dolev, Ohad Birk

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

In recent years there is huge influx of genomic data and a growing need for its analysis, yet existing genomic databases do not allow easy accessibility. We developed a pipeline that continuously pre-processes raw human genetic data. The data is then stored in a cloud data lake and can be accessed via a simple and intuitive web service and API.

Original languageEnglish
Title of host publicationProceedings of the 16th ACM International Conference on Systems and Storage, SYSTOR 2023
PublisherAssociation for Computing Machinery, Inc
Pages142
Number of pages1
ISBN (Electronic)9781450399623
DOIs
StatePublished - Jun 2023
Event16th ACM International Conference on Systems and Storage, SYSTOR 2023 - Haifa, Israel
Duration: 5 Jun 20237 Jun 2023

Publication series

NameProceedings of the 16th ACM International Conference on Systems and Storage, SYSTOR 2023

Conference

Conference16th ACM International Conference on Systems and Storage, SYSTOR 2023
Country/TerritoryIsrael
CityHaifa
Period5/06/237/06/23

Keywords

  • cloud storage
  • data lakes
  • geniepool
  • genomics

ASJC Scopus subject areas

  • Computer Science Applications
  • Hardware and Architecture
  • Software
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Analyzing large-scale genomic data with cloud data lakes'. Together they form a unique fingerprint.

Cite this