Approximating Aggregated SQL Queries with LSTM Networks

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


Despite continuous investments in data technologies, the latency of querying data still poses a significant challenge. Modern analytic solutions require near real-time responsiveness both to make them interactive and to support automated processing. Current technologies (Hadoop, Spark, Dataflow) scan the dataset to execute queries and focus on providing scalable data storage and in-memory concurrent data processing to maximize task execution speed. We argue that these solutions fail to offer an adequate level of interactivity, since they depend on continual access to data. In this paper, we present a method for query approximation, also known as approximate query processing (AQP), that reduces the need to scan data during inference (query calculation), thus enabling a rapid query processing tool. We use an LSTM network to learn the relationship between queries and their results, and to provide a rapid inference layer for the prediction of query results. Our method (referred to as 'Hunch') produces a lightweight LSTM network which provides high query throughput. We evaluated our method using 12 datasets and compared it to state-of-the-art AQP engines (VerdictDB, BlinkDB) in terms of the query latency, model weight, and accuracy. The results show that our method predicted query results with a normalized root mean squared error (NRMSE) ranging from approximately 1% to 4%, which, for the majority of our datasets, was better than the results of the benchmarks. Moreover, our method was able to predict up to 120, 000 queries in a second (streamed together) and with a single query latency of no more than 2 ms.

Original languageEnglish
Title of host publicationIJCNN 2021 - International Joint Conference on Neural Networks, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9780738133669
StatePublished - 18 Jul 2021
Event2021 International Joint Conference on Neural Networks, IJCNN 2021 - Virtual, Shenzhen, China
Duration: 18 Jul 202122 Jul 2021

Publication series

NameProceedings of the International Joint Conference on Neural Networks


Conference2021 International Joint Conference on Neural Networks, IJCNN 2021
CityVirtual, Shenzhen


  • Approximate query processing (AQP)
  • LSTM
  • SQL
  • Supervised learning

ASJC Scopus subject areas

  • Software
  • Artificial Intelligence


Dive into the research topics of 'Approximating Aggregated SQL Queries with LSTM Networks'. Together they form a unique fingerprint.

Cite this