Several approaches harvest, query, or combine Deep Web sources. Yet, in addition to well-studied aspects of the problem such as query answering using views, access limitations, or top-k querying, the Deep Web exhibits a number of peculiarities that are often neglected. First, the services usually deliver not all results, but only the top-n results according to some ranking function. This function may not be compatible with the ordering specified in a user's query. Subsequent results have to be obtained by paging, or may not even be accessible. Second, the services may deliver results in a granularity that is incompatible with the query or joinable services (e.g., months vs. exact dates). Moreover, the services may perform selections or ranking over attributes that are not exposed in the results: this poses an incompleteness problem. Additional challenges come from uncertainty, recency constraints, and inter-service dependencies. In this article, we shed light on these peculiarities, and compile a list of desiderata of a query answering system for the Deep Web.
|Number of pages||4|
|Journal||CEUR Workshop Proceedings|
|State||Published - 1 Dec 2012|
|Event||2nd International Workshop on Searching and Integrating New Web Data Sources: Very Large Data Search, VLDS 2012 - Istanbul, Turkey|
Duration: 31 Aug 2012 → 31 Aug 2012
ASJC Scopus subject areas
- Computer Science (all)