Approximating a Distribution Using Weight Queries

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    1 Scopus citations

    Abstract

    We consider a novel challenge: approximating a distribution without the ability to randomly sample from that distribution. We study how such an approximation can be obtained using weight queries. Given some data set of examples, a weight query presents one of the examples to an oracle, which returns the probability, according to the target distribution, of observing examples similar to the presented example. This oracle can represent, for instance, counting queries to a database of the target population, or an interface to a search engine which returns the number of results that match a given search. We propose an interactive algorithm that iteratively selects data set examples and performs corresponding weight queries. The algorithm finds a reweighting of the data set that approximates the weights according to the target distribution, using a limited number of weight queries. We derive an approximation bound on the total variation distance between the reweighting found by the algorithm and the best achievable reweighting. Our algorithm takes inspiration from the UCB approach common in multi-armed bandits problems, and combines it with a new discrepancy estimator and a greedy iterative procedure. In addition to our theoretical guarantees, we demonstrate in experiments the advantages of the proposed algorithm over several baselines. A python implementation of the proposed algorithm and of all the experiments can be found at https://github.com/Nadav-Barak/AWP.

    Original languageEnglish
    Title of host publicationProceedings of the 38th International Conference on Machine Learning, ICML 2021
    PublisherML Research Press
    Pages674-683
    Number of pages10
    ISBN (Electronic)9781713845065
    StatePublished - 1 Jan 2021
    Event38th International Conference on Machine Learning, ICML 2021 - Virtual, Online
    Duration: 18 Jul 202124 Jul 2021

    Publication series

    NameProceedings of Machine Learning Research
    Volume139
    ISSN (Electronic)2640-3498

    Conference

    Conference38th International Conference on Machine Learning, ICML 2021
    CityVirtual, Online
    Period18/07/2124/07/21

    ASJC Scopus subject areas

    • Artificial Intelligence
    • Software
    • Control and Systems Engineering
    • Statistics and Probability

    Fingerprint

    Dive into the research topics of 'Approximating a Distribution Using Weight Queries'. Together they form a unique fingerprint.

    Cite this