PhyLoNet: Physically-Constrained Long-Term Video Prediction

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    Motions in videos are often governed by physical and biological laws such as gravity, collisions, flocking, etc. Accounting for such natural properties is an appealing way to improve realism in future frame video prediction. Nevertheless, the definition and computation of intricate physical and biological properties in motion videos are challenging. In this work, we introduce PhyLoNet, a PhyDNet extension that learns long-term future frame prediction and manipulation. Similar to PhyDNet, our network consists of a two-branch deep architecture that explicitly disentangles physical dynamics from complementary information. It uses a recurrent physical cell (PhyCell) for performing physically-constrained prediction in latent space. In contrast to PhyDNet, PhyLoNet introduces a modified encoder-decoder architecture together with a novel relative flow loss. This enables a longer-term future frame prediction from a small input sequence with higher accuracy and quality. We have carried out extensive experiments, showing the ability of PhyLoNet to outperform PhyDNet on various challenging natural motion datasets such as ball collisions, flocking, and pool games. Ablation studies highlight the importance of our new components. Finally, we show an application of PhyLoNet for video manipulation and editing by a novel class label modification architecture.

    Original languageEnglish
    Title of host publicationComputer Vision – ACCV 2022 - 16th Asian Conference on Computer Vision, Proceedings
    EditorsLei Wang, Juergen Gall, Tat-Jun Chin, Imari Sato, Rama Chellappa
    PublisherSpringer Science and Business Media Deutschland GmbH
    Pages570-587
    Number of pages18
    ISBN (Print)9783031262920
    DOIs
    StatePublished - 1 Jan 2023
    Event16th Asian Conference on Computer Vision, ACCV 2022 - Hybrid, Macao, China
    Duration: 4 Dec 20228 Dec 2022

    Publication series

    NameLecture Notes in Computer Science
    Volume13847 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference16th Asian Conference on Computer Vision, ACCV 2022
    Country/TerritoryChina
    CityHybrid, Macao
    Period4/12/228/12/22

    Keywords

    • Deep learning
    • Long-term video prediction
    • Physical motion

    ASJC Scopus subject areas

    • Theoretical Computer Science
    • General Computer Science

    Fingerprint

    Dive into the research topics of 'PhyLoNet: Physically-Constrained Long-Term Video Prediction'. Together they form a unique fingerprint.

    Cite this