PhyLoNet: Physically-Constrained Long-Term Video Prediction

Nir Ben Zikri, Andrei Sharf

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Motions in videos are often governed by physical and biological laws such as gravity, collisions, flocking, etc. Accounting for such natural properties is an appealing way to improve realism in future frame video prediction. Nevertheless, the definition and computation of intricate physical and biological properties in motion videos are challenging. In this work, we introduce PhyLoNet, a PhyDNet extension that learns long-term future frame prediction and manipulation. Similar to PhyDNet, our network consists of a two-branch deep architecture that explicitly disentangles physical dynamics from complementary information. It uses a recurrent physical cell (PhyCell) for performing physically-constrained prediction in latent space. In contrast to PhyDNet, PhyLoNet introduces a modified encoder-decoder architecture together with a novel relative flow loss. This enables a longer-term future frame prediction from a small input sequence with higher accuracy and quality. We have carried out extensive experiments, showing the ability of PhyLoNet to outperform PhyDNet on various challenging natural motion datasets such as ball collisions, flocking, and pool games. Ablation studies highlight the importance of our new components. Finally, we show an application of PhyLoNet for video manipulation and editing by a novel class label modification architecture.

Original languageEnglish
Title of host publicationComputer Vision – ACCV 2022 - 16th Asian Conference on Computer Vision, Proceedings
EditorsLei Wang, Juergen Gall, Tat-Jun Chin, Imari Sato, Rama Chellappa
PublisherSpringer Science and Business Media Deutschland GmbH
Pages570-587
Number of pages18
ISBN (Print)9783031262920
DOIs
StatePublished - 1 Jan 2023
Event16th Asian Conference on Computer Vision, ACCV 2022 - Macao, China
Duration: 4 Dec 20228 Dec 2022

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume13847 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th Asian Conference on Computer Vision, ACCV 2022
Country/TerritoryChina
CityMacao
Period4/12/228/12/22

Keywords

  • Deep learning
  • Long-term video prediction
  • Physical motion

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'PhyLoNet: Physically-Constrained Long-Term Video Prediction'. Together they form a unique fingerprint.

Cite this