Prediction of visual saliency in video with deep CNNs

Souad Chaabouni, Jenny Benois-Pineau, Ofer Hadar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations


Prediction of visual saliency in images and video is a highly researched topic. Target applications include Quality assessment of multimedia services in mobile context, video compression techniques, recognition of objects in video streams, etc. In the framework of mobile and egocentric perspectives, visual saliency models cannot be founded only on bottom-up features, as suggested by feature integration theory. The central bias hypothesis, is not respected neither. In this case, the top-down component of human visual attention becomes prevalent. Visual saliency can be predicted on the basis of seen data. Deep Convolutional Neural Networks (CNN) have proven to be a powerful tool for prediction of salient areas in stills. In our work we also focus on sensitivity of human visual system to residual motion in a video. A Deep CNN architecture is designed, where we incorporate input primary maps as color values of pixels and magnitude of local residual motion. Complementary contrast maps allow for a slight increase of accuracy compared to the use of color and residual motion only. The experiments show that the choice of the input features for the Deep CNN depends on visual task:for th eintersts in dynamic content, the 4K model with residual motion is more efficient, and for object recognition in egocentric video the pure spatial input is more appropriate.

Original languageEnglish
Title of host publicationApplications of Digital Image Processing XXXIX
EditorsAndrew G. Tescher
ISBN (Electronic)9781510603332
StatePublished - 1 Jan 2016
Externally publishedYes
EventApplications of Digital Image Processing XXXIX - San Diego, United States
Duration: 29 Aug 20161 Sep 2016

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X


ConferenceApplications of Digital Image Processing XXXIX
Country/TerritoryUnited States
CitySan Diego


  • Contrast
  • Deep Convolutional Neural Networks
  • Residual motion
  • Visual saliency

ASJC Scopus subject areas

  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics
  • Computer Science Applications
  • Applied Mathematics
  • Electrical and Electronic Engineering


Dive into the research topics of 'Prediction of visual saliency in video with deep CNNs'. Together they form a unique fingerprint.

Cite this