Generating Synthetic Sign Language Datasets Using Conditional Generative Adversarial Networks

Yehuda Yadid, Sarel Cohen, Raid Saabni

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Approximately 70 million individuals globally utilize sign languages as a means of communication due to their hearing impairment. The study undertaken in sign languages is extensive and fruitful. However, there are over 300 sign languages worldwide, but most research focuses on a single language [1]. Creating AI models for sign language challenges sometimes requires large datasets, which can be difficult to produce. Various research has produced datasets for sign language using various methodologies; nevertheless, they often focus on specific sign languages. Developing hand skeleton templates for sign languages provides a more efficient method than creating numerous instances of distinct signs. By creating a basic framework or structure, it becomes much simpler to utilize generative models, like GANs[2], to generate a wide range of different versions of the framework. These generative models can effectively reproduce and adjust the fundamental structures into many sign language forms, capturing the diversity in hand shapes, orientations, and movements necessary for precise sign representation. The main objective of our research is to develop a conditional generative adversarial network (cGAN) model that can generate hand images based on hand skeletons; this approach not only improves the capacity to generate sign language data on a larger scale, but also guarantees uniformity across different versions of signs. This makes it easier to create sign language recognition systems that are more reliable and flexible. To train this model, we devised a web scraping technique that produced a significant collection of hand photos taken from TED lecture recordings, together with their corresponding skeletons. Our created cGAN-based model allows researchers to generate artificial hand images by employing target skeleton inputs. This enables the creation of extensive datasets for sign language. Our contribution is expected to streamline the exploration of additional sign languages that encounter challenges in collecting datasets.

Original languageEnglish
Title of host publicationSeventeenth International Conference on Machine Vision, ICMV 2024
EditorsWolfgang Osten
PublisherSPIE
ISBN (Electronic)9781510688278
DOIs
StatePublished - 1 Jan 2025
Externally publishedYes
Event17th International Conference on Machine Vision, ICMV 2024 - Edinburg, United Kingdom
Duration: 10 Oct 202413 Oct 2024

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume13517
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X

Conference

Conference17th International Conference on Machine Vision, ICMV 2024
Country/TerritoryUnited Kingdom
CityEdinburg
Period10/10/2413/10/24

Keywords

  • cGAN
  • GAN
  • generative models
  • sign language
  • WGAN

ASJC Scopus subject areas

  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics
  • Computer Science Applications
  • Applied Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Generating Synthetic Sign Language Datasets Using Conditional Generative Adversarial Networks'. Together they form a unique fingerprint.

Cite this