Abstract
Synthetic data generation (SDG) can be used to augment an existing dataset or create a new dataset with statistical characteristics similar to the original. SDG for tabular data is challenging because of the need to model both continuous and categorical features and their correlations. multiple approaches for tabular SDG use generative adversarial networks (GAN) or variational autoencoders (VAEs). Generally, GAN-based architectures create high-quality samples but have greater difficulty modeling the distribution of the target dataset. VAE-based approaches accurately model the data distribution but sometimes produce lower-quality samples. In this study, we propose T-VAE-GAN, a novel solution for tabular SDG. Our approach hierarchically combines GANs and VAEs to enable the generation of high-quality samples while ensuring that the overall feature distribution is highly similar to that of the original dataset. Extensive evaluation on a large number of datasets shows that our approach either outperforms or achieves comparable results to leading approaches while also being more computationally efficient.
| Original language | English |
|---|---|
| Article number | 113997 |
| Journal | Knowledge-Based Systems |
| Volume | 326 |
| DOIs | |
| State | Published - 27 Sep 2025 |
Keywords
- Generative adversarial networks
- Synthetic data generation
- Tabular data
- Variational autoencoders
ASJC Scopus subject areas
- Management Information Systems
- Software
- Information Systems and Management
- Artificial Intelligence
Fingerprint
Dive into the research topics of 'Synthetic tabular data generation using a VAE-GAN architecture'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver