Harnessing the Power of Generative Adversarial Networks Style Learning for Tabular Data Generation

PAKDD Tutorial -- Harnessing the Power of Generative Adversarial Networks Style Learning for Tabular Data Generation

Generative Adversarial network (GAN) model and its variants have shown to be effective in producing high-quality data in areas of Computer Vision, Text Mining and Natural Language Processing. GAN constitutes of two parts -- generator and discriminator, trained in an end-to-end manner in a game-theoretic manner. Tremendous success of GANs in producing high-quality structured data has inspired many researchers to utilize similar modelling for producing tabular data. Tabular data is a combination of apparently unrelated columns of types numeric, rank, and categorical features which makes the direct application of GAN-based deep learning methods quite challenging. This tutorial is aimed at discussing recent advancements in tabular data generation with GAN-style learning.

In this tutorial, we will start by providing a brief review of recent literature of various GAN-based techniques for tabular data generation. We will discuss various characteristics of tabular data and highlight the challenges of tabular data generation. We will also discuss the need for standard evaluation by proposing a centralized repository for comparing various tabular data generation methods. We will conclude this tutorial with a discussion of applications of tabular data generation in privacy-preserving analytics, robustness analysis (concept drift analysis, adversarial attacks analysis) and anomaly detection.