GENERACIÓN SINTÉTICA DE PERFILES DE

CONSUMO ELÉCTRICO MEDIANTE REDES

GENERATIVAS ANTAGÓNICAS (GAN)

SYNTHETIC GENERATION OF ELECTRICAL CONSUMPTION

PROFILES USING GENERATIVE ADVERSARIAL NETWORKS

(GANS)

Luis Ferney Ortíz Torres

Recibido: y Aceptado:

15/11/2024 - 14/10/2025

Conectando mentes, energizando el futuro

La previsión precisa del consumo de energía es esencial para la planicación y gestión ecaces de las

infraestructuras eléctricas. Este artículo presenta un modelo que aprovecha las redes generativas

adversariales (GAN) para producir perles sintéticos de consumo de energía, abordando los retos planteados

por el acceso limitado a los datos críticos corporativos o empresariales necesarios para el funcionamiento

de los sistemas eléctricos. El enfoque basado en GAN genera perles de consumo realistas, cuya similitud

estadística con los conjuntos de datos del mundo real se evaluó rigurosamente. Los resultados demuestran

que los perles sintéticos se asemejan mucho a los datos auténticos, lo que subraya la capacidad de los

GAN como herramienta robusta para simular y predecir patrones de consumo energético. En conclusión,

este artículo subraya el potencial transformador de los GAN para avanzar en la planicación energética y

permitir simulaciones más precisas en contextos en los que los datos del mundo real son escasos o difíciles

de obtener.

Accurate energy consumption forecasting is essential for the eective planning and management of electrical

infrastructure. This article introduces a model leveraging Generative Adversarial Networks (GANs) to produce

synthetic energy consumption proles, addressing the challenges posed by limited access to critical corporate

or enterprise data necessary for the operation of electrical systems. The GAN-based approach generates

realistic consumption proles, which were rigorously evaluated for their statistical similarity to real-world

datasets. The results demonstrate that the synthetic proles closely mimic authentic data, underscoring the

capability of GANs as a robust tool for simulating and predicting energy consumption patterns. In conclusion,

this article highlights the transformative potential of GANs in advancing energy planning and enabling more

accurate simulations in contexts where real- world data is scarce or dicult to obtain.

PALABRAS CLAVE: Redes generativas antagónicas (GAN), Modelos predictivos, aprendizaje automático,

análisis de datos, eciencia energética, modelado predictivo.

KEYWORDS: Generative Adversarial Networks (GANs), Predictive models, Machine Learning, data privacy,

energy eciency, predictive modeling.

Resumen

Abstract

Conectando mentes, energizando el futuro

1. INTRODUCTION

In an electrical grid, data from generation to

commercialization and the end user/prosumer

must be systematically collected, integrated, and

analyzed. These datasets must align with the

capabilities of modern measurement systems

while ensuring stringent privacy and security

protocols for data acquisition and

transmission. For instance, Advanced Metering

Infrastructure (AMI) (Hart, 2008; Ashari, 2022) is a

key technology used for real-time monitoring and

management of electricity consumption (Park et

al., 2010). Households, buildings, and industries

equipped with AMI automatically transmit energy

consumption data to their electricity providers.

This enables providers to improve energy supply

management, anticipate rationing needs, and

validate energy demand more e ectively (Park et

al., 2010).

The growing need to optimize energy consumption

has become a critical challenge within the

evolving dynamics of the electric sector (Hossain

et al., 2024). This challenge is compounded by

exponential demand growth and the urgency of

advancing the energy transition and sustainability

initiatives. These demands necessitate the

development of scenarios that allow continuous

state and condition validation across electrical

grids (Zhen et al., 2022; Ortiz et al., 2024).

However, this also creates signi cant obstacles

for researchers, particularly in testing innovative

instruments, methods, and theories (National

Academies of Sciences, Engineering, and

Medicine, 2016; Yilmaz, 2023). Given the vital role

of electrical grids in daily life, access to data has

become indispensable for designing and validating

advanced mathematical and computational tools.

Therefore, stakeholders including policymakers,

industry professionals, and researchers must

collaborate to generate, validate, and make

synthetic data accessible to drive advancements

in the  eld (Akbari et al., 2024; Luo et al., 2023;

Enhancing Security in Public Spaces Through

Generative Adversarial Networks (GANs), 2024).

These e orts have the potential to improve the

planning, operation, and optimization of electrical

grids. Nonetheless, a major impediment lies in the

restricted access to real-world data, a sensitive

issue that could compromise national privacy and

security if mishandled (Lim et al., 2024; Shi, 2021;

Dunmore et al., 2023; Goodfellow et al. 2020).

This limitation restricts the availability of data for

researchers and other key players, prompting the

need for innovative approaches that transcend

conventional constraints. Tools like Generative

Adversarial Networks (GANs) o er a promising

avenue to address these challenges by creating

realistic synthetic datasets, thereby fostering

opportunities for progress in the sector.

Figure 1. Description of the operation of a GANs.

Source: own elaboration.

Sección ganadores “call for papers”

Goodfellow et al. pioneered the concept of

Generative Adversarial Networks (GANs) as an

adversarial process (Sharma et al. 2024). This

framework involves the simultaneous training of

two models: a Generator and a Discriminator. As

depicted in Figure 1, the Generator serves as a

generative model designed to approximate the

data distribution, while the Discriminator acts as

a discriminative model tasked with estimating the

probability that a given sample originates from the

training data rather than the Generator (Nayak et

al., 2024; Yadav et al., 2023; Dutta et al., 2020).

One of the most prevalent applications of GANs is

in privacy protection, where they create synthetic

datasets that mimic the statistical properties of

original data without exposing sensitive information

(Choi et al., 2017).

Beyond GANs, alternative methods exist for

generating statistically synthetic data. Ping et

al. demonstrated the utility of Bayesian models

for capturing the relationships within synthetic

data generation frameworks (Hindistan & Yetkin,

2023). However, the primary advantage of GANs

over traditional statistical approaches lies in their

superior capability to approximate real-world data

distributions. Xu and Veeramachaneni (2023)

highlighted the potential of GANs in producing

high-quality synthetic datasets benecial

for data science applications. For instance,

techniques such as Recurrent Conditional GANs

(RCGANs) (Yilmaz & Korn, 2022), Time-Series

GANs (TimeGANs) (Esteban et al., 2017), and

Wasserstein-based models, including Conditional

Wasserstein GANs (CWGANs) (Arjovsky, 2017)

and Recurrent Conditional Wasserstein GANs

(RCWGANs), have been explored for generating

synthetic data with high delity.

Traditional methods like ARIMA or recurrent neural

networks (RNNs) have also been applied to synthetic

data generation but often fall short in capturing

complex, nonlinear relationships. GANs have

emerged as a robust alternative, nding applications

in sectors such as healthcare and cybersecurity.

However, their integration into the energy sector

remains at an early stage (Fekri, 2020).

Amasyali and El-Gohary (2018) conducted

an extensive review of energy forecasting

methodologies, reporting that 67% of the

analyzed studies utilized real data, 19% employed

simulated data, and 14% relied on publicly

available reference datasets. This reliance on real

data underscores the importance of historical

records and highlights the urgent need to develop

larger, high-quality datasets to advance energy

prediction capabilities. Although some real

datasets are publicly accessible, many studies

depend on private, proprietary data derived from

real-world scenarios (Sehovac & Grolinger, 2019).

In their review, Amasyali and El-Gohary (2018)

emphasized the role of simulation-based

approaches using tools such as EnergyPlus,

eQUEST, and Ecotect. These physical models

estimate energy consumption based on detailed

environmental and building characteristics.

However, acquiring such granular information

is often impractical. In contrast, data-driven

approaches leverage sensor-derived data and do

not require the same level of specicity. Simulation

techniques are predominantly utilized in the

design phase, whereas data-driven methods are

more commonly applied to demand and supply

management scenarios. Both approaches are

complementary and are selected based on

the specic objectives and constraints of each

application.

Deb et al. (2017) reviewed time-series forecasting

techniques for building energy consumption and

noted the eectiveness of simulation tools like

EnergyPlus, IES, and Ecotect in modeling energy

use for new buildings. When historical data is

unavailable, simulations oer a viable alternative.

Nevertheless, accurately forecasting energy

consumption involves accounting for numerous

complex factors, such as material properties,

climate conditions, and occupant behavior. While

simulations can approximate these variables, data-

driven methods often achieve greater accuracy for

existing buildings with accessible historical data.

Lazos et al. (2014) categorized energy forecasting

approaches into statistical, machine learning, and

physics-based models. Physics-based models

provide detailed, explainable predictions without

requiring historical data but demand extensive input

on structural, thermodynamic, and operational

parameters. Modeling occupant behavior within

these systems remains a signicant challenge.

Conectando mentes, energizando el futuro

Traditional techniques, such as statistical models

(e.g., ARIMA) and interpolation-based methods,

provide foundational tools but are inherently

The application of GANs in the energy sector,

while still nascent, has shown promise. Studies like

Yilma (2023) have demonstrated their capability

Techniques such as Dierential Privacy and

Privacy-Preserving GANs have emerged to

address ethical concerns surrounding the use

Commonly employed metrics for evaluating

synthetic data include Frechet Inception Distance

(FID), Root Mean Square Error (RMSE), and

Kolmogorov- Smirnov (KS) tests. These metrics

1.1 Traditional Methods for Synthetic Data Generation

1.2 Applications of GANs in the Energy Sector

1.3 Privacy Preservation Techniques

1.4 Evaluation Metrics for Synthetic Data

Conversely, data-driven methods, though reliant

on substantial historical data, excel in capturing

behavioral patterns without necessitating detailed

structural information.

Pillai et al. (2014) proposed a hybrid approach

combining consumption and weather data

to generate synthetic load proles, marking a

signicant advancement in realistic synthetic

data generation for energy applications. Despite

these advancements, generating synthetic energy

consumption proles remains challenging due

to the interplay of human behavior and building

characteristics.

limited in their ability to capture dynamic, nonlinear

patterns in energy data

to generate synthetic electricity demand proles

that replicate complex temporal patterns with high

delity.

of sensitive data. These methods ensure that

synthetic data does not compromise the privacy

of individual contributors.

provide objective assessments of the statistical

similarity between real and synthetic datasets

(Haizea, 2025).

Table 1. Comparison of some traditional methods of generating synthetic data.

Source: own elaboration.

Sección ganadores “call for papers”

The generator is congured to map a latent

noise vector into synthetic energy consumption

proles. Its architecture comprises dense layers

The discriminator architecture includes dense

layers with Dropout to mitigate overtting.

The nal layer employs a Sigmoid activation

Comparison of Approaches

This article introduces Generative Adversarial

Networks (GANs) as a promising approach for

generating synthetic energy consumption proles.

By leveraging Machine Learning technology, GANs

can learn and replicate complex consumption data

patterns while preserving the statistical properties

2.1 Generator Design Framework

2.2 Discriminator Optimization

of real data and safeguarding privacy. Specically,

this study proposes a GAN model simulated in

Python to replicate energy consumption proles,

oering new opportunities for optimizing and

ensuring the sustainability of electrical grids.

2. MATERIALS AND METHODS

Model Architecture

Generator: The generator is a neural network

designed to produce synthetic electrical

consumption proles. It takes a random noise

vector as input, representing a latent feature space.

Through multiple neural layers, the generator

transforms this noise into structured data that

mimics real energy consumption patterns.

Discriminator: The discriminator is another neural

network tasked with assessing the authenticity of

the proles generated by the generator. It learns

to dierentiate between real and synthetic data,

providing feedback to improve both networks

through adversarial training.

Framework and Technique

Framework: The implementation of the model is

conducted using PyTorch, a versatile and ecient

library for deep learning.

Technique: The architecture employs Generative

Adversarial Networks (GANs), where the generator

and discriminator are trained in a competitive

adversarial setup.

Implemented Technologies

PyTorch: Used for implementing, training, and

evaluating neural networks. GPU (Graphics

Processing Unit): Accelerates the training process

through parallel computations.

Optimizers: Adam optimizer is employed to

adjust neural network weights and minimize loss

functions.

Data Visualization: Libraries such as Matplotlib

are utilized to analyze model convergence and

validate data quality.

with LeakyReLU activation functions to capture

non-linear relationships and a nal Tanh layer for

output normalization.

function, facilitating the interpretation of results as

probabilities.

Conectando mentes, energizando el futuro

Both the generator and discriminator are optimized

using the Binary Cross- Entropy loss function.

This choice ensures that the generator learns to

Hyperparameters, such as the latent space

dimension (100) and learning rate (0.0002), were

determined via grid search to achieve a balance

The training process was monitored by evaluating

the loss values of the generator and discriminator.

Convergence was deemed achieved when both

The model was trained on an NVIDIA RTX

3090 GPU with 24 GB of memory, signicantly

2.3 Loss Function Selection

3.1 Hyperparameter Selection Methodology

3.2 Convergence Criteria

3.3 Hardware Specications

deceive the discriminator while the discriminator

accurately identies synthetic data.

3. TRAINING PROTOCOL

between training stability and convergence speed.

loss metrics stabilized, and the generated proles

became indistinguishable from real data.

reducing training time compared to CPU-based

implementations.

4. DATA PREPROCESSING

The model was trained and validated using

hourly electricity consumption data from a mid-

size commercial/institutional facility. Due to

condentiality agreements, specic details about

the facility cannot be disclosed. However, the

dataset characteristics are representative of typical

mixed-use electrical installations commonly found

in educational, corporate, or commercial buildings.

Dataset characteristics:

- Installation type: Commercial/institutional

building

- Installed capacity: 500-800 kW

- Data period: 12 consecutive months

- Temporal resolution: Hourly measurements

(8,760 data points)

- Consumption range: 150-650 kWh per

hour

- Load composition: Lighting (30%), HVAC

systems (40%), oce equipment (20%),

other loads (10%)

The consumption patterns include:

- Daily cycles with operational hours (7:00-

19:00) showing higher demand

- Reduced consumption during non-

operational hours and weekends

- Seasonal variations related to cooling/

heating requirements

- Typical variability of occupied building

environments

This dataset scale is representative of numerous

Sección ganadores “call for papers”

Energy consumption data was normalized using

Min-Max Scaling to ensure all values fell within

Preprocessing steps included cleaning the dataset

by imputing missing values via linear interpolation

and removing extreme outliers using boxplot

analysis.

Implementation Hyperparameters

• Latent space dimension: 100

• Learning rate: 0.0002

• Number of epochs: 10,000

• Batch size: 64

These parameters were carefully selected to

optimize the balance between training speed and

model stability.

4.1 Normalization Techniques

4.2 Data Quality Measures

facilities worldwide, making the methodology

applicable and reproducible for similar energy

management applications without requiring

national-scale infrastructure data.

the range [-1, 1], enhancing the model’s learning

eciency.

Training Procedure

The training process employed an adversarial

approach, with the generator creating synthetic

proles that the discriminator aimed to classify as

either real or generated. This iterative competition

improved both models until equilibrium was

reached.

A dataset of real energy consumption proles,

normalized beforehand, was used to ensure

comparability with the generated proles. This

preprocessing step was critical for ensuring

consistent results and robust model evaluation.

Python Code

import torch

import torch.nn as nn import torch.optim as optim import numpy as np

import matplotlib.pyplot as plt

from sklearn.preprocessing import MinMaxScaler # Hyperparameters

LATENT_SPACE_DIM = 100

CONSUMPTION_PROFILE_DIM = 24

LEARNING_RATE = 0.0002

EPOCHS = 10000

BATCH_SIZE = 64

class ElectricityConsumptionGenerator(nn.Module):

def init (self, latent_space_dim=LATENT_SPACE_DIM): super(). init ()

self.model = nn.Sequential(

nn.Linear(latent_space_dim, 256),

nn.LeakyReLU(0.2),

nn.BatchNorm1d(256),

nn.Linear(256, 512),

Conectando mentes, energizando el futuro

nn.LeakyReLU(0.2),

nn.BatchNorm1d(512),

nn.Linear(512, CONSUMPTION_PROFILE_DIM),

nn.Tanh() # Activation to normalize output

)

def forward(self, z):

return self.model(z)

class ElectricityConsumptionDiscriminator(nn.Module):

def init (self):

super(). init ()

self.model = nn.Sequential(

nn.Linear(CONSUMPTION_PROFILE_DIM, 512),

nn.LeakyReLU(0.2),

nn.Dropout(0.3),

nn.Linear(512, 256),

nn.LeakyReLU(0.2),

nn.Dropout(0.3),

nn.Linear(256, 1),

nn.Sigmoid()

)

def forward(self, prole):

return self.model(prole)

class ElectricityConsumptionGAN:

def init (self):

self.generator = ElectricityConsumptionGenerator()

self.discriminator = ElectricityConsumptionDiscriminator()

self.loss_function = nn.BCELoss()

self.generator_optimizer = optim.Adam(

self.generator.parameters(),

lr=LEARNING_RATE,

betas=(0.5, 0.999)

)

self.discriminator_optimizer = optim.Adam(

self.discriminator.parameters(),

lr=LEARNING_RATE,

betas=(0.5, 0.999)

)

def generate_real_data(self, size):

# Simulating real data (modify as needed)

return torch.FloatTensor(np.random.normal(

loc=0.5,

scale=0.2,

size=(size, CONSUMPTION_PROFILE_DIM)

))

Sección ganadores “call for papers”

def train(self):

generator_losses = []

discriminator_losses = []

for epoch in range(EPOCHS):

# Training the Discriminator

self.discriminator.zero_grad()

# Real data

real_data = self.generate_real_data(BATCH_SIZE)

real_labels = torch.ones(BATCH_SIZE, 1)

# Generated data

noise = torch.randn(BATCH_SIZE, LATENT_SPACE_DIM)

generated_data = self.generator(noise)

generated_labels = torch.zeros(BATCH_SIZE, 1)

# Discriminator loss

real_output = self.discriminator(real_data)

generated_output = self.discriminator(generated_data.detach())

discriminator_loss = (

self.loss_function(real_output, real_labels) +

self.loss_function(generated_output, generated_labels)

)

discriminator_loss.backward()

self.discriminator_optimizer.step()

# Training the Generator

self.generator.zero_grad()

noise = torch.randn(BATCH_SIZE, LATENT_SPACE_DIM)

generated_data = self.generator(noise)

generated_output = self.discriminator(generated_data)

generator_loss = self.loss_function(

generated_output,

torch.ones(BATCH_SIZE, 1)

)

generator_loss.backward()

self.generator_optimizer.step()

# Record losses

generator_losses.append(generator_loss.item())

discriminator_losses.append(discriminator_loss.item())

Conectando mentes, energizando el futuro

# Print progress

if epoch % 100 == 0:

print(f”Epoch [{epoch}/{EPOCHS}]”)

print(f”Discriminator Loss: {discriminator_loss.item()}”)

print(f”Generator Loss: {generator_loss.item()}”)

return generator_losses, discriminator_losses

def generate_proles(self, num_proles=10):

with torch.no_grad():

noise = torch.randn(num_proles, LATENT_SPACE_DIM)

generated_proles = self.generator(noise).numpy()

return generated_proles

# Enhanced Visualization

def visualize_results(generated_proles, generator_losses, discriminator_losses):

# Distinctive color palette

colors = [‘#1f77b4’, ‘#7f0e’, ‘#2ca02c’, ‘#d62728’, ‘#9467bd’]

# Visualization conguration

plt.gure(gsize=(16, 10))

plt.subplot(2, 1, 1)

# Visualizing Generated Proles

for i, prole in enumerate(generated_proles):

plt.plot(

range(len(prole)),

prole,

label=f’Synthetic Prole {i+1}’,

color=colors[i],

linewidth=2,

marker=’o’

)

plt.title(‘Synthetic Electricity Consumption Proles’, fontsize=16)

plt.xlabel(‘Hour of the Day’, fontsize=12)

plt.ylabel(‘Normalized Consumption’, fontsize=12)

plt.legend(loc=’best’)

plt.grid(True, linestyle=’--’, alpha=0.7)

# Visualizing Losses

plt.subplot(2, 1, 2)

plt.plot(

generator_losses,

label=’Generator Loss’,

color=’#1f77b4’,

linewidth=2

)

plt.plot(

discriminator_losses,

label=’Discriminator Loss’,

Sección ganadores “call for papers”

color=’#7f0e’,

linewidth=2

)

plt.title(‘Loss Evolution during Training’, fontsize=16)

plt.xlabel(‘Training Epochs’, fontsize=12)

plt.ylabel(‘Loss Value’, fontsize=12)

plt.legend(loc=’best’)

plt.grid(True, linestyle=’--’, alpha=0.7)

plt.tight_layout()

plt.show()

# Main Function

def main():

# Seed for reproducibility

torch.manual_seed(42)

np.random.seed(42)

# Create and train GAN model

gan_model = ElectricityConsumptionGAN()

# Train model

generator_losses, discriminator_losses = gan_model.train()

# Generate proles

generated_proles = gan_model.generate_proles(num_proles=5)

# Visualize results

visualize_results(generated_proles, generator_losses, discriminator_losses)

# Program entry point

if name == “ main “:

main()

5. RESULTS

In Figure 2, the results of the GAN model training

in Python are presented, specically executed

in an interactive environment such as IDLE. This

process generated data and metrics about the

model, including parameters such as discriminator

losses, generator losses, and the epoch.

Conectando mentes, energizando el futuro

Figure 2. Simulation results in Python’s IDLE.

Source: own elaboration.

1. Generator Loss

The generator loss quanti es the generator’s

e ectiveness in deceiving the discriminator. A high

generator loss indicates that the discriminator

can easily identify the generated data as fake.

Conversely, a low loss value suggests that the

generator is producing more realistic data. The

objective is to minimize this loss so the generator

outputs synthetic data indistinguishable from real

data.

2. Discriminator Loss

The discriminator loss measures the discriminator’s

ability to di erentiate

between real and generated data. A high

discriminator loss indicates di culty in

distinguishing between the two, whereas a low

loss implies that the discriminator e ectively

identi es generated data as fake. Ideally, this loss

should stabilize around 0.5, re ecting that the

discriminator performs no better than random

guessing in di erentiating real and generated data.

3. Epoch

An epoch represents one complete pass through

the training dataset, marking the progress of

the training process. Increasing the number of

epochs allows the model more opportunities

to learn and re ne its outputs. It is essential to

monitor the losses throughout the epochs to

ensure convergence and optimal training results.

4. Discriminator Output for Real and

Generated Data

The outputs from the discriminator are its

predictions on whether the input data is real or

generated:

real_output = self.discriminator(real_data)

generated_output = self.discriminator(generated_

data.detach())

o Real Output: Should approach 1,

indicating that the discriminator accurately

identi es real data.

o Generated Output: Should approach

0, showing the discriminator’s ability to

correctly classify generated data as fake.

The objective is to re ne these outputs so

the discriminator becomes increasingly

accurate in its predictions.

Sección ganadores “call for papers”

Figure 3. Graph of an Electrical Consumption Pro le and Loss Evolution During Training.

Source: own elaboration.

Each line represents a synthetic electrical

consumption pro le generated by the model.

Di erent colors and markers are used to distinguish

between the various pro les. Figure 3 illustrates

how the GAN model has generated consumption

pro les that replicate the patterns observed in

the real data. You can observe the variations in

consumption throughout the day, which may help

identify trends and patterns in electrical usage.

Regarding the loss evolution during training, the

blue line represents the generator loss, and the

5. Loss Logging

Generator and discriminator losses are recorded

at each epoch to track the model’s learning

progress:

generator_losses.append(generator_loss.item())

discriminator_losses.append(discriminator_loss.

item())

These logs enable the visualization of loss trends

during training. By analyzing the evolution of these

losses, it is possible to assess the e ectiveness of

the learning process and implement adjustments

if necessary.

The results of the GAN model are presented in

Figure 3, comprising two key elements:

• Visualization of Synthetic Energy

Consumption Pro les: Illustrating the

generator’s capability to produce realistic

consumption patterns.

• Loss Evolution During Training:

Providing insight into the dynamic interaction

between the generator and discriminator as

they improve over successive epochs.

orange line represents the discriminator loss.

Both evolve over the course of training, ideally

decreasing and stabilizing over time, which

indicates that the model is learning to generate

synthetic pro les that are di cult to distinguish from

real ones. If the losses do not converge or exhibit

erratic behavior, it may be necessary to adjust the

model’s hyperparameters or architecture.

Conectando mentes, energizando el futuro

5.1 Complementary Visualizations Based on Method Validation

5.2. PCA: Dimensionality Reduction

In Figure 4, both real and synthetic data are

displayed in terms of density distribution. As

expected, the density curves for the real and

synthetic data are very similar, suggesting that

the GAN has successfully captured the univariate

distribution of the real data. A noticeable

discrepancy (e.g., if the synthetic curve is shifted

5.1.1 Density Distribution

Figure 4. Density Distribution

Figure 5. PCA Representation.

Source: own elaboration.

or broader than the real one) would indicate that

the model has not yet captured the variability of

the data. However, this evaluation is super cial

and should be complemented with quantitative

metrics and multivariate analysis [47].

Sección ganadores “call for papers”

5.3. Histogram

5.4 Boxplot

The histogram in Figure 6 compares the frequency

distributions of the real and synthetic values,

demonstrating a good replication of the univariate

Dimensionality reduction via PCA allows

multivariate data to be projected into a two-

dimensional space, aiding in their comparison. In

electrical applications, this is useful not only for

emulating individual values (e.g., consumption at a

speci c hour) but also for capturing more complex

patterns (such as the relationship between

consumption at di erent times of the day).

Figure 5 shows a distribution of the data as

components of Principal Component Analysis

(PCA). PCA is de ned as a dimensionality

reduction technique used to transform a dataset

with many variables (dimensions) into a set with

Figure 7 presents the boxplot, which encompasses

the median, interquartile ranges, and outliers of

both the real and synthetic data. The boxes and

whiskers for the real and synthetic data should be

similar in length and position.

Figure 6. Histogram.

Source: own elaboration.

fewer variables, while retaining as much of the

original information as possible (Zhang & Li, 2023).

In the case of Figure 4, there is no signi cant

dispersion between the real and synthetic

data points, indicating that the multivariate

characteristics have been satisfactorily replicated.

If a discrepancy had been observed, it would have

required validation of the model architecture or

training process. PCA-based analyses are crucial

in contexts such as consumption across di erent

locations or times, as well as for the operation and

planning of smart grids.

distribution of the real data. This is particularly

relevant in electrical design applications (Li et al.,

2016).

In electrical grids, the ability to model extreme

values is critical, as these may represent unusual

events such as demand spikes.

Conectando mentes, energizando el futuro

Figure 7. Boxplot.

Table 2. Comparison of other methods during training.

Source: own elaboration.

Comparison of GANs vs. Alternative Models

In this section, GANs are contrasted with other

traditional and advanced approaches:

• TimeGAN: Capable of capturing time

series with high  delity, but with greater

computational complexity and long training

times.

Table 2 indicates that GANs have better accuracy

(lower RMSE and MAE), and a longer training time

compared to ARIMA but shorter than TimeGAN.

They have a high generalization capacity.

TimeGAN is able to capture time series with high

 delity. It also has a high generalization capacity,

but its training time is longer and it has a lower

accuracy than GANs. ARIMA is a faster method in

• Statistical models (ARIMA): Suitable for

linear trends, but limited in their ability to

model non-linear relationships.

• Recurrent networks: Although e ective for

temporal patterns, they require extensive

training data to avoid over tting problems.

terms of training time, but less accurate and has

a low generalization capacity. Suitable for linear

trends, but limited in its ability to model non-linear

relationships.

Sección ganadores “call for papers”

6. DISCUSSION

7. CONCLUSIONS

The synthetic generation of electrical consumption

proles using Generative Adversarial Networks

(GANs) represents a signicant advancement in

energy planning and management. The ndings

of this study highlight the potential of GANs to

address contemporary challenges related to data

privacy and accessibility. GANs ability to replicate

intricate patterns, such as daily consumption

variations, underscores their utility not only

for simulations but also as a powerful tool for

generating articial datasets that complement

real-world data in research and development

applications.

A key aspect worth emphasizing is the quality of

the generated data, which is demonstrated by its

statistical resemblance to real data. This capability

implies that GANs can not only emulate existing

consumption patterns but also be leveraged to train

and validate predictive and analytical algorithms

without jeopardizing sensitive information. This

approach holds substantial potential for industrial

This study demonstrates that Generative

Adversarial Networks (GANs) are a powerful

and promising tool for generating synthetic

electrical consumption proles. The results

reveal that GANs can eectively replicate both

univariate and multivariate patterns in electricity

consumption data, oering a robust solution

for data augmentation, privacy-preserving

simulations, and the development of advanced

energy management algorithms. Validation of the

synthetic data using various graphical techniques

such as density distributions, PCA, histograms,

and boxplots has conrmed a high degree of

similarity to real- world data, reinforcing the

model’s capability to accurately replicate essential

consumption characteristics.

By overcoming the challenges associated

with accessing real consumption data, this

and academic sectors where the accessibility and

use of condential data are restricted.

However, it is crucial to recognize certain inherent

limitations of the model. While the results are

promising, further validation in more complex

scenarios involving multiple contextual variables

such as temperature, consumer behavior, and

dynamic energy pricing remains necessary.

Moreover, the stability of GANs during training

and the interpretability of their outputs continue

to present challenges that must be resolved to

ensure more robust and reliable implementation.

From a practical standpoint, this methodology

demonstrates exibility to adapt to diverse

applications, such as smart grid planning and

microgrid modeling. Its independence from

corporate data oers a signicant advantage

in regulated and competitive environments,

facilitating progress toward sustainable and

inclusive energy solutions.

approach contributes to the democratization

of energy analysis, enabling researchers and

organizations to utilize representative datasets

without compromising privacy or security. Future

research directions could explore the integration

of contextual variables, optimization of model

architecture, and validation of the methodology in

real-world energy systems.

As the global shift toward sustainability

accelerates, the generation of synthetic data

using GANs emerges as a catalyst for the design

of resilient and intelligent electrical infrastructures.

This work invites the scientic and technological

community to delve deeper into the potential of

this innovative tool, solidifying its role as a viable

and transformative solution in the global energy

transition.

Conectando mentes, energizando el futuro

9. REFERENCIAS

Ahmed, N. K., Atiya, A. F., El Gayar, N., & El-Shishiny, H. (2020). Comparative analysis of traditional and deep

learning models for time series forecasting. Neurocomputing, 20, 597–613.

Akbari, A., & Lowther, D. A. (2024). CDC-GANs: Bridging innovation and eciency in e-machine design with

advanced generative models. In 2024 International Conference on Electrical Machines (ICEM) (pp. 1–7). https://

doi.org/10.1109/ICEM60801.2024.10700184

Amasyali, K., & El-Gohary, N. M. (2018). A review of data-driven building energy consumption prediction studies.

Renewable and Sustainable Energy Reviews, 81, 1192–1205.

Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial networks. In International

Conference on Machine Learning (pp. 214–223). PMLR.

Ashari, S., & Setiawan, E. A. (2022). Optimization of advanced metering infrastructure (AMI) customer ecosystem

by using analytic hierarchy process method. In 2022 10th International Conference on Smart Grid (icSmartGrid)

(pp. 240–248). https://doi.org/10.1109/icSmartGrid55722.2022.9848639

Beaulieu-Jones, B. K., Wu, Z. S., Williams, C., Lee, R., Bhavnani, S. P., Byrd, J. B., & Greene, C. S. (2019).

Privacy-preserving generative deep neural networks support clinical data sharing. Circulation: Cardiovascular

Quality and Outcomes, 12(7), e005122. https://doi.org/10.1161/CIRCOUTCOMES.118.005122

Choi, E., Biswal, S., Malin, B., Duke, J., Stewart, W. F., & Sun, J. (2017). Generating multi-label discrete patient

records using Generative Adversarial Networks. arXiv. https://doi.org/10.48550/arxiv.1703.06490

Deb, C., Zhang, F., Yang, J., Lee, S. E., & Shah, K. W. (2017). A review on time series forecasting techniques for

building energy consumption. Renewable and Sustainable Energy Reviews, 74, 902–924.

Dunmore, A., Jang-Jaccard, J., Sabrina, F., & Kwak, J. (2023). A comprehensive survey of generative adversarial

networks (GANs) in cybersecurity intrusion detection. IEEE Access, 11, 76071–76094. https://doi.

org/10.1109/ACCESS.2023.329670719

Dutta, I. K., Ghosh, B., Carlson, A., Totaro, M., & Bayoumi, M. (2020). Generative adversarial networks in security:

A survey. In 2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference

(UEMCON) (pp. 399–405). https://doi.org/10.1109/UEMCON51285.2020.9298135

Enhancing security in public spaces through Generative Adversarial Networks (GANs). (2024). In Advances in

Information Security, Privacy, and Ethics Book Series. https://doi.org/10.4018/979-8-3693-35970

Esteban, C., Hyland, S. L., & Rätsch, G. (2017). Real-valued (medical) time series generation with recurrent

conditional GANs. arXiv preprint. https://arxiv.org/abs/1706.02633

Fekri, M. N., Ghosh, A. M., & Grolinger, K. (2020). Generación de datos de energía para aprendizaje automático

con redes generativas adversarias recurrentes. Energies, 13(1), 130. https://doi.org/10.3390/en13010130

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2020).

Generative adversarial networks. Communications of the ACM, 63(11), 139–144. https://doi.org/10.1145/3422622

Haizea. (2025). Datos sintéticos: La clave para proteger la privacidad en la era de la IA. Nymiz -

Data Anonymization & Redaction Software. https://www.nymiz.com/datos-sinteticos-la-clave-

para-proteger-la-privacidad-en- la-era-de-la-ia/

Hart, D. G. (2008). Using AMI to realize the Smart Grid. In 2008 IEEE Power and Energy Society General Meeting

- Conversion and Delivery of Electrical Energy in the 21st Century (Vol. 10).

Hossain, R., Gautam, M., Olowolaju, J., Livani, H., & Benidris, M. (2024). Multi-agent voltage control in distribution

systems using GAN-DRL-based approach. Electric Power Systems Research, 234, 110528. https://doi.

org/10.1016/j.epsr.2024.110528

Sección ganadores “call for papers”

Li, A., Feng, M., Li, Y., & Liu, Z. (2016). Application of outlier mining in insider identication based on boxplot

method. Procedia Computer Science, 91, 245–251. https://doi.org/10.1016/j.procs.2016.07.069

Li, C., & Yang, D. (2021). Construction of power grid digital twin model based on GAN. In 2021 China Automation

Congress (CAC) (pp. 7767–7771). https://doi.org/10.1109/CAC53003.2021.9728190

Lim, W., Yong, K. S. C., Lau, B. T., & Tan, C. C. L. (2024). Future of generative adversarial networks (GAN) for

anomaly detection in network security: A review. Computers & Security, 139, 103733.

https://doi.org/10.1016/j.cose.2024.103733

Luo, D., Liu, X., Wang, R., & Li, X. (2023). A review of circuit models for GAN power devices. In 2023 IEEE

6th International Electrical and Energy Conference (CIEEC) (pp. 1461–1467). https://doi.org/10.1109/

CIEEC58067.2023.10166401

National Academies of Sciences, Engineering, and Medicine. (2016). Analytic research foundations for the next-

generation electric grid. The National Academies Press. https://doi.org/10.17226/21919

Nayak, A. A., Venugopala, P. S., & Ashwini, B. (2024). A systematic review on generative adversarial network

(GAN): Challenges and future directions.

Archives of Computational Methods in Engineering. https://doi.org/10.1007/

s11831-024-10119-1

Ortega, C. (2024). Generación de datos sintéticos: Técnicas y consideraciones. QuestionPro. h t t p s : / /

www.questionpro.com/blog/es/generacion-de-datos- sinteticos/

Ortiz-Torres, L. F., Gómez-Luna, E., & Sáenz, E. M. (2024). Estudio del uso y contribución de la inteligencia articial

para la operación en redes eléctricas. Revista UIS Ingenierías, 23(2), 31–46. https://doi.org/10.18273/revuin.

v23n2- 2024003

Park, S., Kim, H., Moon, H., Heo, J., & Yoon, S. (2010). Concurrent simulation platform for energy-aware smart

metering systems. IEEE Transactions on Consumer Electronics, 56(3), 1918–1926.

Pillai, G. G., Putrus, G. A., & Pearsall, N. M. (2014). Generation of synthetic benchmark electrical load proles

using publicly available load and weather data. International Journal of Electrical Power & Energy Systems, 61,

1–10.

Sehovac, L., Nesen, C., & Grolinger, K. (2019, July). Forecasting building energy consumption with deep learning:

A sequence to sequence approach. In 2019 IEEE International Congress on Internet of Things (ICIOT) (pp. 108–

116). IEEE.

Sharma, P., Kumar, M., Sharma, H. K., & Biju, S. M. (2024). Generative adversarial networks (GANs): Introduction,

taxonomy, variants, limitations, and applications. Multimedia Tools and Applications. https://doi.

org/10.1007/s11042-024-18767-y

Shi, A. (2021). Cyber attacks detection based on Generative Adversarial Networks. https://doi.org/10.1109/

ACCC54619.2021.00025

Tian, Y., Sehovac, L., & Grolinger, K. (2019). Similarity-based chained transfer learning for energy forecasting with

big data. IEEE Access, 7, 139895–139908.

Xie, M., Zou, H., Zhang, S., & Zhu, Q. (2021). Evaluating generative adversarial networks for creating electricity

consumption data. Applied Energy, 281, 115998.

Xu, L., & Veeramachaneni, K. (2018). Synthesizing tabular data using Generative Adversarial Networks.

arXiv. https://doi.org/10.48550/arxiv.1811.11264

Yadav, H., Vasa, J., & Patel, R. (2023). GAN (Generative Adversarial Network)- based image super-resolution: A

technical perspective. In Lecture Notes in

Conectando mentes, energizando el futuro

Networks and Systems (pp. 283–293). https://doi.org/10.1007/978-981-99-3761- 5_27

Yilmaz, B. (2023). A scenario framework for electricity grid using Generative Adversarial Networks. Sustainable

Energy Grids and Networks, 36, 101157. https://doi.org/10.1016/j.segan.2023.101157

Yilmaz, B., & Korn, R. (2022). Synthetic demand data generation for individual electricity consumers: Generative

Adversarial Networks (GANs). Energy and AI, 9, 100161.

Yoon, J., Jarrett, D., & Van der Schaar, M. (2019). Time-series generative adversarial networks. In Advances in

Neural Information Processing Systems.

Zhang, Y., & Li, Q. (2023). A comparative study of synthetic data generation using machine learning models.

Journal of Articial Intelligence Research, 79, 225–245.

Zhao, Q., Sun, B., Zhao, W., Watanabe, T., Usui, T., & Takeda, H. (2024). Improved GAN-based deep learning

approach for strain eld prediction and failure analysis of precast bridge slab joints. Engineering Structures, 321,

119023. https://doi.org/10.1016/j.engstruct.2024.119023

Zheng, S., Zhang, Y., Zhou, S., Ni, Q., & Zuo, J. (2022). Comprehensive energy consumption assessment based

on industry energy consumption structure. Part I: Analysis of energy consumption in key industries. In 2022

IEEE 5th International Electrical and Energy Conference (CIEEC) (pp. 4942–4949). https://doi.org/10.1109/

cieec54735.2022.9845929

Wang, R., Wang, G., Gu, L., Liu, Q., Liu, Y., & Guo, Y. (2025). Intuitively interpreting GANs latent space using

semantic distribution. Knowledge-Based Systems, 112894. https://doi.org/10.1016/j.knosys.2024.112894

Sección ganadores “call for papers”