57
GENERACIÓN SINTÉTICA DE PERFILES DE
CONSUMO ELÉCTRICO MEDIANTE REDES
GENERATIVAS ANTAGÓNICAS (GAN)
SYNTHETIC GENERATION OF ELECTRICAL CONSUMPTION
PROFILES USING GENERATIVE ADVERSARIAL NETWORKS
(GANS)
Luis Ferney Ortíz Torres
Recibido: y Aceptado:
15/11/2024 - 14/10/2025
Conectando mentes, energizando el futuro
58
59
La previsión precisa del consumo de energía es esencial para la planicación y gestión ecaces de las
infraestructuras eléctricas. Este artículo presenta un modelo que aprovecha las redes generativas
adversariales (GAN) para producir perles sintéticos de consumo de energía, abordando los retos planteados
por el acceso limitado a los datos críticos corporativos o empresariales necesarios para el funcionamiento
de los sistemas eléctricos. El enfoque basado en GAN genera perles de consumo realistas, cuya similitud
estadística con los conjuntos de datos del mundo real se evaluó rigurosamente. Los resultados demuestran
que los perles sintéticos se asemejan mucho a los datos auténticos, lo que subraya la capacidad de los
GAN como herramienta robusta para simular y predecir patrones de consumo energético. En conclusión,
este artículo subraya el potencial transformador de los GAN para avanzar en la planicación energética y
permitir simulaciones más precisas en contextos en los que los datos del mundo real son escasos o difíciles
de obtener.
Accurate energy consumption forecasting is essential for the eective planning and management of electrical
infrastructure. This article introduces a model leveraging Generative Adversarial Networks (GANs) to produce
synthetic energy consumption proles, addressing the challenges posed by limited access to critical corporate
or enterprise data necessary for the operation of electrical systems. The GAN-based approach generates
realistic consumption proles, which were rigorously evaluated for their statistical similarity to real-world
datasets. The results demonstrate that the synthetic proles closely mimic authentic data, underscoring the
capability of GANs as a robust tool for simulating and predicting energy consumption patterns. In conclusion,
this article highlights the transformative potential of GANs in advancing energy planning and enabling more
accurate simulations in contexts where real- world data is scarce or dicult to obtain.
PALABRAS CLAVE: Redes generativas antagónicas (GAN), Modelos predictivos, aprendizaje automático,
análisis de datos, eciencia energética, modelado predictivo.
KEYWORDS: Generative Adversarial Networks (GANs), Predictive models, Machine Learning, data privacy,
energy eciency, predictive modeling.
Resumen
Abstract
Conectando mentes, energizando el futuro
60
1. INTRODUCTION
In an electrical grid, data from generation to
commercialization and the end user/prosumer
must be systematically collected, integrated, and
analyzed. These datasets must align with the
capabilities of modern measurement systems
while ensuring stringent privacy and security
protocols for data acquisition and
transmission. For instance, Advanced Metering
Infrastructure (AMI) (Hart, 2008; Ashari, 2022) is a
key technology used for real-time monitoring and
management of electricity consumption (Park et
al., 2010). Households, buildings, and industries
equipped with AMI automatically transmit energy
consumption data to their electricity providers.
This enables providers to improve energy supply
management, anticipate rationing needs, and
validate energy demand more e ectively (Park et
al., 2010).
The growing need to optimize energy consumption
has become a critical challenge within the
evolving dynamics of the electric sector (Hossain
et al., 2024). This challenge is compounded by
exponential demand growth and the urgency of
advancing the energy transition and sustainability
initiatives. These demands necessitate the
development of scenarios that allow continuous
state and condition validation across electrical
grids (Zhen et al., 2022; Ortiz et al., 2024).
However, this also creates signi cant obstacles
for researchers, particularly in testing innovative
instruments, methods, and theories (National
Academies of Sciences, Engineering, and
Medicine, 2016; Yilmaz, 2023). Given the vital role
of electrical grids in daily life, access to data has
become indispensable for designing and validating
advanced mathematical and computational tools.
Therefore, stakeholders including policymakers,
industry professionals, and researchers must
collaborate to generate, validate, and make
synthetic data accessible to drive advancements
in the eld (Akbari et al., 2024; Luo et al., 2023;
Enhancing Security in Public Spaces Through
Generative Adversarial Networks (GANs), 2024).
These e orts have the potential to improve the
planning, operation, and optimization of electrical
grids. Nonetheless, a major impediment lies in the
restricted access to real-world data, a sensitive
issue that could compromise national privacy and
security if mishandled (Lim et al., 2024; Shi, 2021;
Dunmore et al., 2023; Goodfellow et al. 2020).
This limitation restricts the availability of data for
researchers and other key players, prompting the
need for innovative approaches that transcend
conventional constraints. Tools like Generative
Adversarial Networks (GANs) o er a promising
avenue to address these challenges by creating
realistic synthetic datasets, thereby fostering
opportunities for progress in the sector.
Figure 1. Description of the operation of a GANs.
Source: own elaboration.
Sección ganadores “call for papers”
61
Goodfellow et al. pioneered the concept of
Generative Adversarial Networks (GANs) as an
adversarial process (Sharma et al. 2024). This
framework involves the simultaneous training of
two models: a Generator and a Discriminator. As
depicted in Figure 1, the Generator serves as a
generative model designed to approximate the
data distribution, while the Discriminator acts as
a discriminative model tasked with estimating the
probability that a given sample originates from the
training data rather than the Generator (Nayak et
al., 2024; Yadav et al., 2023; Dutta et al., 2020).
One of the most prevalent applications of GANs is
in privacy protection, where they create synthetic
datasets that mimic the statistical properties of
original data without exposing sensitive information
(Choi et al., 2017).
Beyond GANs, alternative methods exist for
generating statistically synthetic data. Ping et
al. demonstrated the utility of Bayesian models
for capturing the relationships within synthetic
data generation frameworks (Hindistan & Yetkin,
2023). However, the primary advantage of GANs
over traditional statistical approaches lies in their
superior capability to approximate real-world data
distributions. Xu and Veeramachaneni (2023)
highlighted the potential of GANs in producing
high-quality synthetic datasets benecial
for data science applications. For instance,
techniques such as Recurrent Conditional GANs
(RCGANs) (Yilmaz & Korn, 2022), Time-Series
GANs (TimeGANs) (Esteban et al., 2017), and
Wasserstein-based models, including Conditional
Wasserstein GANs (CWGANs) (Arjovsky, 2017)
and Recurrent Conditional Wasserstein GANs
(RCWGANs), have been explored for generating
synthetic data with high delity.
Traditional methods like ARIMA or recurrent neural
networks (RNNs) have also been applied to synthetic
data generation but often fall short in capturing
complex, nonlinear relationships. GANs have
emerged as a robust alternative, nding applications
in sectors such as healthcare and cybersecurity.
However, their integration into the energy sector
remains at an early stage (Fekri, 2020).
Amasyali and El-Gohary (2018) conducted
an extensive review of energy forecasting
methodologies, reporting that 67% of the
analyzed studies utilized real data, 19% employed
simulated data, and 14% relied on publicly
available reference datasets. This reliance on real
data underscores the importance of historical
records and highlights the urgent need to develop
larger, high-quality datasets to advance energy
prediction capabilities. Although some real
datasets are publicly accessible, many studies
depend on private, proprietary data derived from
real-world scenarios (Sehovac & Grolinger, 2019).
In their review, Amasyali and El-Gohary (2018)
emphasized the role of simulation-based
approaches using tools such as EnergyPlus,
eQUEST, and Ecotect. These physical models
estimate energy consumption based on detailed
environmental and building characteristics.
However, acquiring such granular information
is often impractical. In contrast, data-driven
approaches leverage sensor-derived data and do
not require the same level of specicity. Simulation
techniques are predominantly utilized in the
design phase, whereas data-driven methods are
more commonly applied to demand and supply
management scenarios. Both approaches are
complementary and are selected based on
the specic objectives and constraints of each
application.
Deb et al. (2017) reviewed time-series forecasting
techniques for building energy consumption and
noted the eectiveness of simulation tools like
EnergyPlus, IES, and Ecotect in modeling energy
use for new buildings. When historical data is
unavailable, simulations oer a viable alternative.
Nevertheless, accurately forecasting energy
consumption involves accounting for numerous
complex factors, such as material properties,
climate conditions, and occupant behavior. While
simulations can approximate these variables, data-
driven methods often achieve greater accuracy for
existing buildings with accessible historical data.
Lazos et al. (2014) categorized energy forecasting
approaches into statistical, machine learning, and
physics-based models. Physics-based models
provide detailed, explainable predictions without
requiring historical data but demand extensive input
on structural, thermodynamic, and operational
parameters. Modeling occupant behavior within
these systems remains a signicant challenge.
Conectando mentes, energizando el futuro
62
Traditional techniques, such as statistical models
(e.g., ARIMA) and interpolation-based methods,
provide foundational tools but are inherently
The application of GANs in the energy sector,
while still nascent, has shown promise. Studies like
Yilma (2023) have demonstrated their capability
Techniques such as Dierential Privacy and
Privacy-Preserving GANs have emerged to
address ethical concerns surrounding the use
Commonly employed metrics for evaluating
synthetic data include Frechet Inception Distance
(FID), Root Mean Square Error (RMSE), and
Kolmogorov- Smirnov (KS) tests. These metrics
1.1 Traditional Methods for Synthetic Data Generation
1.2 Applications of GANs in the Energy Sector
1.3 Privacy Preservation Techniques
1.4 Evaluation Metrics for Synthetic Data
Conversely, data-driven methods, though reliant
on substantial historical data, excel in capturing
behavioral patterns without necessitating detailed
structural information.
Pillai et al. (2014) proposed a hybrid approach
combining consumption and weather data
to generate synthetic load proles, marking a
signicant advancement in realistic synthetic
data generation for energy applications. Despite
these advancements, generating synthetic energy
consumption proles remains challenging due
to the interplay of human behavior and building
characteristics.
limited in their ability to capture dynamic, nonlinear
patterns in energy data
to generate synthetic electricity demand proles
that replicate complex temporal patterns with high
delity.
of sensitive data. These methods ensure that
synthetic data does not compromise the privacy
of individual contributors.
provide objective assessments of the statistical
similarity between real and synthetic datasets
(Haizea, 2025).
Table 1. Comparison of some traditional methods of generating synthetic data.
Source: own elaboration.
Sección ganadores “call for papers”
63
The generator is congured to map a latent
noise vector into synthetic energy consumption
proles. Its architecture comprises dense layers
The discriminator architecture includes dense
layers with Dropout to mitigate overtting.
The nal layer employs a Sigmoid activation
Comparison of Approaches
This article introduces Generative Adversarial
Networks (GANs) as a promising approach for
generating synthetic energy consumption proles.
By leveraging Machine Learning technology, GANs
can learn and replicate complex consumption data
patterns while preserving the statistical properties
2.1 Generator Design Framework
2.2 Discriminator Optimization
of real data and safeguarding privacy. Specically,
this study proposes a GAN model simulated in
Python to replicate energy consumption proles,
oering new opportunities for optimizing and
ensuring the sustainability of electrical grids.
2. MATERIALS AND METHODS
Model Architecture
Generator: The generator is a neural network
designed to produce synthetic electrical
consumption proles. It takes a random noise
vector as input, representing a latent feature space.
Through multiple neural layers, the generator
transforms this noise into structured data that
mimics real energy consumption patterns.
Discriminator: The discriminator is another neural
network tasked with assessing the authenticity of
the proles generated by the generator. It learns
to dierentiate between real and synthetic data,
providing feedback to improve both networks
through adversarial training.
Framework and Technique
Framework: The implementation of the model is
conducted using PyTorch, a versatile and ecient
library for deep learning.
Technique: The architecture employs Generative
Adversarial Networks (GANs), where the generator
and discriminator are trained in a competitive
adversarial setup.
Implemented Technologies
PyTorch: Used for implementing, training, and
evaluating neural networks. GPU (Graphics
Processing Unit): Accelerates the training process
through parallel computations.
Optimizers: Adam optimizer is employed to
adjust neural network weights and minimize loss
functions.
Data Visualization: Libraries such as Matplotlib
are utilized to analyze model convergence and
validate data quality.
with LeakyReLU activation functions to capture
non-linear relationships and a nal Tanh layer for
output normalization.
function, facilitating the interpretation of results as
probabilities.
Conectando mentes, energizando el futuro
64
Both the generator and discriminator are optimized
using the Binary Cross- Entropy loss function.
This choice ensures that the generator learns to
Hyperparameters, such as the latent space
dimension (100) and learning rate (0.0002), were
determined via grid search to achieve a balance
The training process was monitored by evaluating
the loss values of the generator and discriminator.
Convergence was deemed achieved when both
The model was trained on an NVIDIA RTX
3090 GPU with 24 GB of memory, signicantly
2.3 Loss Function Selection
3.1 Hyperparameter Selection Methodology
3.2 Convergence Criteria
3.3 Hardware Specications
deceive the discriminator while the discriminator
accurately identies synthetic data.
3. TRAINING PROTOCOL
between training stability and convergence speed.
loss metrics stabilized, and the generated proles
became indistinguishable from real data.
reducing training time compared to CPU-based
implementations.
4. DATA PREPROCESSING
The model was trained and validated using
hourly electricity consumption data from a mid-
size commercial/institutional facility. Due to
condentiality agreements, specic details about
the facility cannot be disclosed. However, the
dataset characteristics are representative of typical
mixed-use electrical installations commonly found
in educational, corporate, or commercial buildings.
Dataset characteristics:
- Installation type: Commercial/institutional
building
- Installed capacity: 500-800 kW
- Data period: 12 consecutive months
- Temporal resolution: Hourly measurements
(8,760 data points)
- Consumption range: 150-650 kWh per
hour
- Load composition: Lighting (30%), HVAC
systems (40%), oce equipment (20%),
other loads (10%)
The consumption patterns include:
- Daily cycles with operational hours (7:00-
19:00) showing higher demand
- Reduced consumption during non-
operational hours and weekends
- Seasonal variations related to cooling/
heating requirements
- Typical variability of occupied building
environments
This dataset scale is representative of numerous
Sección ganadores “call for papers”
65
Energy consumption data was normalized using
Min-Max Scaling to ensure all values fell within
Preprocessing steps included cleaning the dataset
by imputing missing values via linear interpolation
and removing extreme outliers using boxplot
analysis.
Implementation Hyperparameters
Latent space dimension: 100
Learning rate: 0.0002
Number of epochs: 10,000
Batch size: 64
These parameters were carefully selected to
optimize the balance between training speed and
model stability.
4.1 Normalization Techniques
4.2 Data Quality Measures
facilities worldwide, making the methodology
applicable and reproducible for similar energy
management applications without requiring
national-scale infrastructure data.
the range [-1, 1], enhancing the model’s learning
eciency.
Training Procedure
The training process employed an adversarial
approach, with the generator creating synthetic
proles that the discriminator aimed to classify as
either real or generated. This iterative competition
improved both models until equilibrium was
reached.
A dataset of real energy consumption proles,
normalized beforehand, was used to ensure
comparability with the generated proles. This
preprocessing step was critical for ensuring
consistent results and robust model evaluation.
Python Code
import torch
import torch.nn as nn import torch.optim as optim import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler # Hyperparameters
LATENT_SPACE_DIM = 100
CONSUMPTION_PROFILE_DIM = 24
LEARNING_RATE = 0.0002
EPOCHS = 10000
BATCH_SIZE = 64
class ElectricityConsumptionGenerator(nn.Module):
def init (self, latent_space_dim=LATENT_SPACE_DIM): super(). init ()
self.model = nn.Sequential(
nn.Linear(latent_space_dim, 256),
nn.LeakyReLU(0.2),
nn.BatchNorm1d(256),
nn.Linear(256, 512),
Conectando mentes, energizando el futuro
66
nn.LeakyReLU(0.2),
nn.BatchNorm1d(512),
nn.Linear(512, CONSUMPTION_PROFILE_DIM),
nn.Tanh() # Activation to normalize output
)
def forward(self, z):
return self.model(z)
class ElectricityConsumptionDiscriminator(nn.Module):
def init (self):
super(). init ()
self.model = nn.Sequential(
nn.Linear(CONSUMPTION_PROFILE_DIM, 512),
nn.LeakyReLU(0.2),
nn.Dropout(0.3),
nn.Linear(512, 256),
nn.LeakyReLU(0.2),
nn.Dropout(0.3),
nn.Linear(256, 1),
nn.Sigmoid()
)
def forward(self, prole):
return self.model(prole)
class ElectricityConsumptionGAN:
def init (self):
self.generator = ElectricityConsumptionGenerator()
self.discriminator = ElectricityConsumptionDiscriminator()
self.loss_function = nn.BCELoss()
self.generator_optimizer = optim.Adam(
self.generator.parameters(),
lr=LEARNING_RATE,
betas=(0.5, 0.999)
)
self.discriminator_optimizer = optim.Adam(
self.discriminator.parameters(),
lr=LEARNING_RATE,
betas=(0.5, 0.999)
)
def generate_real_data(self, size):
# Simulating real data (modify as needed)
return torch.FloatTensor(np.random.normal(
loc=0.5,
scale=0.2,
size=(size, CONSUMPTION_PROFILE_DIM)
))
Sección ganadores “call for papers”
67
def train(self):
generator_losses = []
discriminator_losses = []
for epoch in range(EPOCHS):
# Training the Discriminator
self.discriminator.zero_grad()
# Real data
real_data = self.generate_real_data(BATCH_SIZE)
real_labels = torch.ones(BATCH_SIZE, 1)
# Generated data
noise = torch.randn(BATCH_SIZE, LATENT_SPACE_DIM)
generated_data = self.generator(noise)
generated_labels = torch.zeros(BATCH_SIZE, 1)
# Discriminator loss
real_output = self.discriminator(real_data)
generated_output = self.discriminator(generated_data.detach())
discriminator_loss = (
self.loss_function(real_output, real_labels) +
self.loss_function(generated_output, generated_labels)
)
discriminator_loss.backward()
self.discriminator_optimizer.step()
# Training the Generator
self.generator.zero_grad()
noise = torch.randn(BATCH_SIZE, LATENT_SPACE_DIM)
generated_data = self.generator(noise)
generated_output = self.discriminator(generated_data)
generator_loss = self.loss_function(
generated_output,
torch.ones(BATCH_SIZE, 1)
)
generator_loss.backward()
self.generator_optimizer.step()
# Record losses
generator_losses.append(generator_loss.item())
discriminator_losses.append(discriminator_loss.item())
Conectando mentes, energizando el futuro
68
# Print progress
if epoch % 100 == 0:
print(f”Epoch [{epoch}/{EPOCHS}]”)
print(f”Discriminator Loss: {discriminator_loss.item()}”)
print(f”Generator Loss: {generator_loss.item()}”)
return generator_losses, discriminator_losses
def generate_proles(self, num_proles=10):
with torch.no_grad():
noise = torch.randn(num_proles, LATENT_SPACE_DIM)
generated_proles = self.generator(noise).numpy()
return generated_proles
# Enhanced Visualization
def visualize_results(generated_proles, generator_losses, discriminator_losses):
# Distinctive color palette
colors = [#1f77b4, ‘#7f0e’, ‘#2ca02c, ‘#d62728, ‘#9467bd’]
# Visualization conguration
plt.gure(gsize=(16, 10))
plt.subplot(2, 1, 1)
# Visualizing Generated Proles
for i, prole in enumerate(generated_proles):
plt.plot(
range(len(prole)),
prole,
label=f’Synthetic Prole {i+1},
color=colors[i],
linewidth=2,
marker=o
)
plt.title(‘Synthetic Electricity Consumption Proles, fontsize=16)
plt.xlabel(‘Hour of the Day’, fontsize=12)
plt.ylabel(‘Normalized Consumption, fontsize=12)
plt.legend(loc=’best’)
plt.grid(True, linestyle=--, alpha=0.7)
# Visualizing Losses
plt.subplot(2, 1, 2)
plt.plot(
generator_losses,
label=Generator Loss,
color=#1f77b4,
linewidth=2
)
plt.plot(
discriminator_losses,
label=’Discriminator Loss,
Sección ganadores “call for papers”
69
color=#7f0e,
linewidth=2
)
plt.title(‘Loss Evolution during Training’, fontsize=16)
plt.xlabel(‘Training Epochs, fontsize=12)
plt.ylabel(‘Loss Value’, fontsize=12)
plt.legend(loc=’best’)
plt.grid(True, linestyle=--, alpha=0.7)
plt.tight_layout()
plt.show()
# Main Function
def main():
# Seed for reproducibility
torch.manual_seed(42)
np.random.seed(42)
# Create and train GAN model
gan_model = ElectricityConsumptionGAN()
# Train model
generator_losses, discriminator_losses = gan_model.train()
# Generate proles
generated_proles = gan_model.generate_proles(num_proles=5)
# Visualize results
visualize_results(generated_proles, generator_losses, discriminator_losses)
# Program entry point
if name == “ main “:
main()
5. RESULTS
In Figure 2, the results of the GAN model training
in Python are presented, specically executed
in an interactive environment such as IDLE. This
process generated data and metrics about the
model, including parameters such as discriminator
losses, generator losses, and the epoch.
Conectando mentes, energizando el futuro
70
Figure 2. Simulation results in Python’s IDLE.
Source: own elaboration.
1. Generator Loss
The generator loss quanti es the generator’s
e ectiveness in deceiving the discriminator. A high
generator loss indicates that the discriminator
can easily identify the generated data as fake.
Conversely, a low loss value suggests that the
generator is producing more realistic data. The
objective is to minimize this loss so the generator
outputs synthetic data indistinguishable from real
data.
2. Discriminator Loss
The discriminator loss measures the discriminator’s
ability to di erentiate
between real and generated data. A high
discriminator loss indicates di culty in
distinguishing between the two, whereas a low
loss implies that the discriminator e ectively
identi es generated data as fake. Ideally, this loss
should stabilize around 0.5, re ecting that the
discriminator performs no better than random
guessing in di erentiating real and generated data.
3. Epoch
An epoch represents one complete pass through
the training dataset, marking the progress of
the training process. Increasing the number of
epochs allows the model more opportunities
to learn and re ne its outputs. It is essential to
monitor the losses throughout the epochs to
ensure convergence and optimal training results.
4. Discriminator Output for Real and
Generated Data
The outputs from the discriminator are its
predictions on whether the input data is real or
generated:
real_output = self.discriminator(real_data)
generated_output = self.discriminator(generated_
data.detach())
o Real Output: Should approach 1,
indicating that the discriminator accurately
identi es real data.
o Generated Output: Should approach
0, showing the discriminator’s ability to
correctly classify generated data as fake.
The objective is to re ne these outputs so
the discriminator becomes increasingly
accurate in its predictions.
Sección ganadores “call for papers”
71
Figure 3. Graph of an Electrical Consumption Pro le and Loss Evolution During Training.
Source: own elaboration.
Each line represents a synthetic electrical
consumption pro le generated by the model.
Di erent colors and markers are used to distinguish
between the various pro les. Figure 3 illustrates
how the GAN model has generated consumption
pro les that replicate the patterns observed in
the real data. You can observe the variations in
consumption throughout the day, which may help
identify trends and patterns in electrical usage.
Regarding the loss evolution during training, the
blue line represents the generator loss, and the
5. Loss Logging
Generator and discriminator losses are recorded
at each epoch to track the model’s learning
progress:
generator_losses.append(generator_loss.item())
discriminator_losses.append(discriminator_loss.
item())
These logs enable the visualization of loss trends
during training. By analyzing the evolution of these
losses, it is possible to assess the e ectiveness of
the learning process and implement adjustments
if necessary.
The results of the GAN model are presented in
Figure 3, comprising two key elements:
Visualization of Synthetic Energy
Consumption Pro les: Illustrating the
generator’s capability to produce realistic
consumption patterns.
Loss Evolution During Training:
Providing insight into the dynamic interaction
between the generator and discriminator as
they improve over successive epochs.
orange line represents the discriminator loss.
Both evolve over the course of training, ideally
decreasing and stabilizing over time, which
indicates that the model is learning to generate
synthetic pro les that are di cult to distinguish from
real ones. If the losses do not converge or exhibit
erratic behavior, it may be necessary to adjust the
model’s hyperparameters or architecture.
Conectando mentes, energizando el futuro
5.1 Complementary Visualizations Based on Method Validation
5.2. PCA: Dimensionality Reduction
In Figure 4, both real and synthetic data are
displayed in terms of density distribution. As
expected, the density curves for the real and
synthetic data are very similar, suggesting that
the GAN has successfully captured the univariate
distribution of the real data. A noticeable
discrepancy (e.g., if the synthetic curve is shifted
5.1.1 Density Distribution
Figure 4. Density Distribution
Figure 5. PCA Representation.
Source: own elaboration.
Source: own elaboration.
or broader than the real one) would indicate that
the model has not yet captured the variability of
the data. However, this evaluation is super cial
and should be complemented with quantitative
metrics and multivariate analysis [47].
Sección ganadores “call for papers”
73
5.3. Histogram
5.4 Boxplot
The histogram in Figure 6 compares the frequency
distributions of the real and synthetic values,
demonstrating a good replication of the univariate
Dimensionality reduction via PCA allows
multivariate data to be projected into a two-
dimensional space, aiding in their comparison. In
electrical applications, this is useful not only for
emulating individual values (e.g., consumption at a
speci c hour) but also for capturing more complex
patterns (such as the relationship between
consumption at di erent times of the day).
Figure 5 shows a distribution of the data as
components of Principal Component Analysis
(PCA). PCA is de ned as a dimensionality
reduction technique used to transform a dataset
with many variables (dimensions) into a set with
Figure 7 presents the boxplot, which encompasses
the median, interquartile ranges, and outliers of
both the real and synthetic data. The boxes and
whiskers for the real and synthetic data should be
similar in length and position.
Figure 6. Histogram.
Source: own elaboration.
fewer variables, while retaining as much of the
original information as possible (Zhang & Li, 2023).
In the case of Figure 4, there is no signi cant
dispersion between the real and synthetic
data points, indicating that the multivariate
characteristics have been satisfactorily replicated.
If a discrepancy had been observed, it would have
required validation of the model architecture or
training process. PCA-based analyses are crucial
in contexts such as consumption across di erent
locations or times, as well as for the operation and
planning of smart grids.
distribution of the real data. This is particularly
relevant in electrical design applications (Li et al.,
2016).
In electrical grids, the ability to model extreme
values is critical, as these may represent unusual
events such as demand spikes.
Conectando mentes, energizando el futuro
74
Figure 7. Boxplot.
Table 2. Comparison of other methods during training.
Source: own elaboration.
Source: own elaboration.
Comparison of GANs vs. Alternative Models
In this section, GANs are contrasted with other
traditional and advanced approaches:
TimeGAN: Capable of capturing time
series with high delity, but with greater
computational complexity and long training
times.
Table 2 indicates that GANs have better accuracy
(lower RMSE and MAE), and a longer training time
compared to ARIMA but shorter than TimeGAN.
They have a high generalization capacity.
TimeGAN is able to capture time series with high
delity. It also has a high generalization capacity,
but its training time is longer and it has a lower
accuracy than GANs. ARIMA is a faster method in
Statistical models (ARIMA): Suitable for
linear trends, but limited in their ability to
model non-linear relationships.
Recurrent networks: Although e ective for
temporal patterns, they require extensive
training data to avoid over tting problems.
terms of training time, but less accurate and has
a low generalization capacity. Suitable for linear
trends, but limited in its ability to model non-linear
relationships.
Sección ganadores “call for papers”
75
6. DISCUSSION
7. CONCLUSIONS
The synthetic generation of electrical consumption
proles using Generative Adversarial Networks
(GANs) represents a signicant advancement in
energy planning and management. The ndings
of this study highlight the potential of GANs to
address contemporary challenges related to data
privacy and accessibility. GANs ability to replicate
intricate patterns, such as daily consumption
variations, underscores their utility not only
for simulations but also as a powerful tool for
generating articial datasets that complement
real-world data in research and development
applications.
A key aspect worth emphasizing is the quality of
the generated data, which is demonstrated by its
statistical resemblance to real data. This capability
implies that GANs can not only emulate existing
consumption patterns but also be leveraged to train
and validate predictive and analytical algorithms
without jeopardizing sensitive information. This
approach holds substantial potential for industrial
This study demonstrates that Generative
Adversarial Networks (GANs) are a powerful
and promising tool for generating synthetic
electrical consumption proles. The results
reveal that GANs can eectively replicate both
univariate and multivariate patterns in electricity
consumption data, oering a robust solution
for data augmentation, privacy-preserving
simulations, and the development of advanced
energy management algorithms. Validation of the
synthetic data using various graphical techniques
such as density distributions, PCA, histograms,
and boxplots has conrmed a high degree of
similarity to real- world data, reinforcing the
model’s capability to accurately replicate essential
consumption characteristics.
By overcoming the challenges associated
with accessing real consumption data, this
and academic sectors where the accessibility and
use of condential data are restricted.
However, it is crucial to recognize certain inherent
limitations of the model. While the results are
promising, further validation in more complex
scenarios involving multiple contextual variables
such as temperature, consumer behavior, and
dynamic energy pricing remains necessary.
Moreover, the stability of GANs during training
and the interpretability of their outputs continue
to present challenges that must be resolved to
ensure more robust and reliable implementation.
From a practical standpoint, this methodology
demonstrates exibility to adapt to diverse
applications, such as smart grid planning and
microgrid modeling. Its independence from
corporate data oers a signicant advantage
in regulated and competitive environments,
facilitating progress toward sustainable and
inclusive energy solutions.
approach contributes to the democratization
of energy analysis, enabling researchers and
organizations to utilize representative datasets
without compromising privacy or security. Future
research directions could explore the integration
of contextual variables, optimization of model
architecture, and validation of the methodology in
real-world energy systems.
As the global shift toward sustainability
accelerates, the generation of synthetic data
using GANs emerges as a catalyst for the design
of resilient and intelligent electrical infrastructures.
This work invites the scientic and technological
community to delve deeper into the potential of
this innovative tool, solidifying its role as a viable
and transformative solution in the global energy
transition.
Conectando mentes, energizando el futuro
76
9. REFERENCIAS
Ahmed, N. K., Atiya, A. F., El Gayar, N., & El-Shishiny, H. (2020). Comparative analysis of traditional and deep
learning models for time series forecasting. Neurocomputing, 20, 597–613.
Akbari, A., & Lowther, D. A. (2024). CDC-GANs: Bridging innovation and eciency in e-machine design with
advanced generative models. In 2024 International Conference on Electrical Machines (ICEM) (pp. 1–7). https://
doi.org/10.1109/ICEM60801.2024.10700184
Amasyali, K., & El-Gohary, N. M. (2018). A review of data-driven building energy consumption prediction studies.
Renewable and Sustainable Energy Reviews, 81, 1192–1205.
Arjovsky, M., Chintala, S., & Bottou, L. (2017). Wasserstein generative adversarial networks. In International
Conference on Machine Learning (pp. 214–223). PMLR.
Ashari, S., & Setiawan, E. A. (2022). Optimization of advanced metering infrastructure (AMI) customer ecosystem
by using analytic hierarchy process method. In 2022 10th International Conference on Smart Grid (icSmartGrid)
(pp. 240–248). https://doi.org/10.1109/icSmartGrid55722.2022.9848639
Beaulieu-Jones, B. K., Wu, Z. S., Williams, C., Lee, R., Bhavnani, S. P., Byrd, J. B., & Greene, C. S. (2019).
Privacy-preserving generative deep neural networks support clinical data sharing. Circulation: Cardiovascular
Quality and Outcomes, 12(7), e005122. https://doi.org/10.1161/CIRCOUTCOMES.118.005122
Choi, E., Biswal, S., Malin, B., Duke, J., Stewart, W. F., & Sun, J. (2017). Generating multi-label discrete patient
records using Generative Adversarial Networks. arXiv. https://doi.org/10.48550/arxiv.1703.06490
Deb, C., Zhang, F., Yang, J., Lee, S. E., & Shah, K. W. (2017). A review on time series forecasting techniques for
building energy consumption. Renewable and Sustainable Energy Reviews, 74, 902–924.
Dunmore, A., Jang-Jaccard, J., Sabrina, F., & Kwak, J. (2023). A comprehensive survey of generative adversarial
networks (GANs) in cybersecurity intrusion detection. IEEE Access, 11, 76071–76094. https://doi.
org/10.1109/ACCESS.2023.329670719
Dutta, I. K., Ghosh, B., Carlson, A., Totaro, M., & Bayoumi, M. (2020). Generative adversarial networks in security:
A survey. In 2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference
(UEMCON) (pp. 399–405). https://doi.org/10.1109/UEMCON51285.2020.9298135
Enhancing security in public spaces through Generative Adversarial Networks (GANs). (2024). In Advances in
Information Security, Privacy, and Ethics Book Series. https://doi.org/10.4018/979-8-3693-35970
Esteban, C., Hyland, S. L., & Rätsch, G. (2017). Real-valued (medical) time series generation with recurrent
conditional GANs. arXiv preprint. https://arxiv.org/abs/1706.02633
Fekri, M. N., Ghosh, A. M., & Grolinger, K. (2020). Generación de datos de energía para aprendizaje automático
con redes generativas adversarias recurrentes. Energies, 13(1), 130. https://doi.org/10.3390/en13010130
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2020).
Generative adversarial networks. Communications of the ACM, 63(11), 139–144. https://doi.org/10.1145/3422622
Haizea. (2025). Datos sintéticos: La clave para proteger la privacidad en la era de la IA. Nymiz -
Data Anonymization & Redaction Software. https://www.nymiz.com/datos-sinteticos-la-clave-
para-proteger-la-privacidad-en- la-era-de-la-ia/
Hart, D. G. (2008). Using AMI to realize the Smart Grid. In 2008 IEEE Power and Energy Society General Meeting
- Conversion and Delivery of Electrical Energy in the 21st Century (Vol. 10).
Hossain, R., Gautam, M., Olowolaju, J., Livani, H., & Benidris, M. (2024). Multi-agent voltage control in distribution
systems using GAN-DRL-based approach. Electric Power Systems Research, 234, 110528. https://doi.
org/10.1016/j.epsr.2024.110528
Sección ganadores “call for papers”
77
Li, A., Feng, M., Li, Y., & Liu, Z. (2016). Application of outlier mining in insider identication based on boxplot
method. Procedia Computer Science, 91, 245–251. https://doi.org/10.1016/j.procs.2016.07.069
Li, C., & Yang, D. (2021). Construction of power grid digital twin model based on GAN. In 2021 China Automation
Congress (CAC) (pp. 7767–7771). https://doi.org/10.1109/CAC53003.2021.9728190
Lim, W., Yong, K. S. C., Lau, B. T., & Tan, C. C. L. (2024). Future of generative adversarial networks (GAN) for
anomaly detection in network security: A review. Computers & Security, 139, 103733.
https://doi.org/10.1016/j.cose.2024.103733
Luo, D., Liu, X., Wang, R., & Li, X. (2023). A review of circuit models for GAN power devices. In 2023 IEEE
6th International Electrical and Energy Conference (CIEEC) (pp. 1461–1467). https://doi.org/10.1109/
CIEEC58067.2023.10166401
National Academies of Sciences, Engineering, and Medicine. (2016). Analytic research foundations for the next-
generation electric grid. The National Academies Press. https://doi.org/10.17226/21919
Nayak, A. A., Venugopala, P. S., & Ashwini, B. (2024). A systematic review on generative adversarial network
(GAN): Challenges and future directions.
Archives of Computational Methods in Engineering. https://doi.org/10.1007/
s11831-024-10119-1
Ortega, C. (2024). Generación de datos sintéticos: Técnicas y consideraciones. QuestionPro. h t t p s : / /
www.questionpro.com/blog/es/generacion-de-datos- sinteticos/
Ortiz-Torres, L. F., Gómez-Luna, E., & Sáenz, E. M. (2024). Estudio del uso y contribución de la inteligencia articial
para la operación en redes eléctricas. Revista UIS Ingenierías, 23(2), 31–46. https://doi.org/10.18273/revuin.
v23n2- 2024003
Park, S., Kim, H., Moon, H., Heo, J., & Yoon, S. (2010). Concurrent simulation platform for energy-aware smart
metering systems. IEEE Transactions on Consumer Electronics, 56(3), 1918–1926.
Pillai, G. G., Putrus, G. A., & Pearsall, N. M. (2014). Generation of synthetic benchmark electrical load proles
using publicly available load and weather data. International Journal of Electrical Power & Energy Systems, 61,
1–10.
Sehovac, L., Nesen, C., & Grolinger, K. (2019, July). Forecasting building energy consumption with deep learning:
A sequence to sequence approach. In 2019 IEEE International Congress on Internet of Things (ICIOT) (pp. 108–
116). IEEE.
Sharma, P., Kumar, M., Sharma, H. K., & Biju, S. M. (2024). Generative adversarial networks (GANs): Introduction,
taxonomy, variants, limitations, and applications. Multimedia Tools and Applications. https://doi.
org/10.1007/s11042-024-18767-y
Shi, A. (2021). Cyber attacks detection based on Generative Adversarial Networks. https://doi.org/10.1109/
ACCC54619.2021.00025
Tian, Y., Sehovac, L., & Grolinger, K. (2019). Similarity-based chained transfer learning for energy forecasting with
big data. IEEE Access, 7, 139895–139908.
Xie, M., Zou, H., Zhang, S., & Zhu, Q. (2021). Evaluating generative adversarial networks for creating electricity
consumption data. Applied Energy, 281, 115998.
Xu, L., & Veeramachaneni, K. (2018). Synthesizing tabular data using Generative Adversarial Networks.
arXiv. https://doi.org/10.48550/arxiv.1811.11264
Yadav, H., Vasa, J., & Patel, R. (2023). GAN (Generative Adversarial Network)- based image super-resolution: A
technical perspective. In Lecture Notes in
Conectando mentes, energizando el futuro
78
Networks and Systems (pp. 283–293). https://doi.org/10.1007/978-981-99-3761- 5_27
Yilmaz, B. (2023). A scenario framework for electricity grid using Generative Adversarial Networks. Sustainable
Energy Grids and Networks, 36, 101157. https://doi.org/10.1016/j.segan.2023.101157
Yilmaz, B., & Korn, R. (2022). Synthetic demand data generation for individual electricity consumers: Generative
Adversarial Networks (GANs). Energy and AI, 9, 100161.
Yoon, J., Jarrett, D., & Van der Schaar, M. (2019). Time-series generative adversarial networks. In Advances in
Neural Information Processing Systems.
Zhang, Y., & Li, Q. (2023). A comparative study of synthetic data generation using machine learning models.
Journal of Articial Intelligence Research, 79, 225–245.
Zhao, Q., Sun, B., Zhao, W., Watanabe, T., Usui, T., & Takeda, H. (2024). Improved GAN-based deep learning
approach for strain eld prediction and failure analysis of precast bridge slab joints. Engineering Structures, 321,
119023. https://doi.org/10.1016/j.engstruct.2024.119023
Zheng, S., Zhang, Y., Zhou, S., Ni, Q., & Zuo, J. (2022). Comprehensive energy consumption assessment based
on industry energy consumption structure. Part I: Analysis of energy consumption in key industries. In 2022
IEEE 5th International Electrical and Energy Conference (CIEEC) (pp. 4942–4949). https://doi.org/10.1109/
cieec54735.2022.9845929
Wang, R., Wang, G., Gu, L., Liu, Q., Liu, Y., & Guo, Y. (2025). Intuitively interpreting GANs latent space using
semantic distribution. Knowledge-Based Systems, 112894. https://doi.org/10.1016/j.knosys.2024.112894
Sección ganadores “call for papers”