117

Solar energy time series analysis via

markov chains

1.- mbemveiga@gmail.com

2.- gabrielsigaud@gmail.com

3.- Industrial Engineering Department Pontical Catholic University of Rio de Janeiro

cyrino@puc-rio.br ORCID: 0000-0003-1870-9440

4.- Industrial Engineering Department Pontical Catholic University of Rio de Janeiro

gustavo.melo.rio@gmail.com

Marianne Bechara Elabras da Motta Veiga

, Gabriel Kelab Sigaud

, Fernando Luiz Cyrino Oliveira

Gustavo de Andrade Melo

Recibido: 18/11/2024 y Aceptado: 04/2/2025

118

119

Brasil, ante un escenario global de preocupación por el cambio climático, viene incrementando el

uso de energías renovables, especialmente la energía solar en los últimos años. Con el crecimiento

de su participación, las características de la energía solar, como la intermitencia y las uctuaciones

aleatorias, vienen afectando la planicación de la operación del Sistema Eléctrico Brasileño (SBE). Tales

factores pueden ser estudiados con modelos de series de tiempo, auxiliando la planicación de plantas

generadoras y SBE. Con el n de contribuir al análisis factorial, el objetivo de esta investigación es

analizar las características de la generación de energía fotovoltaica en las estaciones meteorológicas del

año en dos regiones de Brasil con diferentes incidencias solares. Para ello, se aplica una metodología

basada en conceptos de Cadenas de Markov para dos series de tiempo estacionarias. El trabajo se

destaca por la subdivisión de las series de tiempo entre las estaciones climáticas, por el uso de datos

aún no estudiados y por la presentación de la metodología y resultados en detalle. El objetivo de la

investigación fue alcanzado con éxito, evidenciando las diferencias entre los modelos de generación de

energía solar entre las estaciones meteorológicas y las dos regiones estudiadas.

Brazil, given a global scenario of concern with climate change, has been increasing the use of renewable

energy, especially solar energy in the last years. With the growth in its participation, the characteristics of

solar energy, such as intermittence and random uctuations, have been aecting the operation planning

of the Brazilian Electricity System (BES). Such factors can be studied with time series modeling, helping

the planning of power plants and BES. In order to contribute to the factor analysis, the objective of

this research is to analyze the characteristics of photovoltaic energy generation in the meteorological

seasons of the year in two regions of Brazil with dierent solar incidences. For this, a methodology

based on Markov Chain concepts is applied for two stationary time series. The work stands out for the

subdivision of the time series between the climatic seasons, for the use of data not yet studied and for

the presentation of the methodology and results in detail. The objective of the research was successfully

achieved, making evident the dierences between the solar energy generation models between the

meteorological seasons and the two regions studied.

PALABRAS CLAVE: Fuentes de Energía Renovable, Fuentes de Energía Variables, Energía Solar,

Estaciones Climáticas, Cadenas de Markov, K-means

KEYWORDS: Renewable Energy Sources, Variable Energy Sources, Solar Energy, Climatic Seasons,

Markov Chains, K-means

Resumen

Abstract

120

1. INTRODUCTION

Faced with a scenario of concern about climate

change, countries are carrying out the energy

transition, thus moving away from using fossil

energy sources and increasing the use of

renewable sources (Malar, 2022). According

to the International Renewable Energy Agency

(2023), the planet had an increase in renewable

energy capacity in 2022 of 13% compared

to the previous year. Renewable energies are

considered inexhaustible, as they can always be

renewed by nature, and generate considerably

lower environmental impacts than non-renewable

energies (EPE, 2022).

Brazil has been following this transformation

in the world’s energy matrix. According to the

2023 National Energy Balance, 47.4% of Brazil’s

domestic energy supply in 2022 came from

renewable sources. In 2013, this percentage was

40.6%, that is, in 9 years, there was an increase of

approximately 17% (EPE, 2023).

In this context, solar energy is a source that

deserves to be highlighted. In 2022, it accounted

for 3.6% of the domestic energy supply in Brazil.

In addition, between 2021 and 2022, it had an

82.4% growth in installed capacity, being the

fastest growing in the country (EPE, 2023). With

the increase in its use in Brazil, its characteristics,

such as intermittency and random uctuations, will

aect even more the country’s energy generation.

Solar energy is generated from solar radiation,

captured by photovoltaic panels. In addition to

being renewable, it has the advantages of being

silent, requiring little maintenance and being able

to be installed in a short time (Imho, 2007). With

the increase in its use in Brazil, its characteristics,

such as intermittency and random uctuations,

will increasingly aect the country’s energy

generation. Considering this scenario, the use of

time series modeling and simulation methods to

study this impact is important for the planning of

the plants and the BES.

In order to contribute to this theme, the objective

of this work is to analyze the characteristics

of photovoltaic energy generation in dierent

climatic seasons (summer, autumn, winter and

spring) in two regions of Brazil with dierent solar

incidences. For this, the time series discretization

approach was used for Markov Chain modeling, a

methodology already widely used in the literature

for the analysis of electric energy time series.

Furthermore, the subdivision by climatic season

diers from other studies because it is based on

a natural phenomenon, as opposed to monthly

subdivisions, which are more frequently used, for

example.

It is worth noting that this study presents relevant

dierentials in the literature. In the rst place, to

the authors’ knowledge, data that have not yet

been studied are used. Also, these data are from

two plants located in regions with considerably

dierent characteristics and were divided by the

climatic seasons of the year, which allowed both

geographical and temporal comparisons.

The analysis presented in the study was carried

out through two daily photovoltaic energy

generation databases from ONS (National Electric

System Operator): Nova Olinda Complex, located

in Piauí (PI) and founded in 2017 (G1, 2017); and

Guaimbê Complex, located in the state of São

Paulo (SP) and inaugurated in 2019 (G1, 2019).

According to Gadelha de Lima (2020), the state of

Piauí has dierent meteorological characteristics

depending on the quarter of the year, which could

justify a division into four seasons.

Figure 1 shows the location of the two plants on

the brazilian solarimetric map. This map is an

adaptation of the one presented in the Brazilian

Atlas of Solar Energy (Pereira et al., 2017) and

shows the annual average of the total daily normal

direct irradiation over Brazil. It is possible to

perceive the dierence in the averages of direct

irradiation between the two locations of the plants,

which is greater in the Nova Olinda Complex

(Ribeira do Piauí – PI) in relation to the Guaimbê

Complex (Guaimbê – SP).

121

2. THEORETICAL FRAMEWORK

The applied methodology is exploratory and can

be divided into three main phases. The rst relates

to data pre-processing, including data collection,

analysis, and treatment. In the second phase,

data processing is performed, involving modeling

via Markov Chains and obtaining results such

as stationary distribution, recurrence time, and

In the literature, there are several renewable

energy modeling studies that apply the concept

of Markov Chains in their methodologies. Sigauke

and Chikobvu (2017) performed an analysis of

daily peaks of electricity demand through Markov

Chains, seeking to nd the stationary distribution

(distribution of states in which the chain will

stabilize). To do this, the authors used demand

data from South Africa from 2000 to 2011.

Models with two states were considered, being

the positive or negative variations between the

days, and with three states, where the dierence

Figure 1 - Brazilian Solarimetric Map - Average annual normal direct irradiation.

Source: Adapted from Pereira et al. (2017).

rst passage time. In the last phase, data post-

processing, the results obtained were analyzed

for comparison between the climatic seasons and

between the plants.

between small and large positive variations was

considered.

Maçaira et al. (2019), faced with a scenario of

increased wind energy use in Brazil, showed that

the dispatch model used in the period of their

research did not consider the stochastic behavior

of this energy source. The model, which sought

to optimize long-term energy planning, only

evaluated the future aspects of water and thermal

sources. In view of this, the work proposed

the wind-hydrothermal dispatch model, which

122

A methodology based on Markov Chains was

applied to modeling the time series of photovoltaic

solar power generation. Figure 2 shows the

owchart with the main stages of the methodology,

3. METHODOLOGY

divided into the data’s pre-processing, processing,

and post-processing phases.

Figure 2 - Main steps of the methodology.

incorporated wind power generation using the

MCMC (Markov-Chain Monte Carlo) method to

simulate energy scenarios.

Ma et al. (2020) proposed a methodology for

aggregating solar photovoltaic time series data

through clustering via k-means, Markov Chains,

and Monte Carlo simulation. For the authors,

Markovian processes eciently represent the

transitions of photovoltaic power generation time

series. Based on the proposed k-means-MCMC

methodology, initially, the power generation data

should be grouped following the optimal number

of clusters, and then the transition matrix should

be assembled. Finally, from this matrix, energy

scenarios are generated via simulation.

Melo (2022) sought to show the spatial and

temporal complementarity between variable

renewable energies through the joint stochastic

modeling and simulation of solar and wind energy.

To this end, it used two methodologies and

performs three applications, through databases

of mills located in the Northeast of Brazil. Both

methodologies use Markov Chain modeling,

Monte Carlo simulation to obtain scenarios, and

the k-means technique to perform data clustering.

123

The pre-processing phase consists of obtaining,

analyzing, and treating data. The data of the time

series of daily photovoltaic energy generation of

the Nova Olinda (Piauí) and Guaimbê (São Paulo)

complexes were obtained from the National

Electric System Operator (ONS, 2022) for a period

of four years, from 06/21/2018 to 06/20/2022, with

a total of 1,461 observations for each complex.

The only two variables used were date and

energy generation. According to Ma et al. (2020),

due to the characteristics of photovoltaic power

generation data, the optimal time scale to fragment

scenarios would be daily. The methodology is

applied rst to the Nova Olinda Complex and then

to the Guaimbê Complex, so the two series are

worked separately in the modeling.

A preliminary analysis of the data obtained

from energy generation during the period was

performed. First, to test the stationarity of the

time series over the four years, Augmented

Dickey-Fuller (ADF) unit root tests were carried

out. The null hypothesis of the ADF test is that

there are unit roots in the time series and,

therefore, it would not be stationary (Dickey, D.;

Fuller, 1979). The stationarity test is essential for

3.1. Pre-processing

3.2. Processing

the application of the Markov Chain concepts,

because a non-stationary series depends on time,

and in Markovian processes, the probabilities of

transition to the next state depend only on the

current state (Norris, 1998). Furthermore, non-

stationarity would mean a change in the installed

capacity of the plants.

To complete the pre-processing phase, a

treatment of the databases is carried out so that

the time series can be modeled as Markov Chains.

First, the null or missing values were replaced by

the averages of the month in the corresponding

year, as it is an adequate estimate for the value

of generation in the period, given seasonality.

Then, so that the time series could be analyzed

by climatic season, they were subdivided into four

subsets: Summer, Autumn, Winter, and Spring.

In order to group the observations with greater

similarities, the subsets of the solar energy

generation time series, divided by climatic

season, were discretized into markovian states

independently. The clustering method used

was k-means (MacQueen, 1967), as it is easily

programmable and computationally economical.

In the k-means method, a number k of clusters

is pre-specied, and initial k centroids (average

value of clusters) are dened based on a random

variable. Then, the following steps are performed:

Observations are assigned to the nearest centroid

cluster by calculating the distance from each

observation to each centroid; New k centroids

are calculated from the average of intra-cluster

observations; Iterations of steps 1 and 2 are

3.2.1 Series discretization via k-means

performed until the centroid values do not change

further The method can be summarized by the

objective function (1).

However, to apply the k-means method, it is

necessary to pre-dene the number k of clusters.

According to Fritz et al. (2020), choosing the

wrong values for k can lead to poor results,

124

and to choose the ideal number of clusters, it is

common to use the elbow method, rst discussed

by Thorndike (1953). As the number of clusters

increases, the sum of the squared error of the

distance between the observations and the

centroids tends to decrease (Thorndike, 1953).

Hence, the elbow method helps to limit the choice

of very high values for k, in which there are no

relevant benets with the addition of a new cluster.

The elbow method can be used in conjunction with

the k-means method to nd the optimal number of

clusters (Fritz et al., 2020).

To apply the elbow method using k-means, it is

rst necessary to perform the k-means steps for

each k-value up to a chosen maximum number.

Then, the sum of the intra-cluster squared error,

or Within-Cluster-Sum of Squared Errors (WSS),

is calculated for each clustering obtained by the

k-means result. The WSS consists of the sum of

the square of the euclidean distances from each

observation to the centroid of the cluster to which

it belongs.

Consequently, a graph can be created that

presents the WSS for each value of k. So it is

possible to observe the point k at which the curve

presents a “fold”, like an elbow, and it can be

inferred that the dierence between the WSS of

k and k+1 would not provide substantial gains to

clustering.

The next step is to create the daily transition

matrices of states, P. Transition matrices are

composed of the transition probabilities pi,j

between a state i and a state j between a period n

and n+1 (Chung, 1960).

In this step, based on Melo (2022) and Ma et al.

(2020), the transition probabilities are calculated

by the ratio between the number of occurrences

of transitions from state i to state j and the

3.2.2 Creating State Transition Matrices

The transition probabilities and transition matrices

are represented by (2) and (3), respectively.

total occurrences of transitions from state I, as

represented by (4).

125

To analyze the properties of the transition

matrices, three measures of interest were

calculated: Stationary distribution (π) - represents

the distribution of states in which the chain will

stabilize, satisfying the equations (5) and (6);

Recurrence time (mii) - the expected number of

periods for a system in state i to return to that

Interpreting the above concepts, the measures

presented are important to assist in analyzing the

behavior of the Markov Chains model when the

process stabilizes. With a stationary distribution,

it is possible to identify the most frequent states

of the system, where the process is most likely

to be in the future. The recurrence time allows us

to understand, for example, the average time to

return to a state of maximum or minimum energy

Finally, in the post-processing phase, the analysis

and evaluation of the results obtained in the

previous phase were carried out, with the objective

of analyzing the characteristics of the generation

of the two plants in the four climatic seasons and

in regions of Brazil with dierent solar incidences.

In this phase, the main purposes were: to identify

the most frequent states of each season; to

compare the recurrence times of the most extreme

power generation states; and to compare the rst

3.2.3 Obtaining the results

state again, as in the equation (7); First passage

time (mij) - The number of periods expected for a

system in state i to rst passage through state j, as

in the equation (8) (Chung, 1960).

generation, while the rst passage time would

indicate the average transition time between

these two states.

3.3. Post-processing

passage times between the states of highest and

lowest power generation of each climatic season.

126

4. DISCUSSION AND PRESENTATION OF RESULTS

In this chapter, the results of the methodology’s

application are presented for the two plants

individually, starting with the Nova Olinda Complex

(PI) and, later, addressing the Guaimbê Complex

(SP). Finally, the results of the two plants are

compared. All the computational steps in this

When testing the stationarity of the time series of

the Nova Olinda Complex in the analyzed period,

the result obtained was a p-value lower than 0.01,

i.e., the null hypothesis that the time series would

not be stationary is rejected. Thus, it is concluded

that the time series is stationary and, therefore, the

installed capacity is constant, which is fundamental

for the Markov Chain modeling performed in this

work. The stationarity of the time series in the

period can be seen in Figure 3, which represents

the average daily generation per month. In addition,

the series presents considerable volatility and

annual seasonality, with higher energy generation

4.1. Nova Olinda Complex (Piauí)

chapter were performed in the R® programming

language (R Development Core Team, 2009).

4.1.1 Pre-processing

4.1.1.1 Collection, analysis and treatment of data

in the months of July, August, and September

and lower generation in the months of December,

January, February, and March, while the other

months assume intermediate energy generation

values. It is possible to notice greater similarities

in the data in the months of the same climatic

season. Due to this observation, an opportunity is

identied to model the time series by subdividing it

into four subsets, one for each climatic season, for

a better representation of the data in each period.

Figure 3 - Average daily generation - Nova Olinda Complex.

Source: Based on data from ONS (2022).

127

4.1.2 Processing

4.1.2.1 Discretization of the series via k-means

The discretization of the photovoltaic time series

was performed individually for each climatic

season, so that the number of clusters and

the values for the centroids were better suited

specically to each of the subsets.

The rst step in the execution was to create a

function that would calculate the k-means for

values of k from 1 to 20. The maximum number

of 20 clusters was chosen because it was veried

that this is a sucient amount to represent the

data. The second step was to create a function

that returned WSS for each of the 20 clusters.

The third step was to apply the previously created

functions to each of the subsets created. The

fourth step was the application of the elbow

Table 1 shows that winter has the highest daily

average of energy generation in the Nova Olinda

Complex in Piauí, with 1,572.87 MWh/day.

Meanwhile, the summer has a daily average of

32% lower than that of winter, with 1,066.39

MWh/day, probably due to a higher number

of cloudy days in this period of the year, which

reduces the average daily solar radiation in the

Table 1: Measures of daily energy generation - Nova Olinda Complex.

Table 2: Ideal number of clusters - Nova Olinda Complex.

Table 3: Centroids of the states - Nova Olinda Complex.

region of the plant. Furthermore, it is also possible

to note that winter has the lowest standard

deviation, while spring, the second season with

the highest average energy generation, has the

highest standard deviation, therefore, a greater

dispersion of data.

method. With the results of the WSS calculation,

a list was created that contained the ratio between

the WSS of a number k and k+1 of clusters for

k=1 to k=19. Then, for each of the subsets, the

k-value of clusters in which the calculated ratio was

greater than 0.90 was identied, i.e., the number

of clusters necessary for the reduction of the sum

of the intra-cluster squared error, when including a

new cluster, to be less than 10%, which would not

justify the addition.

Thus, the k-means result for each of the subsets

found the ideal number of clusters (Table 2) and

centroid values (Table 3).

128

4.1.2.2 Creating State Transition Matrices

4.1.2.3 Obtaining the results

In this step, the transition matrices of the Nova

Olinda Complex (Figure 4) were constructed from

All the transition matrices created were classied

as irreducible and ergodic, important properties for

the Markov Chain to have a stationary distribution.

Figure 4 - Transition matrices - Nova Olinda Complex.

Table 4: Stationary distribution - Nova Olinda Complex.

Then, stationary distributions (Table 4), recurrence

times (Table 5), and rst passage times (Figure 5)

were calculated.

the transition frequencies between the states for

each subset.

129

Table 5: Recurrence time - Nova Olinda Complex.

Figure 5 - First passage time - Nova Olinda Complex.

130

4.1.3 Post-processing

4.1.3.1 Evaluation and interpretation of results

After obtaining the model’s results, it becomes

possible to analyze and interpret the generated

values and better understand the behavior of the

daily photovoltaic power generation time series,

mainly from the stationary distributions, recurrence

times and rst passage times.

From the stationary distribution, in Table 4, it

is observed that the system presents higher

probabilities for the states of intermediate

generation values in the summer and spring

seasons. Also, the probabilities decay little by

little and in a similar way for the lower and higher

extreme states. Another way to analyze it is

by the time of recurrence of the states in Table

5. In both seasons, the recurrence times of the

extreme states are signicantly higher compared

to the central states and very close to each other.

Analyzing the extreme states, in the summer,

states 1 (289 MWh) and 11 (1,819 MWh) have a

recurrence of 25 and 29 days, respectively, while

states 1 (309 MWh) and 8 (1,969 MWh) in spring

have a recurrence of 18 and 19 days, respectively.

Consequently, the tendency is for the system

to remain in medium-generation states and the

extremes to be rarer, with lower expectations

of low or high generation in a day, especially in

the summer, whose recurrence times of extreme

states are even longer.

Autumn, on the other hand, has a higher probability

of being in the central and upper states, with lower

probabilities in states of lower energy generation,

comparatively. At this season, the two states with

the highest power generation, states 11 (1,582

MWh) and 12 (1,709 MWh), have recurrence

times of 9 and 13 days, respectively. Meanwhile,

the recurrence times of the two lowest-generation

states, states 1 (307 MWh) and 2 (624 MWh),

are 37 and 22 days, respectively. In addition, the

recurrence time of state 1 of autumn is the longest

among all states of all seasons, i.e., autumn

presents the longest average period for the system

to return to low levels of power generation. Hence,

it appears that the system has a tendency towards

higher states with a lower risk of low generation.

Furthermore, by investigating the rst passage

times of summer and spring, it is possible to

analyze that the time to leave the state of lowest

energy generation and reach the state of highest

generation for the rst time is longer than the

reverse. For example, the rst passage time from

state 1 (309 MWh) to state 8 (1,969 MWh) in the

spring is 51 days, while the time from state 8 to

state 1 is 30 days, approximately 40% shorter.

Therefore, although the probabilities of the system

being at each extreme are close, once the system

is in a low-generation state, it will take longer

to reach the higher-generation states in both

seasons.

Meanwhile, when looking at winter, the rst

passage times between the two most extreme

states, lower and upper, are close — 32 days from

state 1 (875 MWh) to state 8 (1,945 MWh) and

30 days from state 8 to state 1— although their

recurrence times are quite dierent (20 days for

state 1 and 11 days for state 8). It is interesting to

note that the rst passage times between states

with more distant generation levels may be shorter

than among others with closer generations. For

example, the rst passage time from state 2

(1,209 MWh) to state 3 (1,407 MWh) is 11 days,

while the time from state 2 to state 6 (1,720 MWh)

is 8 days.

131

By testing the stationarity of the time series of the

Guaimbê Complex in the analyzed period, it was

concluded that the time series is stationary and,

therefore, the installed capacity is constant. The

stationarity of the time series in the period can be

seen in Figure 6, which represents the average

Table 6 shows that spring has the highest daily

average of energy generation in the Guaimbê

Complex, with 752.92 MWh/day. Meanwhile, the

summer has a daily average of 5% lower than

that of spring, with 717.33 MWh/day, being the

lowest average for the plant. Consequently, the

low variability of energy generation between the

seasons of the year is evident, with all values

4.2. Guaimbê Complex (São Paulo)

4.2.1 Pre-processing

4.2.1.1 Collection, analysis and treatment of data

daily generation per month. In addition, the series

has considerably lower volatility than that of the

Nova Olinda Complex, with low variations in

energy generation over the months.

being considerably close to the general average.

Also, the standard deviation of the seasons also

assumes close values.

Table 6: Measurements of daily energy generation - Guaimbê Complex.

132

4.2.2 Processing

4.2.2.1 Discretization of the series via k-means

The discretization of the photovoltaic energy time

series for the Guaimbê Complex was performed

with the same method as the Nova Olinda

Complex, but in a totally independent way, using

the k-means technique and the elbow method.

The ideal number of clusters and centroid values

are shown in Tables 7 and 8, respectively.

Table 7: Ideal number of clusters - Guaimbê Complex.

Table 8: Centroids of the states - Guaimbê Complex.

Figure 7 - Transition matrices - Guaimbê Complex.

Thus, the transition matrices of states of the

Guaimbê Complex were created, represented in

Figure 7.

Then, stationary distributions (Table 9), recurrence

times (Table 10), and rst passage times (Figure 8)

4.2.2.2 Creating State Transition Matrices

4.2.2.3 Obtaining the results

were calculated.

133

Table 9: Stationary distribution - Guaimbê Complex.

Table 10: Recurrence time - Guaimbê Complex.

Figure 8 - First passage time - Guaimbê Complex.

134

4.2.3 Post-processing

4.3 Comparison of results

4.2.3.1 Evaluation and interpretation of results

With the measurements of interest obtained for the

Guaimbê Complex, the next step is to analyze the

model’s characteristics for each of the seasons.

In summer, the stationary probability of the system

being in the state of lower power generation is the

lowest (2.27%), resulting in a recurrence time of

44 days, that is, the occurrence of a state of low

generation is extremely rare, and, once in this

state, many transitions are expected for the return.

Furthermore, state 2 (386 MWh) has the second-

lowest stationary probability at 7.47%, followed

by state 8 (1,018 MWh) at 10.11%. The states

with the highest probabilities are the higher power

plants, and the state with the highest stationary

probability is state 7 (909 MWh), with 19.86%.

Looking at autumn, the system has a higher

probability of being stationary in the upper central

states of power generation values in the seasons,

with the probabilities gradually decreasing to

the lower and upper extreme states. The two

states with the longest recurrence times are the

extremes, states 1 (244 MWh) and 9 (992 MWh),

with 15 and 23 days, respectively. Winter, on the

other hand, in the Guaimbê Complex, has higher

stationary probabilities for the upper states and

very low probabilities for the four states with lower

energy generation. An interesting case is that the

recurrence time of state 2 (324 MWh) is 38 days,

which is approximately 40% longer than the time

of state 1 (182 MWh) 27 days. In this way, the

risk of the system being in low-generation states is

lower, and there is an expectation of higher energy

generations, comparatively.

Spring has more balanced stationary probabilities

among its eight states, with the exception of state

1 (269 MWh), which has lower power generation

and a probability of only 5%.

Analyzing the times of the rst passage, it can

be seen that, in the summer of the Guaimbê

Complex, the time to leave the state of the highest

generation to the state of the lowest generation is

more than double the reverse. The rst passage

time from state 8 (1,018 MWh) to state 1 (234

MWh) is 47 days, while from state 1 to state 8

is 20 days. Hence, this characteristic is favorable

to generation because the average time to have

a low generation from a high generation is high.

However, autumn and winter have rst passage

times with the reverse logic, it takes longer to move

from a state of lower generation to one of greater

generation. This analysis is important because, in

low-generation situations, the expected time to

return to high-generation is longer. In the autumn,

the rst passage time from state 1 (244 MWh) to

state 9 (992 MWh) is 38 days, and the reverse is

23 days. Meanwhile, in winter, the time from state

1 (182 MWh) to state 11 (983 MWh) is 62 days,

and the reverse is 30 days.

Analyzing the time series of the Nova Olinda

and Guaimbê complexes, the dierences in the

variability of the average photovoltaic energy

generation throughout the year are evident since

Nova Olinda presents seasonality with higher

average generation in winter and lower in summer,

which is the wet period, while the averages of

Guaimbê are closer in all climatic seasons. In the

case of Nova Olinda, the reason for subdividing

the series by the climatic seasons to perform the

modeling is more evident, however, although the

Guaimbê Complex presents more homogeneous

monthly averages, the results for the stationary

distributions and recurrence and rst passage

135

times were signicantly dierent in each season,

as previously analyzed. Thus, the subdivision by

climatic season proved to be relevant for both

plants.

Another interesting fact is that the climatic

seasons aect each region dierently as well, with

similarities between dierent seasons in the two

regions. For example, the highest concentration

of stationary probabilities in upper central states is

a case present in summer and spring in the Nova

Olinda Complex, but it also happens in the autumn

in the Guaimbê Complex. On the other hand, the

autumn of Nova Olinda is similar to the winter of

Guaimbê because the states of lower generations

have signicantly lower stationary probabilities

than the others and higher probabilities in the

higher states. Meanwhile, Nova Olinda’s winter

and Guaimbê’s spring are the seasons with the

most balanced stationary probabilities between

the states.

In addition, analyzing the rst passage times,

other similarities were found. The cases in which

the rst passage time from the state with the

highest generation to the lowest was longer than

the inverse were the autumn in Nova Olinda and

the summer in Guaimbê. The opposite happened

in the summer and spring in Nova Olinda and in

the autumn and winter in Guaimbê. On the other

hand, the winter of Nova Olinda and the spring

of Guaimbê had the closest rst passage times

when comparing the most extreme states.

Finally, BES can use this analysis to assist in

the country’s energy planning by calculating the

probability of possible scenarios of low or high

photovoltaic generation by region and climatic

season. The detailed study of the characteristics

of renewable sources brings greater security to

the supply of energy demand in the country.

Brazil has been going through a process of

changing its energy matrix and increasing the use

of renewable energies non-dispatchable. In this

context, photovoltaic solar energy has stood out

due to the signicant growth of its share in the

country. Hence, its characteristics of intermittency

and random uctuations have a greater impact on

the national energy supply scenario. Therefore,

the study of photovoltaic generation through

modeling methods is relevant, and an opportunity

to contribute to the literature was found through

the present work.

This work studies the generation characteristics

of two photovoltaic solar power plants located

in regions with solar incidences of dierent

magnitudes and seasonalities. The methodology

used was based on Markov Chains. The time series

were subdivided among the climatic seasons

of the year. Then, the state transition matrices

5. CONCLUSION

were created, and the results of the measures of

interest, such as stationary distribution, recurrence

time, and rst passage time, were investigated.

Consequently, it was possible to analyze the

dierences between the photovoltaic energy

generation in the dierent seasons and regions. In

this way, the objective of the work was achieved

in a pertinent way.

Conrming the initial hypothesis, the results

showed signicant dierences in solar energy

generation between the regions and between the

climatic seasons, which evidenced the relevance

of the comparative study carried out. By analyzing

and better understanding the specicities of each

location and season, power plants and the Brazilian

Electric System can plan more eciently about

energy generation, analyzing the probabilities of

the occurrence of states of dierent generation

values.

136

6. REFERENCES

Chung, K. L. (1960). Markov Chains with Stationary Transition Probabilities. Springer-Verlag.

Dickey, D. & Fuller, W. (1979). Distribution of the Estimators for Autoregressive Time Series With a Unit Root.

Journal of the American Statistical Association, v. 74, p. 427-431.

EPE (2022). Fontes de Energia. https://www.epe.gov.br/pt/abcdenergia/fontes-de-energia

EPE (2023). Balanço Energético Nacional 2023. Ministério de Minas e Energia.

Fritz, M., Behringer, M. & Schwarz, H. (2020). LOG-Means: eciently estimating the number of clusters in large

datasets. Proceedings of the VLDB Endowment, v. 13, n. 12.

G1 (2019). Complexos de energia solar são inaugurados em duas cidades do interior de SP. https://g1.globo.

com/sp/bauru-marilia/noticia/2019/08/15/complexos-de-energia-solar-serao-inaugurados-em-duas-cidades-

do-interior-de-sp.ghtml

Gadelha, M. L. (2020). Climas do Piauí: interações com o ambiente. Edufpi.

IEA (2023). Renewable Energy Market Update - June 2023. https://www.iea.org/reports/renewable-energy-

market-update-june-2023

Imho, J. (2007). Desenvolvimento de Conversores Estáticos para Sistemas Fotovoltaicos Autônomos. Master’s

thesis presented to the School of Electrical Engineering of Universidade Federal de Santa Maria.

Ma, M., Ye, L., Li, J., Li, P., Song, R. & Zhuang, H. (2020). Photovoltaic Time Series Aggregation Method Based

on K-means and MCMC Algorithm. Asia-Pacic Power and Energy Engineering Conference, v. 2020-September,

n. 9220338.

Maçaira, P., Cyrillo, Y. M., Cyrino, F. & Souza, R. C. (2019). Including wind power generation in Brazil’s long-term

optimization model for energy planning. Energies, v. 12, n. 826.

Macqueen, J. (1967). Some methods for classication and analysis of multivariate observations. In Proceedings of

the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, p. 14.

Malar, J. P. (2022). Conheça os tipos de energia renovável e quais são usados no Brasil. CNN Brasil. https://www.

cnnbrasil.com.br/business/conheca-os-tipos-de-energia-renovavel-e-quais-sao-usados-no-brasil

Melo, G., Cyrino, F. & Maçaira, P. (2022). Simulação estocástica conjunta de energias renováveis. Master’s thesis

– Departamento de Engenharia Industrial, Pontifícia Universidade Católica do Rio de Janeiro.

Nascimento, A. & Araújo, T. (2017). Maior parque solar da América Latina é inaugurado no Piauí. https://g1.globo.

com/pi/piaui/noticia/maior-parque-solar-da-america-latina-e-inaugurado-no-piaui.ghtml

Norris, J. R. (1998). Markov chains. Cambridge university press.

ONS (2022). http://www.ons.org.br

Pereira, E. B., Martins, F. R., Gonçalves, A. R., Costa, R. S., Lima, F. L., Rüther, R., Abreu, S. L., Tiepolo, G. M.,

Pereira, S. V. & Souza, J. G. Atlas brasileiro de energia solar. 2.ed. São José dos Campos: INPE.

R Development Core Team (2009). R: A language and environment for statistical computing. R Foundation for

Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. http://www.R-project.org

Sigauke, C. & Chikobvu, D. (2017). Estimation of extreme inter-day changes to peak electricity demand using

Markov chain analysis: A comparative analysis with extreme value theory. Journal of Energy in Southern Africa, v.

28, n. 4.

Thorndike, R. L. (1953). Psychometrika, v. 18, p. 266-267.