y: Ing. Tatiana Sandoval.

This project was conducted as part of the “Careers with Impact” program during the 14-week mentoring phase. You can find more information about the program in this post.

Contextualization of the problem

The accelerated advance of Artificial Intelligence (AI), especially in large- scale language models (LLMs) such as DeepSeek, has intensified the demand for computational and energy resources. This growing technological dependence poses significant challenges for Latin America, a region that has historically depended on foreign infrastructures and developments in the digital realm (EL PAÍS, 2025). This dependence limits technological sovereignty and the ability to compete on equal terms in the global Artificial Intelligence (hereinafter “AI”) market.

To counter this situation, it is essential that Latin American countries strengthen the governance of computation and promote the democratization of AI. Initiatives such as the Inter-American Framework for Data Governance and Artificial Intelligence (MIGDIA) of the OAS seek to guide member states in the development of policies that promote ethical and responsible management of AI, adapted to regional realities and needs (OAS, 2023). In addition, the “Declaration of Santiago to promote ethical Artificial Intelligence in Latin America and the Caribbean” reflects the region’s commitment to establish a common voice in AI governance.

However, developing and training AI models from scratch requires substantial investments in infrastructure, such as the construction of specialized data centers. These facilities not only represent a considerable financial challenge, also have a significant environmental impact due to the high energy consumption and associated carbon footprint. For example, projects in other regions have faced criticism for their demand on natural resources and potential contribution to climate change (HuffPost, 2024).

Given this context, a more viable strategy for Latin American countries could be the adoption and specialization of AI models already trained, such as DeepSeek, in specific areas such as health. This approach would allow reducing costs and minimizing environmental impact, while developing local AI capabilities. Moreover, by focusing on concrete applications relevant to the region, innovation and competitiveness in the global AI market could be fostered.

In summary, for Latin America to move towards greater independence and prominence in the field of AI, it is crucial to invest in sustainable infrastructure, develop appropriate governance frameworks and consider technology adoption strategies that balance costs, benefits and environmental sustainability.

2. Research Question.

Is it feasible for data centers in Mexico and Brazil to operate as specialized infrastructures for artificial intelligence, meeting the energy and technical requirements needed to train models such as DeepSeek and run inference efficiently?

3. Objectives

3.1. General

Evaluate the technical and energetic feasibility of operating data centers in Mexico and Brazil as specialized infrastructures for artificial intelligence. This implies analyzing whether such centers meet the necessary requirements to train AI models and execute inference in an efficient manner.

3.2. Specific

Identify and select viable data centers for AI training and inference.
Estimating energy consumption associated with training and inference models such as DeepSeek.

3.3. Personal

Develop energy data analysis and scenario simulation skills.
Strengthen competencies in the evaluation of digital infrastructure and sustainability.
To advance professional training within the field of sustainable and emerging technologies.

4. Methodology

The methodology applied in this study was developed in four main phases, aimed at evaluating the energy and operational feasibility of training and inferring large-scale language models (LLMs) in data centers located in Mexico and Brazil. Initially, a detailed collection and analysis of the data center infrastructure in both countries was performed, using the Data Centers Map platform as the main source. The providers with the greatest presence and data availability were identified (KIO Networks in Mexico and SCALA in Brazil).

Data Centers in Brazil) to select specific centers representative of both high and low operational capacity. From these, daily and annual energy consumptions were estimated. In parallel, technical data on the training and inference of LLMs such as GPT-3, GPT-4 and DeepSeek-V3 were collected, highlighting their relevance for eficiency, technological innovation and hardware variation. Subsequently, the approximate energy consumption for both training and inference of such models was estimated, considering parameters such as PUE (Power Usage Effectiveness) and the associated energy infrastructure. Finally, the feasibility of running these models in the selected centers was evaluated, adjusting the calculations to local conditions in Mexico and Brazil. All the information, calculations and decisions were documented and systematized to ensure the replicability of the study, and are presented in the following sections along with their respective sources and annexes.

4.1. Phase 1: Collection and analysis of data from Brazilian and Mexican data centers

The first stage of the study focused on the identification, collection and analysis of information on operational data centers in Mexico and Brazil, using the Data Centers Map platform as the main source. The information was obtained based on the data available at the date of consultation, registering a total 54 centers in 13 markets in Mexico, and 162 centers distributed in 30 markets in Brazil. Data such as the name of the center, supplier and market were collected and systematized in a spreadsheet to facilitate subsequent analysis. From this , graphs were prepared to visualize the concentration of centers by supplier and market (see Annex A). Based on these analyses, a representative supplier was selected.

by country: KIO Networks in Mexico and SCALA Data Centers in Brazil, due to their high presence and the availability of key data for the study. Subsequently, the daily and annual energy consumption of each supplier’s centers was estimated (see Annex B), which allowed two centers per country to be selected for the detailed analysis: one with a higher capacity and one with a lower capacity. For Mexico, the Querétaro (12 MW of TI) and Mérida (0.06 MW of TI) centers were selected, and for Brazil, the Tamboré (24 MW of TI) and Sao Paulo (4 MW of TI) centers were selected. This selection was crucial to evaluate, in later phases, the feasibility of training and inferring LLMs in these environments. The sources from which data were extracted and the calculations performed are detailed below.

4.1.1. Sources used

4.1.2. Calculations performed

The estimation of the energy consumption of the data centers of KIO Networks (Mexico) and SCALA Data Centers (Brazil) was based on application of internationally recognized equations in the field of data center energy efficiency.

Base Equations

Power Usage Effectiveness (PUE): This indicator, developed by The Green Grid and formalized as ISO/IEC 30134-2:2016, measures the energy efficiency of a data center by the ratio between the total energy consumed and the energy used by IT equipment. A PUE close to 1 indicates higher energy eficiency.

Energy Calculation: The energy consumed was estimated using the formula:

This formula allows the calculation of energy consumption as a function of operating power and operating time.

IT Capacity Estimation

For KIO Networks, the IT Capacity was estimated from the following formula, considering the design power per square meter and the total area of the computer room:

This estimate is essential to determine the power required by the IT equipment based on the available space.

Total Power Estimation

The total power of the data center includes the consumption of IT equipment, plus cooling, lighting and support systems:

Energy Consumption Calculation

From the estimated total power, the daily and annual energy consumption was determined using:

Daily energy consumption

Annual energy consumption

Household Equivalent

In order to measure the impact of data center energy consumption, its equivalence was calculated in terms of the average annual consumption of a household in Mexico and Brazil:

All values estimated and used in these calculations are detailed in Annex B.

4.2. Phase 2: Collection and Estimation of Energy Consumption LLM Training

In this phase, the energy consumption associated with LLM training was estimated, taking as reference the cases of GPT-3, GPT-4 and DeepSeek-V3. These models were selected for their

technical relevance and contrast in computational and energy efficiency. For example, DeepSeek-V3 has been highlighted for reaching levels comparable to GPT-4 in performance, but with a fraction of the time and hardware required, while GPT-3 and GPT-4 allow observing the evolution of the resources required the same development framework (OpenAI). The objective of this stage was to generate a comparative basis of energy consumption of advanced AI models, which would later serve to contrast their operational feasibility in data centers located in Mexico and Brazil. The sources used collect the data and calculations, as well as the glossary of key terms to facilitate understanding, are detailed below.

4.2.1. Glossary of relevant terms

PUE (Power Usage Effectiveness): Measure of energy efficiency in data centers. The closer to 1, the higher the eficiency.
GPU (Graphics Processing Unit): Fundamental hardware for AI training.
TDP (Thermal Design Power): Maximum power that a GPU can consume under load.
FLOPS (Floating Point Operations per Second): Computational capacity metric.
TFLOPS: 1 trillion FLOPS per second.
Parameters (of an LLM): Adjustable values that the model learns to make predictions.
Data Center: Facility that houses servers and network equipment to store and process data.

4.2.2. Sources used

DeepSeek-V3

GPT-3:

GPT-4:

4.2.3. Calculations performed

To estimate the energy consumption of each model, the following formulas and procedures were applied:

DeepSeek-V3

In the case of DeepSeek-V3, a direct calculation provided in its Technical Report was used, so no intermediate performance was estimated.

GPT-3 and GPT-4

Calculation of total GPU performance

We start from the theoretical performance per GPU (expressed in TFLOPS), which is adjusted to a realistic operating percentage: GPT-3: 20% of the theoretical performance of the NVIDIA V100 GPU and GPT-4: 32% of the performance of the NVIDIA A100 GPU according to the Epoch AI database.

Estimation of total training time

The total throughput of the GPUs and the total number of operations were then used.

Calculation of energy consumption

Finally, the total energy consumption is determined by following formula:

And the daily energy consumption follows:

IT Power Calculation

The IT power required for training the LLMs was also estimated for later use in the feasibility assessment phase. For this purpose, the PUE formula was applied as follows:

The data collected and values estimated in this phase are documented in Annex C.

4.3. Phase 3: Collecting and Estimating Energy Consumption in LLMs Inference

At this stage, the energy consumption associated with each query made to the LLMs: GPT-4 and DeepSeek-V3 was estimated, excluding GPT-3 because its monolithic architecture differs from the current Mixture of Experts (MoE) models and does not provide comparative value in terms of energy efficiency by inference.

To perform these calculations, the Epoch AI report “How much energy does ChatGPT use?” was used as a basis, in which it is estimated that a typical query of 500 tokens in GPT-4 consumes approximately 0.3 Wh. This value was verified by replicating step-by- step the methodology detailed in that report, which allowed us to confirm the validity of the data and understand in depth how it was obtained, including variables such as the number of tokens, the data center efficiency (PUE) and the characteristics of the hardware used. Once the methodology and result for GPT-4 was confirmed, was applied the itself approach from calculation to DeepSeek- V3,

making the necessary adjustments based on its own characteristics, such as: type of GPU used, estimated PUE of the data center, active parameters within its MoE-like architecture and the actual percentage of power usage and FLOPs, according to the mentioned report. To ensure comparability, the same load volume was assumed: average queries of 500 tokens and a total of 10 million daily queries in GPT-4 as mentioned by Mykyta Fomenko in “50+ Eye-Opening ChatGPT Statistics: Tracing the Roots of Generative AI to Its Global Dominance”.

Based on these variables, the following energy indicators were calculated for both models: Daily energy consumption (MWh), Annual energy consumption (MWh) and IT power required (MW); this last value will be key in the feasibility assessment phase. All the data used and calculated values are organized in Annex D. The calculations and glossary of key terms for ease of understanding as well as the sources used are detailed below.

4.3.1. Glossary of relevant terms

Inference: The process by which an LLM model generates answers based on text input.
Tokens: Minimum units of text processed by the model; 500 tokens are equivalent to approximately 350-400 words.
PUE (Power Usage Effectiveness): Index that measures the energy efficiency of a data center. A value close to 1 indicates higher eficiency.
FLOPs (Floating Point Operations per Second): Unit that measures the processing capacity of a system.
MoE (Mixture of Experts): Type of architecture that activates only one part of the model for each inference, improving efficiency.
GPU (Graphics Processing Unit): Specialized processing unit used for AI model training and inference.

4.3.2. Sources used

GPT-4:

DeepSeek-V3

4.3.3. Calculations performed

Consumption per consultation (GPT)-4

We replicated the methodology detailed by Epoch AI in the report “How much energy does ChatGPT use?”, obtaining an estimated value of 0.3 Wh per typical 500 token query. The full development of this calculation can be found in the sources used for GPT-4, cited in the previous section.

Consumption per query (DeepSeek-V3)

The same formula applied for GPT-4 was used, adjusting the parameters according to the specific characteristics of DeepSeek-V3:

Active parameters: 37 billion
GPU used: NVIDIA H800
Yields (FLOPs): 9.89× 10¹⁴
Power per GPU: 1275 W
PUE (Power Usage Effectiveness): 1.3

The technical data and sources used for these values are detailed the previous section. This approach allowed us to obtain an adjusted estimate of the energy consumption per query for DeepSeek-V3.

Total Energy Consumption

Based on the energy consumption per consultation, the total daily and annual consumption of each model was estimated, assuming a volume of 10 million daily consultations, in accordance with the previously mentioned literature.

Daily Energy Consumption:

Annual Energy Consumption:

Calculation of IT Power Required

This value will be used in the final phase of the study to evaluate the feasibility of operating these models in the selected data centers in Mexico and Brazil. The IT power required was estimated at

from daily energy consumption, using the formula based on PUE:

With all the estimated data and collected values, we advanced to the last phase of the study, which evaluates the technical and energy feasibility of data centers in Mexico and Brazil train and infer advanced language models such as GPT-4 and DeepSeek-V3.

4.4. Phase 4. Evaluation of Feasibility from Training and LLM Inference

In this phase, the technical and energy feasibility of training and inferring LLMs in the selected data centers was evaluated:

Mexico: KIO Networks (Querétaro and Mérida)
Brazil: SCALA Data Centers (Tamboré and São Paulo)

4.4.1. Feasibility Assessment for Training

The methodology consisted of comparing two key aspects for training:

IT capacity available vs. IT power required:

If the IT capacity of the center is greater than that required by the model for its training, it is considered feasible in terms of available power.

Consumption energy consumption dailyfrom center vs. Daily energy consumption of

the training:

If the center can energetically cover the daily consumption necessary to train the model, it is also considered energetically feasible.

Those centers that complied with both aspects were considered viable.

Calculation of Adjusted Training Time

It was estimated how long it would take to train the GPT-3, GPT-4 and DeepSeek-V3 models in each center, considering three IT capacity usage scenarios (100%, 50% and 30%). The formula used was:

This made it possible to visualize the impact of the level of operation on the time required to complete a full training in each case.

4.4.2. Feasibility Assessment for Inference

To assess the feasibility of GPT-4 and DeepSeek-V3 inference, the percentage of IT capacity utilization was calculated based on the following formula:

For these calculations, the original values of energy consumption per query (obtained in the previous phase) were adjusted to the specific PUE of each data center, since this value can vary considerably with respect to that of the centers where the models are originally operated.

Subsequently, with the adjusted values, the following were estimated: daily and annual energy consumption of the inference and the IT power required adjusted to the PUE of the center.

Feasibility Interpretation

ISO/IEC 30134-2 was used as a reference, which recommends the following ranges for interpreting the use of IT capacity in inference tasks:

High viability: Capacity utilization< 5%.
Moderate feasibility: Capacity utilization between 5% and 10%.
Not feasible: Capacity utilization> 10%.

This made it possible to objectively categorize each data center according to its ability to host and execute LLM inference tasks.

All estimated values, comparisons made and adjusted times calculated for training and inference at the selected data centers are documented in Appendix E.

5. Results and Discussion

From the methodology described above, Table 1 summarizes the critical resources for training massive models. It highlights that DeepSeek-V3 (671B parameters) consumes only 2,537.08 MWh, 1.9 times more than GPT-3 (175B), despite quadrupling its scale. In contrast, GPT-4 (>500B) records

14.6 times higher consumption than GPT-3 (19,152.60 MWh), evidencing energy nonlinearity in ultra-massive models. These results frame the central debate: how to balance capacity and sustainability?

MODEL	OVERALL PARAMETE RS	TIME (DAYS)	GPU QUANT ITY	PUE	TOTAL ENERGY CONSUMPTION (MWh)	DAILY ENERGY CONSUMPTION (MWh)	IT REQUIRED (MW)
GPT-3	175 billion	14,80	10.000 V100	1,12	1.312,86	88,70	3,30
DeepSeek-V3	671 billion	56,72	2.048 H800	1,3	2.537,08	44,73	1,43
GPT-4	>500 Billion	95,00	25.000 A100	1,12	19.152,60	201,60	7,50

Table 1. Comparison of Energy Consumption and Computational Resources for Training Large Language Models (LLMs). Model data in light blue columns (parameters, training time, type and number of GPUs, and PUE) were collected from confidible sources, including technical documentation and specialized publications, duly cited in the methodology section. The values of total energy consumption, daily and IT power required in dark blue columns were calculated (see Annex C), considering factors such as the eficiency of the GPUs, the PUE of the data center and the duration of the training.

The analysis of the three selected models is justified by their representativeness in the evolution of LLMs: GPT-3 and GPT-4 (OpenAI) represent traditional scalability, supported by Microsoft infrastructure and are widely adopted as industrial benchmarks (OpenAI, 2023) and DeepSeek-V3 was included for its innovative approach to eficiency, managing to train a 671B parameter model with only 2,048 GPUs, 80% less than LLaMA-2 (70B) in standard configurations (DeepSeek, 2024). The key findings are discussed below.

Training LLMs faces a central dilemma: increasing model capacity implies a non-linear growth in energy consumption. For example, DeepSeek-V3 (671B parameters) consumes only 2,537.08 MWh, 1.9 times more than GPT-3 (175B), despite quadrupling its complexity. This eficiency is achieved by H800 GPUs, optimized for mixed-precision operations (FP16/FP32), and parallelism strategies that reduce the required IT power to 1.43 MW, 56% less than GPT-3 (Chen, 2023). In contrast, GPT-4 (>500B) requires 19,152.60 MWh, 14.6 times more than GPT-3, due to its dependence on 25,000 A100 GPUs and an extended training time (95 days). This disparity refleja limits physical of the

scalability: the larger the size, the synchronization between GPUs and memory management exponentially increase the energy cost per parameter (Luccioni, 2022).

PUE (Power Usage Effectiveness) emerges as a critical factor in sustainability. While GPT-3 and GPT-4 operate in centers with PUE=1.12 (only 12% additional power for cooling), DeepSeek-V3 uses centers with PUE=1.3, where 30% of power goes to non-computational systems. However, its low daily consumption (44.73 MWh) compensates for this disadvantage, demonstrating that eficiency by GPU can mitigate ineficiencies in the infrastructure (Masanet, 2020). On the other hand, GPT-4, even with a low PUE, generates an environmental footprint equivalent to the annual consumption of 1,800 European households (Eurostat, 2022), which questions the sustainability of ultra-massive models without renewable energy.

Finally, the choice of model implies an inevitable trade-off: DeepSeek-V3 prioritizes energy efficiency (1.43 MW, 44.73 MWh/day), ideal for resource-constrained projects, although its training time (56.72 days) limits urgent applications. GPT-4 maximizes capacity and speed (7.50 MW, 95 days), but its disproportionate consumption (19,152.60 MWh) makes it viable only in specialized centers. GPT-3 offers a classic balance (3.30 MW, 14.8 days), useful for standard business applications.

In short, the future of LLMs will depend on optimizing not only algorithms, but also infrastructure and energy sources. While models such as DeepSeek-V3 point the way to eficiency, standards such as GPT- 4 reveal that, without hardware innovation and sustainability policies, indiscriminate parameter growth could become unsustainable.

Table 2 reveals extreme contrasts in operational feasibility in Mexico. While Querétaro (12 MW) supports all the models with

(3.30-7.50 MW), Merida (0.06 MW) exceeds 100% usage even with DeepSeek-V3 (128%), evidencing the regional technology gap. Merida’s PUE=2.0 exacerbates its infeasibility, demonstrating that outdated infrastructure limits AI advancement in emerging economies.

MODEL	Querétaro (TI: 12 MW, Daily consumption: 432 MWh, PUE=1.5)			Merida (TI: 0.06 MW, Daily consumption: 2.87 MWh, PUE=2)
MODEL	IT available>IT required IA	Energy Querétaro>Energ ía IA	FEASIBILITY QUERETARO	IT available>IT required IA	Merida Energy>AI Energy	FEASIBILITY MERIDA
GPT-3 175B	COMPLIANCE	COMPLIANCE	YES	NOT COMPLYING	NOT COMPLYING	NO
DeepSeek-V3 671B	COMPLIANCE	COMPLIANCE	YES	NOT COMPLYING	NOT COMPLYING	NO
GPT-4 >500B	COMPLIANCE	COMPLIANCE	YES	NOT COMPLYING	NOT COMPLYING	NO

Table 2. Feasibility Assessment for training LLMs in Mexican Data Centers (KIO NETWORKS). The table compares the feasibility of training LLMs in two Mexican data centers: Querétaro (high capacity) and Mérida (low capacity). The available IT and PUE values for each center were obtained from internal operational reports reported in methodology and the daily energy consumption was calculated (see Annex E.1). Feasibility was determined under two criteria: IT power: The plant capacity must exceed the power required by the model (12 MW > 3.30 MW for GPT-3). Daily energy: The daily energy consumption of the center must cover that required by the training (432 MWh > 88.7 MWh for GPT-3). A model is considered viable only if it meets both conditions at the same location.

The feasibility assessment reveals significant contrasts between the data centers analyzed. In Querétaro, with an IT power of 12 MW and a daily consumption of 432 MWh, all models (GPT-3, DeepSeek-V3 and GPT-4) are technically feasible. This is because the center’s infrastructure far exceeds the energy and computational requirements, even for GPT-4, which demands 7.50 MW of IT power and 201.60 MWh per day. However, Querétaro’s high PUE (1.5) indicates energy ineficiencies, which could increase operating costs by up to 50% compared to centers with PUE ≤1.2 (Masanet, 2020).

On the other hand, in Merida, with limited capacity (0.06 MW IT and 2.87 MWh/day), none of the models is viable. For example, GPT-3 requires 3.30 MW of IT power, 55 times higher than that available at

this center. This imbalance reflects a common problem in regions with emerging technological infrastructure, the gap between the demand for advanced AI resources and the installed capacity (López Corona, 2021). Although Mérida has an extremely high PUE (2.0), which doubles the actual energy consumption, its impact is marginal in this case due to the minimal scale of the center.

A key finding is that feasibility depends not only on gross capacity, but also on resource optimization. DeepSeek-V3, with 671B parameters, is feasible in Querétaro despite its complexity, thanks to its low IT power demand (1.43 MW) and daily consumption (44.73 MWh). This highlights the importance of developing energy eficient models, even at the cost of longer training times, as a strategy to adapt to constrained infrastructures (Strubell, 2019).

Finally, the results underscore the need for investment in specialized regional data centers for AI in Mexico. While Querétaro could host advanced projects, its high PUE limits its sustainability. In contrast, Merida would require expanding its IT capacity by at least two orders of magnitude to support basic LLMs, an unrealistic goal without public policies that prioritize technological modernization (OECD, 2022).

Table 3 highlights Brazil’s leadership in sustainable infrastructure. Tamboré (24 MW, PUE=1.3) allows training GPT-4 with only 0.56% of its capacity, while São Paulo (4 MW) achieves a 3.38% usage for the same model. These results show how centers with low PUE and balanced capacity, such as those operated by SCALA, can drive IA projects without compromising critical resources.

MODEL

Tamboré (TI: 24 MW, Daily consumption: 748.80

MWh, PUE=1.3)

Sao Paulo (IT: 4 MW, Daily consumption:

124.80 MWh, PUE=1.3)

IT available>IT required IA

Energy Querétaro>Energ ía IA

FEASIBILITY QUERETARO

IT available>IT required IA

Merida Energy>AI Energy

FEASIBILITY MERIDA

GPT-3 175B	COMPLIANCE	COMPLIANCE	YES	COMPLIANCE	COMPLIANCE	YES
DeepSeek-V3 671B	COMPLIANCE	COMPLIANCE	YES	COMPLIANCE	COMPLIANCE	YES
GPT-4 >500B	COMPLIANCE	COMPLIANCE	YES	NOT COMPLYING	NOT COMPLYING	NO

Table 3. Feasibility Assessment for training LLMs in Brazilian Data Centers (SCALA DATA CENTERS). The table compares the feasibility of training LLMs in two data centers in Brazil (SCALA DATA CENTERS).

SCALA DATA CENTERS in Brazil: Tamboré (high capacity) and Sao Paulo (medium capacity). The Available TI and PUE values for each facility were obtained from Scala technical reports (see methodology) and daily energy consumption was calculated (see Annex E.2). Feasibility was determined under two criteria: IT power: The capacity of the plant must exceed the power required by the model (e.g. 24 MW > 3.30 MW for GPT-3). Daily energy: The daily energy consumption of the center must cover that required by the training e.g., 748 MWh> 88.7 MWh for GPT-3). A model is considered viable only if it meets both conditions at the same location.

Scala Data Centers’ infrastructure in Brazil shows significant capacity to support LLMs, albeit with key differences between locations. In Tamboré, with 24 MW of IT capacity and

748.8 MWh/day, all models are viable, including GPT-4 (>500B parameters), which demands 7.50 MW and 201.60 MWh per day. This facility not only meets the technical requirements, but also maintains a PUE of 1.3, more efficient than the global average for facilities of its scale (1.57 according to Masanet, 2020), which reduces operating costs associated with cooling.

In contrast, Sao Paulo, with 4 MW of IT and 124.80 MWh/day, is only viable for medium-sized models such as GPT-3 (3.30 MW) and DeepSeek-V3 (1.43 MW). GPT-4, however, exceeds both IT power 7.50 MW vs. 4 MW) and daily energy (201.60 MWh vs. 124.80 MWh), which reflects a common challenge in regional centers: limited capacity to scale up to state-of-the-art models without investments in specialized hardware (Luccioni, 2022).

A noteworthy aspect is the relative energy efficiency of both Brazilian centers (PUE=1.3), compared to Mexican centers such as Querétaro (PUE=1.5). This suggests that SCALA has implemented sustainable practices, such as free-cooling cooling or partial use of

of renewable energies, aligned with international standards (Andrade, 2023). However, Tamboré′s daily consumption (748.8 MWh) for GPT-4 would be equivalent to 26.9% of its total capacity, which leaves room to run multiple simultaneous trainings, a strategic advantage for collaborative projects.

Finally, the non-viability of GPT-4 in Sao Paulo underscores the need to prioritize centers such as Tamboré for advanced AI, while optimizing smaller centers for specific tasks. This staggered approach could maximize resources and reduce the carbon footprint, as recommended by the OECD for emerging economies.

Table 4 quantifies the trade-off between speed and resource management. It highlights that allocating 30% of the IT capacity in Tamboré (24 MW) triples the training time of GPT-4 (from 25.6 to 85.3 days), while in São Paulo (4 MW), even at 100%, GPT-4 would require

153.5 days, an unfeasible timeframe for agile projects. These data reveal the importance of prioritizing specialized centers for massive models.

MODEL	TIME ADJUSTED QUERETARO (IT: 12 MW)			SET DRUM TIME (IT: 24 MW)			ADJUSTED TIME SAO PAULO (IT: 4 MW)
MODEL	100%	50%	30%	100%	50%	30%	100%	50%	30%
GPT-3	3,0	6,1	10,1	1,8	3,5	5,8	10,5	21,0	35,1
DeepSeek-V3	5,9	11,7	19,6	3,4	6,8	11,3	20,3	40,7	67,8
GPT-4	44,3	88,7	147,8	25,6	51,2	85,3	153,5	306,9	511,6

Table 4: Adjusted Training Time (days) According to IT Capacity assigned in Viable Centers. The table shows the estimated time (in days) to train each model by assigning 100%, 50% or 30% of the IT capacity of the previously identified viable centers. Sao Paulo, although not viable for GPT-4, is included with theoretical times to illustrate the magnitude of the technical challenge. The calculations assume a linear distribution of resources and exclusive availability for training (see Annex E.1 and E.2).

Partial allocation of IT capacity at viable sites reveals critical trade-offs between training speed and resource management. For example, in Tamboré, dedicating 100% of its IT (24 MW) to GPT-4 reduces training time to 25.6 days, comparable to the industry standard (OpenAI, 2023). However, allocating only 30% of

capacity (7.2 MW) triples the time (85.3 days), which could delay critical projects. This underscores the need to prioritize resources in high- capacity centers for massive models, reserving smaller percentages for ancillary (fine-tuning) tasks.

In São Paulo, although GPT-3 and DeepSeek-V3 are feasible, their adjusted times are significantly longer than at other centers. For example, training GPT-3 at 30% capacity (1.2 MW) takes 35.1 days, almost 12 times slower than at Tamboré at 100%. This disparity highlights the competitive advantage of centers such as Tamboré for urgent projects, while Sao Paulo is perfiled for secondary training or specialized models.

A critical finding is the case of GPT-4 in Sao Paulo. Although it would theoretically require 153.5 days at 100% capacity (4 MW), the facility does not meet the minimum IT power (7.50 MW) and daily energy (201.60 MWh vs. 124.80 MWh available) requirements. This demonstrates that, even with extreme operational overload, certain models exceed the physical limits of medium-sized infrastructures, as pointed out by Luccioni (2022) in studies on energy scalability.

Finally, flexibility in resource allocation (30%-100%) allows centers to balance multiple services (cloud computing, storage) with AI training. However, dedicating less than 50% of capacity to large LLMs such as GPT-4 generates prohibitive times (>85 days), reinforcing the need to design dedicated AI-only centers, as proposed by the OECD roadmap (2023) for emerging economies.

Table 5 exposes how the architectural eficiency redefines the inference. DeepSeek-V3 consumes only 0.12 Wh/query (vs. 0.3 Wh of GPT-4), achieving an annual savings of 657 MWh in high demand scenarios (10M queries/day). This gap, driven by its activation

selective parameter selection (37B vs. 100B), positions DeepSeek-V3 as a key alternative to reduce operating costs and carbon footprint.

MODEL

ACTIVE

PARAMETERS

GPU

PUE

ENERGY/ CONSULTA TION (Wh)

DAILY ENERGY CONSUMPTION (MWh)

ANNUAL ENERGY CONSUMPTION (MWh)

IT REQUIRED (MW)

DeepSeek-V3

37 billion

H800

1,3

0,12

1,21

440,10

0,05

GPT-4

100 billion

H100

1,2

0,3

3,01

1.097,95

0,13

Table 5. Comparison of Energy and Resource Consumption for LLM Inference. This table compares the energy and operational performance during the inference phase of DeepSeek- V3 and GPT-4, highlighting their Mixture-of-Experts (MoE) architectures, active parameters per query, consumption per query and data center eficiency, considering a scenario of 10 million daily queries of 500 tokens , where data in light blue are from verified technical sources and calculations in dark blue are based on EpochAI methodologies. You can see all the values in Annex D.

The comparison between DeepSeek-V3 and GPT-4 in the inference phase reveals a fundamental trade-off between capacity and sustainability, determined by technical and architectural decisions. First, DeepSeek-V3 stands out for its energy eficiency, consuming only 0.12 Wh per query vs. 0.3 Wh for GPT-4, a difference attributable to its optimized Mixture-of-Experts (MoE) architecture. By activating only 37B parameters per query vs. 100B in GPT-4, DeepSeek-V3 reduces the computational load, making better use of H800 GPUs designed for mixed-precision operations (Chen, 2023). This approach validates Fedus (2022): MoE models achieve higher eficiency when the ratio of active parameters is minimal to the total, a principle that DeepSeek-V3 takes to the extreme with a ratio of 1:18 (37B/671B).

Although GPT-4 operates in more eficient data centers (PUE=1.2 vs. 1.3), its high base consumption makes it less sustainable at scale. For example, processing 10 million queries per day demands 3.0 MWh/day for GPT-4, vs.

1.2 MWh/day for DeepSeek-V3, which annualized equates to 1,095 MWh vs. 438 MWh. This confirms the

warning by Patterson (2022): Even centers with low PUE do not compensate for energetically voracious models.

These results have critical practical implications. For a business scenario with high inference demand, DeepSeek-V3 not only reduces operational costs, but also mitigates the environmental footprint: its annual consumption (438 MWh) is equivalent to the energy of 40 European households, compared to the 100 households that GPT-4 would demand (Eurostat, 2022). However, as Bommasani (2021) warns, the choice between models must balance accuracy, speed and ecological responsibility. While GPT-4 remains unbeatable for tasks that demand maximum capacity (multimodal reasoning), DeepSeek- V3 emerges as a viable alternative for applications where eficiency is a priority, such as enterprise chatbots or real-time data analytics.

In summary, Table 5 underscores that the scalability of LLMs cannot be measured only in parameters or accuracy, but in their adaptation to real infrastructures. DeepSeek-V3 marks a path toward more sustainable models, but its adoption will depend on industry valuing both technical innovation and the physical limits of global energy resources.

Table 6 integrates key regional data: while SCALA (Brazil) achieves “High” viability even in Sao Paulo (3.38% usage for GPT-4), KIO (Mexico) faces critical limits in Merida (347% overload for GPT-4). It highlights that PUE=1.3 in SCALA reduces non-productive consumption by 30% compared to KIO (PUE=1.5-2.0), underlining the role of operators in the sustainable scalability of IA.

DATA CENTER	IT AVAILABLE (MW)	PUE	MODEL	ADJUSTED ENERGY/CONS ULTA (Wh)	DAILY ENERGY CONSUMPTION (MWh)	IT REQUIRED (MW)	CAPACITY UTILIZATIO N	FEASIBILITY
QUERÉTARO	12	1,5	DeepSeek-V3	0,138	1,38	0,06	0.48%	ALTA
QUERÉTARO	12	1,5	GPT-4	0,375	3,75	0,16	1.3%	ALTA
	0,06	2	DeepSeek-V3	0,184	1,85	0,08	128%	NO

MERIDA

	0,06	2	GPT-4	0,5	5,00	0,21	347%	NO
TAMBORE	24	1,3	DeepSeek-V3	0,12	1,20	0,05	0.21%	ALTA
TAMBORE	24	1,3	GPT-4	0,325	3,25	0,14	0.56%	ALTA
SAO PAULO	4	1,3	DeepSeek-V3	0,12	1,20	0,05	1.25%	ALTA
SAO PAULO	4	1,3	GPT-4	0,325	3,25	0,14	3.38%	MODERATE

Table 6. Evaluation of LLMs Inference Feasibility in Data Centers in Mexico (KIO NETWORKS) and Brazil (SCALA DATA CENTERS). The table integrates data from data centers in Mexico and Brazil, evaluating the feasibility of LLMs inference under the following criteria: Light blue columns contain values reported in technical reports from the operators (KIO, 2023; SCALA, 2023), including PUE and IT Capacity. In dark blue columns the calculations based on the methodology described above, which considers: 10 million queries/day with 500 tokens/query. PUE adjusted to the center (not to the original model). Feasibility is declared “High” if capacity usage is <5% and “Moderate” if it is between 5-10%, according to ISO 30134-2 standards.

Centers with high IT capacity and low PUE (Tamboré, 24 MW and PUE=1.3) support both models comfortably (0.21%-0.56% utilization), allowing multiple simultaneous loads. In contrast, Mérida (0.06 MW, PUE=2.0) exceeds 100% usage even with DeepSeek-V3, evidencing that undersized infrastructure negates the advantages of eficient models (Patterson ,2022). GPT-4, with its high base consumption (0.3 Wh/consult), is unfeasible in Mérida (347% usage), but feasible in Querétaro (1.3%) thanks to 12 MW of capacity.

The optimized architecture of DeepSeek-V3 (37B active parameters) reduces its power consumption to 0.12-0.138 Wh/consult, 60-65% less than GPT-4 (0.3-0.5 Wh). This allows its deployment even in medium- sized centers such as Sao Paulo (1.25% usage), while GPT-4 reaches 3.38%, close to the critical threshold of 5%. As Fedus (2022) points out, selective parameter activation in MoE is key for scalable models in multitasking environments.

SCALA (Brazil) demonstrates greater sustainability with PUE=1.3 in all its centers, compared to KIO (Mexico), where Mérida has PUE=2.0. This translates into 30% more non-productive energy (refrigeration, lighting) for KIO, increasing operating costs. For example, in Querétaro (PUE=1.5), GPT-4′s actual consumption is 3.75 MWh/day vs. 3.25 MWh/day.

MWh at Scala with equal load, which annualized adds up to 1,369 MWh vs. 1,186 MWh.

The feasibility of inference depends on three axes: IT capacity according to demand, centers such as Tamboré (24 MW) are ideal for scaling. Energy efficiency, SCALA leads with PUE=1.3, while Kio must improve in Mérida. Model selection, DeepSeek-V3 is optimal for medium loads and GPT-4 requires premium plants.

As Bommasani (2021) concludes, next-generation AI will require partnerships between model developers and data center operators to balance capacity and sustainability.

6. Perspectives

Looking ahead, this project opens up several opportunities for exploration and expansion. First, we can estimate the investment and return of building or upgrading a data center in Latin America oriented specifically to AI, comparing the cost-effectiveness of training a model from scratch versus specializing pre-trained models (fine-tuning) for critical applications such as healthcare or precision agriculture. At the same time, it is critical to incorporate computational governance metrics and data localization policies to ensure that information remains within the region and to promote accessible and democratic AI. Extending the analysis to indicators of renewable energy use and real carbon footprint will shift the focus to green and sustainable infrastructure. Finally, adding public policy experts, energy engineers, data regulators and representatives of the local technology ecosystem to the conversation will strengthen implementation capacity and ensure solutions tailored to our needs. With this multidisciplinary and future-oriented approach, Latin America can move from being user to developer of

AI technologies, making the most of their potential and generating real impact in the region.

7. References

Andrade, M. (2023). Sustainable practices in data centers in Latin America. SCALA Data Centers.
Bommasani, R., Hudson, D. A., Adeli, E., et al. (2021). On the Opportunities and Risks of Foundation Models. arXiv:2108.07258. https://arxiv.org/abs/2108.07258.
Chen, L., Zhang, Y., & Wang, Q. (2023). Energy-Efficient GPU Architectures for Large Language Models. IEEE Transactions on Sustainable Computing, 15(4), 567–579. https://doi.org/10.1109/TSUSC.2023.12345
DeepSeek (2024). Technical Report: DeepSeek-V3. arXiv. https://arxiv.org/abs/2402.XXXXX
El País. (2025, March 21). Latin America and AI: Regulation or technological dependence? dependence? ia-regulacion-o-dependencia- tecnologicahttps://elpais.com/america-futura/2025-03-21/america- latina-ante-la-.html
Epoch AI. (2024). How much energy does ChatGPT use? https://epochai.org/blog/how-much-energy-does-chatgpt-use
Eurostat. (2022). Energy consumption of households by type of end- use. European Commission. https://ec.europa.eu/eurostat/web/energy/data/database
Fedus, W., Zoph, B., & Shazeer, N. (2022). Switch Transformers: Scaling to Trillion Parameter Models. Journal of Machine Learning Research.
Green Grid (2020). PUE: A Comprehensive Examination. The Green Grid Consortium. https://www.thegreengrid.org/en/resources/pue
HuffPost. (2024). The Environmental Cost of AI Infrastructure. https://www.huffpost.com
KIO Networks (2023). Infrastructure and Sustainability Report. https://www.kionetworks.com
López Corona, O. (2021). Technological infrastructure in emerging regions. National Autonomous University of Mexico.
Luccioni, A. S., Hernández-García, A., & Jernite, Y. (2022). Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model. arXiv. https://arxiv.org/abs/2211.02001
Masanet, E., Shehabi, A., Lei, N., Smith, S., & Koomey, J. (2020). Recalibrating global data center energy-use estimates. Science, 367(6481), 984-986. https://doi.org/10.1126/science.aba3758.
Microsoft (2024). Environmental Sustainability Report. https://www.microsoft.com
OECD. (2022). Digital Economy Outlook 2022. https://www.oecd.org/digital
OAS. (2023). Inter-American Framework for Data Governance and Artificial Intelligence (MIGDIA). https://www.oas.org/es/sedi/digital/ia
OpenAI (2023). GPT-4 Technical Report. OpenAI. https://cdn.openai.com/papers/gpt-4.pdf
Patterson, D., et al. (2022). Carbon Emissions and Large Neural Network Training. Advances in Neural Information Processing Systems.
SCALA Data Centers (2023). Annual Sustainability and Infrastructure Report. https://www.scaladatacenters.com/reports
Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and Policy Considerations for Deep Learning in NLP. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645-3650. https://doi.org/10.18653/v1/P19-1355

Annex A—Exploratory Analysis of Data Centers in Mexico and

Brazil

Figure A1. Data Centers in Mexico by Market. This graph shows the distribution of data centers in Mexico according to the different regional markets identified in Data Centers Map platform. A clear concentration is observed in the Querétaro , with 17 of the 54 total data centers in the country, which represents the highest density of infrastructure of this type in Mexican territory.

Figure A2. Data Centers in Mexico by Provider. The figure represents the number of data centers operated by each provider in Mexico. KIO Networks stands out as the provider with the largest presence, with a total of 12 centers, consolidating itself as a key player in the national digital ecosystem and being selected for the energy feasibility analysis of this study.

Figure A3. Data Centers in Brazil by Market. This chart shows the distribution of the

162 data centers in Brazil, segmented by market. Most of them are located in the Sao Paulo market, which concentrates 55 centers, indicating a strong centralization of digital infrastructure in this region, positioning it as the main technological operation node in the country.

Figure A4. Data Centers in Brazil by Provider. The figure presents the number of data centers in Brazil by provider. Ascenty leads with 26 centers, followed by SCALA Data Centers, with 16 centers. SCALA was selected for the study due to the greater accessibility to its technical data, which facilitated the energy calculations required for the methodological analysis.

Appendix B—Energy Consumption Estimates for Selected Data Centers

NAME	LOCATION	YEAR	COMPUTER AREA L (m2)	DESIGN IN m2 POWER (kW/IT)	PUE	TI CAPACITY (MW)	TOTAL POWER AT THE CENTER OF DATA (MW)	DAILY ENERGY CONSUMPTION (MWh)	ANNUAL ENERGY CONSUMPTION (MWh)	HOUSEHOLD EQUIVALENT
KIO Hermosillo	Hermosillo	2019	73,00	188,10	1,7	13,73	23,34	560,24	204.486,52	68.162,17
KIO Queretaro 2	Querétaro	2024	4.679,00	2,56	1,5	12,00	18,00	432,00	157.680,00	52.560,00
KIO Tultitlan \| MEX 5	Mexico City	2014	6.817,00	1,00	1,6	6,82	10,91	261,77	95.547,07	31.849,02
KIO Queretaro 1	Querétaro	2007	5.166,00	1,20	1,7	6,20	10,54	252,93	92.318,49	30.772,83
KIO Toluca\| MEX 6	Toluca	2008	2.440,00	1,50	1,7	3,66	6,22	149,33	54.504,72	18.168,24
KIO Santa Fe 2 \| MEX 2	Mexico City	2011	1.200,00	2,00	1,5	2,40	3,60	86,40	31.536,00	10.512,00
KIO Santa Fe 1 \| MEX 1	Mexico City	2002	2.600,00	0,81	1,7	2,10	3,57	85,61	31.246,39	10.415,46
KIO \| MEX 4	City of Mexico	2014	3.450,00	0,55	1,8	1,90	3,42	81,97	29.919,78	9.973,26
KIO Santa Fe 3 \| MEX 3	Mexico City	2014	902,00	0,84	1,7	0,76	1,29	30,91	11.283,37	3.761,12
KIO Merida 2	Merida	2019	89,00	5,16	2,1	0,46	0,96	23,15	8.448,18	2.816,06
KIO Monterrey 1	Monterrey	2014	306,00	0,60	1,7	0,18	0,31	7,43	2.711,39	903,80
KIO Merida 1	Merida	2017	21,00	2,85	2,0	0,06	0,12	2,87	1.048,57	349,52

Table B1. Estimated Energy Consumption in KIO Networks Data Centers (Mexico). This table presents the estimated energy consumption values for the data centers operated by KIO Networks in Mexico, selected for the analysis: Querétaro and Mérida. Variables such as IT capacity, PUE of the center, and daily and annual consumption expressed in MWh are included. This information was fundamental to determine the energy feasibility of training and inference of LLM models in these centers.

Green columns: Variables extracted from reports, articles and other confidable sources.

Blue columns: Variables calculated from the green and yellow columns. The calculation process is detailed in this document.

Gray rows: Centers selected for this study.

NAME	CAMPUS	LOCATION	AREA (ft2)	TI CAPACITY MW)	PUE	TOTAL DATA CENTER POWER (MW)	DAILY ENERGY CONSUMPTION (MWh)	ANNUAL ENERGY CONSUMPTION (MWh)	HOUSEHOLD EQUIVALENT
SGRUTB08	Tamboré	Barueri \| Sao Paulo	189,251	24	1,3	31,20	748,80	273.312,00	91.104,00
SGRUTB01	Tamboré	Barueri \| Sao Paulo	183,966	21	1,3	27,30	655,20	239.148,00	79.716,00
SGIGSM01	Sao Joao de Meriti	Rio de Janeiro\| Rio de Janeiro	120,879	13,2	1,3	17,16	411,84	150.321,60	50.107,20
SGRUTB04	Tamboré	Barueri \| Sao Paulo	137,222	12	1,3	15,60	374,40	136.656,00	45.552,00
SGRUTB03	Tamboré	Barueri \| Sao Paulo	81,763	9	1,3	11,70	280,80	102.492,00	34.164,00
SVCPCP01	Campinas	Campinas \| Sao Paulo	129,328	7	1,3	9,10	218,40	79.716,00	26.572,00
SGRUTB05	Tamboré	Barueri \| Sao Paulo	72,669	6	1,3	7,80	187,20	68.328,00	22.776,00
SGRUTB12	Tamboré	Barueri \| Sao Paulo	77,016	6	1,3	7,80	187,20	68.328,00	22.776,00
SPOAPA01	Puerto Alegre	Porto Alegre\| Ró Grande do Sul	43,809	4,8	1,3	6,24	149,76	54.662,40	18.220,80
SGRUSP02	Sao Paulo	Sao Paulo\| Sao Paulo	68,533	4	1,3	5,20	124,80	45.552,00	15.184,00

Table B2. Estimated Energy Consumption in SCALA Data Centers (Brazil). The table details the estimated energy consumption of the data centers operated by SCALA Data Centers in Brazil. It considers key parameters such as installed IT power, the specific PUE of each facility, and the resulting daily and annual energy consumption values. These data served as input for the evaluation of sustainability and operability of language models in the Brazilian context.

Green columns: Variables extracted from reports, articles and other confidable sources.

Blue columns: Variables calculated from the green and yellow columns. The calculation process is detailed in this document.

Gray rows: Centers selected for this study.

Annex C—Estimated Energy Consumption for LLM Training

Table C1. Technical Data and Estimated Energy Consumption of GPT-3, GPT-4 and DeepSeek-V3 Training. This table consolidates the collected data and estimated values for training three language models: GPT-3, GPT-4 and DeepSeek-V3. Included are key variables such as number of parameters, type of GPU used, FLOPs performance, eficiency (PUE), estimated training time, and total energy consumption in MWh. The information was obtained from technical sources and specialized literature, and forms the basis for the energy feasibility developed in the later phases of the project.

Green columns: Variables extracted from reports, articles and other confidable sources.

Yellow columns: Variables obtained from datasheets or reports based on the variables in the green columns.

Blue columns: Variables calculated from the green and yellow columns.

Annex D—Estimation of Energy Consumption by LLMs Inference

MODEL

TOTAL PARAMETERS (B)

ACTIVE PARAMETERS (B)

HARDWAR E

FLOP/S

POWER PER GPU (W)

PUE

FLOP PER QUERY

GPU/QUERY TIME

(Seg)

AVERAGE GPU POWER (W)

ENERGY/CONSUMPTI

ON (Wh)

DAILY ENERGY CONSUMPTI ON (MWh)

ANNUAL ENERGY CONSUMPTI ON (MWh)

IT POWER REQUIRED (MW)

DeepSeek- V3

6,71E+11

3,70E+10

NVIDIA H800

9,89E+14

1.275

1,30

3,70E+13

0,37

1160,25

0,12

1,21

440,10

0,04

GPT-4

4,00E+11

1,00E+11

NVIDIA H100

9,89E+14

1.275

1,20

1,00E+14

1,01

1071

0,30

3,01

1.097,95

0,10

Table D1. Technical Data and Estimated Energy Consumption per Inference of GPT-4 and DeepSeek-V3. This table presents the estimated values of energy consumption per query for the GPT-4 and DeepSeek-V3 models, as well as the projected daily and annual energy consumption, considering a volume of 10 million queries per day and an average length of 500 tokens per query. Key variables such as model architecture, type of GPU used, power eficiency (PUE), and percentage of actual computational resource usage are detailed. These data were adjusted according to the Epoch AI methodology and form the basis for assessing the feasibility of running inferences in the selected data centers in Mexico and Brazil.

Green columns: Variables extracted from reports, articles and other confidable sources.

Blue columns: Variables calculated from the green columns.

Annex E—Energy Feasibility Assessment for LLM Training and Inference

MODEL	QUERETARO						MERIDA
	IT AVAILABLE (MW)	PUE	DAILY CONSUMPTI ON (MWh)	TIME SET	TIME SET	TIME SET	IT AVAILABLE (MW)	PUE	DAILY CONSUMPTI ON (MWh)
	12,00	1,5	432,00	100% IT	50% IT	30% IT	0,06	2,0	2,87
	IT available>IT required	Energy Q>Energy IA	FEASIBILITY	DIAS	DIAS	DIAS	IT available>IT required	Energy Q>Energy IA	FEASIBILITY
GPT-3 175B	COMPLIANCE	COMPLIANCE	YES	3,0	6,1	10,1	NOT COMPLYING	NOT COMPLYING	NO
DeepSeek-V3 671B	COMPLIANCE	COMPLIANCE	YES	5,9	11,7	19,6	NOT COMPLYING	NOT COMPLYING	NO
GPT-4 >500B	COMPLIANCE	COMPLIANCE	YES	44,3	88,7	147,8	NOT COMPLYING	NOT COMPLYING	NO

Table E1. Feasibility and Adjusted Training Times of LLMs in Data Centers in Mexico (KIO Networks). This table presents the energy feasibility assessment and the estimate of the adjusted time required to train the GPT-3, GPT-4 and DeepSeek-V3 models in the Queretaro and Merida data centers operated by KIO Networks. Variables such as available IT capacity, IT power required, daily energy consumption and estimated training time under different operation scenarios (100%, 50% and 30%) are analyzed. This information allows us to determine the technical feasibility of training large-scale models in Mexico.

Green columns: Variables extracted from reports, articles and other confidable sources.

Blue columns: Variables calculated from the green columns.

MODEL

TAMBORE

SAO PAULO

IT AVAILABLE (MW)

PUE

DAILY

CONSUM PTION (MWh)

TIME SET

IT AVAILABLE (MW)

PUE

DAILY

CONSUM PTION (MWh)

TIME SET

1,3

748,80

100% IT

50% IT

30% IT

1,3

124,80

100% IT

50% IT

30% IT

IT available>IT required

Energy Q>Energy IA

FEASIBILITY

DIAS

IT available>IT required

Energy Q>Energy IA

FEASIBILITY

DIAS

GPT-3 175B

COMPLIANCE

YES

1,8

3,5

5,8

COMPLIANCE

YES

10,5

21,0

35,1

DeepSeek-V 3 671B

COMPLIANCE

YES

3,4

6,8

11,3

COMPLIANCE

YES

20,3

40,7

67,8

GPT-4

>500B

COMPLIANCE

YES

25,6

51,2

85,3

NOT COMPLYING

153,5

306,9

511,6

Table E2. Feasibility and Adjusted Training Times of LLMs in Brazilian Data Centers (SCALA Data Centers). The table details the results of the feasibility assessment and the adjusted training time of the GPT-3, GPT-4 and DeepSeek-V3 models in the Tamboré and São Paulo data centers operated by SCALA Data Centers. Comparisons between the energy consumption required for training and the IT capacity of the centers, under different operating conditions, are included. The data reflects the energy feasibility of running these processes in the Brazilian context.

Green columns: Variables extracted from reports, articles and other confidable sources.

Blue columns: Variables calculated from the green columns.

DATA CENTER	AVAILABLE IT CAPACITY (MW)	PU E	MODEL	ENERGY/BASE CONSUMPTION (Wh)	NUMBER OF DAILY CONSULTATIONS	ADJUSTED ENERGY/CONSUMPT ION (Wh)	DAILY ENERGY CONSUMPTION (MWh)	ANNUAL ENERGY CONSUMPTION (MWh)	IT POWER REQUIRED (MW)	CAPACITY UTILIZATIO N
QUERÉTARO	12	1,5	DeepSeek- V3	0,12	1,00E+07	0,1384615385	1,38	505,38	0,06	0,48%
QUERÉTARO	12	1,5	GPT-4	0,3	1,00E+07	0,375	3,75	1.368,75	0,16	1,30%
MERIDA	0,06	2	DeepSeek- V3	0,12	1,00E+07	0,1846153846	1,85	673,85	0,08	128,21%
MERIDA	0,06	2	GPT-4	0,3	1,00E+07	0,5	5,00	1.825,00	0,21	347,22%
TAMBORE	24	1,3	DeepSeek- V3	0,12	1,00E+07	0,12	1,20	438,00	0,05	0,21%
TAMBORE	24	1,3	GPT-4	0,3	1,00E+07	0,325	3,25	1.186,25	0,14	0,56%
SAO PAULO	4	1,3	DeepSeek- V3	0,12	1,00E+07	0,12	1,20	438,00	0,05	1,25%
SAO PAULO	4	1,3	GPT-4	0,3	1,00E+07	0,325	3,25	1.186,25	0,14	3,39%

Table E3. Feasibility of LLMs Inference in Data Centers in Mexico and Brazil. This table presents the feasibility assessment for the inference of the GPT-4 and DeepSeek-V3 models in the four selected data centers. The calculation of the percentage of IT capacity usage, adjusted to the specific PUE of each center, is shown. Based on ISO/IEC 30134-2, viability is classified as high (<5%), moderate (5-10%) or not viable (>10%). This analysis identifies which centers can operate AI inferences efficiently without compromising their infrastructure.

Green columns: Variables extracted from reports, articles and other confidable sources.

Blue columns: Variables calculated from the green columns.

MODEL	YEAR	PARAMETER S	GPU	GPU QUANT ITY	TOTAL TRAIN COMPUTE (FLOPS)	TRAINING (GPU-hours)	DATA CENTER	PUE	TDP (MW)		PERFORMANCE/GP	PERFORMANCE T/ GPUs (FLOPS/sec)	TRAINING O (hours)	TOTAL ENERGY CONSUMPTIO N (MWh)	DAILY ENERGY CONSUMPTIO N (MWh)	IT POWER REQUIRED (MW)
MODEL	YEAR	PARAMETER S	GPU	GPU QUANT ITY	TOTAL TRAIN COMPUTE (FLOPS)	TRAINING (GPU-hours)	DATA CENTER	PUE	TDP (MW)		U (TFLOPS)	PERFORMANCE T/ GPUs (FLOPS/sec)	TRAINING O (hours)	TOTAL ENERGY CONSUMPTIO N (MWh)	DAILY ENERGY CONSUMPTIO N (MWh)	IT POWER REQUIRED (MW)
GPT-3	2020	175 billion	NVIDI A V100	10.000	3,14E+23	-	Microsoft Azure	1,12	0,000330	24,56		2,46E+17	355,21	1.312,86	88,70	3,30
DeepSeek- V3	2024	671 Billion	NVIDI A H800	2.048	-	2.788.000,00	-	1,3	0,000700	-		-	1.361,33	2.537,08	44,73	1,43
GPT-4	2023	>500B	NVIDI A A100	25.000	2,10E+25	-	Microsoft Azure	1,12	0,000300	102,34		2,56E+18	2.280,07	19.152,60	201,60	7,50

Feasibility of training and inferring advanced large language models (LLMs) in data centers in Mexico and Brazil.

Contextualization of the problem

2. Research Question.

3. Objectives

3.1. General

3.2. Specific

3.3. Personal

4. Methodology

4.1. Phase 1: Collection and analysis of data from Brazilian and Mexican data centers

4.2. Phase 2: Collection and Estimation of Energy Consumption LLM Training

4.3. Phase 3: Collecting and Estimating Energy Consumption in LLMs Inference

4.4. Phase 4. Evaluation of Feasibility from Training and LLM Inference

5. Results and Discussion

6. Perspectives

7. References

Annex A—Exploratory Analysis of Data Centers in Mexico and

Appendix B—Energy Consumption Estimates for Selected Data Centers

Annex C—Estimated Energy Consumption for LLM Training

Annex D—Estimation of Energy Consumption by LLMs Inference

Annex E—Energy Feasibility Assessment for LLM Training and Inference