This project was conducted as part of the “Careers with Impact” program during the 14-week mentoring phase. You can find more information about the program in this post.
Contextualization of the problem
The accelerated advance of Artificial Intelligence (AI), especially in large- scale language models (LLMs) such as DeepSeek, has intensified the demand for computational and energy resources. This growing technological dependence poses significant challenges for Latin America, a region that has historically depended on foreign infrastructures and developments in the digital realm (EL PAÍS, 2025). This dependence limits technological sovereignty and the ability to compete on equal terms in the global Artificial Intelligence (hereinafter “AI”) market.
To counter this situation, it is essential that Latin American countries strengthen the governance of computation and promote the democratization of AI. Initiatives such as the Inter-American Framework for Data Governance and Artificial Intelligence (MIGDIA) of the OAS seek to guide member states in the development of policies that promote ethical and responsible management of AI, adapted to regional realities and needs (OAS, 2023). In addition, the “Declaration of Santiago to promote ethical Artificial Intelligence in Latin America and the Caribbean” reflects the region’s commitment to establish a common voice in AI governance.
However, developing and training AI models from scratch requires substantial investments in infrastructure, such as the construction of specialized data centers. These facilities not only represent a considerable financial challenge, also have a significant environmental impact due to the high energy consumption and associated carbon footprint. For example, projects in other regions have faced criticism for their demand on natural resources and potential contribution to climate change (HuffPost, 2024).
Given this context, a more viable strategy for Latin American countries could be the adoption and specialization of AI models already trained, such as DeepSeek, in specific areas such as health. This approach would allow reducing costs and minimizing environmental impact, while developing local AI capabilities. Moreover, by focusing on concrete applications relevant to the region, innovation and competitiveness in the global AI market could be fostered.
In summary, for Latin America to move towards greater independence and prominence in the field of AI, it is crucial to invest in sustainable infrastructure, develop appropriate governance frameworks and consider technology adoption strategies that balance costs, benefits and environmental sustainability.
2. Research Question.
Is it feasible for data centers in Mexico and Brazil to operate as specialized infrastructures for artificial intelligence, meeting the energy and technical requirements needed to train models such as DeepSeek and run inference efficiently?
3. Objectives
3.1. General
Evaluate the technical and energetic feasibility of operating data centers in Mexico and Brazil as specialized infrastructures for artificial intelligence. This implies analyzing whether such centers meet the necessary requirements to train AI models and execute inference in an efficient manner.
3.2. Specific
Identify and select viable data centers for AI training and inference.
Estimating energy consumption associated with training and inference models such as DeepSeek.
3.3. Personal
Develop energy data analysis and scenario simulation skills.
Strengthen competencies in the evaluation of digital infrastructure and sustainability.
To advance professional training within the field of sustainable and emerging technologies.
4. Methodology
The methodology applied in this study was developed in four main phases, aimed at evaluating the energy and operational feasibility of training and inferring large-scale language models (LLMs) in data centers located in Mexico and Brazil. Initially, a detailed collection and analysis of the data center infrastructure in both countries was performed, using the Data Centers Mapplatform as the main source. The providers with the greatest presence and data availability were identified (KIO Networks in Mexico and SCALA in Brazil).
Data Centers in Brazil) to select specific centers representative of both high and low operational capacity. From these, daily and annual energy consumptions were estimated. In parallel, technical data on the training and inference of LLMs such as GPT-3, GPT-4 and DeepSeek-V3 were collected, highlighting their relevance for eficiency, technological innovation and hardware variation. Subsequently, the approximate energy consumption for both training and inference of such models was estimated, considering parameters such as PUE (Power Usage Effectiveness) and the associated energy infrastructure. Finally, the feasibility of running these models in the selected centers was evaluated, adjusting the calculations to local conditions in Mexico and Brazil. All the information, calculations and decisions were documented and systematized to ensure the replicability of the study, and are presented in the following sections along with their respective sources and annexes.
4.1. Phase 1: Collection and analysis of data from Brazilian and Mexican data centers
The first stage of the study focused on the identification, collection and analysis of information on operational data centers in Mexico and Brazil, using the Data Centers Map platform as the main source. The information was obtained based on the data available at the date of consultation, registering a total 54 centers in 13 markets in Mexico, and 162 centers distributed in 30 markets in Brazil. Data such as the name of the center, supplier and market were collected and systematized in a spreadsheet to facilitate subsequent analysis. From this , graphs were prepared to visualize the concentration of centers by supplier and market (see Annex A). Based on these analyses, a representative supplier was selected.
by country: KIO Networks in Mexico and SCALA Data Centers in Brazil, due to their high presence and the availability of key data for the study. Subsequently, the daily and annual energy consumption of each supplier’s centers was estimated (see Annex B), which allowed two centers per country to be selected for the detailed analysis: one with a higher capacity and one with a lower capacity. For Mexico, the Querétaro (12 MW of TI) and Mérida (0.06 MW of TI) centers were selected, and for Brazil, the Tamboré (24 MW of TI) and Sao Paulo (4 MW of TI) centers were selected. This selection was crucial to evaluate, in later phases, the feasibility of training and inferring LLMs in these environments. The sources from which data were extracted and the calculations performed are detailed below.
4.1.1. Sources used
4.1.2. Calculations performed
The estimation of the energy consumption of the data centers of KIO Networks (Mexico) and SCALA Data Centers (Brazil) was based on application of internationally recognized equations in the field of data center energy efficiency.
Base Equations
Power Usage Effectiveness (PUE): This indicator, developed by The Green Grid and formalized as ISO/IEC 30134-2:2016, measures the energy efficiency of a data center by the ratio between the total energy consumed and the energy used by IT equipment. A PUE close to 1 indicates higher energy eficiency.
Energy Calculation: The energy consumed was estimated using the formula:
This formula allows the calculation of energy consumption as a function of operating power and operating time.
IT Capacity Estimation
For KIO Networks, the IT Capacity was estimated from the following formula, considering the design power per square meter and the total area of the computer room:
This estimate is essential to determine the power required by the IT equipment based on the available space.
Total Power Estimation
The total power of the data center includes the consumption of IT equipment, plus cooling, lighting and support systems:
Energy Consumption Calculation
From the estimated total power, the daily and annual energy consumption was determined using:
Daily energy consumption
Annual energy consumption
Household Equivalent
In order to measure the impact of data center energy consumption, its equivalence was calculated in terms of the average annual consumption of a household in Mexico and Brazil:
All values estimated and used in these calculations are detailed in Annex B.
4.2. Phase 2: Collection and Estimation of Energy Consumption LLM Training
In this phase, the energy consumption associated with LLM training was estimated, taking as reference the cases of GPT-3, GPT-4 and DeepSeek-V3. These models were selected for their
technical relevance and contrast in computational and energy efficiency. For example, DeepSeek-V3 has been highlighted for reaching levels comparable to GPT-4 in performance, but with a fraction of the time and hardware required, while GPT-3 and GPT-4 allow observing the evolution of the resources required the same development framework (OpenAI). The objective of this stage was to generate a comparative basis of energy consumption of advanced AI models, which would later serve to contrast their operational feasibility in data centers located in Mexico and Brazil. The sources used collect the data and calculations, as well as the glossary of key terms to facilitate understanding, are detailed below.
4.2.1. Glossary of relevant terms
PUE (Power Usage Effectiveness): Measure of energy efficiency in data centers. The closer to 1, the higher the eficiency.
GPU (Graphics Processing Unit): Fundamental hardware for AI training.
TDP (Thermal Design Power): Maximum power that a GPU can consume under load.
FLOPS (Floating Point Operations per Second): Computational capacity metric.
TFLOPS: 1 trillion FLOPS per second.
Parameters (of an LLM): Adjustable values that the model learns to make predictions.
Data Center: Facility that houses servers and network equipment to store and process data.
4.2.2. Sources used
DeepSeek-V3
GPT-3:
GPT-4:
4.2.3. Calculations performed
To estimate the energy consumption of each model, the following formulas and procedures were applied:
DeepSeek-V3
In the case of DeepSeek-V3, a direct calculation provided in its Technical Report was used, so no intermediate performance was estimated.
GPT-3 and GPT-4
Calculation of total GPU performance
We start from the theoretical performance per GPU (expressed in TFLOPS), which is adjusted to a realistic operating percentage: GPT-3: 20% of the theoretical performance of the NVIDIA V100 GPU and GPT-4: 32% of the performance of the NVIDIA A100 GPU according to the Epoch AI database.
Estimation of total training time
The total throughput of the GPUs and the total number of operations were then used.
Calculation of energy consumption
Finally, the total energy consumption is determined by following formula:
And the daily energy consumption follows:
IT Power Calculation
The IT power required for training the LLMs was also estimated for later use in the feasibility assessment phase. For this purpose, the PUE formula was applied as follows:
The data collected and values estimated in this phase are documented in Annex C.
4.3. Phase 3: Collecting and Estimating Energy Consumption in LLMs Inference
At this stage, the energy consumption associated with each query made to the LLMs: GPT-4 and DeepSeek-V3 was estimated, excluding GPT-3 because its monolithic architecture differs from the current Mixture of Experts (MoE) models and does not provide comparative value in terms of energy efficiency by inference.
To perform these calculations, the Epoch AI report “How much energy does ChatGPT use?” was used as a basis, in which it is estimated that a typical query of 500 tokens in GPT-4 consumes approximately 0.3 Wh. This value was verified by replicating step-by- step the methodology detailed in that report, which allowed us to confirm the validity of the data and understand in depth how it was obtained, including variables such as the number of tokens, the data center efficiency (PUE) and the characteristics of the hardware used. Once the methodology and result for GPT-4 was confirmed, was applied the itself approach from calculation to DeepSeek- V3,
making the necessary adjustments based on its own characteristics, such as: type of GPU used, estimated PUE of the data center, active parameters within its MoE-like architecture and the actual percentage of power usage and FLOPs, according to the mentioned report. To ensure comparability, the same load volume was assumed: average queries of 500 tokens and a total of 10 million daily queries in GPT-4 as mentioned by Mykyta Fomenko in “50+ Eye-Opening ChatGPT Statistics: Tracing the Roots of Generative AI to Its Global Dominance”.
Based on these variables, the following energy indicators were calculated for both models: Daily energy consumption (MWh), Annual energy consumption (MWh) and IT power required (MW); this last value will be key in the feasibility assessment phase. All the data used and calculated values are organized in Annex D. The calculations and glossary of key terms for ease of understanding as well as the sources used are detailed below.
4.3.1. Glossary of relevant terms
Inference: The process by which an LLM model generates answers based on text input.
Tokens: Minimum units of text processed by the model; 500 tokens are equivalent to approximately 350-400 words.
PUE (Power Usage Effectiveness): Index that measures the energy efficiency of a data center. A value close to 1 indicates higher eficiency.
FLOPs (Floating Point Operations per Second): Unit that measures the processing capacity of a system.
MoE (Mixture of Experts): Type of architecture that activates only one part of the model for each inference, improving efficiency.
GPU (Graphics Processing Unit): Specialized processing unit used for AI model training and inference.
4.3.2. Sources used
GPT-4:
DeepSeek-V3
4.3.3. Calculations performed
Consumption per consultation (GPT)-4
We replicated the methodology detailed by Epoch AI in the report “How much energy does ChatGPT use?”, obtaining an estimated value of 0.3 Wh per typical 500 token query. The full development of this calculation can be found in the sources used for GPT-4, cited in the previous section.
Consumption per query (DeepSeek-V3)
The same formula applied for GPT-4 was used, adjusting the parameters according to the specific characteristics of DeepSeek-V3:
Active parameters: 37 billion
GPU used: NVIDIA H800
Yields (FLOPs): 9.89× 10¹⁴
Power per GPU: 1275 W
PUE (Power Usage Effectiveness): 1.3
The technical data and sources used for these values are detailed the previous section. This approach allowed us to obtain an adjusted estimate of the energy consumption per query for DeepSeek-V3.
Total Energy Consumption
Based on the energy consumption per consultation, the total daily and annual consumption of each model was estimated, assuming a volume of 10 million daily consultations, in accordance with the previously mentioned literature.
Daily Energy Consumption:
Annual Energy Consumption:
Calculation of IT Power Required
This value will be used in the final phase of the study to evaluate the feasibility of operating these models in the selected data centers in Mexico and Brazil. The IT power required was estimated at
from daily energy consumption, using the formula based on PUE:
With all the estimated data and collected values, we advanced to the last phase of the study, which evaluates the technical and energy feasibility of data centers in Mexico and Brazil train and infer advanced language models such as GPT-4 and DeepSeek-V3.
4.4. Phase 4. Evaluation of Feasibility from Training and LLM Inference
In this phase, the technical and energy feasibility of training and inferring LLMs in the selected data centers was evaluated:
Mexico: KIO Networks (Querétaro and Mérida)
Brazil: SCALA Data Centers (Tamboré and São Paulo)
4.4.1. Feasibility Assessment for Training
The methodology consisted of comparing two key aspects for training:
IT capacity available vs. IT power required:
If the IT capacity of the center is greater than that required by the model for its training, it is considered feasible in terms of available power.
Consumption energy consumption dailyfrom center vs. Daily energy consumption of
the training:
If the center can energetically cover the daily consumption necessary to train the model, it is also considered energetically feasible.
Those centers that complied with both aspects were considered viable.
Calculation of Adjusted Training Time
It was estimated how long it would take to train the GPT-3, GPT-4 and DeepSeek-V3 models in each center, considering three IT capacity usage scenarios (100%, 50% and 30%). The formula used was:
This made it possible to visualize the impact of the level of operation on the time required to complete a full training in each case.
4.4.2. Feasibility Assessment for Inference
To assess the feasibility of GPT-4 and DeepSeek-V3 inference, the percentage of IT capacity utilization was calculated based on the following formula:
For these calculations, the original values of energy consumption per query (obtained in the previous phase) were adjusted to the specific PUE of each data center, since this value can vary considerably with respect to that of the centers where the models are originally operated.
Subsequently, with the adjusted values, the following were estimated: daily and annual energy consumption of the inference and the IT power required adjusted to the PUE of the center.
Feasibility Interpretation
ISO/IEC 30134-2 was used as a reference, which recommends the following ranges for interpreting the use of IT capacity in inference tasks:
High viability: Capacity utilization< 5%.
Moderate feasibility: Capacity utilization between 5% and 10%.
Not feasible: Capacity utilization> 10%.
This made it possible to objectively categorize each data center according to its ability to host and execute LLM inference tasks.
All estimated values, comparisons made and adjusted times calculated for training and inference at the selected data centers are documented in Appendix E.
5. Results and Discussion
From the methodology described above, Table 1 summarizes the critical resources for training massive models. It highlights that DeepSeek-V3 (671B parameters) consumes only 2,537.08 MWh, 1.9 times more than GPT-3 (175B), despite quadrupling its scale. In contrast, GPT-4 (>500B) records
14.6 times higher consumption than GPT-3 (19,152.60 MWh), evidencing energy nonlinearity in ultra-massive models. These results frame the central debate: how to balance capacity and sustainability?
MODEL
OVERALL
PARAMETE RS
TIME (DAYS)
GPU
QUANT ITY
PUE
TOTAL ENERGY CONSUMPTION (MWh)
DAILY ENERGY CONSUMPTION (MWh)
IT REQUIRED (MW)
GPT-3
175 billion
14,80
10.000 V100
1,12
1.312,86
88,70
3,30
DeepSeek-V3
671 billion
56,72
2.048 H800
1,3
2.537,08
44,73
1,43
GPT-4
>500
Billion
95,00
25.000 A100
1,12
19.152,60
201,60
7,50
Table 1. Comparison of Energy Consumption and Computational Resources for Training Large Language Models (LLMs). Model data in light blue columns (parameters, training time, type and number of GPUs, and PUE) were collected from confidible sources, including technical documentation and specialized publications, duly cited in the methodology section. The values of total energy consumption, daily and IT power required in dark blue columns were calculated (see Annex C), considering factors such as the eficiency of the GPUs, the PUE of the data center and the duration of the training.
The analysis of the three selected models is justified by their representativeness in the evolution of LLMs: GPT-3 and GPT-4 (OpenAI) represent traditional scalability, supported by Microsoft infrastructure and are widely adopted as industrial benchmarks (OpenAI, 2023) and DeepSeek-V3 was included for its innovative approach to eficiency, managing to train a 671B parameter model with only 2,048 GPUs, 80% less than LLaMA-2 (70B) in standard configurations (DeepSeek, 2024). The key findings are discussed below.
Training LLMs faces a central dilemma: increasing model capacity implies a non-linear growth in energy consumption. For example, DeepSeek-V3 (671B parameters) consumes only 2,537.08 MWh, 1.9 times more than GPT-3 (175B), despite quadrupling its complexity. This eficiency is achieved by H800 GPUs, optimized for mixed-precision operations (FP16/FP32), and parallelism strategies that reduce the required IT power to 1.43 MW, 56% less than GPT-3 (Chen, 2023). In contrast, GPT-4 (>500B) requires 19,152.60 MWh, 14.6 times more than GPT-3, due to its dependence on 25,000 A100 GPUs and an extended training time (95 days). This disparity refleja limits physical of the
scalability: the larger the size, the synchronization between GPUs and memory management exponentially increase the energy cost per parameter (Luccioni, 2022).
PUE (Power Usage Effectiveness) emerges as a critical factor in sustainability. While GPT-3 and GPT-4 operate in centers with PUE=1.12 (only 12% additional power for cooling), DeepSeek-V3 uses centers with PUE=1.3, where 30% of power goes to non-computational systems. However, its low daily consumption (44.73 MWh) compensates for this disadvantage, demonstrating that eficiency by GPU can mitigate ineficiencies in the infrastructure (Masanet, 2020). On the other hand, GPT-4, even with a low PUE, generates an environmental footprint equivalent to the annual consumption of 1,800 European households (Eurostat, 2022), which questions the sustainability of ultra-massive models without renewable energy.
Finally, the choice of model implies an inevitable trade-off: DeepSeek-V3 prioritizes energy efficiency (1.43 MW, 44.73 MWh/day), ideal for resource-constrained projects, although its training time (56.72 days) limits urgent applications. GPT-4 maximizes capacity and speed (7.50 MW, 95 days), but its disproportionate consumption (19,152.60 MWh) makes it viable only in specialized centers. GPT-3 offers a classic balance (3.30 MW, 14.8 days), useful for standard business applications.
In short, the future of LLMs will depend on optimizing not only algorithms, but also infrastructure and energy sources. While models such as DeepSeek-V3 point the way to eficiency, standards such as GPT- 4 reveal that, without hardware innovation and sustainability policies, indiscriminate parameter growth could become unsustainable.
Table 2 reveals extreme contrasts in operational feasibility in Mexico. While Querétaro (12 MW) supports all the models with
(3.30-7.50 MW), Merida (0.06 MW) exceeds 100% usage even with DeepSeek-V3 (128%), evidencing the regional technology gap. Merida’s PUE=2.0 exacerbates its infeasibility, demonstrating that outdated infrastructure limits AI advancement in emerging economies.
MODEL
Querétaro (TI: 12 MW, Daily consumption: 432
MWh, PUE=1.5)
Merida (TI: 0.06 MW, Daily consumption: 2.87
MWh, PUE=2)
IT available>IT required IA
Energy Querétaro>Energ ía IA
FEASIBILITY QUERETARO
IT available>IT required IA
Merida Energy>AI Energy
FEASIBILITY MERIDA
GPT-3 175B
COMPLIANCE
COMPLIANCE
YES
NOT COMPLYING
NOT COMPLYING
NO
DeepSeek-V3 671B
COMPLIANCE
COMPLIANCE
YES
NOT COMPLYING
NOT COMPLYING
NO
GPT-4 >500B
COMPLIANCE
COMPLIANCE
YES
NOT COMPLYING
NOT COMPLYING
NO
Table 2. Feasibility Assessment for training LLMs in Mexican Data Centers (KIO NETWORKS). The table compares the feasibility of training LLMs in two Mexican data centers: Querétaro (high capacity) and Mérida (low capacity). The available IT and PUE values for each center were obtained from internal operational reports reported in methodology and the daily energy consumption was calculated (see Annex E.1). Feasibility was determined under two criteria: IT power: The plant capacity must exceed the power required by the model (12 MW > 3.30 MW for GPT-3). Daily energy: The daily energy consumption of the center must cover that required by the training (432 MWh > 88.7 MWh for GPT-3). A model is considered viable only if it meets both conditions at the same location.
The feasibility assessment reveals significant contrasts between the data centers analyzed. In Querétaro, with an IT power of 12 MW and a daily consumption of 432 MWh, all models (GPT-3, DeepSeek-V3 and GPT-4) are technically feasible. This is because the center’s infrastructure far exceeds the energy and computational requirements, even for GPT-4, which demands 7.50 MW of IT power and 201.60 MWh per day. However, Querétaro’s high PUE (1.5) indicates energy ineficiencies, which could increase operating costs by up to 50% compared to centers with PUE ≤1.2 (Masanet, 2020).
On the other hand, in Merida, with limited capacity (0.06 MW IT and 2.87 MWh/day), none of the models is viable. For example, GPT-3 requires 3.30 MW of IT power, 55 times higher than that available at
this center. This imbalance reflects a common problem in regions with emerging technological infrastructure, the gap between the demand for advanced AI resources and the installed capacity (López Corona, 2021). Although Mérida has an extremely high PUE (2.0), which doubles the actual energy consumption, its impact is marginal in this case due to the minimal scale of the center.
A key finding is that feasibility depends not only on gross capacity, but also on resource optimization. DeepSeek-V3, with 671B parameters, is feasible in Querétaro despite its complexity, thanks to its low IT power demand (1.43 MW) and daily consumption (44.73 MWh). This highlights the importance of developing energy eficient models, even at the cost of longer training times, as a strategy to adapt to constrained infrastructures (Strubell, 2019).
Finally, the results underscore the need for investment in specialized regional data centers for AI in Mexico. While Querétaro could host advanced projects, its high PUE limits its sustainability. In contrast, Merida would require expanding its IT capacity by at least two orders of magnitude to support basic LLMs, an unrealistic goal without public policies that prioritize technological modernization (OECD, 2022).
Table 3 highlights Brazil’s leadership in sustainable infrastructure. Tamboré (24 MW, PUE=1.3) allows training GPT-4 with only 0.56% of its capacity, while São Paulo (4 MW) achieves a 3.38% usage for the same model. These results show how centers with low PUE and balanced capacity, such as those operated by SCALA, can drive IA projects without compromising critical resources.
MODEL
Tamboré (TI: 24 MW, Daily consumption: 748.80
MWh, PUE=1.3)
Sao Paulo (IT: 4 MW, Daily consumption:
124.80 MWh, PUE=1.3)
IT available>IT required IA
Energy Querétaro>Energ ía IA
FEASIBILITY QUERETARO
IT available>IT required IA
Merida Energy>AI Energy
FEASIBILITY MERIDA
GPT-3 175B
COMPLIANCE
COMPLIANCE
YES
COMPLIANCE
COMPLIANCE
YES
DeepSeek-V3 671B
COMPLIANCE
COMPLIANCE
YES
COMPLIANCE
COMPLIANCE
YES
GPT-4 >500B
COMPLIANCE
COMPLIANCE
YES
NOT COMPLYING
NOT COMPLYING
NO
Table 3. Feasibility Assessment for training LLMs in Brazilian Data Centers (SCALA DATA CENTERS). The table compares the feasibility of training LLMs in two data centers in Brazil (SCALA DATA CENTERS).
SCALA DATA CENTERS in Brazil: Tamboré (high capacity) and Sao Paulo (medium capacity). The Available TI and PUE values for each facility were obtained from Scala technical reports (see methodology) and daily energy consumption was calculated (see Annex E.2). Feasibility was determined under two criteria: IT power: The capacity of the plant must exceed the power required by the model (e.g. 24 MW > 3.30 MW for GPT-3). Daily energy: The daily energy consumption of the center must cover that required by the training e.g., 748 MWh> 88.7 MWh for GPT-3). A model is considered viable only if it meets both conditions at the same location.
Scala Data Centers’ infrastructure in Brazil shows significant capacity to support LLMs, albeit with key differences between locations. In Tamboré, with 24 MW of IT capacity and
748.8 MWh/day, all models are viable, including GPT-4 (>500B parameters), which demands 7.50 MW and 201.60 MWh per day. This facility not only meets the technical requirements, but also maintains a PUE of 1.3, more efficient than the global average for facilities of its scale (1.57 according to Masanet, 2020), which reduces operating costs associated with cooling.
In contrast, Sao Paulo, with 4 MW of IT and 124.80 MWh/day, is only viable for medium-sized models such as GPT-3 (3.30 MW) and DeepSeek-V3 (1.43 MW). GPT-4, however, exceeds both IT power 7.50 MW vs. 4 MW) and daily energy (201.60 MWh vs. 124.80 MWh), which reflects a common challenge in regional centers: limited capacity to scale up to state-of-the-art models without investments in specialized hardware (Luccioni, 2022).
A noteworthy aspect is the relative energy efficiency of both Brazilian centers (PUE=1.3), compared to Mexican centers such as Querétaro (PUE=1.5). This suggests that SCALA has implemented sustainable practices, such as free-cooling cooling or partial use of
of renewable energies, aligned with international standards (Andrade, 2023). However, Tamboré′s daily consumption (748.8 MWh) for GPT-4 would be equivalent to 26.9% of its total capacity, which leaves room to run multiple simultaneous trainings, a strategic advantage for collaborative projects.
Finally, the non-viability of GPT-4 in Sao Paulo underscores the need to prioritize centers such as Tamboré for advanced AI, while optimizing smaller centers for specific tasks. This staggered approach could maximize resources and reduce the carbon footprint, as recommended by the OECD for emerging economies.
Table 4 quantifies the trade-off between speed and resource management. It highlights that allocating 30% of the IT capacity in Tamboré (24 MW) triples the training time of GPT-4 (from 25.6 to 85.3 days), while in São Paulo (4 MW), even at 100%, GPT-4 would require
153.5 days, an unfeasible timeframe for agile projects. These data reveal the importance of prioritizing specialized centers for massive models.
MODEL
TIME ADJUSTED QUERETARO (IT: 12 MW)
SET DRUM TIME (IT: 24 MW)
ADJUSTED TIME SAO PAULO
(IT: 4 MW)
100%
50%
30%
100%
50%
30%
100%
50%
30%
GPT-3
3,0
6,1
10,1
1,8
3,5
5,8
10,5
21,0
35,1
DeepSeek-V3
5,9
11,7
19,6
3,4
6,8
11,3
20,3
40,7
67,8
GPT-4
44,3
88,7
147,8
25,6
51,2
85,3
153,5
306,9
511,6
Table 4: Adjusted Training Time (days) According to IT Capacity assigned in Viable Centers. The table shows the estimated time (in days) to train each model by assigning 100%, 50% or 30% of the IT capacity of the previously identified viable centers. Sao Paulo, although not viable for GPT-4, is included with theoretical times to illustrate the magnitude of the technical challenge. The calculations assume a linear distribution of resources and exclusive availability for training (see Annex E.1 and E.2).
Partial allocation of IT capacity at viable sites reveals critical trade-offs between training speed and resource management. For example, in Tamboré, dedicating 100% of its IT (24 MW) to GPT-4 reduces training time to 25.6 days, comparable to the industry standard (OpenAI, 2023). However, allocating only 30% of
capacity (7.2 MW) triples the time (85.3 days), which could delay critical projects. This underscores the need to prioritize resources in high- capacity centers for massive models, reserving smaller percentages for ancillary (fine-tuning) tasks.
In São Paulo, although GPT-3 and DeepSeek-V3 are feasible, their adjusted times are significantly longer than at other centers. For example, training GPT-3 at 30% capacity (1.2 MW) takes 35.1 days, almost 12 times slower than at Tamboré at 100%. This disparity highlights the competitive advantage of centers such as Tamboré for urgent projects, while Sao Paulo is perfiled for secondary training or specialized models.
A critical finding is the case of GPT-4 in Sao Paulo. Although it would theoretically require 153.5 days at 100% capacity (4 MW), the facility does not meet the minimum IT power (7.50 MW) and daily energy (201.60 MWh vs. 124.80 MWh available) requirements. This demonstrates that, even with extreme operational overload, certain models exceed the physical limits of medium-sized infrastructures, as pointed out by Luccioni (2022) in studies on energy scalability.
Finally, flexibility in resource allocation (30%-100%) allows centers to balance multiple services (cloud computing, storage) with AI training. However, dedicating less than 50% of capacity to large LLMs such as GPT-4 generates prohibitive times (>85 days), reinforcing the need to design dedicated AI-only centers, as proposed by the OECD roadmap (2023) for emerging economies.
Table 5 exposes how the architectural eficiency redefines the inference. DeepSeek-V3 consumes only 0.12 Wh/query (vs. 0.3 Wh of GPT-4), achieving an annual savings of 657 MWh in high demand scenarios (10M queries/day). This gap, driven by its activation
selective parameter selection (37B vs. 100B), positions DeepSeek-V3 as a key alternative to reduce operating costs and carbon footprint.
MODEL
ACTIVE
PARAMETERS
GPU
PUE
ENERGY/ CONSULTA TION (Wh)
DAILY ENERGY CONSUMPTION (MWh)
ANNUAL ENERGY CONSUMPTION (MWh)
IT REQUIRED (MW)
DeepSeek-V3
37 billion
H800
1,3
0,12
1,21
440,10
0,05
GPT-4
100 billion
H100
1,2
0,3
3,01
1.097,95
0,13
Table 5. Comparison of Energy and Resource Consumption for LLM Inference. This table compares the energy and operational performance during the inference phase of DeepSeek- V3 and GPT-4, highlighting their Mixture-of-Experts (MoE) architectures, active parameters per query, consumption per query and data center eficiency, considering a scenario of 10 million daily queries of 500 tokens , where data in light blue are from verified technical sources and calculations in dark blue are based on EpochAI methodologies. You can see all the values in Annex D.
The comparison between DeepSeek-V3 and GPT-4 in the inference phase reveals a fundamental trade-off between capacity and sustainability, determined by technical and architectural decisions. First, DeepSeek-V3 stands out for its energy eficiency, consuming only 0.12 Wh per query vs. 0.3 Wh for GPT-4, a difference attributable to its optimized Mixture-of-Experts (MoE) architecture. By activating only 37B parameters per query vs. 100B in GPT-4, DeepSeek-V3 reduces the computational load, making better use of H800 GPUs designed for mixed-precision operations (Chen, 2023). This approach validates Fedus (2022): MoE models achieve higher eficiency when the ratio of active parameters is minimal to the total, a principle that DeepSeek-V3 takes to the extreme with a ratio of 1:18 (37B/671B).
Although GPT-4 operates in more eficient data centers (PUE=1.2 vs. 1.3), its high base consumption makes it less sustainable at scale. For example, processing 10 million queries per day demands 3.0 MWh/day for GPT-4, vs.
1.2 MWh/day for DeepSeek-V3, which annualized equates to 1,095 MWh vs. 438 MWh. This confirms the
warning by Patterson (2022): Even centers with low PUE do not compensate for energetically voracious models.
These results have critical practical implications. For a business scenario with high inference demand, DeepSeek-V3 not only reduces operational costs, but also mitigates the environmental footprint: its annual consumption (438 MWh) is equivalent to the energy of 40 European households, compared to the 100 households that GPT-4 would demand (Eurostat, 2022). However, as Bommasani (2021) warns, the choice between models must balance accuracy, speed and ecological responsibility. While GPT-4 remains unbeatable for tasks that demand maximum capacity (multimodal reasoning), DeepSeek- V3 emerges as a viable alternative for applications where eficiency is a priority, such as enterprise chatbots or real-time data analytics.
In summary, Table 5 underscores that the scalability of LLMs cannot be measured only in parameters or accuracy, but in their adaptation to real infrastructures. DeepSeek-V3 marks a path toward more sustainable models, but its adoption will depend on industry valuing both technical innovation and the physical limits of global energy resources.
Table 6 integrates key regional data: while SCALA (Brazil) achieves “High” viability even in Sao Paulo (3.38% usage for GPT-4), KIO (Mexico) faces critical limits in Merida (347% overload for GPT-4). It highlights that PUE=1.3 in SCALA reduces non-productive consumption by 30% compared to KIO (PUE=1.5-2.0), underlining the role of operators in the sustainable scalability of IA.
DATA
CENTER
IT AVAILABLE (MW)
PUE
MODEL
ADJUSTED ENERGY/CONS ULTA (Wh)
DAILY ENERGY CONSUMPTION (MWh)
IT REQUIRED (MW)
CAPACITY UTILIZATIO N
FEASIBILITY
QUERÉTARO
12
1,5
DeepSeek-V3
0,138
1,38
0,06
0.48%
ALTA
12
1,5
GPT-4
0,375
3,75
0,16
1.3%
ALTA
0,06
2
DeepSeek-V3
0,184
1,85
0,08
128%
NO
MERIDA
0,06
2
GPT-4
0,5
5,00
0,21
347%
NO
TAMBORE
24
1,3
DeepSeek-V3
0,12
1,20
0,05
0.21%
ALTA
24
1,3
GPT-4
0,325
3,25
0,14
0.56%
ALTA
SAO PAULO
4
1,3
DeepSeek-V3
0,12
1,20
0,05
1.25%
ALTA
4
1,3
GPT-4
0,325
3,25
0,14
3.38%
MODERATE
Table 6. Evaluation of LLMs Inference Feasibility in Data Centers in Mexico (KIO NETWORKS) and Brazil (SCALA DATA CENTERS). The table integrates data from data centers in Mexico and Brazil, evaluating the feasibility of LLMs inference under the following criteria: Light blue columns contain values reported in technical reports from the operators (KIO, 2023; SCALA, 2023), including PUE and IT Capacity. In dark blue columns the calculations based on the methodology described above, which considers: 10 million queries/day with 500 tokens/query. PUE adjusted to the center (not to the original model). Feasibility is declared “High” if capacity usage is <5% and “Moderate” if it is between 5-10%, according to ISO 30134-2 standards.
Centers with high IT capacity and low PUE (Tamboré, 24 MW and PUE=1.3) support both models comfortably (0.21%-0.56% utilization), allowing multiple simultaneous loads. In contrast, Mérida (0.06 MW, PUE=2.0) exceeds 100% usage even with DeepSeek-V3, evidencing that undersized infrastructure negates the advantages of eficient models (Patterson ,2022). GPT-4, with its high base consumption (0.3 Wh/consult), is unfeasible in Mérida (347% usage), but feasible in Querétaro (1.3%) thanks to 12 MW of capacity.
The optimized architecture of DeepSeek-V3 (37B active parameters) reduces its power consumption to 0.12-0.138 Wh/consult, 60-65% less than GPT-4 (0.3-0.5 Wh). This allows its deployment even in medium- sized centers such as Sao Paulo (1.25% usage), while GPT-4 reaches 3.38%, close to the critical threshold of 5%. As Fedus (2022) points out, selective parameter activation in MoE is key for scalable models in multitasking environments.
SCALA (Brazil) demonstrates greater sustainability with PUE=1.3 in all its centers, compared to KIO (Mexico), where Mérida has PUE=2.0. This translates into 30% more non-productive energy (refrigeration, lighting) for KIO, increasing operating costs. For example, in Querétaro (PUE=1.5), GPT-4′s actual consumption is 3.75 MWh/day vs. 3.25 MWh/day.
MWh at Scala with equal load, which annualized adds up to 1,369 MWh vs. 1,186 MWh.
The feasibility of inference depends on three axes: IT capacity according to demand, centers such as Tamboré (24 MW) are ideal for scaling. Energy efficiency, SCALA leads with PUE=1.3, while Kio must improve in Mérida. Model selection, DeepSeek-V3 is optimal for medium loads and GPT-4 requires premium plants.
As Bommasani (2021) concludes, next-generation AI will require partnerships between model developers and data center operators to balance capacity and sustainability.
6. Perspectives
Looking ahead, this project opens up several opportunities for exploration and expansion. First, we can estimate the investment and return of building or upgrading a data center in Latin America oriented specifically to AI, comparing the cost-effectiveness of training a model from scratch versus specializing pre-trained models (fine-tuning) for critical applications such as healthcare or precision agriculture. At the same time, it is critical to incorporate computational governance metrics and data localization policies to ensure that information remains within the region and to promote accessible and democratic AI. Extending the analysis to indicators of renewable energy use and real carbon footprint will shift the focus to green and sustainable infrastructure. Finally, adding public policy experts, energy engineers, data regulators and representatives of the local technology ecosystem to the conversation will strengthen implementation capacity and ensure solutions tailored to our needs. With this multidisciplinary and future-oriented approach, Latin America can move from being user to developer of
AI technologies, making the most of their potential and generating real impact in the region.
7. References
Andrade, M. (2023). Sustainable practices in data centers in Latin America. SCALA Data Centers.
Bommasani, R., Hudson, D. A., Adeli, E., et al. (2021). On the Opportunities and Risks of Foundation Models. arXiv:2108.07258. https://arxiv.org/abs/2108.07258.
Chen, L., Zhang, Y., & Wang, Q. (2023). Energy-Efficient GPU Architectures for Large Language Models. IEEE Transactions on Sustainable Computing, 15(4), 567–579. https://doi.org/10.1109/TSUSC.2023.12345
López Corona, O. (2021). Technological infrastructure in emerging regions. National Autonomous University of Mexico.
Luccioni, A. S., Hernández-García, A., & Jernite, Y. (2022). Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model. arXiv. https://arxiv.org/abs/2211.02001
Masanet, E., Shehabi, A., Lei, N., Smith, S., & Koomey, J. (2020). Recalibrating global data center energy-use estimates. Science, 367(6481), 984-986. https://doi.org/10.1126/science.aba3758.
Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and Policy Considerations for Deep Learning in NLP. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645-3650. https://doi.org/10.18653/v1/P19-1355
Annex A—Exploratory Analysis of Data Centers in Mexico and
Brazil
Figure A1. Data Centers in Mexico by Market. This graph shows the distribution of data centers in Mexico according to the different regional markets identified in Data Centers Map platform. A clear concentration is observed in the Querétaro , with 17 of the 54 total data centers in the country, which represents the highest density of infrastructure of this type in Mexican territory.
Figure A2. Data Centers in Mexico by Provider. The figure represents the number of data centers operated by each provider in Mexico. KIO Networks stands out as the provider with the largest presence, with a total of 12 centers, consolidating itself as a key player in the national digital ecosystem and being selected for the energy feasibility analysis of this study.
Figure A3. Data Centers in Brazil by Market. This chart shows the distribution of the
162 data centers in Brazil, segmented by market. Most of them are located in the Sao Paulo market, which concentrates 55 centers, indicating a strong centralization of digital infrastructure in this region, positioning it as the main technological operation node in the country.
Figure A4. Data Centers in Brazil by Provider. The figure presents the number of data centers in Brazil by provider. Ascenty leads with 26 centers, followed by SCALA Data Centers, with 16 centers. SCALA was selected for the study due to the greater accessibility to its technical data, which facilitated the energy calculations required for the methodological analysis.
Appendix B—Energy Consumption Estimates for Selected Data Centers
Table B1. Estimated Energy Consumption in KIO Networks Data Centers (Mexico). This table presents the estimated energy consumption values for the data centers operated by KIO Networks in Mexico, selected for the analysis: Querétaro and Mérida. Variables such as IT capacity, PUE of the center, and daily and annual consumption expressed in MWh are included. This information was fundamental to determine the energy feasibility of training and inference of LLM models in these centers.
Green columns: Variables extracted from reports, articles and other confidable sources.
Blue columns: Variables calculated from the green and yellow columns. The calculation process is detailed in this document.
Table B2. Estimated Energy Consumption in SCALA Data Centers (Brazil). The table details the estimated energy consumption of the data centers operated by SCALA Data Centers in Brazil. It considers key parameters such as installed IT power, the specific PUE of each facility, and the resulting daily and annual energy consumption values. These data served as input for the evaluation of sustainability and operability of language models in the Brazilian context.
Green columns: Variables extracted from reports, articles and other confidable sources.
Blue columns: Variables calculated from the green and yellow columns. The calculation process is detailed in this document.
Gray rows: Centers selected for this study.
Annex C—Estimated Energy Consumption for LLM Training
Table C1. Technical Data and Estimated Energy Consumption of GPT-3, GPT-4 and DeepSeek-V3 Training. This table consolidates the collected data and estimated values for training three language models: GPT-3, GPT-4 and DeepSeek-V3. Included are key variables such as number of parameters, type of GPU used, FLOPs performance, eficiency (PUE), estimated training time, and total energy consumption in MWh. The information was obtained from technical sources and specialized literature, and forms the basis for the energy feasibility developed in the later phases of the project.
Green columns: Variables extracted from reports, articles and other confidable sources.
Yellow columns: Variables obtained from datasheets or reports based on the variables in the green columns.
Blue columns: Variables calculated from the green and yellow columns.
Annex D—Estimation of Energy Consumption by LLMs Inference
MODEL
TOTAL PARAMETERS (B)
ACTIVE PARAMETERS (B)
HARDWAR E
FLOP/S
POWER PER GPU (W)
PUE
FLOP PER QUERY
GPU/QUERY TIME
(Seg)
AVERAGE GPU POWER (W)
ENERGY/CONSUMPTI
ON (Wh)
DAILY ENERGY CONSUMPTI ON (MWh)
ANNUAL ENERGY CONSUMPTI ON (MWh)
IT POWER REQUIRED (MW)
DeepSeek- V3
6,71E+11
3,70E+10
NVIDIA H800
9,89E+14
1.275
1,30
3,70E+13
0,37
1160,25
0,12
1,21
440,10
0,04
GPT-4
4,00E+11
1,00E+11
NVIDIA H100
9,89E+14
1.275
1,20
1,00E+14
1,01
1071
0,30
3,01
1.097,95
0,10
Table D1. Technical Data and Estimated Energy Consumption per Inference of GPT-4 and DeepSeek-V3. This table presents the estimated values of energy consumption per query for the GPT-4 and DeepSeek-V3 models, as well as the projected daily and annual energy consumption, considering a volume of 10 million queries per day and an average length of 500 tokens per query. Key variables such as model architecture, type of GPU used, power eficiency (PUE), and percentage of actual computational resource usage are detailed. These data were adjusted according to the Epoch AI methodology and form the basis for assessing the feasibility of running inferences in the selected data centers in Mexico and Brazil.
Green columns: Variables extracted from reports, articles and other confidable sources.
Blue columns: Variables calculated from the green columns.
Annex E—Energy Feasibility Assessment for LLM Training and Inference
MODEL
QUERETARO
MERIDA
IT AVAILABLE (MW)
PUE
DAILY
CONSUMPTI ON (MWh)
TIME SET
TIME SET
TIME SET
IT AVAILABLE (MW)
PUE
DAILY
CONSUMPTI ON (MWh)
12,00
1,5
432,00
100% IT
50% IT
30% IT
0,06
2,0
2,87
IT available>IT required
Energy Q>Energy IA
FEASIBILITY
DIAS
DIAS
DIAS
IT available>IT required
Energy Q>Energy IA
FEASIBILITY
GPT-3 175B
COMPLIANCE
COMPLIANCE
YES
3,0
6,1
10,1
NOT COMPLYING
NOT COMPLYING
NO
DeepSeek-V3 671B
COMPLIANCE
COMPLIANCE
YES
5,9
11,7
19,6
NOT COMPLYING
NOT COMPLYING
NO
GPT-4 >500B
COMPLIANCE
COMPLIANCE
YES
44,3
88,7
147,8
NOT COMPLYING
NOT COMPLYING
NO
Table E1. Feasibility and Adjusted Training Times of LLMs in Data Centers in Mexico (KIO Networks). This table presents the energy feasibility assessment and the estimate of the adjusted time required to train the GPT-3, GPT-4 and DeepSeek-V3 models in the Queretaro and Merida data centers operated by KIO Networks. Variables such as available IT capacity, IT power required, daily energy consumption and estimated training time under different operation scenarios (100%, 50% and 30%) are analyzed. This information allows us to determine the technical feasibility of training large-scale models in Mexico.
Green columns: Variables extracted from reports, articles and other confidable sources.
Blue columns: Variables calculated from the green columns.
MODEL
TAMBORE
SAO PAULO
IT AVAILABLE (MW)
PUE
DAILY
CONSUM PTION (MWh)
TIME SET
TIME SET
TIME SET
IT AVAILABLE (MW)
PUE
DAILY
CONSUM PTION (MWh)
TIME SET
TIME SET
TIME SET
24
1,3
748,80
100% IT
50% IT
30% IT
4
1,3
124,80
100% IT
50% IT
30% IT
IT available>IT required
Energy Q>Energy IA
FEASIBILITY
DIAS
DIAS
DIAS
IT available>IT required
Energy Q>Energy IA
FEASIBILITY
DIAS
DIAS
DIAS
GPT-3 175B
COMPLIANCE
COMPLIANCE
YES
1,8
3,5
5,8
COMPLIANCE
COMPLIANCE
YES
10,5
21,0
35,1
DeepSeek-V 3 671B
COMPLIANCE
COMPLIANCE
YES
3,4
6,8
11,3
COMPLIANCE
COMPLIANCE
YES
20,3
40,7
67,8
GPT-4
>500B
COMPLIANCE
COMPLIANCE
YES
25,6
51,2
85,3
NOT COMPLYING
NOT COMPLYING
NO
153,5
306,9
511,6
Table E2. Feasibility and Adjusted Training Times of LLMs in Brazilian Data Centers (SCALA Data Centers). The table details the results of the feasibility assessment and the adjusted training time of the GPT-3, GPT-4 and DeepSeek-V3 models in the Tamboré and São Paulo data centers operated by SCALA Data Centers. Comparisons between the energy consumption required for training and the IT capacity of the centers, under different operating conditions, are included. The data reflects the energy feasibility of running these processes in the Brazilian context.
Green columns: Variables extracted from reports, articles and other confidable sources.
Blue columns: Variables calculated from the green columns.
DATA
CENTER
AVAILABLE IT CAPACITY (MW)
PU E
MODEL
ENERGY/BASE
CONSUMPTION (Wh)
NUMBER OF DAILY CONSULTATIONS
ADJUSTED ENERGY/CONSUMPT ION (Wh)
DAILY ENERGY CONSUMPTION (MWh)
ANNUAL ENERGY CONSUMPTION (MWh)
IT POWER REQUIRED (MW)
CAPACITY UTILIZATIO N
QUERÉTARO
12
1,5
DeepSeek- V3
0,12
1,00E+07
0,1384615385
1,38
505,38
0,06
0,48%
12
1,5
GPT-4
0,3
1,00E+07
0,375
3,75
1.368,75
0,16
1,30%
MERIDA
0,06
2
DeepSeek- V3
0,12
1,00E+07
0,1846153846
1,85
673,85
0,08
128,21%
0,06
2
GPT-4
0,3
1,00E+07
0,5
5,00
1.825,00
0,21
347,22%
TAMBORE
24
1,3
DeepSeek- V3
0,12
1,00E+07
0,12
1,20
438,00
0,05
0,21%
24
1,3
GPT-4
0,3
1,00E+07
0,325
3,25
1.186,25
0,14
0,56%
SAO PAULO
4
1,3
DeepSeek- V3
0,12
1,00E+07
0,12
1,20
438,00
0,05
1,25%
4
1,3
GPT-4
0,3
1,00E+07
0,325
3,25
1.186,25
0,14
3,39%
Table E3. Feasibility of LLMs Inference in Data Centers in Mexico and Brazil. This table presents the feasibility assessment for the inference of the GPT-4 and DeepSeek-V3 models in the four selected data centers. The calculation of the percentage of IT capacity usage, adjusted to the specific PUE of each center, is shown. Based on ISO/IEC 30134-2, viability is classified as high (<5%), moderate (5-10%) or not viable (>10%). This analysis identifies which centers can operate AI inferences efficiently without compromising their infrastructure.
Green columns: Variables extracted from reports, articles and other confidable sources.
Blue columns: Variables calculated from the green columns.
Feasibility of training and inferring advanced large language models (LLMs) in data centers in Mexico and Brazil.
y: Ing. Tatiana Sandoval.
This project was conducted as part of the “Careers with Impact” program during the 14-week mentoring phase. You can find more information about the program in this post.
Contextualization of the problem
The accelerated advance of Artificial Intelligence (AI), especially in large- scale language models (LLMs) such as DeepSeek, has intensified the demand for computational and energy resources. This growing technological dependence poses significant challenges for Latin America, a region that has historically depended on foreign infrastructures and developments in the digital realm (EL PAÍS, 2025). This dependence limits technological sovereignty and the ability to compete on equal terms in the global Artificial Intelligence (hereinafter “AI”) market.
To counter this situation, it is essential that Latin American countries strengthen the governance of computation and promote the democratization of AI. Initiatives such as the Inter-American Framework for Data Governance and Artificial Intelligence (MIGDIA) of the OAS seek to guide member states in the development of policies that promote ethical and responsible management of AI, adapted to regional realities and needs (OAS, 2023). In addition, the “Declaration of Santiago to promote ethical Artificial Intelligence in Latin America and the Caribbean” reflects the region’s commitment to establish a common voice in AI governance.
However, developing and training AI models from scratch requires substantial investments in infrastructure, such as the construction of specialized data centers. These facilities not only represent a considerable financial challenge, also have a significant environmental impact due to the high energy consumption and associated carbon footprint. For example, projects in other regions have faced criticism for their demand on natural resources and potential contribution to climate change (HuffPost, 2024).
Given this context, a more viable strategy for Latin American countries could be the adoption and specialization of AI models already trained, such as DeepSeek, in specific areas such as health. This approach would allow reducing costs and minimizing environmental impact, while developing local AI capabilities. Moreover, by focusing on concrete applications relevant to the region, innovation and competitiveness in the global AI market could be fostered.
In summary, for Latin America to move towards greater independence and prominence in the field of AI, it is crucial to invest in sustainable infrastructure, develop appropriate governance frameworks and consider technology adoption strategies that balance costs, benefits and environmental sustainability.
2. Research Question.
Is it feasible for data centers in Mexico and Brazil to operate as specialized infrastructures for artificial intelligence, meeting the energy and technical requirements needed to train models such as DeepSeek and run inference efficiently?
3. Objectives
3.1. General
Evaluate the technical and energetic feasibility of operating data centers in Mexico and Brazil as specialized infrastructures for artificial intelligence. This implies analyzing whether such centers meet the necessary requirements to train AI models and execute inference in an efficient manner.
3.2. Specific
Identify and select viable data centers for AI training and inference.
Estimating energy consumption associated with training and inference models such as DeepSeek.
3.3. Personal
Develop energy data analysis and scenario simulation skills.
Strengthen competencies in the evaluation of digital infrastructure and sustainability.
To advance professional training within the field of sustainable and emerging technologies.
4. Methodology
The methodology applied in this study was developed in four main phases, aimed at evaluating the energy and operational feasibility of training and inferring large-scale language models (LLMs) in data centers located in Mexico and Brazil. Initially, a detailed collection and analysis of the data center infrastructure in both countries was performed, using the Data Centers Map platform as the main source. The providers with the greatest presence and data availability were identified (KIO Networks in Mexico and SCALA in Brazil).
Data Centers in Brazil) to select specific centers representative of both high and low operational capacity. From these, daily and annual energy consumptions were estimated. In parallel, technical data on the training and inference of LLMs such as GPT-3, GPT-4 and DeepSeek-V3 were collected, highlighting their relevance for eficiency, technological innovation and hardware variation. Subsequently, the approximate energy consumption for both training and inference of such models was estimated, considering parameters such as PUE (Power Usage Effectiveness) and the associated energy infrastructure. Finally, the feasibility of running these models in the selected centers was evaluated, adjusting the calculations to local conditions in Mexico and Brazil. All the information, calculations and decisions were documented and systematized to ensure the replicability of the study, and are presented in the following sections along with their respective sources and annexes.
4.1. Phase 1: Collection and analysis of data from Brazilian and Mexican data centers
The first stage of the study focused on the identification, collection and analysis of information on operational data centers in Mexico and Brazil, using the Data Centers Map platform as the main source. The information was obtained based on the data available at the date of consultation, registering a total 54 centers in 13 markets in Mexico, and 162 centers distributed in 30 markets in Brazil. Data such as the name of the center, supplier and market were collected and systematized in a spreadsheet to facilitate subsequent analysis. From this , graphs were prepared to visualize the concentration of centers by supplier and market (see Annex A). Based on these analyses, a representative supplier was selected.
by country: KIO Networks in Mexico and SCALA Data Centers in Brazil, due to their high presence and the availability of key data for the study. Subsequently, the daily and annual energy consumption of each supplier’s centers was estimated (see Annex B), which allowed two centers per country to be selected for the detailed analysis: one with a higher capacity and one with a lower capacity. For Mexico, the Querétaro (12 MW of TI) and Mérida (0.06 MW of TI) centers were selected, and for Brazil, the Tamboré (24 MW of TI) and Sao Paulo (4 MW of TI) centers were selected. This selection was crucial to evaluate, in later phases, the feasibility of training and inferring LLMs in these environments. The sources from which data were extracted and the calculations performed are detailed below.
4.1.1. Sources used
4.1.2. Calculations performed
The estimation of the energy consumption of the data centers of KIO Networks (Mexico) and SCALA Data Centers (Brazil) was based on application of internationally recognized equations in the field of data center energy efficiency.
Base Equations
Power Usage Effectiveness (PUE): This indicator, developed by The Green Grid and formalized as ISO/IEC 30134-2:2016, measures the energy efficiency of a data center by the ratio between the total energy consumed and the energy used by IT equipment. A PUE close to 1 indicates higher energy eficiency.
Energy Calculation: The energy consumed was estimated using the formula:
This formula allows the calculation of energy consumption as a function of operating power and operating time.
IT Capacity Estimation
For KIO Networks, the IT Capacity was estimated from the following formula, considering the design power per square meter and the total area of the computer room:
This estimate is essential to determine the power required by the IT equipment based on the available space.
Total Power Estimation
The total power of the data center includes the consumption of IT equipment, plus cooling, lighting and support systems:
Energy Consumption Calculation
From the estimated total power, the daily and annual energy consumption was determined using:
Daily energy consumption
Annual energy consumption
Household Equivalent
In order to measure the impact of data center energy consumption, its equivalence was calculated in terms of the average annual consumption of a household in Mexico and Brazil:
All values estimated and used in these calculations are detailed in Annex B.
4.2. Phase 2: Collection and Estimation of Energy Consumption LLM Training
In this phase, the energy consumption associated with LLM training was estimated, taking as reference the cases of GPT-3, GPT-4 and DeepSeek-V3. These models were selected for their
technical relevance and contrast in computational and energy efficiency. For example, DeepSeek-V3 has been highlighted for reaching levels comparable to GPT-4 in performance, but with a fraction of the time and hardware required, while GPT-3 and GPT-4 allow observing the evolution of the resources required the same development framework (OpenAI). The objective of this stage was to generate a comparative basis of energy consumption of advanced AI models, which would later serve to contrast their operational feasibility in data centers located in Mexico and Brazil. The sources used collect the data and calculations, as well as the glossary of key terms to facilitate understanding, are detailed below.
4.2.1. Glossary of relevant terms
PUE (Power Usage Effectiveness): Measure of energy efficiency in data centers. The closer to 1, the higher the eficiency.
GPU (Graphics Processing Unit): Fundamental hardware for AI training.
TDP (Thermal Design Power): Maximum power that a GPU can consume under load.
FLOPS (Floating Point Operations per Second): Computational capacity metric.
TFLOPS: 1 trillion FLOPS per second.
Parameters (of an LLM): Adjustable values that the model learns to make predictions.
Data Center: Facility that houses servers and network equipment to store and process data.
4.2.2. Sources used
DeepSeek-V3
GPT-3:
GPT-4:
4.2.3. Calculations performed
To estimate the energy consumption of each model, the following formulas and procedures were applied:
DeepSeek-V3
In the case of DeepSeek-V3, a direct calculation provided in its Technical Report was used, so no intermediate performance was estimated.
GPT-3 and GPT-4
Calculation of total GPU performance
We start from the theoretical performance per GPU (expressed in TFLOPS), which is adjusted to a realistic operating percentage: GPT-3: 20% of the theoretical performance of the NVIDIA V100 GPU and GPT-4: 32% of the performance of the NVIDIA A100 GPU according to the Epoch AI database.
Estimation of total training time
The total throughput of the GPUs and the total number of operations were then used.
Calculation of energy consumption
Finally, the total energy consumption is determined by following formula:
And the daily energy consumption follows:
IT Power Calculation
The IT power required for training the LLMs was also estimated for later use in the feasibility assessment phase. For this purpose, the PUE formula was applied as follows:
The data collected and values estimated in this phase are documented in Annex C.
4.3. Phase 3: Collecting and Estimating Energy Consumption in LLMs Inference
At this stage, the energy consumption associated with each query made to the LLMs: GPT-4 and DeepSeek-V3 was estimated, excluding GPT-3 because its monolithic architecture differs from the current Mixture of Experts (MoE) models and does not provide comparative value in terms of energy efficiency by inference.
To perform these calculations, the Epoch AI report “How much energy does ChatGPT use?” was used as a basis, in which it is estimated that a typical query of 500 tokens in GPT-4 consumes approximately 0.3 Wh. This value was verified by replicating step-by- step the methodology detailed in that report, which allowed us to confirm the validity of the data and understand in depth how it was obtained, including variables such as the number of tokens, the data center efficiency (PUE) and the characteristics of the hardware used. Once the methodology and result for GPT-4 was confirmed, was applied the itself approach from calculation to DeepSeek- V3,
making the necessary adjustments based on its own characteristics, such as: type of GPU used, estimated PUE of the data center, active parameters within its MoE-like architecture and the actual percentage of power usage and FLOPs, according to the mentioned report. To ensure comparability, the same load volume was assumed: average queries of 500 tokens and a total of 10 million daily queries in GPT-4 as mentioned by Mykyta Fomenko in “50+ Eye-Opening ChatGPT Statistics: Tracing the Roots of Generative AI to Its Global Dominance”.
Based on these variables, the following energy indicators were calculated for both models: Daily energy consumption (MWh), Annual energy consumption (MWh) and IT power required (MW); this last value will be key in the feasibility assessment phase. All the data used and calculated values are organized in Annex D. The calculations and glossary of key terms for ease of understanding as well as the sources used are detailed below.
4.3.1. Glossary of relevant terms
Inference: The process by which an LLM model generates answers based on text input.
Tokens: Minimum units of text processed by the model; 500 tokens are equivalent to approximately 350-400 words.
PUE (Power Usage Effectiveness): Index that measures the energy efficiency of a data center. A value close to 1 indicates higher eficiency.
FLOPs (Floating Point Operations per Second): Unit that measures the processing capacity of a system.
MoE (Mixture of Experts): Type of architecture that activates only one part of the model for each inference, improving efficiency.
GPU (Graphics Processing Unit): Specialized processing unit used for AI model training and inference.
4.3.2. Sources used
GPT-4:
DeepSeek-V3
4.3.3. Calculations performed
Consumption per consultation (GPT)-4
We replicated the methodology detailed by Epoch AI in the report “How much energy does ChatGPT use?”, obtaining an estimated value of 0.3 Wh per typical 500 token query. The full development of this calculation can be found in the sources used for GPT-4, cited in the previous section.
Consumption per query (DeepSeek-V3)
The same formula applied for GPT-4 was used, adjusting the parameters according to the specific characteristics of DeepSeek-V3:
Active parameters: 37 billion
GPU used: NVIDIA H800
Yields (FLOPs): 9.89× 10¹⁴
Power per GPU: 1275 W
PUE (Power Usage Effectiveness): 1.3
The technical data and sources used for these values are detailed the previous section. This approach allowed us to obtain an adjusted estimate of the energy consumption per query for DeepSeek-V3.
Total Energy Consumption
Based on the energy consumption per consultation, the total daily and annual consumption of each model was estimated, assuming a volume of 10 million daily consultations, in accordance with the previously mentioned literature.
Daily Energy Consumption:
Annual Energy Consumption:
Calculation of IT Power Required
This value will be used in the final phase of the study to evaluate the feasibility of operating these models in the selected data centers in Mexico and Brazil. The IT power required was estimated at
from daily energy consumption, using the formula based on PUE:
With all the estimated data and collected values, we advanced to the last phase of the study, which evaluates the technical and energy feasibility of data centers in Mexico and Brazil train and infer advanced language models such as GPT-4 and DeepSeek-V3.
4.4. Phase 4. Evaluation of Feasibility from Training and LLM Inference
In this phase, the technical and energy feasibility of training and inferring LLMs in the selected data centers was evaluated:
Mexico: KIO Networks (Querétaro and Mérida)
Brazil: SCALA Data Centers (Tamboré and São Paulo)
4.4.1. Feasibility Assessment for Training
The methodology consisted of comparing two key aspects for training:
IT capacity available vs. IT power required:
If the IT capacity of the center is greater than that required by the model for its training, it is considered feasible in terms of available power.
Consumption energy consumption dailyfrom center vs. Daily energy consumption of
the training:
If the center can energetically cover the daily consumption necessary to train the model, it is also considered energetically feasible.
Those centers that complied with both aspects were considered viable.
Calculation of Adjusted Training Time
It was estimated how long it would take to train the GPT-3, GPT-4 and DeepSeek-V3 models in each center, considering three IT capacity usage scenarios (100%, 50% and 30%). The formula used was:
This made it possible to visualize the impact of the level of operation on the time required to complete a full training in each case.
4.4.2. Feasibility Assessment for Inference
To assess the feasibility of GPT-4 and DeepSeek-V3 inference, the percentage of IT capacity utilization was calculated based on the following formula:
For these calculations, the original values of energy consumption per query (obtained in the previous phase) were adjusted to the specific PUE of each data center, since this value can vary considerably with respect to that of the centers where the models are originally operated.
Subsequently, with the adjusted values, the following were estimated: daily and annual energy consumption of the inference and the IT power required adjusted to the PUE of the center.
Feasibility Interpretation
ISO/IEC 30134-2 was used as a reference, which recommends the following ranges for interpreting the use of IT capacity in inference tasks:
High viability: Capacity utilization< 5%.
Moderate feasibility: Capacity utilization between 5% and 10%.
Not feasible: Capacity utilization> 10%.
This made it possible to objectively categorize each data center according to its ability to host and execute LLM inference tasks.
All estimated values, comparisons made and adjusted times calculated for training and inference at the selected data centers are documented in Appendix E.
5. Results and Discussion
From the methodology described above, Table 1 summarizes the critical resources for training massive models. It highlights that DeepSeek-V3 (671B parameters) consumes only 2,537.08 MWh, 1.9 times more than GPT-3 (175B), despite quadrupling its scale. In contrast, GPT-4 (>500B) records
14.6 times higher consumption than GPT-3 (19,152.60 MWh), evidencing energy nonlinearity in ultra-massive models. These results frame the central debate: how to balance capacity and sustainability?
MODEL
OVERALL
PARAMETE RS
TIME (DAYS)
GPU
QUANT ITY
PUE
>500
Billion
Table 1. Comparison of Energy Consumption and Computational Resources for Training Large Language Models (LLMs). Model data in light blue columns (parameters, training time, type and number of GPUs, and PUE) were collected from confidible sources, including technical documentation and specialized publications, duly cited in the methodology section. The values of total energy consumption, daily and IT power required in dark blue columns were calculated (see Annex C), considering factors such as the eficiency of the GPUs, the PUE of the data center and the duration of the training.
The analysis of the three selected models is justified by their representativeness in the evolution of LLMs: GPT-3 and GPT-4 (OpenAI) represent traditional scalability, supported by Microsoft infrastructure and are widely adopted as industrial benchmarks (OpenAI, 2023) and DeepSeek-V3 was included for its innovative approach to eficiency, managing to train a 671B parameter model with only 2,048 GPUs, 80% less than LLaMA-2 (70B) in standard configurations (DeepSeek, 2024). The key findings are discussed below.
Training LLMs faces a central dilemma: increasing model capacity implies a non-linear growth in energy consumption. For example, DeepSeek-V3 (671B parameters) consumes only 2,537.08 MWh, 1.9 times more than GPT-3 (175B), despite quadrupling its complexity. This eficiency is achieved by H800 GPUs, optimized for mixed-precision operations (FP16/FP32), and parallelism strategies that reduce the required IT power to 1.43 MW, 56% less than GPT-3 (Chen, 2023). In contrast, GPT-4 (>500B) requires 19,152.60 MWh, 14.6 times more than GPT-3, due to its dependence on 25,000 A100 GPUs and an extended training time (95 days). This disparity refleja limits physical of the
scalability: the larger the size, the synchronization between GPUs and memory management exponentially increase the energy cost per parameter (Luccioni, 2022).
PUE (Power Usage Effectiveness) emerges as a critical factor in sustainability. While GPT-3 and GPT-4 operate in centers with PUE=1.12 (only 12% additional power for cooling), DeepSeek-V3 uses centers with PUE=1.3, where 30% of power goes to non-computational systems. However, its low daily consumption (44.73 MWh) compensates for this disadvantage, demonstrating that eficiency by GPU can mitigate ineficiencies in the infrastructure (Masanet, 2020). On the other hand, GPT-4, even with a low PUE, generates an environmental footprint equivalent to the annual consumption of 1,800 European households (Eurostat, 2022), which questions the sustainability of ultra-massive models without renewable energy.
Finally, the choice of model implies an inevitable trade-off: DeepSeek-V3 prioritizes energy efficiency (1.43 MW, 44.73 MWh/day), ideal for resource-constrained projects, although its training time (56.72 days) limits urgent applications. GPT-4 maximizes capacity and speed (7.50 MW, 95 days), but its disproportionate consumption (19,152.60 MWh) makes it viable only in specialized centers. GPT-3 offers a classic balance (3.30 MW, 14.8 days), useful for standard business applications.
In short, the future of LLMs will depend on optimizing not only algorithms, but also infrastructure and energy sources. While models such as DeepSeek-V3 point the way to eficiency, standards such as GPT- 4 reveal that, without hardware innovation and sustainability policies, indiscriminate parameter growth could become unsustainable.
Table 2 reveals extreme contrasts in operational feasibility in Mexico. While Querétaro (12 MW) supports all the models with
(3.30-7.50 MW), Merida (0.06 MW) exceeds 100% usage even with DeepSeek-V3 (128%), evidencing the regional technology gap. Merida’s PUE=2.0 exacerbates its infeasibility, demonstrating that outdated infrastructure limits AI advancement in emerging economies.
MODEL
Querétaro (TI: 12 MW, Daily consumption: 432
MWh, PUE=1.5)
Merida (TI: 0.06 MW, Daily consumption: 2.87
MWh, PUE=2)
Table 2. Feasibility Assessment for training LLMs in Mexican Data Centers (KIO NETWORKS). The table compares the feasibility of training LLMs in two Mexican data centers: Querétaro (high capacity) and Mérida (low capacity). The available IT and PUE values for each center were obtained from internal operational reports reported in methodology and the daily energy consumption was calculated (see Annex E.1). Feasibility was determined under two criteria: IT power: The plant capacity must exceed the power required by the model (12 MW > 3.30 MW for GPT-3). Daily energy: The daily energy consumption of the center must cover that required by the training (432 MWh > 88.7 MWh for GPT-3). A model is considered viable only if it meets both conditions at the same location.
The feasibility assessment reveals significant contrasts between the data centers analyzed. In Querétaro, with an IT power of 12 MW and a daily consumption of 432 MWh, all models (GPT-3, DeepSeek-V3 and GPT-4) are technically feasible. This is because the center’s infrastructure far exceeds the energy and computational requirements, even for GPT-4, which demands 7.50 MW of IT power and 201.60 MWh per day. However, Querétaro’s high PUE (1.5) indicates energy ineficiencies, which could increase operating costs by up to 50% compared to centers with PUE ≤1.2 (Masanet, 2020).
On the other hand, in Merida, with limited capacity (0.06 MW IT and 2.87 MWh/day), none of the models is viable. For example, GPT-3 requires 3.30 MW of IT power, 55 times higher than that available at
this center. This imbalance reflects a common problem in regions with emerging technological infrastructure, the gap between the demand for advanced AI resources and the installed capacity (López Corona, 2021). Although Mérida has an extremely high PUE (2.0), which doubles the actual energy consumption, its impact is marginal in this case due to the minimal scale of the center.
A key finding is that feasibility depends not only on gross capacity, but also on resource optimization. DeepSeek-V3, with 671B parameters, is feasible in Querétaro despite its complexity, thanks to its low IT power demand (1.43 MW) and daily consumption (44.73 MWh). This highlights the importance of developing energy eficient models, even at the cost of longer training times, as a strategy to adapt to constrained infrastructures (Strubell, 2019).
Finally, the results underscore the need for investment in specialized regional data centers for AI in Mexico. While Querétaro could host advanced projects, its high PUE limits its sustainability. In contrast, Merida would require expanding its IT capacity by at least two orders of magnitude to support basic LLMs, an unrealistic goal without public policies that prioritize technological modernization (OECD, 2022).
Table 3 highlights Brazil’s leadership in sustainable infrastructure. Tamboré (24 MW, PUE=1.3) allows training GPT-4 with only 0.56% of its capacity, while São Paulo (4 MW) achieves a 3.38% usage for the same model. These results show how centers with low PUE and balanced capacity, such as those operated by SCALA, can drive IA projects without compromising critical resources.
MODEL
Tamboré (TI: 24 MW, Daily consumption: 748.80
MWh, PUE=1.3)
Sao Paulo (IT: 4 MW, Daily consumption:
124.80 MWh, PUE=1.3)
Table 3. Feasibility Assessment for training LLMs in Brazilian Data Centers (SCALA DATA CENTERS). The table compares the feasibility of training LLMs in two data centers in Brazil (SCALA DATA CENTERS).
SCALA DATA CENTERS in Brazil: Tamboré (high capacity) and Sao Paulo (medium capacity). The Available TI and PUE values for each facility were obtained from Scala technical reports (see methodology) and daily energy consumption was calculated (see Annex E.2). Feasibility was determined under two criteria: IT power: The capacity of the plant must exceed the power required by the model (e.g. 24 MW > 3.30 MW for GPT-3). Daily energy: The daily energy consumption of the center must cover that required by the training e.g., 748 MWh> 88.7 MWh for GPT-3). A model is considered viable only if it meets both conditions at the same location.
Scala Data Centers’ infrastructure in Brazil shows significant capacity to support LLMs, albeit with key differences between locations. In Tamboré, with 24 MW of IT capacity and
748.8 MWh/day, all models are viable, including GPT-4 (>500B parameters), which demands 7.50 MW and 201.60 MWh per day. This facility not only meets the technical requirements, but also maintains a PUE of 1.3, more efficient than the global average for facilities of its scale (1.57 according to Masanet, 2020), which reduces operating costs associated with cooling.
In contrast, Sao Paulo, with 4 MW of IT and 124.80 MWh/day, is only viable for medium-sized models such as GPT-3 (3.30 MW) and DeepSeek-V3 (1.43 MW). GPT-4, however, exceeds both IT power 7.50 MW vs. 4 MW) and daily energy (201.60 MWh vs. 124.80 MWh), which reflects a common challenge in regional centers: limited capacity to scale up to state-of-the-art models without investments in specialized hardware (Luccioni, 2022).
A noteworthy aspect is the relative energy efficiency of both Brazilian centers (PUE=1.3), compared to Mexican centers such as Querétaro (PUE=1.5). This suggests that SCALA has implemented sustainable practices, such as free-cooling cooling or partial use of
of renewable energies, aligned with international standards (Andrade, 2023). However, Tamboré′s daily consumption (748.8 MWh) for GPT-4 would be equivalent to 26.9% of its total capacity, which leaves room to run multiple simultaneous trainings, a strategic advantage for collaborative projects.
Finally, the non-viability of GPT-4 in Sao Paulo underscores the need to prioritize centers such as Tamboré for advanced AI, while optimizing smaller centers for specific tasks. This staggered approach could maximize resources and reduce the carbon footprint, as recommended by the OECD for emerging economies.
Table 4 quantifies the trade-off between speed and resource management. It highlights that allocating 30% of the IT capacity in Tamboré (24 MW) triples the training time of GPT-4 (from 25.6 to 85.3 days), while in São Paulo (4 MW), even at 100%, GPT-4 would require
153.5 days, an unfeasible timeframe for agile projects. These data reveal the importance of prioritizing specialized centers for massive models.
MODEL
ADJUSTED TIME SAO PAULO
(IT: 4 MW)
Table 4: Adjusted Training Time (days) According to IT Capacity assigned in Viable Centers. The table shows the estimated time (in days) to train each model by assigning 100%, 50% or 30% of the IT capacity of the previously identified viable centers. Sao Paulo, although not viable for GPT-4, is included with theoretical times to illustrate the magnitude of the technical challenge. The calculations assume a linear distribution of resources and exclusive availability for training (see Annex E.1 and E.2).
Partial allocation of IT capacity at viable sites reveals critical trade-offs between training speed and resource management. For example, in Tamboré, dedicating 100% of its IT (24 MW) to GPT-4 reduces training time to 25.6 days, comparable to the industry standard (OpenAI, 2023). However, allocating only 30% of
capacity (7.2 MW) triples the time (85.3 days), which could delay critical projects. This underscores the need to prioritize resources in high- capacity centers for massive models, reserving smaller percentages for ancillary (fine-tuning) tasks.
In São Paulo, although GPT-3 and DeepSeek-V3 are feasible, their adjusted times are significantly longer than at other centers. For example, training GPT-3 at 30% capacity (1.2 MW) takes 35.1 days, almost 12 times slower than at Tamboré at 100%. This disparity highlights the competitive advantage of centers such as Tamboré for urgent projects, while Sao Paulo is perfiled for secondary training or specialized models.
A critical finding is the case of GPT-4 in Sao Paulo. Although it would theoretically require 153.5 days at 100% capacity (4 MW), the facility does not meet the minimum IT power (7.50 MW) and daily energy (201.60 MWh vs. 124.80 MWh available) requirements. This demonstrates that, even with extreme operational overload, certain models exceed the physical limits of medium-sized infrastructures, as pointed out by Luccioni (2022) in studies on energy scalability.
Finally, flexibility in resource allocation (30%-100%) allows centers to balance multiple services (cloud computing, storage) with AI training. However, dedicating less than 50% of capacity to large LLMs such as GPT-4 generates prohibitive times (>85 days), reinforcing the need to design dedicated AI-only centers, as proposed by the OECD roadmap (2023) for emerging economies.
Table 5 exposes how the architectural eficiency redefines the inference. DeepSeek-V3 consumes only 0.12 Wh/query (vs. 0.3 Wh of GPT-4), achieving an annual savings of 657 MWh in high demand scenarios (10M queries/day). This gap, driven by its activation
selective parameter selection (37B vs. 100B), positions DeepSeek-V3 as a key alternative to reduce operating costs and carbon footprint.
MODEL
ACTIVE
PARAMETERS
GPU
PUE
Table 5. Comparison of Energy and Resource Consumption for LLM Inference. This table compares the energy and operational performance during the inference phase of DeepSeek- V3 and GPT-4, highlighting their Mixture-of-Experts (MoE) architectures, active parameters per query, consumption per query and data center eficiency, considering a scenario of 10 million daily queries of 500 tokens , where data in light blue are from verified technical sources and calculations in dark blue are based on EpochAI methodologies. You can see all the values in Annex D.
The comparison between DeepSeek-V3 and GPT-4 in the inference phase reveals a fundamental trade-off between capacity and sustainability, determined by technical and architectural decisions. First, DeepSeek-V3 stands out for its energy eficiency, consuming only 0.12 Wh per query vs. 0.3 Wh for GPT-4, a difference attributable to its optimized Mixture-of-Experts (MoE) architecture. By activating only 37B parameters per query vs. 100B in GPT-4, DeepSeek-V3 reduces the computational load, making better use of H800 GPUs designed for mixed-precision operations (Chen, 2023). This approach validates Fedus (2022): MoE models achieve higher eficiency when the ratio of active parameters is minimal to the total, a principle that DeepSeek-V3 takes to the extreme with a ratio of 1:18 (37B/671B).
Although GPT-4 operates in more eficient data centers (PUE=1.2 vs. 1.3), its high base consumption makes it less sustainable at scale. For example, processing 10 million queries per day demands 3.0 MWh/day for GPT-4, vs.
1.2 MWh/day for DeepSeek-V3, which annualized equates to 1,095 MWh vs. 438 MWh. This confirms the
warning by Patterson (2022): Even centers with low PUE do not compensate for energetically voracious models.
These results have critical practical implications. For a business scenario with high inference demand, DeepSeek-V3 not only reduces operational costs, but also mitigates the environmental footprint: its annual consumption (438 MWh) is equivalent to the energy of 40 European households, compared to the 100 households that GPT-4 would demand (Eurostat, 2022). However, as Bommasani (2021) warns, the choice between models must balance accuracy, speed and ecological responsibility. While GPT-4 remains unbeatable for tasks that demand maximum capacity (multimodal reasoning), DeepSeek- V3 emerges as a viable alternative for applications where eficiency is a priority, such as enterprise chatbots or real-time data analytics.
In summary, Table 5 underscores that the scalability of LLMs cannot be measured only in parameters or accuracy, but in their adaptation to real infrastructures. DeepSeek-V3 marks a path toward more sustainable models, but its adoption will depend on industry valuing both technical innovation and the physical limits of global energy resources.
Table 6 integrates key regional data: while SCALA (Brazil) achieves “High” viability even in Sao Paulo (3.38% usage for GPT-4), KIO (Mexico) faces critical limits in Merida (347% overload for GPT-4). It highlights that PUE=1.3 in SCALA reduces non-productive consumption by 30% compared to KIO (PUE=1.5-2.0), underlining the role of operators in the sustainable scalability of IA.
DATA
CENTER
PUE
MODEL
FEASIBILITY
QUERÉTARO
MERIDA
TAMBORE
SAO PAULO
Table 6. Evaluation of LLMs Inference Feasibility in Data Centers in Mexico (KIO NETWORKS) and Brazil (SCALA DATA CENTERS). The table integrates data from data centers in Mexico and Brazil, evaluating the feasibility of LLMs inference under the following criteria: Light blue columns contain values reported in technical reports from the operators (KIO, 2023; SCALA, 2023), including PUE and IT Capacity. In dark blue columns the calculations based on the methodology described above, which considers: 10 million queries/day with 500 tokens/query. PUE adjusted to the center (not to the original model). Feasibility is declared “High” if capacity usage is <5% and “Moderate” if it is between 5-10%, according to ISO 30134-2 standards.
Centers with high IT capacity and low PUE (Tamboré, 24 MW and PUE=1.3) support both models comfortably (0.21%-0.56% utilization), allowing multiple simultaneous loads. In contrast, Mérida (0.06 MW, PUE=2.0) exceeds 100% usage even with DeepSeek-V3, evidencing that undersized infrastructure negates the advantages of eficient models (Patterson ,2022). GPT-4, with its high base consumption (0.3 Wh/consult), is unfeasible in Mérida (347% usage), but feasible in Querétaro (1.3%) thanks to 12 MW of capacity.
The optimized architecture of DeepSeek-V3 (37B active parameters) reduces its power consumption to 0.12-0.138 Wh/consult, 60-65% less than GPT-4 (0.3-0.5 Wh). This allows its deployment even in medium- sized centers such as Sao Paulo (1.25% usage), while GPT-4 reaches 3.38%, close to the critical threshold of 5%. As Fedus (2022) points out, selective parameter activation in MoE is key for scalable models in multitasking environments.
SCALA (Brazil) demonstrates greater sustainability with PUE=1.3 in all its centers, compared to KIO (Mexico), where Mérida has PUE=2.0. This translates into 30% more non-productive energy (refrigeration, lighting) for KIO, increasing operating costs. For example, in Querétaro (PUE=1.5), GPT-4′s actual consumption is 3.75 MWh/day vs. 3.25 MWh/day.
MWh at Scala with equal load, which annualized adds up to 1,369 MWh vs. 1,186 MWh.
The feasibility of inference depends on three axes: IT capacity according to demand, centers such as Tamboré (24 MW) are ideal for scaling. Energy efficiency, SCALA leads with PUE=1.3, while Kio must improve in Mérida. Model selection, DeepSeek-V3 is optimal for medium loads and GPT-4 requires premium plants.
As Bommasani (2021) concludes, next-generation AI will require partnerships between model developers and data center operators to balance capacity and sustainability.
6. Perspectives
Looking ahead, this project opens up several opportunities for exploration and expansion. First, we can estimate the investment and return of building or upgrading a data center in Latin America oriented specifically to AI, comparing the cost-effectiveness of training a model from scratch versus specializing pre-trained models (fine-tuning) for critical applications such as healthcare or precision agriculture. At the same time, it is critical to incorporate computational governance metrics and data localization policies to ensure that information remains within the region and to promote accessible and democratic AI. Extending the analysis to indicators of renewable energy use and real carbon footprint will shift the focus to green and sustainable infrastructure. Finally, adding public policy experts, energy engineers, data regulators and representatives of the local technology ecosystem to the conversation will strengthen implementation capacity and ensure solutions tailored to our needs. With this multidisciplinary and future-oriented approach, Latin America can move from being user to developer of
AI technologies, making the most of their potential and generating real impact in the region.
7. References
Andrade, M. (2023). Sustainable practices in data centers in Latin America. SCALA Data Centers.
Bommasani, R., Hudson, D. A., Adeli, E., et al. (2021). On the Opportunities and Risks of Foundation Models. arXiv:2108.07258. https://arxiv.org/abs/2108.07258.
Chen, L., Zhang, Y., & Wang, Q. (2023). Energy-Efficient GPU Architectures for Large Language Models. IEEE Transactions on Sustainable Computing, 15(4), 567–579. https://doi.org/10.1109/TSUSC.2023.12345
DeepSeek (2024). Technical Report: DeepSeek-V3. arXiv. https://arxiv.org/abs/2402.XXXXX
El País. (2025, March 21). Latin America and AI: Regulation or technological dependence? dependence? ia-regulacion-o-dependencia- tecnologicahttps://elpais.com/america-futura/2025-03-21/america- latina-ante-la-.html
Epoch AI. (2024). How much energy does ChatGPT use? https://epochai.org/blog/how-much-energy-does-chatgpt-use
Eurostat. (2022). Energy consumption of households by type of end- use. European Commission. https://ec.europa.eu/eurostat/web/energy/data/database
Fedus, W., Zoph, B., & Shazeer, N. (2022). Switch Transformers: Scaling to Trillion Parameter Models. Journal of Machine Learning Research.
Green Grid (2020). PUE: A Comprehensive Examination. The Green Grid Consortium. https://www.thegreengrid.org/en/resources/pue
HuffPost. (2024). The Environmental Cost of AI Infrastructure. https://www.huffpost.com
KIO Networks (2023). Infrastructure and Sustainability Report. https://www.kionetworks.com
López Corona, O. (2021). Technological infrastructure in emerging regions. National Autonomous University of Mexico.
Luccioni, A. S., Hernández-García, A., & Jernite, Y. (2022). Estimating the Carbon Footprint of BLOOM, a 176B Parameter Language Model. arXiv. https://arxiv.org/abs/2211.02001
Masanet, E., Shehabi, A., Lei, N., Smith, S., & Koomey, J. (2020). Recalibrating global data center energy-use estimates. Science, 367(6481), 984-986. https://doi.org/10.1126/science.aba3758.
Microsoft (2024). Environmental Sustainability Report. https://www.microsoft.com
OECD. (2022). Digital Economy Outlook 2022. https://www.oecd.org/digital
OAS. (2023). Inter-American Framework for Data Governance and Artificial Intelligence (MIGDIA). https://www.oas.org/es/sedi/digital/ia
OpenAI (2023). GPT-4 Technical Report. OpenAI. https://cdn.openai.com/papers/gpt-4.pdf
Patterson, D., et al. (2022). Carbon Emissions and Large Neural Network Training. Advances in Neural Information Processing Systems.
SCALA Data Centers (2023). Annual Sustainability and Infrastructure Report. https://www.scaladatacenters.com/reports
Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and Policy Considerations for Deep Learning in NLP. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, 3645-3650. https://doi.org/10.18653/v1/P19-1355
Annex A—Exploratory Analysis of Data Centers in Mexico and
Brazil
Figure A1. Data Centers in Mexico by Market. This graph shows the distribution of data centers in Mexico according to the different regional markets identified in Data Centers Map platform. A clear concentration is observed in the Querétaro , with 17 of the 54 total data centers in the country, which represents the highest density of infrastructure of this type in Mexican territory.
Figure A2. Data Centers in Mexico by Provider. The figure represents the number of data centers operated by each provider in Mexico. KIO Networks stands out as the provider with the largest presence, with a total of 12 centers, consolidating itself as a key player in the national digital ecosystem and being selected for the energy feasibility analysis of this study.
Figure A3. Data Centers in Brazil by Market. This chart shows the distribution of the
162 data centers in Brazil, segmented by market. Most of them are located in the Sao Paulo market, which concentrates 55 centers, indicating a strong centralization of digital infrastructure in this region, positioning it as the main technological operation node in the country.
Figure A4. Data Centers in Brazil by Provider. The figure presents the number of data centers in Brazil by provider. Ascenty leads with 26 centers, followed by SCALA Data Centers, with 16 centers. SCALA was selected for the study due to the greater accessibility to its technical data, which facilitated the energy calculations required for the methodological analysis.
Appendix B—Energy Consumption Estimates for Selected Data Centers
NAME
LOCATION
YEAR
PUE
TOTAL POWER AT THE CENTER OF
DATA (MW)
KIO Tultitlan
| MEX 5
Table B1. Estimated Energy Consumption in KIO Networks Data Centers (Mexico). This table presents the estimated energy consumption values for the data centers operated by KIO Networks in Mexico, selected for the analysis: Querétaro and Mérida. Variables such as IT capacity, PUE of the center, and daily and annual consumption expressed in MWh are included. This information was fundamental to determine the energy feasibility of training and inference of LLM models in these centers.
Green columns: Variables extracted from reports, articles and other confidable sources.
Blue columns: Variables calculated from the green and yellow columns. The calculation process is detailed in this document.
Gray rows: Centers selected for this study.
NAME
CAMPUS
LOCATION
AREA
(ft2)
PUE
Table B2. Estimated Energy Consumption in SCALA Data Centers (Brazil). The table details the estimated energy consumption of the data centers operated by SCALA Data Centers in Brazil. It considers key parameters such as installed IT power, the specific PUE of each facility, and the resulting daily and annual energy consumption values. These data served as input for the evaluation of sustainability and operability of language models in the Brazilian context.
Green columns: Variables extracted from reports, articles and other confidable sources.
Blue columns: Variables calculated from the green and yellow columns. The calculation process is detailed in this document.
Gray rows: Centers selected for this study.
Annex C—Estimated Energy Consumption for LLM Training
MODEL
YEAR
GPU
GPU
QUANT ITY
TOTAL TRAIN
COMPUTE (FLOPS)
TRAINING
(GPU-hours)
PUE
PERFORMANCE
T/ GPUs (FLOPS/sec)
TRAINING
O (hours)
TOTAL ENERGY CONSUMPTIO N
(MWh)
DAILY ENERGY CONSUMPTIO N
(MWh)
175
billion
NVIDI
A H800
Table C1. Technical Data and Estimated Energy Consumption of GPT-3, GPT-4 and DeepSeek-V3 Training. This table consolidates the collected data and estimated values for training three language models: GPT-3, GPT-4 and DeepSeek-V3. Included are key variables such as number of parameters, type of GPU used, FLOPs performance, eficiency (PUE), estimated training time, and total energy consumption in MWh. The information was obtained from technical sources and specialized literature, and forms the basis for the energy feasibility developed in the later phases of the project.
Green columns: Variables extracted from reports, articles and other confidable sources.
Yellow columns: Variables obtained from datasheets or reports based on the variables in the green columns.
Blue columns: Variables calculated from the green and yellow columns.
Annex D—Estimation of Energy Consumption by LLMs Inference
MODEL
TOTAL PARAMETERS (B)
ACTIVE PARAMETERS (B)
HARDWAR E
FLOP/S
PUE
FLOP PER QUERY
GPU/QUERY TIME
(Seg)
ENERGY/CONSUMPTI
ON (Wh)
Table D1. Technical Data and Estimated Energy Consumption per Inference of GPT-4 and DeepSeek-V3. This table presents the estimated values of energy consumption per query for the GPT-4 and DeepSeek-V3 models, as well as the projected daily and annual energy consumption, considering a volume of 10 million queries per day and an average length of 500 tokens per query. Key variables such as model architecture, type of GPU used, power eficiency (PUE), and percentage of actual computational resource usage are detailed. These data were adjusted according to the Epoch AI methodology and form the basis for assessing the feasibility of running inferences in the selected data centers in Mexico and Brazil.
Green columns: Variables extracted from reports, articles and other confidable sources.
Blue columns: Variables calculated from the green columns.
Annex E—Energy Feasibility Assessment for LLM Training and Inference
MODEL
DAILY
CONSUMPTI ON (MWh)
DAILY
CONSUMPTI ON (MWh)
Table E1. Feasibility and Adjusted Training Times of LLMs in Data Centers in Mexico (KIO Networks). This table presents the energy feasibility assessment and the estimate of the adjusted time required to train the GPT-3, GPT-4 and DeepSeek-V3 models in the Queretaro and Merida data centers operated by KIO Networks. Variables such as available IT capacity, IT power required, daily energy consumption and estimated training time under different operation scenarios (100%, 50% and 30%) are analyzed. This information allows us to determine the technical feasibility of training large-scale models in Mexico.
Green columns: Variables extracted from reports, articles and other confidable sources.
Blue columns: Variables calculated from the green columns.
MODEL
PUE
DAILY
CONSUM PTION (MWh)
PUE
DAILY
CONSUM PTION (MWh)
FEASIBILITY
DIAS
DIAS
DIAS
FEASIBILITY
DIAS
DIAS
DIAS
GPT-4
>500B
Table E2. Feasibility and Adjusted Training Times of LLMs in Brazilian Data Centers (SCALA Data Centers). The table details the results of the feasibility assessment and the adjusted training time of the GPT-3, GPT-4 and DeepSeek-V3 models in the Tamboré and São Paulo data centers operated by SCALA Data Centers. Comparisons between the energy consumption required for training and the IT capacity of the centers, under different operating conditions, are included. The data reflects the energy feasibility of running these processes in the Brazilian context.
Green columns: Variables extracted from reports, articles and other confidable sources.
Blue columns: Variables calculated from the green columns.
DATA
CENTER
ENERGY/BASE
CONSUMPTION (Wh)
QUERÉTARO
MERIDA
TAMBORE
SAO PAULO
Table E3. Feasibility of LLMs Inference in Data Centers in Mexico and Brazil. This table presents the feasibility assessment for the inference of the GPT-4 and DeepSeek-V3 models in the four selected data centers. The calculation of the percentage of IT capacity usage, adjusted to the specific PUE of each center, is shown. Based on ISO/IEC 30134-2, viability is classified as high (<5%), moderate (5-10%) or not viable (>10%). This analysis identifies which centers can operate AI inferences efficiently without compromising their infrastructure.
Green columns: Variables extracted from reports, articles and other confidable sources.
Blue columns: Variables calculated from the green columns.