Skip to main content
Premier resource for practicing structural engineers
Go back to https://www.structuremag.org/articles Back
Wind

Machine Learning Model for Wind Load Prediction on Tall Buildings

By Aniket Panchal, Anastasia Athanasiou, and Nenghui (Chris) Lin
September 2, 2025

To view the figures and tables associated with this article, please refer to the flipbook above.

Machine learning is a branch of artificial intelligence that utilizes data and an underlying algorithm to identify patterns and make predictions without programming them explicitly. This is used further to imitate the underlying behavior between the independent and dependent variables. The model iteratively updates the relationship between dependent and independent variables with accuracy improving over time. The last few decades saw the explosion of machine learning applications due to ease of availability of computational resources and the development of complex machine learning models that utilize multi-layered networks to learn from big, complex datasets. Traditional machine learning models cannot often capture the underlying physics of the governing phenomena. Physics-based machine learning models now are being introduced that integrate physical constraints into the learning process, ensuring the model adheres to scientific laws while enhancing predictive accuracy.

Traditional machine learning models are deterministic, i.e., they produce the same output for a given input. In contrast, generative machine learning models, which have gained recent popularity, learn the underlying data distribution and can generate multiple possible outputs for the same input. This enables exploration of a broader range of possible outcomes for the same input. The most popular generative model examples are in the field of image generation (DALL.E, Sora, Midjourney, etc.) and text generation (ChatGPT, Perplexity). Similarly, in civil engineering, historical data and experimental results can be leveraged effectively to train machine learning models aimed at reducing project costs and timelines. This is well summarized by Burton (2021), which entails various applications, such as improving empirical relations proposed by various design standards, surrogate modeling, and information/feature extraction.

The field of wind engineering has also seen significant development, offering innovative solutions for analyzing and mitigating wind effects on tall buildings (He et al., 2021; Kareem, 2020; Liu et al., 2023). One such example is about the development of a machine model for image processing to identify low velocity areas around a building for the pedestrian-level wind environment. The model was trained on a large dataset of velocity vectors around a building using computational fluid dynamics simulations. Machine learning models have also been extended to predict wind loads on tall buildings (Meddage et al., 2024), where the wind time history over the building surface is predicted given its spatial coordinate as input. Such emerging paradigms of machine learning offer promising opportunities in the field of civil and structural engineering, particularly around the analysis and modeling of complex wind-structure interactions.

Currently, state-of-the-art structural wind engineering revolves around nonlinear response history simulations of tall buildings subjected to recurring winds of increasing intensity (Athanasiou et al., 2022). Performance-based wind engineering, vital for resilient infrastructure, relies heavily on simulating building response to service and design level winds using dynamic wind histories. The wind loads are typically evaluated in dimensionless form through standard atmospheric boundary layer wind tunnel experiments. This test is typically used as the benchmark to generate multiple sets of alongwind, crosswind, and torsional wind load histories, which serve as the input for the nonlinear response history analysis. Monte Carlo simulation, such as from Shinozuka (1972), is generally used to generate the wind time histories while ensuring the consistent energy distribution over frequency (power spectral density) as that of the benchmark case. The generation of reliable wind loadings is crucial for predicting engineering demand parameters. However, Monte Carlo simulations are computationally intensive and require skilled users every time while generating the wind loads.

To mitigate these drawbacks, the article proposes a framework to generate wind load time histories on a rectangular building with the help of generative machine learning models, incorporating the physics of the phenomenon by constraining them to match the power spectral densities of the wind loads. The framework will significantly reduce the computational effort and simplify the process for practicing engineers who require a large set of loading time histories for the performance assessment of such buildings.

Framework

Figure 1 summarizes the major steps involved in developing a machine learning model for generating physically meaningful wind time histories. These are:

  • Conduct atmospheric boundary layer wind tunnel experiments on wind-sensitive buildings, varying their Height-Breadth-Depth (H-B-D) ratios, angles of wind incidence, and atmospheric exposure conditions.
  • Preprocess the wind load histories to train the machine learning models.
  • Develop the base machine learning model to learn the time history patterns of the processed data.
  • Develop the generative machine learning model to generate the wind time histories.
  • Combine the models developed in stages (c) and (d) to generate the wind loading time histories of the required duration.

Atmospheric Boundary Layer Wind Tunnel Database

The atmospheric boundary layer wind tunnel experiment database is a collection of wind-induced pressure measurements on isolated rectangular clad buildings with various geometric configurations under simulated boundary layer wind conditions. This is a standard experimental setup which practitioners are advised to perform for tall buildings/slender structures to obtain reliable wind loading coefficients. Typically, experiments are performed for a range of angles of wind attack (0°-100°) on the buildings examined. A Simultaneous Multi-Pressure System is employed to evaluate the wind pressure forces on each building surface. Tokyo Polytechnic University (TPU) developed an open-access aerodynamic database based on numerous atmospheric boundary layer wind tunnel experiments performed on low- and high-rise buildings in various exposure environments (TPU, https://wind.arch.t-kougei.ac.jp/system/eng/contents/code/tpu). The TPU high-rise building section comprises data from 22 high-rise building models, offering statistical contours of local wind pressure coefficients, graphs of area-averaged wind pressure coefficients on wall surfaces, and time-series data of point wind pressure coefficients for 394 test cases. This data facilitates calculating local and area-averaged wind pressures, as well as wind-induced dynamic responses of high-rise buildings. This dataset is utilized as a benchmark for developing our machine learning model in the present framework. A sample building from the TPU database depicting the pressure tap locations, along with the inlet velocity and turbulent intensity profile, is shown in Figure 2.

Preprocessing of Atmospheric Boundary Layer Wind Tunnel Database

A single atmospheric boundary layer wind tunnel experiment provides wind coefficients at various locations on all the building surfaces (Fig. 2a). This is then used to calculate force coefficient time histories, including alongwind, crosswind, and torsional components. These are calculated by averaging the measured wind coefficients in the respective directions. The power spectral density function of the force coefficients for a sample building having a width-depth-height ratio of 1:1:3 is shown in Figure 3a for a given height (z = 0.75H). Such a high-dimensional dataset is typically complex to train a machine learning model. To overcome this, the dimensionality of the system is reduced to a few dominant modes, using Proper Orthogonal Decomposition (POD - Weiss, 2019). Since the three wind components are independent of each other, separate operations are performed, which helps in reducing the model complexity. For each component, typically two or three modes are sufficient to capture 90% of the energy and are used to train the models. A sample spectral characteristic of the first mode of each of the wind force coefficients is shown in Figure 3b. The spectral shape of the modal component is nearly similar to that of the force coefficient for all three components, thereby verifying that the POD captures the relevant information.

The calculated reduced modes of the time histories are first standardized with their respective mean and standard deviation to ensure equal contribution of each time history during the training of the machine learning model. Then the time histories are further adjusted to remove any trends over time (using successive difference) or any outliers due to measurement errors. Such operations ensure numerical stability as well as faster convergence of the machine learning models. The final processed time histories are utilized to train the machine learning models.

Development of Base Machine Learning Model

The processed wind time histories for each component (alongwind, crosswind, and torsional) are first trained using a traditional machine learning model (referred to as the base model). The objective of the base model is to learn the underlying trends in the time histories and update the model parameters using an iterative process. The most commonly utilized base models for such purposes are Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), and Transformer models. The LSTM model developed by Hochreiter and Schmidhuber (1997) is used in the present framework due to its effectiveness in capturing long-term dependencies in the time series. The LSTM model is a special type of RNN model that consists of three key components: (1) forget gate, (2) input gate, and (3) output gate, which helps in managing the flow of data through time. The model requires a fixed-length sequence of input data points (known as “sequence length”) to predict the next output in the time series. This prediction is then recursively used as input for future steps, allowing it to generate the required length of time series.

In addition to the sequence length, other sets of parameters known as hyperparameters are required to define the architecture of the LSTM model. This must be specified before the model training, unlike the model weights and biases, which are learned during training. These hyperparameters need to be tuned to maximize the model’s performance. The model training process typically involves minimizing the mean squared error between the predicted and actual time series, and the model parameters are updated using a gradient descent algorithm. Additionally, physics-based constraints are incorporated by minimizing the mean squared error between the auto-spectral density of the predicted and actual time series. The performance of the model is evaluated not only by comparing the auto-spectral density, but also the coherence between the time series signals.

After training the base model, the initial segment of the time series is given as an input, and the required length of time series is predicted recursively. Once the base machine learning model is developed, a generative model is created to generate random input segments, enabling the generation of random and realistic time histories.

Development of Generative Machine Learning Model

The base model developed operates deterministically, meaning it always produces the same output for a given input segment, which limits the ability to generate diverse or random time history realizations. To address this, a complementary generative model is developed to introduce randomness. This model generates random, physically consistent input sequences, allowing the production of diverse and realistic output sequences.

The most popular generative model is the Generative Adversarial Network (GAN, developed by Goodfellow et al., 2014), which generates realistic data by learning the distribution of the original dataset. GAN consists of two key components: (1) the generator model, and (2) the discriminator model. The GAN model is trained through a min-max game between the generator and discriminator: the generator is trained to minimize the error in creating samples, whereas the discriminator is trained to maximize its ability to distinguish between real and generated data.

Similar to the base machine learning model, the generator model also requires a set of hyperparameters, which needs to be specified before the training process. These need to be fine-tuned to maximize the performance of GAN. After training, the performance of the generative machine learning model is evaluated by comparing the distribution of the generated data segment with the real data segment.

Combining Models to Generate Random Wind Force Coefficient Time Histories

The random wind time histories can be generated with the help of the machine learning models developed in the previous sections through the following steps:

  • Generate a random input segment of the wind time histories, CpCg(t), using the GAN model.
  • The random input segment from the previous step is then used to predict the CpCg time history recursively, using the LSTM model.
  • Using the back transformation that was used for processing the various modes of the CpCg time history to obtain the original modes.
  • The final wind CpCg time history is then obtained using the same POD process.

The models can be developed for the set of buildings and angles of attack. An overall conditional model will be developed to adjust the model parameters based on the following inputs: building geometry (ratio of building dimensions), terrain roughness, and angle of attack.

Conclusions

The present framework proposes an innovative approach to generate wind load through the application of multiple machine learning models. Until now, most applications in the field of wind engineering are limited to traditional machine learning models, which lack randomness. This is overcome by combining a generative model with a traditional physics-based machine learning model to generate reliable random outputs. Such an approach highlights the growing potential of such combination models that combine the interpretability of physics-based frameworks with the adaptability of generative machine learning models.

This framework could help practicing engineers reduce the project cost and time by potentially eliminating the need of a standard atmospheric boundary layer wind tunnel experiment to obtain reliable wind force time histories. This also helps in reducing computational efforts and skills required to perform Monte Carlo simulations to generate multiple realistic and random time histories using experimental data.

Future work may focus on the extension of such a framework to include structures of various geometries, improving the interpretability of models, real-time implementation in structural health monitoring, and prediction of climate change. ■

About the Authors

Aniket Panchal is currently pursuing a Ph.D at Indian Institute of Technology Gandhinagar, India, with a focus on multi-hazard assessment of tall buildings under wind and earthquake excitations.

Anastasia Athanasiou (Ph.D) is an Assistant Professor of Natural Hazards and Structural Resilience at Bauhaus-Universität Weimar, Germany.

Nenghui (Chris) Lin is a Project Engineer at Mott MacDonald and a graduate student at the Department of Computer Science at the University of Texas at Austin. He specializes in end-to-end AI solution design and LLM development.

References

Athanasiou A, Tirca L, Stathopoulos T (2022) Nonlinear wind and earthquake loads on tall steel braced frame buildings. ASCE Journal of Structural Engineering, 148(8): 04022098, https://ascelibrary.org/doi/10.1061/%28ASCE%29ST.1943-541X.0003375
Burton, H., 2021. Machine Learning Applications. STRUCTURE.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y., 2014. Generative adversarial nets. Advances in neural information processing systems 27.
He, Y., Liu, X.-H., Zhang, H.-L., Zheng, W., Zhao, F.-Y., Aurel Schnabel, M., Mei, Y., 2021. Hybrid framework for rapid evaluation of wind environment around buildings through parametric design, CFD simulation, image processing and machine learning. Sustainable Cities and Society 73, 103092. https://doi.org/10.1016/j.scs.2021.103092
Hochreiter, S., Schmidhuber, J., 1997. Long Short-Term Memory. Neural Computation 9, 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
Kareem, A., 2020. Emerging frontiers in wind engineering: Computing, stochastics, machine learning and beyond. Journal of Wind Engineering and Industrial Aerodynamics 206, 104320. https://doi.org/10.1016/j.jweia.2020.104320
Liu, Y.J., Fu, J.Y., Tong, B., Liu, Y.H., He, Y.C., 2023. Assessment of approaching wind field for high-rise buildings based on wind pressure records via machine learning techniques. Engineering Structures 280, 115663. https://doi.org/10.1016/j.engstruct.2023.115663
Meddage, D.P.P., Mohotti, D., Wijesooriya, K., 2024. Predicting transient wind loads on tall buildings in three-dimensional spatial coordinates using machine learning. Journal of Building Engineering 85, 108725. https://doi.org/10.1016/j.jobe.2024.108725
Ouyang, Z., Spence, S.M.J., 2021. Performance-based wind-induced structural and envelope damage assessment of engineered buildings through nonlinear dynamic analysis. Journal of Wind Engineering and Industrial Aerodynamics 208, 104452. https://doi.org/10.1016/j.jweia.2020.104452
Shinozuka, M., 1972. Monte Carlo solution of structural dynamics. Computers & Structures 2, 855–874. https://doi.org/10.1016/0045-7949(72)90043-0
Weiss, J., 2019. A Tutorial on the Proper Orthogonal Decomposition. https://doi.org/10.14279/DEPOSITONCE-8512