Keywords

1 Introduction

1.1 Background

Currently, most urban dwellers suffered from the urban heatwave events, including both systematic changes in climate such as warmer summers, and severity of extreme events such as heat waves [1]. A considerable number of studies have shown that the number of summer heat stress days suffered by Chinese urban residents has increased year by year in the past half century, which has caused significant urban health problems, especially for the life and health of the elderly. At the same time, research shows that the high temperature in the city even has a significant negative impact on the baby birth rate and the pregnancy safety of women [2]. It is estimated that there are about 24966 deaths related to heatwave in 2021, according to report of the Lancet [3]. Besides, heat-related labor loss, indirectly resulted in a loss of 1.68% of gross domestic product (GDP) in 2021 [3].

In the face of the growing urban heat problem, the number of government publications has risen each year in recent years. In 2021 alone, the number of papers related to climate health has increased by 3.7 times compared to the average annual number of papers issued in the past decades [3]. Academia has also paid close attention to the issue of urban climate comfort, especially the study of outdoor thermal comfort for hot summer climates. Over the past century, various models and metrics for thermal environment evaluation have been proposed for the study of urban environment as well as thermal comfort. Among them, the UTCI model, based on the human heat exchange mechanism and combined with the dressing model, integrates a variety of climate elements such as temperature, humidity and wind speed, and has the characteristics of multi-scale, multi-area and multi-climate generalization, thus becoming the mainstream evaluation index of outdoor thermal comfort today.

However, outdoor thermal comfort at the human scale has long been neglected in urban construction [1], especially in traditional neighborhood spaces, which is difficult to consider at the beginning of design. However, Urban neighborhood spaces are essential for residents by providing spaces for daily activities, of which the Universal Thermal Climate Index (UTCI) has an influence on space quality, where positive physical thermal comfort leads to more lingering and interactive activities, promoting healthy travel and improving quality of life [6].

Therefore, in this context, how to efficiently and rapidly conduct a large-scale and fine-grained outdoor environmental comfort evaluation based on the outdoor environment of urban traditional neighborhood spaces is the problem to be solved in this study.

1.2 Research Overview on UTCI in Outdoor Environment

In general, research on urban-level outdoor UTCI is still in its initial stage in China, focusing mostly on macroscopic urban space, with relatively little research on microscopic human scale. The current research on thermal environmental comfort in urban space can be summarized from three aspects: research themes, research methods, and experimental means.

  1. 1.

    In terms of research topics, the main areas are as follows: researches on the spatial distribution and temporal trends of UTCI; evaluation of the applicability of UTCI and try to make corrections on this indicator; researches on performance-driven design with UTCI as the goal; researches on the influencing factors related to outdoor thermal climate comfort;

  2. 2.

    In terms of research methods, there are: descriptive statistics of computational results based on statistics; spatial distribution patterns and temporal trends based on GIS; modeling studies based on machine learning [6].

  3. 3.

    In terms of experimental methods, there are three main types of research: building a laboratory for human perception research [4], which allows for small samples of experimental research, more accurate and easier to control variables, but limited by the data samples and data sources; on-site measurement to collect climate environmental data, including temperature, humidity, wind speed, etc., and then calculate the UTCI values with the help of software, where this model usually limited to expensive cost such as time and money; the use of simulation software to simulate the virtual model of the site numerically, and then validated in the field, which is limited by the length of time consuming on software simulation.

1.3 Research Gaps

Although the current researches on outdoor thermal comfort are extensive, there are corresponding knowledge gaps, mainly as follows:

  1. 1.

    Data sparsity problems are common: data sparsity problems caused by the limitation of the number of weather stations, cannot highlight spatial climate characteristics at the meso-micro scale; most studies use mid-term reanalysis data from climate websites, with data accuracy and confidence are difficult to ensure;

  2. 2.

    The calculation of outdoor thermal comfort relying on numerical simulation is inefficient, for it takes a long time to run the model to calculate the equivalent temperature of UTCI, which makes it difficult to carry out the evaluation of UTCI at both urban scale and human scale. In the study of actual UTCI, data are mostly obtained from on-site measurement data or simulation data, which is costly and ineffective;

  3. 3.

    The relationship between outdoor thermal comfort and built environment elements in microclimate environments is relatively underexplored, and some of the relevant findings are valid only for the sample areas;

  4. 4.

    The small size sample-based measurement modeling lacks diversity, and the study findings are difficult to be applied on a larger scale. Because location-specific predictions, rather than probabilistic predictions of entire urban fields, limits its operational utility and usefulness [7].

1.4 Research Framework

Compared to the conventional qualitative urban morphology analysis methods, the rapidly developing algorithm-supported data acquisition and machine learning modelling are more efficient and accurate, easing the problems of under-representation and interference by episodic factors in traditional research methods, and better model traditionally difficult non-linear phenomena [7]. However, machine learning models with superior generalization performance need sufficient data samples for training, in order to get more accurate prediction results.

For dealing with the above problems, we try to train a GAN model to replace numerical simulations, and related studies show that using GAN instead of numerical simulations for UTCI can improve the speedup by 120–240 times [8]. We propose a Grasshopper-based workflow (Fig. 1), combined with data simulation, augmentation and estimation.

Fig. 1.
figure 1

Workflow of this study

Specifically, we build a classical city model based on authoritative mapping data in Rhino/Grasshopper platform, and use Ladybug and Eddy 3D plug-ins to perform human-scale Micro-environment climate simulation; then Ladybug tools was used to calculate and generate UTCI images. Finally, based on the deep learning framework, we train a GAN model for future overall UTCI mapping of the city.

The remainder of this paper is organized as follows: Sect. 2 is a literature review of related researches, including the definition of UTCI and its application and researches about GAN. Section 3 illustrates the methods in this project about how to prepare the dataset required and how to train the GAN model. Section 4 describes the GAN prediction results and make discussions. Section 5 summarizes the main aspects of this article and proposes the possible application of the proposed model as well as the limitations and expectations.

2 Literature Review about Related Researches

2.1 The Definition of UTCI and Its Application

The Universal Thermal Climate Index (UTCI), based on the multi-node dynamic thermo-physiological UTCI-Fiala model [9], is used to predict human temperature and regulatory responses for combinations of the prevailing outdoor climate conditions. The UTCI is defined as the air temperature of the reference condition causing the same model response as actual conditions [10], which provides a human-based representation of the environment temperature, covering the whole climate range from heat to cold [11].

Compared with the physical temperature information, UTCI can more accurately distinguish the degree of human body’s perception of cold and hot discomfort, which was widely used to be applied in tourism, urban planning, construction, etc., in different scales and climate zones [12, 13]. With the deepening of researches, some studies have been carried out in recent years on the regional applicability of UTCI [14, 15]. At the same time, researches on UTCI-related impact factors are also emerging. In general, these impact factors include climate factors, urban traffic factors, urban development intensity factors, micro-environmental landscape factors [18], etc.

2.2 Researches About GAN

Generative Adversarial Network (GAN), proposed by Goodfellow in 2014 [18], has rapidly created a research boom in the field of deep learning and image generation, and has been applied in various research areas. Based on this, various variants have been developed since then, such as DCGAN, WGAN, StyleGAN, etc. The GAN trains a generator network and a discriminator, where goal of the generator is to map a random vector to a realistic image, whereas the goal of the discriminator is to distinguish the generated and the real images [19].

Due to the advantages of allowing fast numerical generation by image transformation, GAN is applied in more studies, such as residential floor plan generation, building layout generation, garden layout generation, NDVI/NDRE prediction [20], precipitation nowcasting [7] and so on. Among them, the Digital Futures Workshop led by YAO et al. has explored GAN with generative urban design in numerous ways, and found that GAN has good applications in alternative environmental performance models [8].

3 Methods

3.1 Model Generation Based on Rhino/Grasshopper Platform

Rhino/Grasshopper, a parametric modeling platform, is the main modeling tool used by architects nowadays, which can effectively perform rapid model generation. GAN training requires a large amount of data, but the building of refined urban models is usually a complex process. At the same time, there was a problem of different scales in the collection of previous datasets, as the actual scales reflected by the input two-dimensional images were uneven, resulting in inaccurate model predictions. In Huang’s study [8], they proposed a fine method of “Prototype summary-Type derivation”, to obtain a large number of city models analogous to the study area in a short period of time. However, the traditional numerical simulation of datasets involves simulating the environment of independent plots in a wind box, neglecting the correlation between the selected area and the surrounding environment.

Therefore, unlike Huang's study [8], we take into account the realistic characteristics and associative features of urban scenes. So, we choose 35 typical tracts (250 m * 250 m) for modeling based on authoritative mapping data, and each tract satisfies the diverse characteristics of building layout forms. Our research area is the traditional historic district within the second ring road of Beijing, where we focus on this area for two reasons: on one hand, the study of the historic district, with its complex morphology, is relatively less studied on UTCI; and the outdoor thermal comfort of the historic district can influence the pedestrian spatial experience and promote the vitality of the historic district.

3.2 Simulation and Calculation of UTCI Based on Ladybug Tools

Ladybug software package, a collection of tools for environmental performance simulation on the Rhino/grasshopper platform, allows for the simulation of wind, light, heat and other climate parameters, in which outdoor comfort was evaluated using microclimatic and energy modelling with OpenFOAM and EnergyPlus, respectively. In this project, based on each simulation parameter, the final UTCI values were calculated using the Ladybug software to generate 35 slices of overall UTCI images. Referring to the principle of convolutional neural network, the UTCI images of the 35 whole city slices are then segmented using different sizes of convolutional kernels with different step sizes to ensure that the image dataset can satisfy the characteristics of multi-scale and front-back connectivity. Finally, we obtain 4500 paired picture datasets.

3.3 GAN-Based Image Generation

GANs were used to predict the outdoor environment comfort with full information, with learning global features instead of the detailed features of each object [8]. Based on the Tensorflow framework, we train a pix2pix adversarial network model for fast prediction of UTCI values, which can effectively reduce the time of environmental performance simulation. Pix2pix, one of the GAN models, conduct image-to-image translation with paired training data.

Finally, we perform data enhancement on the dataset, and images are panned and cut in four directions to achieve an 8-fold data enhancement, resulting in 36,000 data samples. The pre-trained model is then invoked to train the pix2pix generative adversarial network model, based on the TensorFlow framework. We divided the data set into training, test and validation set, in the ratio of 7:1.5:1.5, where the model was trained on the training set, and the robustness of the validation set and the model performance of the test set were evaluated.

4 Results and Discussion

In this study, the information in the Fig. 2 shows that the training process of the pix2pix model gradually converges with the increase of the number of training iterations, and the mutual game process between the discriminator and the generator in the model training process can be seen from it. The generator loss increases slightly in the initial stage, and between 280 and 600 K iterations, the generator loss fluctuates up and down around 0.308, but increases after thousand iterations. The loss of the discriminator function gradually decreases with the increase of the number of iterations, and the model gradually converges after about 280 thousand iterations, with loss of the discriminator as around 0.3, but decreases after 400 thousand iterations. From the whole process, the model began to converge when the model iterated to 280 K, and after 600 K generation, the model appeared overfitting. From this, it can be seen that setting the iteration number to 300 thousand generations is more appropriate, so resetting the total number to 400 K generations for model training. The entire training process uses RTX3090 GPU, and the training process takes about 12.5 h.

Fig. 2.
figure 2

Training loss curves of generator and discriminator

Figure 3 shows the output results of the model on the test set, from which it can be seen that the predicted images can almost meet the performance and fineness requirements of the project, and the GAN model has good results in grasping the relationship and structural pulse of the building layout and UTCI as a whole. The model has excellent prediction performance for the layout of the enclosed building compound in the selected area, especially for the UTCI prediction of the north side of the building and the larger building courtyard. However, the prediction ability needs to be improved for the highly dense and overly complex building layout scenarios.

Fig. 3.
figure 3

Examples of model performance on test set

In order to further compare the effectiveness of the pix2pix model, we trained the cycleGAN model using the same dataset. The deployment method of the model dataset was the same as above, and a total of 20.4 h was spent with the using of RTX3090 GPU. The model eventually converged after 100 epochs, and model prediction results on the test set are shown in Fig. 4. Overall, compared to pix2pix model, there is a certain gap in details, which also proves that strict image-to-image transformation method of pix2pix has better performance.

Overall, the pix2pix model has been able to understand the overall semantic information behind the UTCI graphs to a high degree. Although this study is limited by time and computing power, and no more iterations are set, the model converges well so far, while reducing the over-convergence of the model caused by over-training.

Fig. 4.
figure 4

Examples of cycleGAN model performance on test set

5 Conclusions

With the support of new urban science and technology, the bottom-up and human-centered street quality research has become the key to delicacy urban governance. In this paper, we propose an approach based on a generative adversarial network (GAN) to predict UTCI in traditional blocks. 36000 data samples were obtained from the simulations, to train a pix2pix based on the TensorFlow framework. After more than 300 thousand iterations, the model gradually converges, where the loss of the function gradually decreases with the increase of the number of iterations. We can clearly see that the pix2pix model has a high grasp of the relationship between the architectural form of historical ancient city blocks and UTCI. With the help of this model, we can quickly predict the fine-scale UTCI at the urban block scale in Beijing. Based on this, on the one hand, we can identify the overheated and uncomfortable areas in the ancient city in the historical ancient city and formulate more accurate policies. On the other hand, data mining can be used to explore the relationship with other urban factors. Overall, the model, learning from data to depict non-linear relationships between input parameters and output metrics, has been able to understand the overall semantic information behind the UTCI graphs to a high.

Compared to other studies on the use of GAN in built environments, the contribution of this article lies in: firstly, the processing of the dataset used for training first involves modeling historical ancient cities based on official data; secondly, each sample data is captured from fragments on a large-scale simulation result image, so each sample takes into account the influence of the surrounding environment; thirdly, the data interception method uses simple segmentation and sliding window interception to ensure the continuity of the dataset; to ensure the consistency of input data, each image proxy a 50 m * 50 m block. Therefore, the model built in this study is limited to predicting at a 50 m scale to ensure persuasiveness.

Of course, the drawback of this study is also very obvious: the model training process lacks the necessary correction mechanism, and there is still a risk of overfitting the model without actually collecting data for testing. This is also the focus of our next research, including the control process of model training, the improvement of data types, and model correction based on actual data. Moreover, if the computing power allows, we can check whether there are fluctuations in the convergence of the model under more training iterations.

The key to future research lies in model evaluation. In addition to the Ineption score/FID (Fréchet Inception Distance), the next step is to construct a scale that can be easily understood by subjective experience.