Abstract
The rational planning of store types and locations to maximize street vitality is essential in real estate planning. Traditional business planning relies heavily on the subjective experience of developers. Currently, developers have access to low-resolution urban data to support their decision making, and researchers have done much image-based machine learning research from the scale of urban texture. However, there is still a lack of research on the functional layout with shop-level accuracy. This paper uses a sequence-based neural network (RNN) to explore the relationship between the sequence of store types along a street and its commercial vitality. Currently, the use of RNNs in the architectural and urban fields is very rare. We use customer review data of 80streets from O2O platforms to represent the store vitality degree. In the machine learning model, the input is the sequence of store types on the street, and the output is the corresponding sequence of business vitality indexes. After training and evaluation, the model was shown to have acceptable accuracy. We further combined this evaluation model with a genetic algorithm to develop a business planning optimization tool to maximize the overall street business value, thus guiding real estate business planning at a high resolution.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Business management experience shows that the order in which stores are located on the street has a significant effect on business. Stores at the street corner tend to have higher popularity and therefore higher rents. People without specific goals are more likely to shop at the first supermarket they see. Store's neighbours may also have a complex impact on its operation, depending on the types of stores. For example, two supermarkets that are located close to each other may have a vicious competition, but for McDonald's and KFC, putting them together may help enhance their visibility on the street. A bank placed next to a luxury store may help to increase the sales of the luxury store.
1.1 Problem Statement
The current business planning model still relies heavily on the subjective experience of real estate developers, which leads to the uncertainty in planning results and adversely affects the profitability of businesses. There has been increasingly data analysis such as customer base analysis and regional vitality analysis to support low-resolution issues like the proportion of store types [5]. However, the resolution of these data is still not sufficient to guide shop-level planning. There is still a lack of research for more precise planning such as the location sequence of business types along a street. Therefore, this paper has a strong practical research significance.
1.2 Literature Review
There have been many studies on region vitality through machine learning. However, most of them are image-based (GAN-based) and do not achieve store-level accuracy. Among these studies, GAN models are predominant. A study transforms the citizens’ cycling route data into an urban heat map to represent community vitality and explores its relationship with urban fabric [8]. Similar approaches can be used to predict other urban metrics, such as urban crime rate [3] and commercial value [7]. However, due to the limitation of computation and data resolution, the generated results always have ambiguous areas. This is the reason why some studies have attempted to vectorize images before performing machine learning [9].
In this study, we choose RNN as the basic neural network model. RNN is based on sequential data, widely used in natural language processing, advertising recommendations and so on. Compared with other models, RNN’s features are highly compatible with our research object and goal. Here are the reasons:
-
1.
RNN uses sequential data as input and output.
-
2.
In RNN models, the order of data has a decisive influence on the results.
-
3.
The input and output in the RNN training set can be of different lengths.
Among the sparse RNN-based studies in the architectural and urban fields, there is one relevant to the topic on business optimization [4]. Using the behaviour of pedestrian inside a mall as data, the researchers trained a behavioural predictor that can infer the pedestrian's walking direction. This model in turn guides the design of the mall, leading to higher commercial value on the pedestrian's expected route. In addition, some researchers have tried to use RNNs from the perspective of software operation. Toulkeridou describes a method to train RNNs to assist in parametric design decisions (Toulkeridou 2019) [1, 2].
1.3 Project Goal
The paper aims to explore the relationship between the order of store business types along the street and their commercial vitality by a sequence-based neural network (RNN). The machine learning model simulates the behaviour of people walking down the street and passing through stores. In the model, the input is the sequence of store types and the output is the sequence of vitality indexes. After training, this machine learning model can predict the vitality of each store, thus guiding real estate business planning at a high resolution.
2 Methodology
The research process is divided into three parts: data collection, model training and model evaluation. We collected data of stores along the streets from O2O platforms including Gaode Map, Meituan and Dianping and transformed these data into sequences that can represent the types of stores and their sales status. After that, the sequence data are entered into the seq2seq model and trained in the LSTM layers. Then the model outputs the sequence of letters that can represent the vitality level. Finally, we use Cross Entropy Loss Function and the prediction accuracy function to evaluate the effectiveness of this prediction model (Fig. 1).
After obtaining the prediction model, a street outside the training set is used to verify the effectiveness of the model. Furthermore, we can combine this prediction model with a genetic algorithm to develop a business planning optimization tool: it automatically gives the best ranking order based on the input store types to maximize the business value of the whole street.
2.1 Data Collection
We selected 80 streets, 1261 stores, and 29 store types from 8 representative cities in China from O2O platforms (Fig. 2). As the main O2O platforms vary from city to city and different merchants on the same street might choose different platforms, it is necessary to collate data from multiple mainstream platforms. In this research, the commercial data was comprehensively collected on Meituan, Dianping and Gaode Map. In this way, we collect as complete data as possible for every store on each of the 80 streets. Regarding a tiny number of shops with missing data, we take the average of the nearby shops of the same type as a replacement. O2O platforms provide a variety of information: shop type, number of reviews, per capita spending. There is also information on sales volume (some semi-annual, some monthly).
2.2 Data Processing
Quantitative assessment of business vitality is very complex since no platform provides direct information on the sales of every shop in the street. Based on the assumption that all shops have the same review rate, we can use the number of reviews multiplied by the per capita spend to estimate the sales of each shop. However, after research, we found that the type of shop significantly impacts the number of reviews. For example, milk tea shops and fast-food restaurants tend to have very high review rates. In contrast, some support facilities such as banks and bicycle repair points have low review rates though their existence can have a significant impact on the surrounding stores.
In order to provide a more objective assessment of the commercial viability of shops, a relative quantity approach is applied here. For these 1261 shops, we compare the number of reviews multiplied by the value of per capita consumption within each type of shop, and then classify their relative vitality into five classes: ABCDE. For example, there are 75 pastry shops, so we rank their vitality, then the top 10 are ranked A, 11–25 are ranked B, and so on (Fig. 3). For those supporting facilities with few reviews like banks, we unify their vitality value C. After calculating the vitality values of the stores in these 80 streets, we can get some interesting statistical conclusions. Shanghai, Nanjing, Wuhan and Suzhou have higher average store vitality than Kunming and Changsha, which is in line with daily experience: store vitality is positively correlated with the economic development of a city (Fig. 4).
2.3 Training Set Expansion
The machine learning model simulates the behaviour of people walking down the street and passing through stores that is a one-way experience. However, since both ends of the street can be the starting points, the sequences can all be trained in reverse, so the dataset was expanded from 80 streets to 160. To expand the sample size further, we extracted all the subsequences whose length are greater than five from the beginning of these 160 sequences (Fig. 5). This is reasonable because we may not go through the whole street in daily shopping but finish shopping after passing several stores. By this method, we obtained a total of 1820 sequential data. This method of expanding the database is inspired by the research of Weixin Huang's team on the modelling operation, in which they also applied a similar subsequence approach [2].
2.4 Machine Learning
Machine training is based on the Seq2Seq attention model (Fig. 1). Data set is divided into the training set, validation set and test set according to the ratio of 7:2:1. We evaluate the effectiveness of this model by two functions: Cross Entropy Loss Function (Eq. 1) and the Prediction Accuracy Function (Eq. 2). The Prediction Accuracy Function is formulated by the specific issue of this paper. The difference between the predicted value and the target value varies depending on the predicted value (Table 1). The accuracy of random guess is the sum of all the values in Table 3 divided by 25 equals 46.56%.
\(\begin{gathered} \begin{array}{*{20}l} {M{\text{: Number of categories}}} \hfill & {\quad \quad \quad \quad y_{ic} {\text{: Sign function }}(0{\text{ or 1}})} \hfill \\ \end{array} \hfill \\ P_{ic} {\text{: The predicted probability that}}i{\text{th item belongs to category}}\;c \hfill \\ \end{gathered}\)
\(\begin{array}{*{20}l} {R{\text{: Range of vitality level}}} \hfill & {} \hfill \\ {n_{t} {\text{: Target sequence length}}} \hfill & {n_{p} {\text{: Predicted sequence length}}} \hfill \\ {r_{it} {\text{: Target vitality of the}}\;i\;{\text{th term}}} \hfill & {r_{ip} {\text{: Predicted vitality of the}}\;i\;{\text{th term}}} \hfill \\ \end{array}\)
The training results after 600 epochs with 15 batches per epoch are shown in Fig. 6. The training effect is good. The model never enters the overfitting state since the training loss curve and the validation loss curve remain stable and the accuracy curve keeps increasing.
3 Case Study
We chose Gungyuan West Street in Nanjing, outside the training set, to apply our trained evaluation model. Gongyuan West Street is in the historical centre area of Nanjing, with a wide variety of businesses and high popularity. The commercial situation of the site is shown in Fig. 7.
The types of stores in West Street were input into the trained model, and the output vitality prediction was “b c b c b c b c b c b c b c b c b c b c”, with an accuracy of 77% according to Formula (2) (Table 1). Experiment 1 adds a movie theatre at the beginning of the street, and the model had a higher expectation of street vitality (Fig. 8). Experiment 2 arranges the same kinds of stores together. The model also has a higher expectation of the overall vitality of the street (Fig. 9 and Table 2).
3.1 Vitality Optimization Based on Genetic Algorithm
Further, we combined this evaluation model with a genetic algorithm to develop a reference tool that can provide suggestions for optimizing the location of stores. The vitality levels correspond to specific numbers: A scores 5, B scores 4, C scores 3, D scores 2, and E scores 1. The genetic algorithm takes the total score of vitality as the optimization target. At each iteration, the genetic algorithm randomly swaps two store locations. Through continuous iterations, the genetic algorithm then gives the optimal solution of this prediction model.
After hundreds of iterations, the system did find a solution with a high vitality index: “CS F STS CS AS JS IS STS B HC B DH B H F FAFR”. The vitality prediction for this sequence is: “B C B A A A A A A A A A A A A A A A A” with a score of 76. Figure 10 records an evolutionary process.
4 Conclusion
This paper presents a method that uses machine learning to predict commercial vitality along streets and provide optimization advice. This study has important practical value for high-precision business planning. Although there have been many machine learning studies based on urban texture images, few studies are accurate to the prediction of vitality of stores. Compared with previous studies, this study creatively interpreted people’s walking and shopping behaviour in the street as a linear sequence. It converted POI data collected from the O2O platform into a sequence format to train the RNN model.
In the future, this study still has much room for improvement. The accuracy of the current model is still not high enough. In the data collection stage, a larger data set is needed in the future. Since the information accuracy requirement is very high (relative location of each store), the automatic POI data collection method based on geographic coordinates is not applicable. Currently, we use manual methods to collect data one by one along the street. In the future, however, automated data collection algorithms will have to be developed to replace the current manual methods to remarkably expand the scale of the training set. In the data processing stage, there are many noise points in the data set due to many factors affecting the vitality of the real-world stores. In the future, homogenized data algorithms will be used to eliminate the effect of noise [6]. In the model training phase, we will use more RNN models such as Transformer, GRU, BiLSTM to compare which model is more suitable for this research in the future.
References
Gao W et al (2021) A data structure for studying 3D modeling design behavior based on event logs. Autom Constr 132(103967):103967. https://doi.org/10.1016/j.autcon.2021.103967
Gao W et al (2022) Command prediction based on early 3D modeling design logs by deep neural networks. Autom Constr 133(104026):104026. https://doi.org/10.1016/j.autcon.2021.104026
He J, Zheng H (2021) Prediction of crime rate in urban neighborhoods based on machine learning. Eng Appl Artif Intell 106(104460):104460. https://doi.org/10.1016/j.engappai.2021.104460
Karoji G, Hotta K, Hotta A, Ikeda Y (2019) Pedestrian dynamic behaviour modeling. In: Proceedings of the 24th international conference on computer-aided architectural design research in Asia: intelligent and informed, CAADRIA 2019, pp 281–290
Schlegel A, Birkel HS, Hartmann E (2021) Enabling integrated business planning through big data analytics: a case study on sales and operations planning. Int J Phys Distrib Logist Manag 51(6):607–633. https://doi.org/10.1108/ijpdlm-05-2019-0156
Sheng Q et al (2018) The application of space syntax modeling in data-based urban design—an example of Chaoyang square renewal in Jilin city. Lands Archit Front 6(2):102. https://doi.org/10.15302/j-laf-20180211
Shou X, Chen P, Zheng H (2021) Predicting the heat map of street vendors from pedestrian flow through machine learning. In: Proceedings of the 26th international conference on computer-aided architectural design research in Asia: projections, CAADRIA 2021, pp 569–578
Sun YJ, Jiang L, Zheng H (2021) A machine learning method of predicting behavior vitality via urban forms. In: Proceedings of the 40th international conference on computer aided design in architecture: distributed proximities, ACADIA 2021, pp 160–168
Xia X, Tong Z (2020) A machine learning-based method for predicting urban land use. In: Proceedings of the 25th international conference on computer-aided architectural design research in Asia: anthropocene, CAADRIA 2021, pp 21–30
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2023 The Author(s)
About this paper
Cite this paper
Liu, Z., Li, Y., Xiao, X. (2023). Predicting the Vitality of Stores Along the Street Based on Business Type Sequence via Recurrent Neural Network. In: Yuan, P.F., Chai, H., Yan, C., Li, K., Sun, T. (eds) Hybrid Intelligence. CDRF 2022. Computational Design and Robotic Fabrication. Springer, Singapore. https://doi.org/10.1007/978-981-19-8637-6_29
Download citation
DOI: https://doi.org/10.1007/978-981-19-8637-6_29
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8636-9
Online ISBN: 978-981-19-8637-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)