Predicting the Vitality of Stores Along the Street Based on Business Type Sequence via Recurrent Neural Network

Liu, Zidong; Li, Yan; Xiao, Xiao

doi:10.1007/978-981-19-8637-6_29

Zidong Liu⁷,
Yan Li⁸ &
Xiao Xiao⁹

Part of the book series: Computational Design and Robotic Fabrication ((CDRF))

Included in the following conference series:

The International Conference on Computational Design and Robotic Fabrication

4096 Accesses

Abstract

The rational planning of store types and locations to maximize street vitality is essential in real estate planning. Traditional business planning relies heavily on the subjective experience of developers. Currently, developers have access to low-resolution urban data to support their decision making, and researchers have done much image-based machine learning research from the scale of urban texture. However, there is still a lack of research on the functional layout with shop-level accuracy. This paper uses a sequence-based neural network (RNN) to explore the relationship between the sequence of store types along a street and its commercial vitality. Currently, the use of RNNs in the architectural and urban fields is very rare. We use customer review data of 80streets from O2O platforms to represent the store vitality degree. In the machine learning model, the input is the sequence of store types on the street, and the output is the corresponding sequence of business vitality indexes. After training and evaluation, the model was shown to have acceptable accuracy. We further combined this evaluation model with a genetic algorithm to develop a business planning optimization tool to maximize the overall street business value, thus guiding real estate business planning at a high resolution.

You have full access to this open access chapter, Download conference paper PDF

A Machine Learning Approach for Locating Businesses Along Main Arteries in Inner Cities

Introduction: City Planning and New Technology

Real-Time Retail Smart Space Optimization and Personalized Store Assortment with Two-Stage Object Detection Using Faster Regional Convolutional Neural Network

Keywords

1 Introduction

Business management experience shows that the order in which stores are located on the street has a significant effect on business. Stores at the street corner tend to have higher popularity and therefore higher rents. People without specific goals are more likely to shop at the first supermarket they see. Store's neighbours may also have a complex impact on its operation, depending on the types of stores. For example, two supermarkets that are located close to each other may have a vicious competition, but for McDonald's and KFC, putting them together may help enhance their visibility on the street. A bank placed next to a luxury store may help to increase the sales of the luxury store.

1.1 Problem Statement

The current business planning model still relies heavily on the subjective experience of real estate developers, which leads to the uncertainty in planning results and adversely affects the profitability of businesses. There has been increasingly data analysis such as customer base analysis and regional vitality analysis to support low-resolution issues like the proportion of store types [5]. However, the resolution of these data is still not sufficient to guide shop-level planning. There is still a lack of research for more precise planning such as the location sequence of business types along a street. Therefore, this paper has a strong practical research significance.

1.2 Literature Review

There have been many studies on region vitality through machine learning. However, most of them are image-based (GAN-based) and do not achieve store-level accuracy. Among these studies, GAN models are predominant. A study transforms the citizens’ cycling route data into an urban heat map to represent community vitality and explores its relationship with urban fabric [8]. Similar approaches can be used to predict other urban metrics, such as urban crime rate [3] and commercial value [7]. However, due to the limitation of computation and data resolution, the generated results always have ambiguous areas. This is the reason why some studies have attempted to vectorize images before performing machine learning [9].

In this study, we choose RNN as the basic neural network model. RNN is based on sequential data, widely used in natural language processing, advertising recommendations and so on. Compared with other models, RNN’s features are highly compatible with our research object and goal. Here are the reasons:

1.
RNN uses sequential data as input and output.
2.
In RNN models, the order of data has a decisive influence on the results.
3.
The input and output in the RNN training set can be of different lengths.

Among the sparse RNN-based studies in the architectural and urban fields, there is one relevant to the topic on business optimization [4]. Using the behaviour of pedestrian inside a mall as data, the researchers trained a behavioural predictor that can infer the pedestrian's walking direction. This model in turn guides the design of the mall, leading to higher commercial value on the pedestrian's expected route. In addition, some researchers have tried to use RNNs from the perspective of software operation. Toulkeridou describes a method to train RNNs to assist in parametric design decisions (Toulkeridou 2019) [1, 2].

1.3 Project Goal

The paper aims to explore the relationship between the order of store business types along the street and their commercial vitality by a sequence-based neural network (RNN). The machine learning model simulates the behaviour of people walking down the street and passing through stores. In the model, the input is the sequence of store types and the output is the sequence of vitality indexes. After training, this machine learning model can predict the vitality of each store, thus guiding real estate business planning at a high resolution.

2 Methodology

The research process is divided into three parts: data collection, model training and model evaluation. We collected data of stores along the streets from O2O platforms including Gaode Map, Meituan and Dianping and transformed these data into sequences that can represent the types of stores and their sales status. After that, the sequence data are entered into the seq2seq model and trained in the LSTM layers. Then the model outputs the sequence of letters that can represent the vitality level. Finally, we use Cross Entropy Loss Function and the prediction accuracy function to evaluate the effectiveness of this prediction model (Fig. 1).

After obtaining the prediction model, a street outside the training set is used to verify the effectiveness of the model. Furthermore, we can combine this prediction model with a genetic algorithm to develop a business planning optimization tool: it automatically gives the best ranking order based on the input store types to maximize the business value of the whole street.

2.1 Data Collection

We selected 80 streets, 1261 stores, and 29 store types from 8 representative cities in China from O2O platforms (Fig. 2). As the main O2O platforms vary from city to city and different merchants on the same street might choose different platforms, it is necessary to collate data from multiple mainstream platforms. In this research, the commercial data was comprehensively collected on Meituan, Dianping and Gaode Map. In this way, we collect as complete data as possible for every store on each of the 80 streets. Regarding a tiny number of shops with missing data, we take the average of the nearby shops of the same type as a replacement. O2O platforms provide a variety of information: shop type, number of reviews, per capita spending. There is also information on sales volume (some semi-annual, some monthly).

2.2 Data Processing

Quantitative assessment of business vitality is very complex since no platform provides direct information on the sales of every shop in the street. Based on the assumption that all shops have the same review rate, we can use the number of reviews multiplied by the per capita spend to estimate the sales of each shop. However, after research, we found that the type of shop significantly impacts the number of reviews. For example, milk tea shops and fast-food restaurants tend to have very high review rates. In contrast, some support facilities such as banks and bicycle repair points have low review rates though their existence can have a significant impact on the surrounding stores.

In order to provide a more objective assessment of the commercial viability of shops, a relative quantity approach is applied here. For these 1261 shops, we compare the number of reviews multiplied by the value of per capita consumption within each type of shop, and then classify their relative vitality into five classes: ABCDE. For example, there are 75 pastry shops, so we rank their vitality, then the top 10 are ranked A, 11–25 are ranked B, and so on (Fig. 3). For those supporting facilities with few reviews like banks, we unify their vitality value C. After calculating the vitality values of the stores in these 80 streets, we can get some interesting statistical conclusions. Shanghai, Nanjing, Wuhan and Suzhou have higher average store vitality than Kunming and Changsha, which is in line with daily experience: store vitality is positively correlated with the economic development of a city (Fig. 4).

2.3 Training Set Expansion

The machine learning model simulates the behaviour of people walking down the street and passing through stores that is a one-way experience. However, since both ends of the street can be the starting points, the sequences can all be trained in reverse, so the dataset was expanded from 80 streets to 160. To expand the sample size further, we extracted all the subsequences whose length are greater than five from the beginning of these 160 sequences (Fig. 5). This is reasonable because we may not go through the whole street in daily shopping but finish shopping after passing several stores. By this method, we obtained a total of 1820 sequential data. This method of expanding the database is inspired by the research of Weixin Huang's team on the modelling operation, in which they also applied a similar subsequence approach [2].

2.4 Machine Learning

Machine training is based on the Seq2Seq attention model (Fig. 1). Data set is divided into the training set, validation set and test set according to the ratio of 7:2:1. We evaluate the effectiveness of this model by two functions: Cross Entropy Loss Function (Eq. 1) and the Prediction Accuracy Function (Eq. 2). The Prediction Accuracy Function is formulated by the specific issue of this paper. The difference between the predicted value and the target value varies depending on the predicted value (Table 1). The accuracy of random guess is the sum of all the values in Table 3 divided by 25 equals 46.56%.

$$ L = \frac{1}{N}\sum\limits_{i} {L_{i} } = - \frac{1}{N}\sum\limits_{i} {L_{i} } \sum\limits_{c = 1}^{M} {y_{ic} } \log (P_{ic} ) $$

(1)

Table 1. Accuracy calculation table

Full size table

$\begin{gathered} \begin{array}{*{20}l} {M{\text{: Number of categories}}} \hfill & {\quad \quad \quad \quad y_{ic} {\text{: Sign function }}(0{\text{ or 1}})} \hfill \\ \end{array} \hfill \\ P_{ic} {\text{: The predicted probability that}}i{\text{th item belongs to category}}\;c \hfill \\ \end{gathered}$

$$ P = \frac{1}{{n_{t} }}\sum\limits_{i = 0}^{{\min \;\left( {n_{t} ,n_{p} } \right)}} {1 - \frac{{\Delta r_{i} }}{{\max \left( {R - r_{t} ,r_{t} - 1} \right)}} \times 100\% ,\quad \Delta r_{i} } = \left| {r_{ip} - r_{it} } \right| $$

(2)

$\begin{array}{*{20}l} {R{\text{: Range of vitality level}}} \hfill & {} \hfill \\ {n_{t} {\text{: Target sequence length}}} \hfill & {n_{p} {\text{: Predicted sequence length}}} \hfill \\ {r_{it} {\text{: Target vitality of the}}\;i\;{\text{th term}}} \hfill & {r_{ip} {\text{: Predicted vitality of the}}\;i\;{\text{th term}}} \hfill \\ \end{array}$

The training results after 600 epochs with 15 batches per epoch are shown in Fig. 6. The training effect is good. The model never enters the overfitting state since the training loss curve and the validation loss curve remain stable and the accuracy curve keeps increasing.

3 Case Study

We chose Gungyuan West Street in Nanjing, outside the training set, to apply our trained evaluation model. Gongyuan West Street is in the historical centre area of Nanjing, with a wide variety of businesses and high popularity. The commercial situation of the site is shown in Fig. 7.

The types of stores in West Street were input into the trained model, and the output vitality prediction was “b c b c b c b c b c b c b c b c b c b c”, with an accuracy of 77% according to Formula (2) (Table 1). Experiment 1 adds a movie theatre at the beginning of the street, and the model had a higher expectation of street vitality (Fig. 8). Experiment 2 arranges the same kinds of stores together. The model also has a higher expectation of the overall vitality of the street (Fig. 9 and Table 2).

Table 2. Accuracy calculation table

Full size table

3.1 Vitality Optimization Based on Genetic Algorithm

Further, we combined this evaluation model with a genetic algorithm to develop a reference tool that can provide suggestions for optimizing the location of stores. The vitality levels correspond to specific numbers: A scores 5, B scores 4, C scores 3, D scores 2, and E scores 1. The genetic algorithm takes the total score of vitality as the optimization target. At each iteration, the genetic algorithm randomly swaps two store locations. Through continuous iterations, the genetic algorithm then gives the optimal solution of this prediction model.

After hundreds of iterations, the system did find a solution with a high vitality index: “CS F STS CS AS JS IS STS B HC B DH B H F FAFR”. The vitality prediction for this sequence is: “B C B A A A A A A A A A A A A A A A A” with a score of 76. Figure 10 records an evolutionary process.

4 Conclusion

This paper presents a method that uses machine learning to predict commercial vitality along streets and provide optimization advice. This study has important practical value for high-precision business planning. Although there have been many machine learning studies based on urban texture images, few studies are accurate to the prediction of vitality of stores. Compared with previous studies, this study creatively interpreted people’s walking and shopping behaviour in the street as a linear sequence. It converted POI data collected from the O2O platform into a sequence format to train the RNN model.

In the future, this study still has much room for improvement. The accuracy of the current model is still not high enough. In the data collection stage, a larger data set is needed in the future. Since the information accuracy requirement is very high (relative location of each store), the automatic POI data collection method based on geographic coordinates is not applicable. Currently, we use manual methods to collect data one by one along the street. In the future, however, automated data collection algorithms will have to be developed to replace the current manual methods to remarkably expand the scale of the training set. In the data processing stage, there are many noise points in the data set due to many factors affecting the vitality of the real-world stores. In the future, homogenized data algorithms will be used to eliminate the effect of noise [6]. In the model training phase, we will use more RNN models such as Transformer, GRU, BiLSTM to compare which model is more suitable for this research in the future.

References

Gao W et al (2021) A data structure for studying 3D modeling design behavior based on event logs. Autom Constr 132(103967):103967. https://doi.org/10.1016/j.autcon.2021.103967
Article Google Scholar
Gao W et al (2022) Command prediction based on early 3D modeling design logs by deep neural networks. Autom Constr 133(104026):104026. https://doi.org/10.1016/j.autcon.2021.104026
Article Google Scholar
He J, Zheng H (2021) Prediction of crime rate in urban neighborhoods based on machine learning. Eng Appl Artif Intell 106(104460):104460. https://doi.org/10.1016/j.engappai.2021.104460
Article Google Scholar
Karoji G, Hotta K, Hotta A, Ikeda Y (2019) Pedestrian dynamic behaviour modeling. In: Proceedings of the 24th international conference on computer-aided architectural design research in Asia: intelligent and informed, CAADRIA 2019, pp 281–290
Google Scholar
Schlegel A, Birkel HS, Hartmann E (2021) Enabling integrated business planning through big data analytics: a case study on sales and operations planning. Int J Phys Distrib Logist Manag 51(6):607–633. https://doi.org/10.1108/ijpdlm-05-2019-0156
Article Google Scholar
Sheng Q et al (2018) The application of space syntax modeling in data-based urban design—an example of Chaoyang square renewal in Jilin city. Lands Archit Front 6(2):102. https://doi.org/10.15302/j-laf-20180211
Article Google Scholar
Shou X, Chen P, Zheng H (2021) Predicting the heat map of street vendors from pedestrian flow through machine learning. In: Proceedings of the 26th international conference on computer-aided architectural design research in Asia: projections, CAADRIA 2021, pp 569–578
Google Scholar
Sun YJ, Jiang L, Zheng H (2021) A machine learning method of predicting behavior vitality via urban forms. In: Proceedings of the 40th international conference on computer aided design in architecture: distributed proximities, ACADIA 2021, pp 160–168
Google Scholar
Xia X, Tong Z (2020) A machine learning-based method for predicting urban land use. In: Proceedings of the 25th international conference on computer-aided architectural design research in Asia: anthropocene, CAADRIA 2021, pp 21–30
Google Scholar

Download references

Author information

Authors and Affiliations

University College London, Gower St., London, WC1E 6BT, UK
Zidong Liu
University of Sydney, Camperdown, NSW, 2006, Australia
Yan Li
Politecnico di Torino, Corso Duca degli Abruzzi, 24, 10129, Torino, TO, Italy
Xiao Xiao

Authors

Zidong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yan Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Xiao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao Xiao .

Editor information

Editors and Affiliations

College of Architecture and Urban Planning, Tongji University, Shanghai, China
Philip F. Yuan
College of Architecture and Urban Planning, Tongji University, Shanghai, China
Hua Chai
College of Architecture and Urban Planning, Tongji University, Shanghai, China
Chao Yan
College of Architecture and Urban Planning, Tongji University, Shanghai, China
Keke Li
College of Architecture and Urban Planning, Tongji university, Shanghai, China
Tongyue Sun

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, Z., Li, Y., Xiao, X. (2023). Predicting the Vitality of Stores Along the Street Based on Business Type Sequence via Recurrent Neural Network. In: Yuan, P.F., Chai, H., Yan, C., Li, K., Sun, T. (eds) Hybrid Intelligence. CDRF 2022. Computational Design and Robotic Fabrication. Springer, Singapore. https://doi.org/10.1007/978-981-19-8637-6_29

Download citation

DOI: https://doi.org/10.1007/978-981-19-8637-6_29
Published: 04 April 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8636-9
Online ISBN: 978-981-19-8637-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us