2.1. Study Area
With the exception of Hong Kong, Macao and Taiwan province, China’s major economic regions are also divided in different ways. In addition to their geographical differences, these regions also widely reflect the overall differences in social and economic development. Since the beginning of China’s reform and opening up, southeast China’s natural geographical advantages have led to its rapid economic development. With the introduction of various policies and benefits, southeast China has become the most developed region in China’s economy. The Yangtze River Delta and Pearl River Delta economic zones are the most active regions in China’s economic development. The economic conditions of the southeast region are very good as it hosts Shanghai, China’s economic center and national metropolis, and Shenzhen, a special economic zone in the region. Therefore, this paper chooses the economically and culturally developed southeast region as the research area (
Figure 1). The southeast region, located in the southeast of China, includes Jiangsu, Shanghai, Zhejiang, Fujian, Jiangxi, Anhui, Guangdong, Guangxi and Hainan (excluding Sansha City) (
Table 1).
Four kinds of data were used in this study, as shown in
Table 2, including NPP/VIIRS 2015 annual data [
25] and MODIS normalized difference vegetation index (
NDVI) data. Administrative boundaries and socioeconomic statistics of provinces, municipalities and prefecture-level cities.
2.3. Light Index Model
Considering that the study area is mainland China and there are great differences in land cover types and vegetation coverage among different areas, two types of indices were selected, namely
NTL index without other restrictions, and improved impervious surface index (
IISI) of weak light area enhanced by logarithm. They can be represented separately by the formula:
NTL stands for preprocessed data of VIIRS-DNB, and
DNB is the pixel value of the corresponding pixel.
where
represents the normalized VIIRS-DNB image value range from 0 to 1,
is the minimum value and
the maximum value, which were 0 and 200, respectively.
where
represents normalized VIIRS-DNB value. The logarithm model and coefficient can guarantee an index value between 0 and 1. This index not only suppresses extremely large values but also enhances small ones.
Considering that vegetation has a negative correlation with human activities, that is, the intensity of human nonagricultural activities in the central area of the city is high, and the vegetation coverage is generally less. However, in rural areas, the intensity of human nonagricultural activities is low and vegetation coverage is high. The NTL value of light intensity decreases gradually from the urban center to the outskirts, while the NDVI shows an almost opposite trend. After NTL is normalized to get , it is taken square treatment to achieve the effect of enhancing light intensity.
Due to the nature of the square root function itself, the closer the value of
is to 1, that is, the closer it is to the city center, the weaker the enhancement effect is, while the closer the value of
is to 0, the farther away from the city center, the stronger the enhancement effect is. When
is equal to
NDVI and the value of
is less than
NDVI, which usually occurs in the rural-urban transition area and the countryside far away from the city center, so the
is equal to
NDVI is taken as the reference. Set the adjustment coefficient of
NTL to 1 at this time to achieve the role of light value in the prominent area. When
is greater than
NDVI, the larger the difference value is, the greater the light intensity is and the closer it is to the active light area. When
is less than
NDVI, the larger the difference value is, the weaker the light intensity is. Therefore, when
is larger than
NDVI, the spatial differences of
NTL within the region are enhanced, and the adjustment coefficient of
NTL is greater than 1. When
is less than
NDVI, the adjustment coefficient of
NTL is set to be less than 1 in order to enhance the difference of light intensity between suburban and rural areas and city center. Based on this idea, the light index shown in Formula (4) is as follows:
where
VHNI is the improved lighting index. Theoretically, when
is equal to
NDVI, the value of
VHNI remains unchanged. When
is larger than
NDVI, the maximum theoretical value is two times
NTL. When
is less than
NDVI, the maximum theoretical value is 0.5 times
NTL. Under the condition of
NTL data pretreatment, the calculated
VHNI will not have infinite or infinitesimal outliers.
The NTL model is preprocessed light data. Because noise removal and extremely high value elimination will not affect the distribution characteristics of DN value of light data, normalization of NTL will not cause abnormal influence. Therefore, both NTL and can be defined as unrestricted light index.
IISI index is according to Formulas (2) and (3), the normalization processing of lighting data itself only adjusts the range of DN value of lighting data and distributes it in the interval of [0,1]. However, IISI index adopts the logarithm function model. As it is a concave function, the closer the independent variable is to the right range within the scope of definition, the slower the growth rate of the value of the corresponding dependent variable is, the DN value of the light data will be constrained, and the original distribution characteristics and laws will change. Therefore, it is defined as a single light restricted light index.
The vegetation represented by NDVI is added to VHNI, and the lighting type is NTL model, but its adjustment coefficient represents not a simple “non-negative is positive”, but highlights the lighting area or vegetation covered area by comparing the lighting value with vegetation. The appearance of makes the original unrestricted light index (NTL or ) become a single light limiting index. After the introduction of vegetation factor, the adjustment coefficient of NTL becomes complicated, so it is defined as the light limiting index under vegetation constraint.
Firstly, the nighttime light data and
NDVI data of China were obtained by using the preprocessed light data and MODIS
NDVI data after clipping the administrative boundary of China. Then, the administrative boundary of prefecture-level cities was cut out and processed to obtain the lighting data of three models, and the
TNL values of the lighting data of three models in different cities were statistically obtained. The flow chart is shown in
Figure 2.
2.4. Rregression Model
We chose four regression models which are the most commonly used in regression analysis to evaluate the potential of nighttime light data for modeling socioeconomic parameters [
27,
28,
29,
30,
31]: linear regression model (Equation (5)), logarithm regression model (Equation (6)), exponential regression model (Equation (7)), and double logarithm model used for those with small dependent and independent variables (Equation (8)):
P represents social and economic parameters (GDP, power consumption and population), is the total night light intensity of each administrative region (i.e., the sum of all pixel values of light data within the administrative region), is the total night light intensity of different light index models (NTL, IISI, VHNI).
According to the performance of the four regression models in the overall regression of different indicators in southeast China, which regression model to use in the regression analysis was decided.