0% found this document useful (0 votes)
19 views1 page

1 - DHS - IEEE - Deep - Air

The document proposes a Deep Air Learning (DAL) approach to solve the interpolation, prediction, and feature analysis of fine-grained air quality data in a single model. DAL embeds feature selection and semi-supervised learning in different layers of a deep learning network. This utilizes unlabeled spatio-temporal data to improve interpolation and prediction performance, and performs feature selection and association analysis to identify the main factors influencing air quality variations. The approach is evaluated on real data from Beijing, China, and is shown to outperform other methods for these air quality tasks.

Uploaded by

rajeshkumar32it
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views1 page

1 - DHS - IEEE - Deep - Air

The document proposes a Deep Air Learning (DAL) approach to solve the interpolation, prediction, and feature analysis of fine-grained air quality data in a single model. DAL embeds feature selection and semi-supervised learning in different layers of a deep learning network. This utilizes unlabeled spatio-temporal data to improve interpolation and prediction performance, and performs feature selection and association analysis to identify the main factors influencing air quality variations. The approach is evaluated on real data from Beijing, China, and is shown to outperform other methods for these air quality tasks.

Uploaded by

rajeshkumar32it
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1

Deep Air Learning: Interpolation, Prediction, and


Feature Analysis of Fine-grained Air Quality
Zhongang Qi, Tianchun Wang, Guojie Song, Weisong Hu, Xi Li∗ , Zhongfei (Mark) Zhang

Abstract—The interpolation, prediction, and feature analysis of fine-gained air quality are three important topics in the area of urban
air computing. The solutions to these topics can provide extremely useful information to support air pollution control, and consequently
generate great societal and technical impacts. Most of the existing work solves the three problems separately by different models. In
this paper, we propose a general and effective approach to solve the three problems in one model called the Deep Air Learning (DAL).
The main idea of DAL lies in embedding feature selection and semi-supervised learning in different layers of the deep learning network.
The proposed approach utilizes the information pertaining to the unlabeled spatio-temporal data to improve the performance of the
interpolation and the prediction, and performs feature selection and association analysis to reveal the main relevant features to the
variation of the air quality. We evaluate our approach with extensive experiments based on real data sources obtained in Beijing, China.
Experiments show that DAL is superior to the peer models from the recent literature when solving the topics of interpolation, prediction,
and feature analysis of fine-gained air quality.

Index Terms—Feature Selection, Feature Analysis, Spatio-temporal Semi-supervised Learning, Deep Learning.

1 I NTRODUCTION

T HE interpolation, prediction, and feature analysis of


fine-gained air quality are three important topics in the
area of urban air computing. A good interpolation solves the
training samples when dealing with fine-gained air quality.
Second, the labeled data of the air-quality-monitor-stations
are incomplete, and there exist lots of missing labels of the
problem that there are limited air-quality-monitor-stations historical data in some time periods for some stations. The
whose distribution is uneven in a city; a precise prediction reason for the incomplete labels is related to the air quality
provides valuable information to protect humans from be- monitor devices. In general, each station only has one mon-
ing damaged by air pollution; a reasonable feature analysis itor device which needs to be maintained at intervals, thus
reveals the main relevant factors to the variation of air there will be no outputs for the station when the device
quality. In general, the solutions to these topics can extract is being maintained, recalibrated, or has other problems.
extremely useful information to support air pollution con- Third, the kinds of urban air related data are various for
trol, and consequently generate great societal and technical the development of data acquisition technologies. However,
impacts. there is not an universally accepted judgment to reveal
the main causes of the occurrence and dissipation of air
However, there exist several challenges for urban air pollution, especially the pollution of PM2.5 . Hence, it is
computing as the related data have some special character- hard to know that what kinds of data are the main relevant
istics. First, since there are insufficient air-quality-monitor- features for interpolation and prediction, and the key factors
stations in a city due to the high cost of building and for environment departments to prevent and control air
maintaining such a station, it is expensive to obtain labeled pollution.
This paper is motivated to address all these challenges
• Z. Qi is with the School of Electrical Engineering and Computer Science, by utilizing the information contained in the unlabeled data
Oregon State University, 1148 Kelley Engineering Center, Corvallis, OR
97331-5501, USA.
and the spatio-temporal data, and performing feature selec-
E-mail: [email protected] tion and association analysis for the urban air related data.
• T. Wang is with the School of Information Systems, Singapore Manage- Though labeled data are difficult or expensive to obtain,
ment University, 178902, Singapore. large amounts of unlabeled examples can often be gathered
E-mail: [email protected]
• G. Song is with the Key Laboratory of Machine Perception (Ministry of cheaply. In general, unlabeled data can help in providing
Education), Peking University, Beijing 100871, China. information to better exploit the geometric structure of the
E-mail: [email protected] data. Moreover, most of the urban air related data contain
• W. Hu is with the NEC Laboratories China, 11F Bldg. A, Innovation
Plaza, Tsinghua Science Park Haidian District, Beijing 100084, China.
both space and time information. In Figure 1, (a) and (b)
E-mail: hu [email protected] show totally different observations in each place with a
• X. Li∗ (corresponding author) is with the College of Computer Science and long time interval1 ; Each row in (c) shows continuous air
Technology, Zhejiang University, No. 38, Zheda Road, Hangzhou 310027, quality changing observed in one place2 ; (d), (e), and (f)
China.
E-mail: [email protected] show observations at different spatial locations of Beijing
• Z. Zhang is with Computer Science Department, Watson School, State
University of New York, Binghamton, NY 13902-6000, USA.
E-mail: [email protected] 1. from: http://www.guardian.co.uk/
2. from: http://www.cma.gov.cn/

1041-4347 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

You might also like