Image-Based Appraisal of Real Estate Properties
Image-Based Appraisal of Real Estate Properties
Image-Based Appraisal of Real Estate Properties
I. INTRODUCTION
Fig. 1. Example of homes for sale from Realtor.
EAL estate appraisal, which is the process of estimating
R the price for real estate properties, is crucial for both buys
and sellers as the basis for negotiation and transaction. Real
their homes. From this perspective, real estate appraisal is also
closely related to people’s lives.
estate plays a vital role in all aspects of our contemporary
Current research from both estate industry and academia has
society. In a report published by the European Public Real
reached the conclusion that real estate value is closely related
Estate Association (EPRA http://alturl.com/7snxx), it was
to property infrastructure [1], traffic [2], online user reviews [3]
shown that real estate in all its forms accounts for nearly 20%
and so on. Generally speaking, there are several different types
of the economic activity. Therefore, accurate prediction of real
of appraisal values. In particular, we are interested in the market
estate prices or the trends of real estate prices help governments
value, which refers to the trade price in a competitive Walrasian
and companies make informed decisions. On the other hand, for
auction setting [4]. Today, people are likely to trade through
most of the working class, housing has been one of the largest
real estate brokers, who provide easy access online websites for
expenses. A right decision on a house, which heavily depends on
browsing real estate property in an interactive and convenient
their judgement on the value of the property, can possibly help
way. Fig. 1 shows an example of house listing from Realtor
them save money or even make profits from their investment in
(http://www.realtor.com/), which is the largest real estate broker
in North America. From the figure, we see that a typical piece of
Manuscript received March 28, 2016; revised February 26, 2017 and April listing on a real estate property will introduce the infrastructure
18, 2017; accepted May 15, 2017. Date of publication June 1, 2017; date
of current version November 15, 2017. The associate editor coordinating the data in text for the house along with some pictures of the house.
review of this manuscript and approving it for publication was Prof. Benoit Typically, a buyer will look at those pictures to obtain a general
Huet. (Corresponding author: Quanzeng You.) idea of the overall property in a selected area before making his
Q. You and J. Luo are with the Department of Computer Science, Univer-
sity of Rochester, Rochester, NY 14623 USA (e-mail: [email protected]; next move.
[email protected]). Traditionally, both real estate industry professionals and
R. Pang is with PayPaL, San Jose, CA 95131 USA (e-mail: pangrr89@ researchers have relied on a number of factors, such as eco-
gmail.com).
L. Cao is with the Electrical Engineering and Computer Sciences Department, nomic index, house age, history trade and neighborhood en-
Columbia University, New York, NY 10013 USA, and also with customerser- vironment [5] and so on to estimate the price. Indeed, these
viceAI, New York, NY 10013 USA (e-mail: [email protected]). factors have been proved to be related to the house price, which is
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. quite difficult to estimate and sensitive to many different human
Digital Object Identifier 10.1109/TMM.2017.2710804 activities. Therefore, researchers have devoted much effort in
1520-9210 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications standards/publications/rights/index.html for more information.
Authorized licensed use limited to: UNIVERSITY OF ROCHESTER. Downloaded on March 17,2020 at 20:53:35 UTC from IEEE Xplore. Restrictions apply.
2752 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 19, NO. 12, DECEMBER 2017
building a robust house price index [6]–[9]. In addition, quan- To preserve the local relation among properties we employ
titative features including Area, Year, Storeys, Rooms and Cen- a novel approach, which employs random walks to generate
tre [10], [11] are also employed to build neural network models house sequences. In building the random walk graph, only the
for estimating house prices. However, pictures, which is proba- locations of houses are utilized. In this way, the problem of real
bly the most important factor on a buyer’s initial decision making estate appraisal has been transformed into a sequence learn-
process [12], have been ignored in this process. This is partially ing problem. Recurrent Neural Network (RNN) is particularly
due to the fact that visual content is very difficult to interpret or designed to solve sequence related problems. Recently, RNNs
quantify by computers compared with human beings. have been successfully applied to challenging tasks including
A picture is worth a thousand words. One advantage with im- machine translation [25], image captioning [26], and speech
ages and videos is that they act like universal languages. People recognition [27]. Inspired by the success of RNN, we deploy
with different backgrounds can easily understand the main con- RNN to learn regression models on the transformed problem.
tent of an image or video. In the real estate industry, pictures can The main contributions of our work are as follows.
easily tell people exactly how the house looks like, which is im- 1) To the best of our knowledge, we are the first to quan-
possible to be described in many ways using language. For the tify the impact of visual content on real estate price es-
given house pictures, people can easily have an overall feeling timation. We attribute the possibility of our work to the
of the house, e.g. what is the overall construction style, how the newly designed computer vision algorithms, in particular
neighboring environment looks like. These high-level attributes Convolutional Neural Networks (CNNs).
are difficult to be quantitatively described. On the other hand, 2) We employ random walks to generate house sequences
today’s computational infrastructure is also much cheaper and according to the locations of each house. In this way, we
more powerful to make the analysis of computationally inten- are able to transform the problem into a novel sequence
sive visual content analysis feasible. Indeed, there are existing prediction problem, which is able to preserve the relation
works on focusing the analysis of visual content for tasks such among houses.
as prediction [13], [14], and online user profiling [15]. Due to 3) We employ the novel Recurrent Neural Networks (RNNs)
the recently developed deep learning, computers have become to predict real estate properties and achieve accurate
smart enough to interpret visual content in a way similar to results.
human beings.
Recently, deep learning has enabled robust and accurate
feature learning, which in turn produces the state-of-the-art per- II. RELATED WORK
formance on many computer vision related tasks, e.g., digit Real estate appraisal has been studied by both real estate in-
recognition [16], [17], image classification [18], [19], aesthet- dustrial professionals and academia researchers. Earlier work
ics estimation [20] and scene recognition [21]. These systems focused on building price indexes for real properties. The semi-
suggest that deep learning is very effective in learning robust nal work in [6] built price index according to the repeat prices of
features in a supervised or unsupervised fashion. Even though the same property at different times. They employed regression
deep neural networks may be trapped in local optima [22], [23], analysis to build the price index, which shows good perfor-
using different optimization techniques, one can achieve the mances. Another widely used regression model, Hedonic re-
state-of-the-art performance on many challenging tasks men- gression, is developed on the assumption that the characteristics
tioned above. of a house can predict its price [7], [8]. However, it is argued
Inspired by the recent successes of deep learning, in this that the Hedonic regression model requires more assumptions
work we are interested in solving the challenging real estate ap- in terms of explaining its target [28]. They also mentioned that
praisal problem using deep visual features. In particular, for for repeat sales model, the main problem is lack of data, which
images related tasks, Convolutional Neural Network (CNN) may lead to failure of the model. Recent work in [9] employed
are widely used due to the usage of convolutional layers. It locations and sale price series to build an autoregressive com-
takes into consideration the locations and neighbors of image ponent. Their model is able to use both single sale homes and
pixels, which are important to capture useful features for vi- repeat sales homes, which can offer a more robust sale price
sual tasks. Convolutional Neural Networks [18], [19], [24] have index.
been proved very powerful in solving computer vision related More studies are conducted on employing feed forward neu-
tasks. ral networks for real estate appraisal [29]–[32]. However, their
We intend to employ the pictures for the task of real es- results suggest that neural network models are unstable even us-
tate price estimation. We want to know whether visual features, ing the same package with different run times [29]. The perfor-
which is a reflection of a real estate property, can help estimate mance of neural networks are closely related to the features and
the real estate price. Intuitively, if visual features can charac- data size [32]. Recently, Kontrimas and Verikas [33] empirically
terize a property in a way similar to human beings, we should studied several different models on selected 12 dimensional fea-
be able to quantify the house features using those visual re- tures, e.g., type of the house, size, and construction year. Their
sponses. Meanwhile, real estate properties are closely related to results show that linear regression outperforms neural network
the neighborhood. In this work, we develop algorithms which on their selected 100 houses.
only rely on: 1) the neighbor information and 2) the attributes More recent studies in [1] propose a ranking objective, which
from pictures to estimate real estate property price. takes geographical individual, peer and zone dependencies into
Authorized licensed use limited to: UNIVERSITY OF ROCHESTER. Downloaded on March 17,2020 at 20:53:35 UTC from IEEE Xplore. Restrictions apply.
YOU et al.: IMAGE-BASED APPRAISAL OF REAL ESTATE PROPERTIES 2753
A. Random Walks
One main feature of real estate properties is its location. In by the recent proposed DeepWalk [35] to learn feature represen-
particular, for houses in the same neighborhood, they tend to tations for graph nodes. It has been shown that random walks
have similar extrinsic features including traffic, schools and so can capture the local structure of the graphs. In this way, we can
on. We build an undirected graph G for all the houses collected, keep the local location structure of houses and build sequences
where each node vi represent the i-th house in our data set. The for houses in the graph. Algorithm 1 summarizes the detailed
similarity sij between house hi and house hj is defined using steps for generating sequences from a similarity graph.
the Gaussian kernel function, which is a widely used similarity We have generated sequences by employing random walks.
measure1 In each sequence, we have a number of houses, which is related
in terms of their locations. Since we build the graph on top of
dist(hi , hj )
sij = exp (1) house locations, the houses within the same sequence are highly
2σ 2 possible to be close to each other. In other words, the prices of
where dist(hi , hj ) is the geodesic distance between house hi houses in the same sequence are related to each other. We can
and hj . σ is the hyper-parameter, which controls the similarity employ this context for estimating real estate property price,
decaying velocity with the increase of distance. In all of our which can be solved by recurrent neural network discussed in
experiments, we set σ to 0.5 miles so that houses within the following sections.
1.5 (within 3σ) miles will have a relatively larger similarity.
The -neighborhood graph [34] is employed to build G in our B. Recurrent Neural Network
implementation. We assign the weight of each edge eij as the With a Recurrent Neural Network (RNN), we are trying to
similarity sij between house hi and the house hj . predict the output sequence {y1 , y2 , . . . , yT } given the input
Given this graph G, we can then employ random walks to gen- sequence {x1 , x2 , . . . , xT }. Between the input layer and the
erate sequences. In particular, every time, we randomly choose output layer, there is a hidden layer, which is usually estimated
one node vi as the root node, then we proportionally jump to as in
its neighboring nodes vj according to the weights between vi
and its neighbors. The probability of jumping to node vj is ht = Δ(Whi ht−1 + Wx xt + bh ). (3)
defined as Δ represents some selected activation function or other com-
ej i
pj = (2) plex architecture employed to process the input xt and ht . One
k ∈N (i) ek i of the most widely deployed architectures is Long Short-Term
where N (i) is the set of neighbor nodes of vi . We continue to Memory (LSTM) cell [36], which can overcome the vanishing
employ this process until we generate the desired length of se- and exploding gradient problem [37] when training RNN with
quence. The employment of random walks is mainly motivated gradient descent. Fig. 2 shows the details of a single Long Short-
Term Memory (LSTM) block [38]. Each LSTM cell contains
1 [Online]. Available: http://en.wikipedia.org/wiki/Radial_basis_function_ an input gate, an output gate and an forget gate, which is also
kernel called a memory cell in that it is able to remember the error in
Authorized licensed use limited to: UNIVERSITY OF ROCHESTER. Downloaded on March 17,2020 at 20:53:35 UTC from IEEE Xplore. Restrictions apply.
2754 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 19, NO. 12, DECEMBER 2017
layer respectively. Next, the output (in our problem, the output
Fig. 2. Illustration of a single long short-term memory (LSTM) cell. is the price of each house) of each house is calculated using the
output of the 2nd-LSTM layer as input to the output layer.
The objective function for training the Multi-Layer Bidirec-
the error propagation stage [39]. In this way, LSTM is capable tional LSTM is defined as follows:
of modeling long-range dependencies than conventional RNNs.
1
N
For completeness, we give the detailed calculation of ht
L= ŷij − yij 2 (9)
given input xt and ht−1 in the following equations. Let W.i , N n =1 j
W.f , W.o represent the parameters related to input, forget and
output gate respectively. denotes the element-wise multiplica- where W is the the set of all the weights between different
tion between two vectors. φ and ψ are some selected activation layers. yij is the actual trade price for the j-th house in the
functions and σ is the fixed logistic sigmoid function. Follow- generated i-th sequence and ŷij is the corresponding estimated
ing [27], [38], [40], we employ tanh for both φ in (6) and ψ price for this house.
in (8): When training our Multi-Layer B-LSTM model, we employ
the RMSProp [42] optimizer, which is an adaptive method for
it = σ(Wxi xt + Whi ht−1 + Wci ct−1 + bi ) (4) automatically adjust the learning rates. In particular, it normal-
izes the gradients by the average of its recent magnitude.
ft = σ(Wxf xt + Whf ht−1 + Wcf ct−1 + bf ) (5) We conduct the back propagation in a mini-batch approach.
ct = ft ct−1 + it φ(Wxc xt + Whc ht−1 + bc ) (6) Algorithm 2 summarizes the main steps for our proposed
algorithm.
ot = σ(Wxo xt + Who ht−1 + Wco ct + bo ) (7)
ht = ot ψ(ct ). (8) D. Prediction
In the prediction stage, the first step is also generating se-
C. Multilayer Bidirectional LSTM quence. For each testing house, we add it as a new node into our
previously build similarity graph on the training data. Each test-
In previous sections, we have discussed the generation of
ing house is a new node in the graph. Next, we add edges to the
sequences as well as Recurrent Neural Network. Recall that
testing nodes and the training nodes. We use the same settings
we have built an undirected graph in generating the sequences,
when adding edges to the new -neighborhood graph. Given
which indicates that the price of one house is related to all the
the new graph G , we randomly generate sequences and keep
houses in the same sequence including those in the later part.
those sequences that contain one and only one testing node. In
Bidirectional Recurrent Neural Network (BRNN) [41] has been
this way, for each house, we are able to generate many different
proposed to enable the usage of both earlier and future contexts.
sequences that contain this house. Fig. 4 shows the idea. Each
In bidirectional recurrent neural network, there is an additional
testing sequence only has one testing house. The remaining
backward hidden layer iterating from the last of the sequence
nodes in the sequence are the known training houses.
to the first. The output layer is calculated by employing both
a) Average: The above strategy implies that we are able to
forward and backward hidden layer.
build many different sequences for each testing house. To obtain
Bidirectional-LSTM (B-LSTM) is a particular type of BRNN,
the final prediction price for each testing house, one simple strat-
where each hidden node is calculated by the long short-term
egy is to average the prediction results from different sequences
memory as shown in Fig. 2. Graves et al. [40] have employed
and report the average price as the final prediction price.
Bidirectional-LSTM for speech recognition. Fig. 3 shows the
architecture of the bidirectional recurrent neural network. We
have two Bidirectional-LSTM layers. During the forward pass IV. EXPERIMENTAL RESULTS
of the network, we calculate the response of both the forward In this section, we discuss how to collect data and evalu-
and the backward hidden layers in the 1st-LSTM and 2nd-LSTM ate the proposed framework as well as several state-of-the-art
Authorized licensed use limited to: UNIVERSITY OF ROCHESTER. Downloaded on March 17,2020 at 20:53:35 UTC from IEEE Xplore. Restrictions apply.
YOU et al.: IMAGE-BASED APPRAISAL OF REAL ESTATE PROPERTIES 2755
Fig. 3. Multilayer BRNN architecture for real estate price estimation. There are two bidirectional recurrent layers in this architecture. For real estate price
estimation, the price of each house is related to all houses in the same sequence, which is the main motivation to employ bidirectional recurrent layers.
TABLE I
AVERAGE PRICE PER SQUARE FOOT AND THE STANDARD DEVIATION
(STD) OF THE PRICE OF THE TWO STUDIED CITIES
country). The house prices in the two cities also have significant
approaches. In this work, all the data are collected from Real- differences. Fig. 5 shows some of the example house pictures
tor (http://www.realtor.com/), which is the largest realtor asso- from the two cities, respectively. From these pictures, we ob-
ciation in North America. We collect data from San Jose, CA, serve that houses whose prices are above average typically have
one of the most active cities in U.S., and Rochester, NY, one of larger yards and better curb appeal, and vice versa. The same
the least active cities in U.S., over a period of one year. In the can be observed among house interior pictures (examples not
next section, we will discuss the details on how to preprocess shown due to space).
the data for further experiments. Realtor does not provide the exact geo-location for each
house. However, geo-location is important for us to build the
-neighborhood graph for random walks. We employ Microsoft
A. Data Preparation Bing Map API (https://msdn.microsoft.com/en-us/library/
The data collected from Realtor contains description, school ff701715.aspx) to obtain the latitude and longitude for each
information and possible pictures about each real property as house given its collected address. Fig. 6 shows some of the
shown in Fig. 1 show. We are particularly interested in employ- houses in our collected data from San Jose and Rochester using
ing the pictures of each house to conduct the price estimation. the returned geo-locations from Bing Map API.
We filter out those houses without image in our data set. Since According to these coordinates, we are able to calcu-
houses located in the same neighborhood seem to have similar late the distance between any pair of houses. In particular,
price, the location is another important features in our data set. we employ Vincenty distance (https://en.wikipedia.org/wiki/
However, after an inspection of the data, we notice that some of Vincenty’s_formulae) to calculate the geodesic distances ac-
the house price are abnormal. Thus, we preprocess the data by cording to the coordinates. Fig. 7 shows distribution of the dis-
filtering out houses with extremely high or low price compared tance between any pair of houses in our data set. The distance
with their neighborhood. is less than 4 miles for most randomly picked pair of houses. In
Table I shows the overall statistics of our dataset after filter- building our -neighborhood graph, we assign an edge between
ing. Overall, the city of San Jose has more houses than Rochester any pair of houses, which has a distance smaller than 5 miles
on the market (as expected for one of the hottest market in the ( = 5 miles).
Authorized licensed use limited to: UNIVERSITY OF ROCHESTER. Downloaded on March 17,2020 at 20:53:35 UTC from IEEE Xplore. Restrictions apply.
2756 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 19, NO. 12, DECEMBER 2017
Fig. 5. Examples of house pictures of the two cities, respectively. Top row: houses whose prices (per square foot) are above the average of their neighborhood.
Bottom row: houses whose prices (per square foot) are below the average of their neighborhood. (a) Rochester. (b) San Jose.
2 We also tried max-pooling. However, the results are not as good as average-
pooling. In the following experiments, we report the results using average-
Fig. 7. Distribution of distances between different pairs of houses. pooling.
Authorized licensed use limited to: UNIVERSITY OF ROCHESTER. Downloaded on March 17,2020 at 20:53:35 UTC from IEEE Xplore. Restrictions apply.
YOU et al.: IMAGE-BASED APPRAISAL OF REAL ESTATE PROPERTIES 2757
TABLE II
PREDICTION DEVIATION OF DIFFERENT MODELS FROM THE ACTUAL SALE PRICES
San Jose 70.79 16.92% 68.05 16.12% 17.98 4.58% 66.3 16.11%
Rochester 14.19 24.83% 13.68 23.28% 5.21 9.94% 13.32 22.69%
Note that RNN-best is the upper-bound performance of the RNN based model proposed in this work.
1 t i − pi
N
MAPE = | | (11)
N i=1 ti Fig. 8. Performance of B-LSTM-avg in different groups. All the testing houses
are grouped by the predicted standard deviation. (a) MAE. (b) MAPE.
We use the same training and testing split to evaluate all the
approaches. Table II shows the regression results for all the
different approaches in the two selected cities. For each testing model can distinguish the confidence level of its prediction. In
house, we generate about 100 sequences. In Table II, we report particular, we group the testing houses evenly into three groups
both the best and the average price of the predicted price. For for each city. The first group has the smallest standard deviation
Rochester, the average standard deviation of the predicted prices of the prediction prices. The second group is the middle one and
over all the houses is 5.6, which is 7.33% of the average price the last group is the one with the largest standard deviation.
in Rochester (see Table I). Comparably, the average standard Fig. 8 shows the MAE and MAPE for the different groups.
deviation for San Jose is 34.64, which is 7.63% of the average The results show that standard deviation can be viewed as a
price in San Jose. The best is the price closest to the true price rough measure of the confidence level of the proposed model
among all the available sequences for each house.3 Overall, our on the current testing house. Small standard deviation tends
B-LSTM model outperforms other two baseline algorithms in to indicate a high confidence of the model and overall it also
both cities. All of the evaluation approaches perform better in suggests a smaller prediction error.
San Jose than in Rochester in terms of MAPE. This is possible
due to the availability of more training data in the city of San V. CONCLUSION
Jose. DeepWalk shows slightly better performance than LASSO,
which suggests that location is relatively more important than In this work, we propose a novel framework for real estate
the visual features in the realtor business. This is expected appraisal. In particular, the proposed framework is able to take
both the location and the visual attributes into consideration.
D. Confidence Level The evaluation of the proposed model on two selected cities
suggests the effectiveness and flexibility of the model. Indeed,
For each testing house, the proposed model can give a group our work has also offered new approaches of applying deep
of predictions. We want to know whether or not the proposed neural networks on graph structured data. We hope our model
can not only give insights on real estate appraisal, but also can
3 This is the upper bound of the prediction results. We choose the closest price inspire others on employing deep neural networks on graph
using the ground truth price as reference. structured data.
Authorized licensed use limited to: UNIVERSITY OF ROCHESTER. Downloaded on March 17,2020 at 20:53:35 UTC from IEEE Xplore. Restrictions apply.
2758 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 19, NO. 12, DECEMBER 2017
Authorized licensed use limited to: UNIVERSITY OF ROCHESTER. Downloaded on March 17,2020 at 20:53:35 UTC from IEEE Xplore. Restrictions apply.
YOU et al.: IMAGE-BASED APPRAISAL OF REAL ESTATE PROPERTIES 2759
Ran Pang is currently working toward the M.S. Jiebo Luo (S’93–M’96–SM’99–F’09) received the
degree in computer science at the University of B.S. and M.S. degrees in electrical engineering from
Rochester, Rochester, NY, USA. the University of Science and Technology of China,
He is interested in artificial intelligence and his re- Hefei, China, in 1989 and 1992, respectively, and
search focuses on social multimedia and data mining. the Ph.D. degree in electrical and computer engineer-
ing from the University of Rochester, Rochester, NY,
USA, in 1995.
He joined the University of Rochester, Rochester,
NY, USA, in fall 2011, after more than 15 years at
Kodak Research Laboratories, Rochester, NY, USA,
where he was a Senior Principal Scientist leading re-
search and advanced development.
Prof. Luo is a Fellow of the International Society for Optics and Pho-
Liangliang Cao received the B.E. degree from the tonics, and the International Association for Pattern Recognition. He has
University of Science and Technology of China, been involved in numerous technical conferences, and served as the Program
Hefei, China, in 2003, the M.E. degree from The Chi- Co-Chair of ACM Multimedia 2010 and the IEEE CVPR 2012. He is the Editor-
nese University of Hong Kong, Hong Kong, China, in-Chief of the Journal of Multimedia, and has served on the Editorial Boards of
in 2005, and the Ph.D. degree from the University of the IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,
Illinois at Urbana-Champaign, Urbana, IL, USA, in the IEEE TRANSACTIONS ON MULTIMEDIA, the IEEE TRANSACTIONS ON CIR-
2011. CUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, Pattern Recognition, Machine
He is currently a Senior Research Scientist at Ya- Vision and Applications, and the Journal of Electronic Imaging.
hoo! Laboratories, Sunnyvale, CA, USA, and an Ad-
junct Faculty at Columbia University, New York, NY,
USA. He has authored or coauthored more than 40 pa-
pers in top conferences and journals, including the International Conference on
Computer Vision, the Computer Vision and Pattern Recognition Conference, the
European Conference on Computer Vision, the Conference on Neural Informa-
tion Processing Systems, the ACM Multimedia, the International World Wide
Web Conference, the IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MA-
CHINE INTELLIGENCE, and the PROCEEDINGS OF THE IEEE. His research interests
include the intersection of computer vision, multimedia, and big data analytics.
Mr. Cao was the General Chair of the Greater New York Area Multimedia
and Vision Meeting in 2012 and 2013. He was an Area Chair of WACV 2014
and ACM Multimedia 2012. He was a Guest Editor of the ACM Transactions on
Multimedia Computing, Communications, and Applications, and the Computer
Vision and Image Understanding journal.
Authorized licensed use limited to: UNIVERSITY OF ROCHESTER. Downloaded on March 17,2020 at 20:53:35 UTC from IEEE Xplore. Restrictions apply.