0% found this document useful (0 votes)

2 views

Understanding and Visualizing Generative

The document discusses the application of Generative Adversarial Networks (GAN), specifically the PIX2PIXHD model, in learning and generating architectural drawings. It details the network's structure, including convolution, residual, and deconvolution layers, and how these contribute to recognizing and visualizing features in architectural plans. The findings suggest that as the network trains, it learns to represent architectural features more concisely and clearly, drawing parallels to human learning processes.

Uploaded by

whk666888

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Understanding and Visualizing Generative

Uploaded by

whk666888

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

W. Huang, M. Williams, D. Luo, Y. Wu and Y. Lin (eds.

), Learning, Prototyping and Adapting, Short

Paper Proceedings of the 23rd International Conference on Computer-Aided Architectural Design Research
in Asia (CAADRIA) 2018. © 2018, The Association for Computer-Aided Architectural Design Research in
Asia (CAADRIA), Hong Kong.

UNDERSTANDING AND VISUALIZING GENERATIVE

ADVERSARIAL NETWORK IN ARCHITECTURAL
DRAWINGS

HAO ZHENG1 and WEIXIN HUANG2

1
University of California, Berkeley,USA
[email protected]
2
Tsinghua University, Beijing, China
[email protected]

Abstract. Generative Adversarial Network (GAN) is a model frame-

work in machine learning. It’s specially used for learning and generat-
ing input and output data with similar or the same format. PIX2PIXHD
is a refined version of GAN, which is designed for learning image data
in pairs, and generating predicted images based on the network model.
The author applied PIX2PIXHD to learning and generating architec-
tural drawings, marking rooms with different colours automatically by
computer programs. Then, to understand how this network works, the
author analysed the frame of the network, and gave a detailed explana-
tion about the three working principles of this network, convolution
layer, residual network layer and deconvolution layer. Last, to visualize
the network in architectural drawings, the author exported the data from
each layer and each training epoch as grayscale images, finding that the
features of architectural plan drawings have been learned step by step,
and stored in the network as parameters. And the features in the draw-
ings become more concise as the network goes deeper, and clearer as
the training epoch increases. It might be inspiring comparing to the
learning process of our human beings.

Keywords. Machine learning; architectural drawing; Generative Ad-

versarial Network; visualizing; PIX2PIXHD.

1. Introduction

1.1. GENERATIVE ADVERSARIAL NETWORKS

Goodfellow et al. (2014) firstly proposed a network in machine learning,
which we called Generative Adversarial Networks. It was used for generating
234 H. ZHENG AND W. HUANG

data, especially for the output data, whose format is similar or same with the
input data. By giving training data in pairs, the program will figure out the
most suitable parameters in the network, so that the discriminator (D) has
smallest possibility to distinguish the generated data (G) with the original data.
Then, Wang et al. (2017) built a refined network called PIX2PIXHD (Fig-
ure 1), for generating and evaluating 2D image data. Input image will be trans-
lated into three 2D matrices, based on its width, height, and RGB channels.
Then the matric will go through 5 groups of convolution layer, which contains
one convolution layer, one batch normalization layer, and one ReLU layer,
then 9 groups of residual network layer, which follows as two sets of Reflec-
tionPad2d-Conv2d-InstanceNorm2d-ReLU layer, and finally 5 groups of de-
convolution layer, which contains one deconvolution layer, one batch normal-
ization layer, and one ReLU or Tanh layer.

Figure 1. Network architecture of PIX2PIXHD by Wang et al. (2017).

A large number of experiments show that, PIX2PIXHD network works

very well in 2D image training and generating. So the following work of this
article is heavily based on this network.

1.2. RECOGNIZING ARCHITECTURAL DRAWINGS

As the main method for communicating with architects, architectural plan
drawings can also be regarded as 2D image data. So we collected 100 images
of apartment floor plans from lianjia.com, and marked them with different
colours by rooms (Figure 2). R255G0B0 for walkway, R0G255B0 for bed-
rooms, R0G0B255 for living rooms, R255G255B0 for kitchens, R255G0B255
for toilets, R0G255B255 for dining rooms, R0G0B0 for balconies, R128G0B0
for windows, and R0G128B0 for doors. The plan drawing and its correspond-
ing coloured image together become a training image pair.
UNDERSTANDING AND VISUALIZING GENERATIVE ADVERSARIAL NET-
WORK IN ARCHITECTURAL DRAWINGS 235

Figure 2. Apartment floor plan drawing (left); labelled image (middle); labelling rule (right).

After training with 100 image pairs, we gave new floor plan drawings to
the network, and told the program to generate predicted labelled images (Fig-
ure 3). As figure 3 shows, the network generated a highly similar labelled im-
age, which means it performed nicely in recognizing architectural drawings.

Figure 3. Apartment floor plan drawing (left); generated labelled image (middle); original
labelled image (right).

2. Working principles
In order to reveal how PIX2PIXHD network learns image pairs, this chapter
will analyse all three parts of this network, and give explanation of why they
work well in processing image data.

2.1. CONVOLUTION LAYER

Mentioned earlier in this article, the first part of PIX2PIXHD network is made
from 5 groups of convolution layer, which contains one convolution layer, one
batch normalization layer, and one ReLU layer. While the batch normalization
236 H. ZHENG AND W. HUANG

layer and ReLU layer do not build connections between each pixel, convolu-
tion layer acts as the main calculation rule, extracting and mixing features of
an image.
As figure 4 shows, convolution kernel is a 3 × 3 (or larger) matrix. When
we input a 5 × 5 matrix, the kernel will find its corresponding position, multi-
ply and sum-up 9 numbers, and finally output a new 3 × 3 matrix. General
speaking, convolution kernel is a feature extractor, turning a matrix into a
smaller but refined new matrix. A convolution layer usually contains hundreds
of kernels, in order to make sure all features being contained in the layer. So
the numbers in the kernel are the parameters, which we should tell the program
to figure out by machine learning.

Figure 4. Generating a new matrix (image) using convolution kernel.

2.2. RESIDUAL NETWORK LAYER

The Second part in the network is 9 groups of residual network layer (ResNet).
One ResNet contains two groups of convolution layer, but instead of directly
linking convolution layers to the network, ResNet has a protecting operation
to skip these two layers if the result is turning worse (Figure 5). It proceeds
the network into deeper layers, while making sure the output will not become
worse.

Figure 5. ResNet unit framework.

2.3. DECONVOLUTION LAYER

After going through 5 convolution layers and 9 residual network layers, the
width and height of the new matrix become 1/16 of the original matrix. So
next 5 groups of deconvolution layers will enlarge the matrix, turning it back
UNDERSTANDING AND VISUALIZING GENERATIVE ADVERSARIAL NET-
WORK IN ARCHITECTURAL DRAWINGS 237

to the original size, while using the features to generate similar data as the
second image in the image pairs. Considering the length of this article, the
reversed operation of matrix will not be elaborated.

3. Visualizing architectural features in plan drawings

After understanding the network frame, we started to extract the parameters
in each layer from each training epoch, to see what features the network has
learned.

3.1. BY NETWORK LAYER

Firstly, we inputted a new image into the network. After passing through each
convolution layer, the size of the image got down to 1/2, while the number of
channels increased to 2 times. That means the features in one image channel
became more concise while the features or the combination of features became
more. As figure 6 shows, only simple features like walls were recognized in
conv-layer 1, but as the layer went deeper, more features like edges of tables
and sofas became conspicuous, while the image size went down, which means
the features were combined and compressed in convolution layers, like we
human beings learning from concrete entities to abstract concepts as we think
deeper.

Figure 6. Visualization of parameters in each convolution layer.

Later comes the ResNet layer, which did not change the image size and
number of channels, but further shifted the combination of features. Last, as
figure 7 shows, deconvolution layer enlarged the image and decreased the
number of channels into the same status of the original image.

Figure 7. Visualization of parameters in ResNet layer and each deconvolution layer.

238 H. ZHENG AND W. HUANG

3.2. BY TRAINING EPOCH

Also, we used the same method to extract parameters in different training
epochs. As figure 8 shows, in convolution layer 3, when training to epoch 4,
the node 0 recognized the paving pattern in bedrooms, but it’s not obvious
since there is still noise of table edges and walkway paving. As the training
went on, walkway paving first disappeared in epoch 24, and then table edges
went away gradually. In epoch 80, only bedroom paving pattern remained in
the parameters, which means this node was used to tell bedroom area apart
from other areas, and it became clearer and clearer as training time increased.
So comparing to human learning process, it’s easy to understand the effect of
training time on performance.

Figure 8. Visualization of parameters in convolution layer 3 node 0 in each sample epoch.

4. Conclusion
Based on Generative Adversarial Networks, PIX2PIXHD is a powerful ma-
chine learning tool to recognize and generate architectural drawings. The fea-
tures in the drawings become more concise as the network goes deeper, and
clearer as the training epoch increases. It might be inspiring comparing to the
learning process of our human beings, noted that we learned from concrete
entities to abstract concepts, and from fuzzy cognition to accurate judgement.
So in the future, Generative Adversarial Networks not only can be used for
generating images, but also has the potential for self-designing art or architec-
tural works.

Acknowledgements
I'd like to show my gratitude to Prof. Weixin Huang from Tsinghua University, who supervised
me in this research, and Yuming Lin, Lijing Yang, Chenglin Wu, Zhijia Chen, and Xia Su for
providing labelled image data and advice.

References
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio,
Y. (2014). Generative adversarial nets. In Advances in neural information processing sys-
tems (pp. 2672-2680).
Wang, T. C., Liu, M. Y., Zhu, J. Y., Tao, A., Kautz, J., & Catanzaro, B. (2017). High-resolution
image synthesis and semantic manipulation with conditional gans.

Depth Prediction Single Image
No ratings yet
Depth Prediction Single Image
8 pages
U-Net: Convolutional Networks For Biomedical Image Segmentation
No ratings yet
U-Net: Convolutional Networks For Biomedical Image Segmentation
8 pages
Architectural_Drawings_Recognition_and_G
No ratings yet
Architectural_Drawings_Recognition_and_G
10 pages
Multi-Layered Deep Convolutional Neural Network For Object Detection
No ratings yet
Multi-Layered Deep Convolutional Neural Network For Object Detection
6 pages
CNN Case Studies Unit 4
No ratings yet
CNN Case Studies Unit 4
13 pages
DNN Architectures
No ratings yet
DNN Architectures
12 pages
Zeiler Ec CV 2014
No ratings yet
Zeiler Ec CV 2014
16 pages
219 Report
No ratings yet
219 Report
7 pages
Literature Review On Image Classification Architecture
No ratings yet
Literature Review On Image Classification Architecture
14 pages
Understanding of A Convolutional Neural Network
No ratings yet
Understanding of A Convolutional Neural Network
6 pages
Imagify Reconstruction of High - Resolution Images From Degraded Images
No ratings yet
Imagify Reconstruction of High - Resolution Images From Degraded Images
5 pages
Understanding of Convolutional Neural Network (CNN) - Deep Learning
No ratings yet
Understanding of Convolutional Neural Network (CNN) - Deep Learning
7 pages
Convolutional Neural Networks: Computer Vision
No ratings yet
Convolutional Neural Networks: Computer Vision
14 pages
Convolutional Neural Networks (Cnns / Convnets)
No ratings yet
Convolutional Neural Networks (Cnns / Convnets)
21 pages
A Novel Method For Detecting Image Forgery Based On Convolutional Neural Network
No ratings yet
A Novel Method For Detecting Image Forgery Based On Convolutional Neural Network
4 pages
Deep Koalarization: Image Colorization Using Cnns and Inception-Resnet-V2
No ratings yet
Deep Koalarization: Image Colorization Using Cnns and Inception-Resnet-V2
12 pages
10 1109ICEngTechnol 2017 83081861 PDF
No ratings yet
10 1109ICEngTechnol 2017 83081861 PDF
7 pages
Model
No ratings yet
Model
4 pages
Unit IV Deep Leraning
No ratings yet
Unit IV Deep Leraning
35 pages
Rocco Convolutional Neural Network CVPR 2017 Paper
No ratings yet
Rocco Convolutional Neural Network CVPR 2017 Paper
10 pages
Deep Learning Unit2
No ratings yet
Deep Learning Unit2
43 pages
Chap 2 DL
No ratings yet
Chap 2 DL
88 pages
UNIT-III DeepLearning Notes
No ratings yet
UNIT-III DeepLearning Notes
30 pages
Easy Going Vector Graphics As Textures On The GPU
No ratings yet
Easy Going Vector Graphics As Textures On The GPU
4 pages
Abstract 3. Related Work Sandstorm U-Net Deeplabv3 4. Methodology
No ratings yet
Abstract 3. Related Work Sandstorm U-Net Deeplabv3 4. Methodology
18 pages
iwpr2016WR79
No ratings yet
iwpr2016WR79
6 pages
23-CNN Operations - Architecture - Simple Convolution Network-09!09!2024
No ratings yet
23-CNN Operations - Architecture - Simple Convolution Network-09!09!2024
8 pages
CS231n - Convolutional-Networks 1
No ratings yet
CS231n - Convolutional-Networks 1
3 pages
Data Science Interview Preparation (30 Days of Interview Preparation)
No ratings yet
Data Science Interview Preparation (30 Days of Interview Preparation)
15 pages
Depth Estimation Using CNN With Transfer Learning
No ratings yet
Depth Estimation Using CNN With Transfer Learning
15 pages
An Algorithmic Approach For Efficient Image Compression Using Neuro-Wavelet Model and Fuzzy Vector Quantization Technique
No ratings yet
An Algorithmic Approach For Efficient Image Compression Using Neuro-Wavelet Model and Fuzzy Vector Quantization Technique
9 pages
MODULE 3
No ratings yet
MODULE 3
26 pages
CS231n Convolutional Neural Networks For Visual Recognition
No ratings yet
CS231n Convolutional Neural Networks For Visual Recognition
25 pages
Introduction To CNNs
No ratings yet
Introduction To CNNs
26 pages
Chapter 4 Ann
No ratings yet
Chapter 4 Ann
33 pages
CONVOLUTIONAL NEURAL NETWORK
No ratings yet
CONVOLUTIONAL NEURAL NETWORK
36 pages
A - GNN - Architecture - With - Local - and - Global-Attention - Feature - For - Image - Classification 2
No ratings yet
A - GNN - Architecture - With - Local - and - Global-Attention - Feature - For - Image - Classification 2
13 pages
Image_generation
No ratings yet
Image_generation
10 pages
CV Lab 12 - Implementatin of a Simple CNN
No ratings yet
CV Lab 12 - Implementatin of a Simple CNN
9 pages
Auto-Encoder Based Dimensionality Reduction
No ratings yet
Auto-Encoder Based Dimensionality Reduction
25 pages
Learning Hierarchical Features For Scene Labeling
No ratings yet
Learning Hierarchical Features For Scene Labeling
15 pages
Document Query
No ratings yet
Document Query
5 pages
Aggregated Residual Transformations For Deep Neural Networks
No ratings yet
Aggregated Residual Transformations For Deep Neural Networks
9 pages
CS 601 Machine Learning Unit 3
No ratings yet
CS 601 Machine Learning Unit 3
37 pages
Unit 5
No ratings yet
Unit 5
24 pages
Graph Convolutional Networks Adaptations and Applications
No ratings yet
Graph Convolutional Networks Adaptations and Applications
6 pages
dl ass 742
No ratings yet
dl ass 742
14 pages
Advancements in Image Classification Using Convolutional Neural Network
No ratings yet
Advancements in Image Classification Using Convolutional Neural Network
8 pages
Convolutional Neural Networks. Before Kickstarting Into CNNs We Must - by Namita - Medium
No ratings yet
Convolutional Neural Networks. Before Kickstarting Into CNNs We Must - by Namita - Medium
13 pages
Predicting Depth, Surface Normals and Semantic Labels With A Common Multi-Scale Convolutional Architecture
No ratings yet
Predicting Depth, Surface Normals and Semantic Labels With A Common Multi-Scale Convolutional Architecture
9 pages
Harley Vis Isvc15
No ratings yet
Harley Vis Isvc15
11 pages
Ahmed An Improved Deep 2015 CVPR Paper
No ratings yet
Ahmed An Improved Deep 2015 CVPR Paper
9 pages
Yu Generative Image Inpainting CVPR 2018 Paper
No ratings yet
Yu Generative Image Inpainting CVPR 2018 Paper
10 pages
XLA_final_report (1)
No ratings yet
XLA_final_report (1)
17 pages
Miika Aittala Burst Image Deblurring ECCV 2018 Paper
No ratings yet
Miika Aittala Burst Image Deblurring ECCV 2018 Paper
17 pages
A Deep Approach To Image Matting Report
No ratings yet
A Deep Approach To Image Matting Report
9 pages
MobileNetV2 Inverted Residuals and Linear Bottlenecks
No ratings yet
MobileNetV2 Inverted Residuals and Linear Bottlenecks
11 pages
Data Science Interview Preparation (# DAY 22)
No ratings yet
Data Science Interview Preparation (# DAY 22)
16 pages
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
From Everand
Mesh Generation: Advances and Applications in Computer Vision Mesh Generation
Fouad Sabry
No ratings yet
Machine Learning - Advanced Concepts
From Everand
Machine Learning - Advanced Concepts
Derrick Mwiti
No ratings yet
MySQL网络数据库指南
No ratings yet
MySQL网络数据库指南
480 pages
26a61e0d-0cf4-4f62-b1fa-5c8d4390c02a
No ratings yet
26a61e0d-0cf4-4f62-b1fa-5c8d4390c02a
13 pages
1066_CodeDPO_Aligning_Code_Mod
No ratings yet
1066_CodeDPO_Aligning_Code_Mod
20 pages
2402.07945
No ratings yet
2402.07945
21 pages
Regularization
No ratings yet
Regularization
45 pages
Matlab
No ratings yet
Matlab
7 pages
Unit1 ML
No ratings yet
Unit1 ML
36 pages
AI PROJECT CYCLE Data Modelling
No ratings yet
AI PROJECT CYCLE Data Modelling
18 pages
Hris Group 8
No ratings yet
Hris Group 8
23 pages
Speech Recognition
No ratings yet
Speech Recognition
16 pages
222MI5216 - Group 4 - AI Mini Project Assignment
No ratings yet
222MI5216 - Group 4 - AI Mini Project Assignment
8 pages
Machine Learning With Python/Scikit-Learn: - Application To The Estimation of Occupancy and Human Activities
No ratings yet
Machine Learning With Python/Scikit-Learn: - Application To The Estimation of Occupancy and Human Activities
113 pages
IIT KGP CALENDER_Cohort 3
No ratings yet
IIT KGP CALENDER_Cohort 3
2 pages
Data Exploration and Visualization - AD3301 - Hand Written Notes - Unit 5 - Multivariate and Time Series Analysis
No ratings yet
Data Exploration and Visualization - AD3301 - Hand Written Notes - Unit 5 - Multivariate and Time Series Analysis
59 pages
MLOPs
No ratings yet
MLOPs
20 pages
Ai in Marketing BAse Article
No ratings yet
Ai in Marketing BAse Article
28 pages
ML LAB - V SEM - BCA
No ratings yet
ML LAB - V SEM - BCA
22 pages
PBL Project
No ratings yet
PBL Project
18 pages
Image Captioning
No ratings yet
Image Captioning
16 pages
Unit-4object Segmentation Regression Vs Segmentation Supervised and Unsupervised Learning Tree Building Regression Classification Overfitting Pruning and Complexity Multiple Decision Trees
No ratings yet
Unit-4object Segmentation Regression Vs Segmentation Supervised and Unsupervised Learning Tree Building Regression Classification Overfitting Pruning and Complexity Multiple Decision Trees
25 pages
DS_3000_Syllabus_Spring_2025
No ratings yet
DS_3000_Syllabus_Spring_2025
10 pages
From Decoding To Meta-Generation: Inference-Time Algorithms For Large Language Models
No ratings yet
From Decoding To Meta-Generation: Inference-Time Algorithms For Large Language Models
46 pages
COMPARISON - Jupyter Notebook
No ratings yet
COMPARISON - Jupyter Notebook
5 pages
Deep Learning 2017 Lecture5CNN
No ratings yet
Deep Learning 2017 Lecture5CNN
30 pages
Learn Data Mining Through Excel: A Step-by-Step Approach for Understanding Machine Learning Methods, 2nd Edition Hong Zhou 2024 scribd download
100% (4)
Learn Data Mining Through Excel: A Step-by-Step Approach for Understanding Machine Learning Methods, 2nd Edition Hong Zhou 2024 scribd download
66 pages
Machine Learning: Andrew NG's Course From Coursera: Presentation
100% (1)
Machine Learning: Andrew NG's Course From Coursera: Presentation
4 pages
Ix - Pta Quest Sample Paper 1 - 2024-25
No ratings yet
Ix - Pta Quest Sample Paper 1 - 2024-25
2 pages
Application of Meta-Heuristic Algorithms For Training Neural Networks and Deep Learning Architectures
No ratings yet
Application of Meta-Heuristic Algorithms For Training Neural Networks and Deep Learning Architectures
104 pages
IoT Based Automated Weather Report Generation and Prediction Using Machine Learning-Unlocked
No ratings yet
IoT Based Automated Weather Report Generation and Prediction Using Machine Learning-Unlocked
6 pages
Information 14 00052
No ratings yet
Information 14 00052
15 pages
Cluster Analysis Using Dicer: Install - Packages
No ratings yet
Cluster Analysis Using Dicer: Install - Packages
8 pages
Automated Design Modular Buildings GAN
No ratings yet
Automated Design Modular Buildings GAN
20 pages
Ebooks File Hybrid Information Systems Non Linear Optimization Strategie With Artificial Intelligence 1st Edition Ramakant Bhardwaj All Chapters
100% (2)
Ebooks File Hybrid Information Systems Non Linear Optimization Strategie With Artificial Intelligence 1st Edition Ramakant Bhardwaj All Chapters
64 pages
AIMI-3-33
No ratings yet
AIMI-3-33
13 pages