Comp PDF
Comp PDF
0 1 2 4 12 16
encoding-type codecID
message-type padding #-of-rects x y w h encoding-type
number name CODEC_ID_MPEG4
0 Raw CODEC_ID_MJPEG
Header Extended header Payload : encoded stream
1 CopyRect CODEC_ID_MPEG2VIDEO
: CODEC_ID_MPEG1VIDEO
srcX srcY srcW srcH codecID encoded-byte-size 21 VideoMode
0 2 4 6 8 12 16
Fig. 4. Header extension for supporting video encoders in the FramebufferUpdate message.
IV. VIDEO CODEC INTEGRATION AND PROTOCOL needed for decoding at the client such as top-left position
EXTENSION of encoding block, and its size (width and height). If the
client receives the FramebufferUpdate message, it can decode
A. Video Codec Integration successfully with an appropriate encoder by parsing this
A mobile VNC server requires fast encoding of screen information.
images due to its strict resource constraint. When screen As described in section III, VNC is a client-pull system
images are encoded in the server, encoder complexity and which is generally appropriate for thin-client architecture.
compression ratio become important factors to determine the However, the client-pull system has occasionally a
overall VNC performance such as screen update rate. The drawback in high latency network environments. For
encoding method implemented in the original VNC system is instance, consider the screen update flow as illustrated in
run-length encoding. It has low complexity, but exhibits bad Fig. 5 (a). The client waits for the next update message
compression ratio. When sharing gaming or movie playbacks after sending a request to the server. While waiting for the
on a screen, low compression ratio leads to large amount of next update, it remains idle, and does not do anything.
bitstreams which cause network congestion and slow screen Similarly, the server also enters into an idle state after
updates at the client. On the other hand, video encoding sending the screen update. This idle state unnecessarily
techniques such as MPEG exhibit high compression ratio but wastes scarce resource for both server and client. To
computation proportionally increases because of complex remove the redundancy time in the conventional VNC
motion estimation and compensation. In particular, encoder system, we modified protocol operations as shown in Fig.
complexity in the server is more critical for mobile devices. 5 (b). As soon as the server receives the screen request, it
Thus, a method to encode screen images should be carefully immediately sends previous screen image which has been
selected in terms of both encoding time and compression already encoded, and so, the idle time can be reduced.
efficiency. As mentioned in the previous section, we Thus, the client receives screen image data soon after
integrated FFmpeg to our prototype system because it sending the screen request. This allows client to update its
provides a variety of image and video encoders. Among them, screen faster.
we select MPEG2, MPEG4 and motion JPEG as candidate
encoding methods, and their performances are reported in the V. MODIFIED REGION CODING
experimental results.
A. Motivation
B. Protocol Extension and Improvement There may be large regions which have no change between
The RFB protocol implementation of ‘droid VNC’ does not consecutive screen images, depending on applications to be
support any video encoder for screen image encoding. shared. Motivated by this observation, we propose a modified
Integrating video codecs into VNC systems requires us to region coding, which encodes the modified region only as
define a new encoding-type in RFB. That also needs a illustrated in Fig. 6. Note that modified region coding is
mechanism to convey additional information associated with allowed for MJPEG only, and it is not applicable to typical
video encoder. For this, the RFB protocol is extended video coding standards such as MPEG and H.264 due to their
straightforwardly, while preserving backward compatibility inter-frame coding.
with conventional VNC systems. In order to support video In the first step of modified region coding, a screen image is
encoders, we newly define ‘VideoMode’ for encoding-type segmented into unit rectangles which are fixed size blocks.
which is specified at the header of the FramebufferUpdate Then, difference detection between current and previous
message as shown in Fig. 4. In addition, there are several screen images is performed for each unit rectangle. If all pixel
kinds of video codecs for ‘VideoMode’. For identifying video values are identical, the unit rectangle is regarded as a skip
codec in the video encoding mode, ‘codecID’ is additionally block, which does not need to be transmitted. If any difference
defined at the extended header followed by main header. is detected, the unit rectangle is encoded and is transmitted to
Note that theextended header also contains other information the client as usual.
1214 IEEE Transactions on Consumer Electronics, Vol. 58, No. 4, November 2012
Server Client
Receiving
Video decoding
CSC
screen request Display
Framebuffer Compare
Previous Current
CSC
Idle
Video encoding Screen
Sending
screen update update
Receiving time
Video decoding
Modified Region
Idle CSC
screen request Display Fig. 6. Modified region detection.
Framebuffer
CSC
Idle B. Conventional Raster Scan Detection
Video encoding
screen update The conventional modified region detection method
Sending
simply compares each pixel between previous and current
screens in a raster scan order (i.e., from the leftmost top
corner to the rightmost bottom). While performing the
Idle
raster scan, if even a single modified pixel is detected, the
comparison operation stops, and the corresponding unit
(a) rectangle is regarded as a code block. Then the comparison
Server Client moves to the next unit rectangle. Such early termination is
Receiving devised for minimizing the unnecessary computation load.
Video decoding Nevertheless, as the modified pixel is located closer to the
CSC rightmost bottom along the raster scan order, computation
screen request Display load increases linearly. In the worst case, if the modified
Sending
Receiving pixel is located at the rightmost bottom corner, the
screen update Screen
Framebuffer
Video decoding update algorithm should carry out the comparison for all pixels in
CSC the unit rectangle. This is a very severe overload. The
CSC time
Video encoding screen request Display larger the size of the unit rectangle, the more vulnerable the
Sending
algorithm is to this phenomenon. Motivated by this
Receiving
Framebuffer screen update observation, we propose a hierarchical region detection
Video decoding
CSC
CSC
algorithm.
Video encoding screen request Display
Sending C. Proposed Hierarchical Region Detection
Receiving In modified region coding, the precise prediction of the
screen update
modified pixel location is crucial in order to reduce time
consumed for detecting modified regions. However, methods
such as motion estimation, as used in video coding, require
(b) much more computation rather than gain of computation
Fig. 5. Protocol improvement from (a) serial operation to (b) parallel one. reduction by modified region coding. Thus, we hierarchically
determine the pixel location for comparison instead of
In modified region coding, the unit rectangle size may affect the predicting the modified pixel location. For this, we propose
overall encoding performance sensitively. If it is too large, the two different methods, which will be described in the remains
performance may be worse on the contrary due to both less number of the subsection.
of skip blocks and the overhead taken for detection. On the other
hand, if the unit rectangle size is too small, much overload is 1. Hierarchical region detection algorithm
accompanied because of the induced computation for encoder We first propose a hierarchical 3-step region detection
initialization and additional data transmission per unit rectangle. algorithm, which is divided into three steps.
Modified region coding plays a decisive role for fast screen In ‘step 1’, as shown in Fig. 7 (a), the unit rectangle is
image encoding. However, detecting modified regions requires down-sampled by a factor of 4 in both horizontal and vertical
additional computational overhead, which may cause worse directions. Then, pixel comparison on the low resolution is
performance rather than full region coding. Therefore, we done in a raster scan order. If a modified pixel is detected in
propose a fast modified region detection method in this paper. ‘step 1’, the remaining ‘step 2’ and ‘step 3’ are skipped, and
H.-Y. Ko et al.: Implementation and Evaluation of Fast Mobile VNC Systems 1215
w
Server
1 Server AP
w
2 `
Client
1 h
w
4
1
h
1 2
h Client AP
4
(a) (b) (c)
Fig. 7. Hierarchical 3 step detection method; (a) Step 1, (b) Step 2, and (c) Fig. 9. Experiments over our prototype system.
Step 3. (The game images courtesy of Olivestudio, EBS and Dream Search.)
milliseconds without almost PSNR reduction. Also, even with Fig. 13. Total encoding time for a screen image.
‘step 1’ only in the hierarchical 3 step algorithm, we can save
the detection time by approximately 20 milliseconds while the The next experiment evaluates the performance of screen image
received visual quality is still preserved, comparable to the encoding time for MJPEG. For this experiment, we use the same
raster scan order as shown in Table III. It is worth noting that environments as the previous region detection experiment.
the negligible reduction in the PSNR results of Table III are Total time required for encoding a screen image is compared
caused by applying the proposed modified region coding in Fig. 13. We can see from Fig. 13 that our system can provide
H.-Y. Ko et al.: Implementation and Evaluation of Fast Mobile VNC Systems 1217
faster image encoding than ‘VNC’ and ‘VNC with raster’ over VII. CONCLUSION
test sequences. In particular, the proposed ‘VNC with 1 step’ In this paper, we implemented a prototype system for mobile
algorithm produces better performance in terms of encoding VNC, and reported practical performance evaluations. To integrate
time while almost maintaining the same PSNR. video codecs into our VNC system, the existing RFB protocol is
extended, preserving backward compatibility. Also, protocol
TABLE IV
Average ratio of modified blocks per video frame operations are modified to parallel for reducing unnecessary idle
Average ratio of time. In addition to the adoption of video codec we propose a
Sequence modified blocks /total unit rectangles modified region coding to further reduce the encoding time of
Raster 3 Step 2 Step 1 Step Dual screen images. Based on numerous experiments, we found that
Angry 7.71/12 7.71/12 7.70/12 7.69/12 7.71/12 MJPEG is the most suitable for mobile VNC systems in terms of
Mole 5.31/12 5.31/12 5.30/12 5.29/12 5.31/12 both complexity and compression ratio. Besides, the proposed
Animal 7.88/12 7.88/12 7.87/12 7.86/12 7.88/12
modified region coding can further decrease encoding time, and
Akiyo 8.97/12 8.97/12 8.96/12 8.95/12 8.97/12
MnD 9.36/12 9.36/12 9.35/12 9.35/12 9.36/12 consequently increase screen update rate at the client. Various
practical and diverse experimental results demonstrate that the
proposed methods guarantee fast screen image encoding without
19 visual quality degradation. In particular, modified region detection
is significantly effective for gaming video contents with small
17
texture regions.
Screen Uodate Rate
15
VNC
13 REFERENCES
VNC + Raster
11 [1] T. Richardson, Q. Stafford-Fraser, K. Wood, and A. Hopper, “Virtual
VNC + 3 Step
network computing,” IEEE Internet Computing, vol. 2, no. 1, pp. 33–38,
9 VNC + 2 Step Jan./Feb.1998.
VNC + 1 Step [2] P. M. Corcoran, F. Papal, and A. Zoldi, “User interface technologies for
7
home appliances and networks,” IEEE Trans. Consumer Electron., vol.
5 44, no. 3, pp. 679-685, Aug. 1998.
angry mole animal akiyo MnD [3] K. Tsunashima, T. Shida, H. Kawano, T. Sato, and H. Kosaka,
Sequence “Compact programmable network display system for portable
projectors,” IEEE Trans. Consumer Electron., vol. 55, no. 2, pp. 312-
Fig. 14. Screen update rate for game sequences. 315, May 2009.
[4] D. Thommes, Q. Wang, A. Gerlicher, and C. Grecos, “RemoteUI: A
high-performance remote user interface system for mobile consumer
The proposed algorithms depend on the characteristics of
electronics devices,” Proc. of IEEE International Conference on
video sequences. As shown in Table IV, the gaming sequences Consumer Electronics (ICCE 2012), pp. 670-671, Jan. 2012.
relatively have the low ratio of modified blocks due to their little [5] P. M. Corcoran, F. Papal, and A. Zoldi, “User interface technologies for
motion and small texture regions. Meanwhile, the natural video home appliances and networks,” IEEE Trans. Consumer Electron., vol.
44, no. 3, pp. 679-685, Aug. 1998.
sequences such as ‘Akiyo’ and ‘MnD’ typically contain more [6] K. V. Kaplinsky, “VNC tight encoder-data compression for VNC,” Proc.
modified regions, and may lead to less performance gain. of the 7th International Scientific and Practical Conference of Students,
However, the proposed algorithms are still effective for the Post-graduates and Young Scientists (MTT 2001), pp. 155–157, Feb.
2001.
natural video sequences. As confirmed in Fig. 14, the proposed [7] K. J. Tan, “A remote thin client system for real time multimedia
algorithms achieve higher screen update rate than VNC. streaming over VNC,” Proc. of IEEE International Conference on
SUR is a performance metric which is ‘screen update rate’ as Multimedia and Expo (ICME 2010), pp. 992-997, July 2010.
[8] X. Zhang and H. Takahashi, “A hybrid data compression scheme for
measured at the client display. Note that SUR is conceptually improved VNC,” Systemics, Cybernetics and Informatics, Vol. 5, No. 2
equal to frame rate in the video playback. While we measure pp. 1-4, 2007.
and calculate the SUR per frame, we only report on the average [9] H. Shen, “A high-performance remote computing platform,” Proc. of
SUR over hundreds frames. Higher SUR is better. Figure 14 IEEE International Conference on Pervasive Computing and
Communication (PerCom 2009), pp. 1-6, Mar. 2009.
shows the average SUR, resulting from five detection methods [10] S. Rao, H. Vin, and A. Tarafdar, “Comparative evaluation of server-push
and the original VNC. Our system can provide higher SUR than and client-pull architectures for multimedia servers,” Proc. of the 6th
‘original VNC’ and ‘VNC with raster’ over test sequences. As a International Workshop Network and Operating System Support for
Digital Audio and Video (NOSSDAV), pp. 45-48, Apr. 1996.
result, our systems achieve higher screen update rate at the [11] C. Taylor and J. Pasquale, “Improving video performance in VNC under
client due to the reduced encoding complexity. We can see that high latency conditions,” Proc. of the International Symposium on
the proposed algorithms can reduce encoding complexity at the Collaborative Technologies and Systems (CTS 2010), pp.26-35, May 2010.
[12] W. J. Kim, K. Cho, and K. S. Chung, “Stage-based frame-partitioned
server significantly, and achieve higher screen update rate at the parallelization of H.264/AVC decoding,” IEEE Trans. Consumer
client. In addition, applying modified region coding almost Electron., vol. 56, no. 2, pp. 1088-1096, May 2010.
leads to the reduction of compression bit rate because only [13] Sheng Liang, Java Native Interface: Programmer's Guide and
Specification, 1st ed., Prentice Hall, 1999.
modified region is encoded. That is another benefit of the
[14] Jae-Hyeok Lee, Ha-Young Ko, Jong-Ok Kim, “Fast modified region
proposed method even though the aspect of encoding detection for mobile VNC systems,” Proc. of International Conference
complexity is primarily investigated in this paper. on Awareness Science and Technology (iCAST 2012), Aug. 2012.
1218 IEEE Transactions on Consumer Electronics, Vol. 58, No. 4, November 2012