0% found this document useful (0 votes)
4 views8 pages

Planning Problems

This document presents a study on using a Deep Q-Learning Network (DQN) to address the ship stowage planning problem, which is crucial for optimizing container terminal operations. The proposed DQN allows for rapid stowage planning after extensive pre-training, demonstrating improved efficiency and generalization compared to traditional methods. The effectiveness of the DQN is validated through various production cases, showcasing its potential in enhancing ship stowage planning processes.

Uploaded by

alcides
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views8 pages

Planning Problems

This document presents a study on using a Deep Q-Learning Network (DQN) to address the ship stowage planning problem, which is crucial for optimizing container terminal operations. The proposed DQN allows for rapid stowage planning after extensive pre-training, demonstrating improved efficiency and generalization compared to traditional methods. The effectiveness of the DQN is validated through various production cases, showcasing its potential in enhancing ship stowage planning processes.

Uploaded by

alcides
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

POLISH MARITIME RESEARCH Special Issue 2017 S3 (95) 2017 Vol. 24; pp.

102-109
10.1515/pomr-2017-0111

A DEEP Q-LEARNING NETWORK FOR SHIP STOWAGE


PLANNING PROBLEM

Yifan Shen1
Ning Zhao2*
Mengjue Xia1
Xueqiang Du2
1
Scientific Research Academy, Shanghai Maritime University, China
2
Logistics Engineering College, Shanghai Maritime University, China
* corresponding author

ABSTRACT

Ship stowage plan is the management connection of quae crane scheduling and yard crane scheduling. The quality
of ship stowage plan affects the productivity greatly. Previous studies mainly focuses on solving stowage planning
problem with online searching algorithm, efficiency of which is significantly affected by case size. In this study, a Deep
Q-Learning Network (DQN) is proposed to solve ship stowage planning problem. With DQN, massive calculation and
training is done in pre-training stage, while in application stage stowage plan can be made in seconds. To formulate
network input, decision factors are analyzed to compose feature vector of stowage plan. States subject to constraints,
available action and reward function of Q-value are designed. With these information and design, an 8-layer DQN
is formulated with an evaluation function of mean square error is composed to learn stowage planning. At the end
of this study, several production cases are solved with proposed DQN to validate the effectiveness and generalization
ability. Result shows a good availability of DQN to solve ship stowage planning problem.

Keywords: Deep Q-Leaning Network (DQN); Container terminal; Ship stowage plan; Markov decision process; Value function approximation;
Generalization

time, which increases planning labor and time consumption.


INTRODUCTION Under such circumstances, stowage planning automation or
intelligent stowage planning has been a critical technology
In Recent years more and more researchers devoted to be broke through in container terminal management to
themselves to the study of marine science, port logistics and optimize both cost and efficiency.
so on[1-3], cause ocean is one of the important resources Previous studies of container ship stowage planning focus
for human beings. Especially in ports, researchers have on stowage planning model and algorithms to solve this
made great contributions to container terminal equipment problem.
[4-7] and planning [8-10] intelligence. In container In terms of stowage planning model, Master Bay Plan
terminals, stowage plan is one of the most important and and In-Bay Plan have been the mainstream. Among Master
time consuming planning phase. In present time, stowage Bay Plan, Todd D S and Sen P [11] propose a Master Bay
planning is mainly made by hand with computer assistance. Plan model minimizing reshuffling, with trimming moment,
Such manual planning management mode relies heavily on heeling moment, ship stability and position as constraints.
experience of planners, which costs labor and time. With A GA is designed to solve this problem. Zhao N and Mi W J
automation currency in container terminal, the manual [12] made a multi-objective mixed integer programming
planning hinders management automation process. At the model with ship stability factors and operation factors as
same time, container ships has been larger and larger in recent constraints to optimize reshuffles and yard crane efficiency.

102 POLISH MARITIME RESEARCH, No S3/2017


The proposed MIP model can only solve small scale problems before land side slots. This sequence relationship between slot
with traditional planning solver. Moura A and Oliveira J et al locations varies between Deck Stowage Plan and Hold Stowage
[13] proposed a MIP model optimizing total transportation Plan. Deck has more constraints to ensure ship stability and
cost with shipping line in consideration. Amone In-Bay Plan, operation safety.
Avriel M and Penn M [14] proposed a MIP model minimizing
reshuffles. Proposed algorithm can solve small scale problems. 2. Ship slot weight limit factor
J.J.Shields [15] made a comparison between model solving Before In-Bay Planning, Master Bay Plan has preplanned
outputs and actual loading outputs to validate proposed allocation to suggest a weight limit for each ship slot to
model. Imai et al [16-18] proposed multi-objective MIP model guarantee ship stability and securing capacity. Thus each
minimizing reshuffles. Numerical experiments reveals more slot has a weight range constraint.
binary variables and binary constraints would significantly
increase complexity and significantly lower solving efficiency. 3. Heavy-over-light limit factor
Haghani and Kaisar et al [19] proposed a MIP model with Theoretically, heavy containers should be loaded under
turnaround time and ship parameters as constraints. light containers to ensure ship stability. While in actual
In terms of stowage planning algorithm, most researches planning, heavy containers are allowed to load over light
prefer intelligent optimization algorithms. Álvarez et al containers if these containers weighs close. Thus, a heavy-
[20] proposed a tabu-search algorithm with multiple initial over-light limit factor is applied to formulate this constraint.
solutions to solve the problem optimizing moving distance
of stackers, shuffles and container weight distribution in OPTIMIZATION OBJECTIVES
ship. Numerical experiments show that proposed tabu-search
algorithm can solve cases with more than 100 containers in 1. Staircase shape sequencing in deck stowage planning
short time while MIP solvers cannot solve cases with more In terms of loading sequence, containers should better be
than 40 containers. Kim et al [21-24] proposed beam-search loaded in stair shape, which means avoid insert a container
algorithm to solve stowage planning problem. Y.Lee et al between containers to improve loading operation efficiency
[25] decomposes stowage planning problem into smaller and safety.
scale sub-problems using hierarchy theory to solve stowage
planning problem. An Ant Colony Optimization-Tabu Search 2. Minimizing reshuffles
hybrid algorithm is proposed, and numerical experiment When a container needs to be loaded before containers
shows superiority of proposed hybrid algorithm over original over itself in container yard, reshuffle is needed. Reshuffles
independent algorithms. caused by stowage plan should be minimized during planning
These analyses shows that stowage planning studies at to improve loading efficiency.
the moment concentrate on composing a MIP model and
design a heuristic algorithm to solve the model. Such method 3. Minimizing yard crane shifts
performs well in small scale cases while has limitations such When containers with adjacent planned loading sequence
as poor performance in large scale cases and weak ability locates in different yard bays or even in different yard blocks,
of generalization. Thus in this paper a deep reinforcement yard crane needs to shift from one bay to another to load
learning algorithm is proposed to solve stowage planning these two containers. Unreasonable plan causes yard crane to
problem. Intelligent agent of stowage planning is trained shift back and forth to pick containers, which affects loading
to solve stowage planning problem efficiently and maintain efficiency. Thus, yard crane shifts should be minimized
better generalization. to improve loading efficiency.

CONTAINER SHIP STOWAGE DEEP REINFORCEMENT LEARNING


PLANNING PROBLEM ALGORITHM FOR STOWAGE PLANNING
PROBLEM
DECISION FACTORS
MARKOV DECISION PROCESS
In stowage planning process, several factors needs to be
considered to ensure seaworthiness of container ship and L. S. Shapley first proposes Markov Decision Processes
improve operation efficiency. (MDP) in stochastic games research. R.Bellman then proposes
dynamic programming method to solve general sequencial
1. Ship slot location and sequence relationship factor problem. R.A.Howard and D.Blackwell proposed general
To ensure efficiency during ship loading process, ship slots theoretical framework and effective method for MDP. A MDP
has relative loading sequence relationship such as slot 8401 is a 5-tuple ( st , at , rt , T , π ) , where
can only be loaded when the slot right under slot 8401 which st is a finite set of states,
is 8201, and relative sea side slots should better be loaded at is a finite set of actions available from state st ,

POLISH MARITIME RESEARCH, No S3/2017 103


rt is immediate reward (or expected immediate reward), -500, if stowage plan is available

T is the transit function from state st to st +1 , r1 = 500, if stowage plan is not available (1)
π is the strategy or policy 0, else

The problem of MDP is to find a policy π that specifies
actions that the decision maker will choose when in state st . r2 = −10 * φ (5) (2)

MDP FOR STOWAGE PROBLEM


r3 = -30*φ (9) (3)
MDP for stowage planning problem is formulated
according to basis of MDP. Fig. 2 shows a example of stowage Formula (1) is the reward of availability, (2) is the reward
planning MDP. of reshuffling, (3) is the reward of yard crane shifting.

4. Stowage Planning Evaluation Function and Action


Evaluation Function

Fig. 1. Slot Scheme.


vπ ( s )
= ∑ T (s ' | s, a)[r
s'
t +1 vπ ( st +1 ) | st s ]
( s ' | s , a ) + λ= (4)

=Qπ ( s , a ) ∑ T (s ' | s, a)[r


s'
t +1 ( s ' | s, a) + λ v( st +1 )] (5)

=V * ( s ) max ∑ T ( s ' | s, a )[ R ( s ' | s, a ) + λV * ( s ')] (6)


a
s'

=Q* ( s , a ) ∑ T (s ' | s, a)[ R(s ' | s, a) + λV


s'
*
( s ')] (7)
Fig. 2. MDP for Stowage Planning Problem
π is the stowage strategy or policy,
λ is the discount factor, which represents the influence
1. Stowage State of next stowage move to this move,
In stowage planning, t is stowage sequence, t=0 is the initial T ( s ' | s, a ) is the probability of taking action a to get state
state when no container is loaded, t=1 is the next state when s ' in state s ,
the first container is loaded, t=2 is the state when the second R ( s ' | s, a ) is the reward of taking action a to get state s '
container is loaded, and so on. In Fig. 2, S0 is the initial state in state s ,
when the whole bay is empty, S1 is the next state when the first v( s ) is expected reward of taking various actions in state s ,
container C1 is stowed, and then S2 . As is shown, when in or expected reward of stowing other containers after state s,
S0, there are several available actions, which is to stow which Q( s, a ) is the total reward of taking action a in state s,
container in which available slot. S1 (C 2, M 1) means stow C2 V * ( s ) is the maximum reward in state s ,
into M1 slot in state S1, S2 (C 6, M 2 | S1 (C 2, M 1)) means two Q* ( s, a ) is the maximum reward of taking action a in
containers are stowed, first stow C2 into M1 slot and then state s.
stow C6 into M2 in state S2 .
STOWAGE PLANNING FEATURES
2. Stowage Action
In stowage planning, an action is to mate a container with The dimensions of different ship bays are usually different
a slot, which means stow this container in this slot. In different in stowage planning, and reinforcement learning needs
state, available actions are different. In Fig. 2, when in S0, a training set with same dimensions. Thus, stowage features
if stowage constraints are ignored, there are 36 available are introduced to approximate different stowage states. In
actions or mate of containers to slots. this research, 9 features are selected as the feature vector of
a stowage state, Φ ={φ (1),φ (2),φ (8),φ (9)}
3. Stowage Reward
In reinforcement learning, reward represents the
environment. In stowage planning learning, reward mainly φi (1) = Wi / Wmax (8)
expresses objectives and constraints. Since the result of
a stowage plan is evaluated by availability, reshuffling, yard
crane shifting, and these evaluations have different scales φi (2) = Ti / Tmax (9)
of importance, the stowage reward is as follow.

104 POLISH MARITIME RESEARCH, No S3/2017


rewards to maximize the final reward and make the policy
 W j -Wi / Wmax , if Ti > 1


φi (3) =  (10) better.

( Wmax -Wi / Wmax , if Ti = 1

φi (4)= Pi − Si (11)

φi (5) = Fi / Fmax (12)

Fig. 3. Framework of Reinforcement Learning


φi (6)
= X i − X j / X max (13)

The difference between Deep Q-Learning and Q-Learning


φi (7)
= X i − X k / X max (14) is that the look up table is replaced by deep neuron network
to update Q( s, a) , which enables effectiveness in super large
state space scale. And the deep neuron network can be trained
φi (8) = (Gi − G j ) + (Gi − Gk ) / 2Gmax (15) with minimizing lost function Li ( wi ) which updates in each
iteration.

1, if this container is located in


 the same yard bay with the
Ε[(r + γ max Q( s ', a '; wi ) − Q( s, a; wi )) 2 ]
Li ( wi ) = (17)
  a'

φi (9) =  (16)
 previous one Target

0, else
s ' is the next state, and a ' is the next action. The partial
in the wi direction is in (18).
(8) Represents the normalized weight of selected container.
(9) Represents the normalized tier number of selected slot. Ε s , a , r , s '[(r + γ max Q( s ', a '; wi )
∇ wi Li ( wi ) =
(10) Represents the normalized weight gap between selected a'
(18)
container and the container located right under it on the ship. − Q( s, a; wi ))∇ wi Q( s, a; wi )]
(11) Represents the potential of selected match of container
and slot (or action), which means number of remaining lighter Stochastic Gradient Descent is used to optimize the lost
container minus available ship slots above selected slot, or function, and the weight updates after every iteration, which
expression of influence of selected action to later stowage is quite similar to traditional Q-Learning algorithm.
planning. In order to approximate reward for new states that never
(12) Represents normalized reshuffles caused by this action. appeared before, a evaluation function approximation
(13) Represents normalized sequential gap between selected function is introduced to improve generalization ability.
container with containers located left of selected. Unlike supervised learning, reinforcement learning doesn’t
(14) Represents normalized sequential gap between selected have known tags for training, tags are obtained through
containers with containers located right under selected iterations. While a state and an action is updated, the change
container. of weight for this match can affect other matches, which
(15) Represents normalized sum of sequential gap between causes ineffectiveness of previous state and action matches,
selected container with containers located left of selected and and then it causes longer training time or even failure of
sequential gap between selected containers with containers training. Thus, an experience replay method is introduced
located right under selected container. to prevent ineffectiveness.
(16) Represents whether this container locates in the same Experience replay stores the experience of time t as
yard bay with the previous one. (φt , at , rt ,φt +1 ) in experience history queue D , and then D is
stochastically sampled as (φ j , a j , rj ,φ j +1 ) to do mini-batch
DEEP REINFOREMENT LEARNING ALORITHM FOR to update the weight. This ensures every history points are
STOWAGE PLANNING PROBLEM considered when updating a new data point. Experience replay
stores all previous states and action in a sequence to minimize
Figure 3 shows the framework of reinforcement learning or objective function when Q-Function updates.
Q-Learning for stowage plan. In the initial state of learning,
the intelligent agent is like a naïve planner, every action the Ε( s , a , r , s ')~U ( D ) [(r + γ max Q( s ', b; wi )
Li ( wi ) =
b
planner take will have a reward to update Q( s, a ) , and the (19)
agent will decide next action for next state depending on − Q( s, a; wi )) 2 ]
updated Q( s, a ), this is the iteration of reinforcement learning.
Actually, the agent learns from iterations of attempts and

POLISH MARITIME RESEARCH, No S3/2017 105


D represents a sequence of previous states and actions, Tab. 1. DQN Algorithm for Stowage Planning
U ( D) is a uniform distribution among experience sequence D .
DQN Training Algorithm for Stowage Planning
Experience replay based upon uniform distribution lowered
data dependency to improve learning robustness. Initialize experience history queue D with length N
Initialize Q ( s, a; w) with random weight w0
The deep network for the stowage planning problem is
For each stowage episode loop:
designed as follows. Initialize observation sequence s1 = {x1} and feature sequence φ1 = φ ( s1 )
For each step in an episode loop:
1. Input layer and output layer Select an action to perform in state s with ε ( softmax) − greedy
For stowage planning problem, the input layer is a matrix Update reward and extract feature φ1 = φ ( s1 )
Save experience tuple (φt , at , rt , φt +1 ) into experience history queue D
of feature vector of stowage samples, the output layer is the
Collect samples (φ j , a j , rj , φ j +1 ) with size of random sampling mini-batch
approximate Q-value. Thus, the number of nodes in input Transform sample (φ j , a j , rj , φ j +1 ) into training tuple ( xk , yk )
layer is 9, the number of nodes in output layer is 1. xk = φ j , yk = r + λ max a ' Q i (φ j +1 , a ', wi −1 )
Update network weights of training set {( xk , yk )}m according to ∇ wi Li ( wi )
2. Number of hidden layers with stochastic gradient descent
Loop until end of states s
Generally, more hidden layer makes higher precision
Loop until end of episodes
of approximation, while more hidden layer costs more
training and greater probability of over-fitting. In this case,
9 hidden layer is accepted.

3. Number of nodes in hidden layers


To avoid over-fitting and maintain better generalization
ability, the number of nodes in hidden layer should be
minimized when the precision is assured. Number of nodes
in hidden layer is related to number of nodes in input layer,
number of nodes in output layer, complexity of learning
problem, transition function and sample data. Too few nodes
causes poor training performance, and too many nodes causes
less system error but may cause over-fitting.

4. Activation function
There are three widely used activation functions, TanH ,
Sigmoid and Relu ( Rectified Linear Units). Relu has better
training performance especially in attenuation of gradient
and network sparsely. Thus, Relu is used as the activation
function of this research.

Relu : f(x)= max(0, x) (20)

The designed deep neuron network for stowage planning


problem is shown in Figure 4.
According to deep network design, DQN training
algorithm is designed, pseudo code for DQN Algorithm
for Stowage Planning is shown in Table 1, and flowchart in
Figure 5.

Fig. 5. Flowchart for DQN Algorithm for Stowage Planning


Figure 4. Deep Neuron Network for Stowage Planning Problem

106 POLISH MARITIME RESEARCH, No S3/2017


STOWAGE CASE STUDY OF DQN STOWAGE STOWAGE RESULT ANALYSIS
PLANNING
The proposed DQN is trained with production data for
200000 iterations, which costs 2 hours and 46 mins. The
CASE DESCRIPTION trained DQN can finish the test case in 0.069 seconds, and
the stowage result of the test case is as in Figure 8.
In this case, production data of Ningbo Port is used to The upper left figure shows the weight distribution of
verify proposed method. Selected ship bay has 19 slots, stowage, the upper right figure shows the sequence of stowage.
19 corresponding containers locate in 4 yard bays in 2 blocks. The boxes are filled with different colors to distinguish its
Ship bay is shown in Figure 6, this bay has 4 tiers and 5 rows, original yard bay. In this stowage plan, 1 reshuffle and 3 shifts
each weight box is a slot to be stowed. Container distribution are needed to finish loading of this ship bay, of which 3 shifts
in yard is shown in Figure 7. Number inside each box in are necessary (because there are 4 yard bays in total). The
Figure 7 is the weight of each container. reshuffle of container with sequence 18 is unnecessary, but
Parameter setup for stowage planning is shown in Table 2 it is still a good solution. With all that, the effectiveness of
and parameter setup for DQN learning algorithm is in Table DQN trained with production data is validated.
3. Random exploration rate ε indicates that in the initial
state of iterations, the random exploration rate equals to 1
to improve exploration. After each 1% of total iterations, the
exploration rate decrease by a step of 0.09 to reach 0.1 when
iterations finish. With this descending, the agent can focus
on optimized solution gradually to converge while keeping
a moderate ability of exploration.

Fig. 8. Stowage result of test case

GENERALIZATION ABILITY ANALYSIS

1. Generalization of Data with Same Size


Fig. 6. Ship Bay Layout To verify the generalization of same size data, another
stowage case with 19 containers is introduced. This case (case.
2) comes from a different ship of same port. Stowage result
has 1 reshuffle and 4 shifts, 4 shifts of witch are all necessary.
With comparison with port planners’ stowage results, the
stowage plan of DQN shows a good performance. Manual
plan costs 121 seconds on average, while DQN can complete
Fig. 7. Container Distribution the calculation in 0.073s. This case study shows a good result
in terms of generalization of same size data.
Tab. 2. Parameter setup for stowage planning
2 Generalization of Data with Different Size
Heavy-over-light Yard crane shift To verify the generalization of different size data, a stowage
limit factor δ Reshuffle weight w1 weight w2 case with 40 containers is introduced. This case (case. 3) has
0.5 t 3 1 a big difference with the previous one both in case size and
Tab. 3. Parameter Setup for DQN Learning Algorithm container distribution. Result of DQN of this case shows
some heavy-over-light containers, while the weight gaps are
Random exploration all in the heavy-over-light limit. The result has 12 reshuffles
Learning ratio α Discount factor λ rate ε
and 11 shifts, 4 shifts are unnecessary. For the complexity
1 to 0.1 with step of of this case, port planners show varieties in their plans, with
1*10-4 0.3
-0.09
an average of 10.2 reshuffles and 9.6 shifts. Port planners takes
Random exploration Experience replay Number of nodes in 237s to make the plan while DQN costs 0.131s. Thus, DQN
rate update internal depth hidden layers
shows comparable ability in this case with human competitors
1% 5000 128 with much better time consumption. This case study shows
a good result in terms of generalization of different size data.

POLISH MARITIME RESEARCH, No S3/2017 107


ROBUSTNESS ANALYSIS ACKNOWLEDGEMENTS

In stowage planning DQN learning, robustness of This research was supported by the National Natural
algorithm refers to whether the training algorithm can get Science Foundation of China (No.61540045), the Science
good DQN with various stowage planning cases. and Technology Commission of Shanghai Municipality
In generalization analysis part, DQN trained with case. 1 (No.15YF1404900, No.14170501500), Ministry of Education
is used to plan case. 2 and case. 3. To verify the robustness of of the PR China (No.20133121110005), Shanghai Municipal
proposed algorithm, case. 2 and case. 3 are used as training Education Commission (No. 14ZZ140), Shanghai Maritime
set to get new DQNs. Planning results of different DQNs are University (No.2014ycx040).
shown below.
Tab. 4. Training parameters and time consumption REFERENCES
Training Set No. of Containers Iterations Training time
1. M. Omar, S. S. Supadi. 2012. Integrated models for shipping
Case. 1 19 150000 2 h 46 min a vendor’s final production batch to a single buyer under
Case. 2 19 150000 2 h 53 min linearly decreasing demand for consignment policy. Sains
Case. 3 28 150000 4 h 21 min Malaysiana 41.3: 367-370.

2. C. Mi, Z. W. Zhang, Y. F. Huang, Y. Shen, 2013. A fast


Table 4 shows that these three training has same iteration automated vision system for container corner casting
setup, and with same case size, the training time is quite recognition. Journal of Marine Science and Technology-
similar. Taiwan, 24(1): 54-60. DOI: 10.6119/JMST-016-0125-8
Tab. 5. Comparison of DQNs’ planning results
3. X. P. Rui, X. T. Yu, J. Lu, et al. 2016. An algorithm for
Training Set Case. 1 Case. 2 Case. 3
generation of DEMs from contour lines considering
Test Case Case. 1 Case. 2 Case. 3 Case. 1 Case. 2 Case. 3 Case. 1 Case. 2 Case. 3 geomorphic features. Earth Sciences Research Journal,
Reshuffles 1 1 12 2 1 12 1 1 10 20(2): G1-G9, 20(2):G1-G9.
Shifts 3 4 11 3 4 12 3 5 11

Training Time 0.069 0.073 0.131 0.071 0.081 0.142 0.068 0.073 0.137
4. Y. Shen, 2016. An Anti-Collision Method of Slip Barrel for
Automatic Ship Loading in Bulk Terminal. Polish Maritime
Research, 23(s1).
As in Table 5, different test cases shows good result with
different trained DQNs, and the efficiency of different 5. C. Mi, Y. Shen, W. J. Mi, Y. F. Huang, 2015. Ship Identification
DQNs are quite similar, which means influence of different Algorithm Based on 3D Point Cloud for Automated Ship
training cases and test cases are negligible. Thus, the proposed Loaders. Journal of Coastal Research, 2015(SI.73): 28-34.
algorithm has good stability and robustness. DOI: 10.2112/SI73-006.

6. C. Mi, Z. W. Zhang, X. He, Y. F. Huang, W. J. Mi, 2015.


CONCLUSIONS Two-stage classification approach for human detection
in camera video in bulk ports, Polish Maritime Research,
In this study, a DQN and a learning method for this DQN is 22(SI.1):163-170. DOI: 10.1515/pomr-2015-0049
proposed to solve ship stowage planning problem, inovations
are as follows. 7. C. Mi, H. W. Liu, Y. F. Huang, W. J. Mi, Y. Shen, 2016.
1. Introduces deep learning algorithm to solve planning Fatigue alarm systems for port machine operators. Asia
problem. With DQN, massive calculation and training is Life Sciences, 25(1): 31-41.
done in pre-training stage, while in application the planning
problem can be solved in seconds 8. Yifan S, Ning Z, Weijian M. 2016. Group-Bay Stowage
2. Objectives and constraints of ship stowage planning Planning Problem for Container Ship. Polish Maritime
problem are transformed to feature vectors to extract stowage Research, 23(s1).
policies with deep learning algorithm automatically. Policies
from data tends to have less bias than designed heuristics in 9. Mengjue X., Ning Z, Weijian M. 2016. Storage Allocation
previous studies. in Automated Container Terminals: the Upper Level. Polish
3. Experience replay is introduced in DQN to enforce Maritime Research, 23(s1).
generalization and robustness of proposed algorithm.
4. Provided reference to solving planning problem in 10. C. Mi, X. He, H. W. Liu, Y. F. Huang, W. J. Mi, 2014. Research
container terminals such as yard storage planning and on a Fast Human-Detection Algorithm for Unmanned
equipment scheduling. Surveillance Area in Bulk Ports. Mathematical Problems
in Engineering. DOI: 10.1155/2014/386764

108 POLISH MARITIME RESEARCH, No S3/2017


11. D. S. Todd, P. Sen, 1997. A Multiple Criteria Genetic 23. K. H. Kim, Y. M. Park, K. R. Ryu, 2000. Deriving decision
Algorithm for Containership Loading International rules to locate export containers in container yards.
Conference on Genetic Algorithms, East Lansing, Mi, European Journal of Operational Research, 124: 89-101.
Usa, July. DBLP, 674-681.
24. K. H. Kim, J. S. Kang, K. R. Ryu, 200. A beam search
12. N. Zhao, W. J. Mi, 2008. Robust approach in stowage algorithm for the load sequencing of outbound containers
planning at contianer terminals. IEEE proceeding of the in port container terminals. OR Spectrum, 26: 93-116.
4th International Conference on Intelligent Logistic System,
191-204. 25. Y. Lee, J. Kang, K. R. Ryu, K. H. Kim, 2005. Optimization
of Container Load Sequencing by a Hybrid of Ant Colony
13. A. Moura, J. Oliveira, C. Pimentel, 2013. A Mathematical Optimization and Tabu Search, Natural Computation
Model for the Container Stowage and Ship Routing Problem. Lecture Notes in Computer Science, 3611, 1259-1268.
Journal of Mathematical Modelling and Algorithms in
Operations Research, 12(3): 217-231.

14. M. Avriel, M. Penn, 1993. Exact and approximate solutions


of the container ship stowage problem. Computers & CONTACT WITH THE AUTHORS
Industrial Engineering, 25(1-4):271-274.
Ning Zhao
15. J. J. Shields, 1984. Containership Stowage: A Computer-
Aided Preplanning System. Marine Technology, 21(4): Logistics Engineering College
370-383. Shanghai Maritime University
Shanghai201306
16. A. Imai, T. Miki, 1989. A heuristic algorithm with expected China
utility for an optimal sequence of loading containers
into a containerized ship. Journal of Japan Institute of
Navigation, 80: 117–124 (in Japanese).

17. A. Imai, E. Nishimura, K. Sasaki, S. Papadimitriou, 2001.


Solution comparisons of algorithms for the containership
loading problem. Proceedings of the International
Conference on Shipping: Technology and Environment,
available on CD-ROM.

18. A. Imai, E. Nishimura, K. Sasaki, S. Papadimitriou, 2001.


Solution comparisons of algorithms for the containership
loading problem. Proceedings of the International
Conference on Shipping: Technology and Environment,
available on CD-ROM.

19. A. Haghani, E. I. Kaisar, 2001. A model for designing


container loading plans for containerships. In: 80th
Transportation Research Board Annual Meeting,
Washington, DC, USA.

20. J. F. Álvarez, 2006. A heuristic for Vessel planning in a reach


stacker terminal. Journal of Maritime Research Jmr, 3(1):
págs. 3-16.

21. K. H. Kim, 1994. Analysis of rehandles of transfer crane in


a container yard. APORS-Conference, 3: 357-365.

22. K. H. Kim. 1997. Evaluation of the number of rehandles


in container yards. Computers & Industrial Engineering,
32: 701–711.

POLISH MARITIME RESEARCH, No S3/2017 109

You might also like