0% found this document useful (0 votes)
8 views13 pages

Distributed Computing

The ScOSA project aims to enhance on-board computing for spacecraft by integrating reliable space-qualified hardware with commercial off-the-shelf (COTS) components, creating a distributed, fault-tolerant, and reconfigurable architecture. This system allows for dynamic workload distribution and fault mitigation, improving computational performance for various space applications such as earth observation and robotic servicing. The paper details the system's core concepts, including its architecture, node classification, and software stack, demonstrating its capabilities through several application demonstrations.

Uploaded by

snsathya04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views13 pages

Distributed Computing

The ScOSA project aims to enhance on-board computing for spacecraft by integrating reliable space-qualified hardware with commercial off-the-shelf (COTS) components, creating a distributed, fault-tolerant, and reconfigurable architecture. This system allows for dynamic workload distribution and fault mitigation, improving computational performance for various space applications such as earth observation and robotic servicing. The paper details the system's core concepts, including its architecture, node classification, and software stack, demonstrating its capabilities through several application demonstrations.

Uploaded by

snsathya04
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/328336046

ScOSA - Scalable On-Board Computing for Space Avionics

Conference Paper · October 2018

CITATIONS READS
16 474

17 authors, including:

Carl Johann Treudler Heike Frei

5 PUBLICATIONS 34 CITATIONS
German Aerospace Center (DLR)
54 PUBLICATIONS 442 CITATIONS
SEE PROFILE
SEE PROFILE

Thomas Firchau Jörg Langwald


German Aerospace Center (DLR) German Aerospace Center (DLR)
12 PUBLICATIONS 42 CITATIONS 18 PUBLICATIONS 325 CITATIONS

SEE PROFILE SEE PROFILE

Some of the authors of this publication are also working on these related projects:

On-Orbit Servicing - End-to-End Simulation View project

Scalable On-Board Computing for Space Avionics View project

All content following this page was uploaded by Carl Johann Treudler on 12 July 2021.

The user has requested enhancement of the downloaded file.


ScOSA - Scalable On-Board Computing for Space


Avionics
Carl Treudler Heike Benninghof Kai Borchers
Bernhard Brunner Jan Cremer Michael Dumke
Thomas Gärtner Kilian Höflinger Jörg Langwald
Daniel Lüdtke Ting Peng Eicke-Alexander Risse
Kurt Schwenk Martin Stelzer Moritz Ulmer
Simon Vellas Karsten Westerdorff

German Aerospace Center (DLR)


[email protected]
Abstract

Computational demands on-board spacecraft have continuously increased for years whereas available
space qualified hardware is not capable of satisfying these requirements completely. The project ScOSA
(Scalable On-Board Computing for Space Avionics) addresses this issue by establishing a system that
combines highly reliable space qualified hardware with highly performant commercial off-the-shelf (COTS)
components. This paper introduces the basic concepts of the system but also the concrete characteris-
tics which lead to a distributed, fault-tolerant, and reconfigurable architecture. This architecture in turn
provides multiple possibilities to increase computational performance for space applications. Several demon-
strations like earth observations or robotic on orbit servicing are applied to evaluate the system capabilities.
Additionally, these demonstrations are used to show dynamic system reconfiguration for fault mitigation.

1 Introduction Furthermore, often each dedicated unit is applied


in a redundant way whereas each spare part can only
Actual and future spacecraft encounter many chal- cover units of the same type which increases the sys-
lenges which might require a methodology change in tem mass. During operation, these spare parts are
the on-board computing design. On one hand the generally performing observing tasks or they are pow-
amount of data required to be handled rises but also ered off entirely. This kind of operation must be
the need for on-board data processing. Architec- treated as inefficient from the overall workload per-
tures of today’s on-board computing systems have spective because available resources are not utilized.
not changed over the years. Basically, a lot of ded- The objective of ScOSA, as a successor project of
icated units are centered around an on-board com- [1], is to solve these disadvantages. This is done by
puter which serves as the main control instance of combining space qualified and COTS components.
the system. However, this structure leads to several They are used to create computing nodes with dif-
disadvantages. ferent characteristics, which are connected by a net-
Dependent on the required reliability, most or even work. These nodes can handle a wide range of tasks
all units are designed to be radiation hardened which whereas they are also able to handle different tasks
massively affects the costs in a negative way. Ad- throughout system operation. This reconfiguration
ditionally, the computation performance of radiation capability, enables the system to mask faulty nodes
hardened units is relatively low. and to maintain a predefined reliability. Addition-

1
ally, workload can be distributed throughout oper- tive events. Gomspace also offers a System on Mod-
ation dynamically to utilize the available processing ule based on a Xilinx Zynq, the Z7000 [5], which has
capabilities efficiently. been tested in orbit.
This paper introduces the applied concepts that The Xilinx Zynq products, manufactured in 28nm
are used to create an alternative kind of on-board technology, were already subject to radiation tests.
computing system with a distributed, fault-tolerant Thereby current increasing, which was recovered by
and reconfigurable architecture. The distributed and power cycling, and corruptions of on-chip memory
reconfigurable nature of the system also offers great were observed. However, the occurrence of destruc-
flexibility and scalability. tive events were not stated [6] [7]. This product may
The remainder of this paper is organized as follows. be a meaningful choice for systems designed to detect
The related work is given by Section 2 followed by an and handle recoverable data corruptions.
introduction into the core concepts of ScOSA in Sec- Promising information was recently provided by
tion 3. The classification of nodes and how they are Xilinx. Existing radiation test results of Kintex Ul-
connected to form the whole system is provided by traScale indicates the device meets space domain re-
Section 4. Section 5 describes the system’s software. quirements even without a need for applying a RHBD
In Section 6 the evaluation by several appllied appli- (Radiation Hardened By Design) re-design [8].
cation demonstrations is shown. This includes com- Distributed systems are not a standard ap-
mand and data-handling tasks (CDH), attitude con- proach to solve on-board computing challenges
trol (ACS), earth-observation (EO) data-processing, in spacecraft. This approach for example been
autonomous visual navigation and robotics. High- addressed by the reserch project A3M [9] and
performance is achieved by parallel processing in OBC-SA [10]. Also similar in the techniques applied
combination with Field Programmable Gate Array are Network-on-Chip approaches like [9].
(FPGA) acceleration explained in Section 7. This An overview of fault tolerant techniques for dis-
is done by use of COTS parts with tailored fault- tributed systems in space is provided in [11]. The
mitigation. Finally, Section 8 provides some conclu- research also proposes a distributed spacecraft archi-
sions tecture based on nodes realized as a multiprocessor
system-on-chip. The system aims to provide task mi-
gration in case of faulty behavior but also workload
2 Related Work sharing throughout operation.

COTS components are frequently investigated for


usage in space environments. Some research is given 3 Core Concepts
in [2]. The evaluation of platforms for vision-based
navigation [3] investigates both rad-hard and COTS
On the conceptual level ScOSA is build around multi-
based systems, and shows the differences in perfor-
ple central concepts. These concepts enable the main
mance and power between rad-hard and COTS com-
features of ScOSA. Afterwards, the added value by
ponents.
the features and the concepts are elaborated.
At the time the Xilinx Zynq was selected not much
was known about its performance in a radiation en-
vironment. This situation has changed, to date mul- 3.1 Distributed System
tiple Instances have been flown. The flight computer
CSPv1 takes advantage from utilizing COTS compo- As a distributed system, multiple computing devices
nents [4]. Like our approach, the CSPv1 is based on on nodes can be interconnected by a network. This
a Xilinx Zynq and also supported by rad-hard com- allows to share the workload of applications by mul-
ponents. The flight computer was able to handle sev- tiple nodes, and provides potential for parallelisation
eral radiation tests, including an experiment on the of the algorithm, and therefore a faster processing of
ISS (International Space Station), without destruc- the algorithm.

2
3.2 Network-interconnected the COTS-based nodes, mainly software mitigation
means are integrated to cope with SEEs.
ScOSA uses a packet-oriented network to exchange
data between its nodes. There are no constraints
imposed regarding the network topology, allowing 3.6 Hardware Acceleration
point-to-point connections as well as fully meshed se-
Some ScOSA nodes contain FPGAs that allows the
tups.
system to load application-specific accelerators, pro-
viding a drastic increase in throughput and offering
3.3 Heterogeneous power efficient computation. The COTS CPU of
the HPN itself offers an increased computation per-
The system does not require all nodes to be iden- formance compared to the radiation-hard RCN, but
tical. This allows nodes to have different proper- with the disadvantage of a reduced reliability. Ad-
ties, adapted to specific tasks to be performed. This ditionally, the accelerator Intellectual Property (IP)
grid computing approach allows a dedicated tailoring Core of the HPN can also be changed at run time.
of the system, demanded by the constraints and re-
quirements of the technical process it controls. This
ensures a future-proof design, enabling further up- 3.7 Scalability
grades. In ScOSA, we use the Reliable Comput- The system is designed to scale over a broad range
ing Nodes (RCN) and the High Performance Nodes design parameters. For instance, we don’t impose an
(HPN), which are explained in greater detail in Sec- artificial limit to the number of possible nodes. Also
tion 4. ties to COTS technology are seen as contemporary
with future upgrades and swaps expected.
3.4 Reconfigurable
The reconfigurability of the system is an essential 4 Base Design
functionality, allowing to change the operational sys-
tem state, i.e. the system configuration. Configura- For the duration of the project a base design was
tion is used as a collective term, and includes the acti- selected for development and testing purposes. The
vation of nodes, their roles in the system, the network following paragraphs will explain this design and the
interconnection and also the operational parameters rationale behind it. Flight hardware is not strictly
of application tasks. A system reconfiguration can be required, but a reasonable path to a flight-capable
conducted as a planned reconfiguration, determined system has to be available.
by the mission time-line, or as a reaction to mitigate We choose to implement three different types of
the effects of an inner fault in the system. nodes: Reliable Computing Nodes (RCN), High Per-
formance Nodes (HPN) and Interface Nodes (IFN).
Figure 1 reveals our base design at node level. Two
3.5 Fault-tolerance RCNs, one IFN and three HPNs are connected via
The system allows parts of it to be affected by faults, multiple SpaceWire links. The IFN and the HPNs
and has mitigation and isolation measures prepared. contain SpaceWire router IP-cores and enable the
Modeling faults is essential when implementing an routing of SpaceWire packets to other nodes.
actual ScOSA system. To reduce the complexity and
costs of fault-tolerance, a hybrid scheme is supported 4.1 High Performance Node
which allows the combination of nodes with differ-
ent fault-models (e.g. fail-stop and benign byzan- A Xilinx Zynq is used for its build-in FPGA and fast
tine). There is also fault-tolerance embedded into ARM Cortex-A9 CPU (compared to the highly reli-
some of the nodes, like tolerance to Single-Event Ef- able RCNs). The COTS Zynq is also the target Sys-
fects (SEE) in the RCN. To increase the reliability of tem on Chip (SoC) for the usage in space, but with

3
specific, concentrating most modifications of hard-

SpW Rt.
TM/TC RCN HPN ware to this type of component. The usage of multi-

Ethernet
ple IFNs is possible, enabling their physical distribu-
Dev. tion in the overall system.

SpW Rt.
TM/TC RCN HPN PC
4.4 Network
SpW Rt.
SpW Rt.
HPN
IFN We choose SpaceWire for ScOSA in the space envi-
ronment, as it is well supported in the space domain
Other Subsystems and was developed with focus on space requirements.
(e.g. ACS) SpaceWire’s limited speed is a concern, but adopting
Figure 1: The block diagram of the base design and a different technology, or SpaceFiber, is considered
test-bed show the distributed and heterogeneous na- feasible. The HPNs are additionally interconnected
ture of the system. Multiple routes are useful to avoid via Gigabit Ethernet among each other, as optional
single points of failure which the system would route inter-nodelink, and with the development PC, for de-
around in the case of a link failure. bugging. The topology of the SpaceWire network
can be changed flexibly, either by altering the phys-
ical connections or via the FPGA SpaceWire router
its surrounding support circuitry adjusted to the de- firmware.
mands of the space environment. The lack of built-in
SEE robustness is addressed by fault detection, iso-
lation and recovery means on system and node level, 5 Software
in hardware and software.
The system software stack enables applications to
fully utilize ScOSA’s distributed nature. It enables
4.2 Reliable Computing Node
a distribution of tasks of an application across the
The RCN is based on a prior development effort available computing nodes and connects them via
named Compact On-Board Computer (COBC). It communication channels. To enable this and to mit-
is based on a rad-hard LEON3 SoC accompanied igate potential system-internal faults, a layered ap-
by a flash-based FPGA. The RCN offers a 50MHz proach was chosen. This results in a clear hierarchy
dual LEON3, a 64 MiByte EDAC-protected SDRAM and data-flow within the separated system compo-
and non-volatile memories for boot-images and arbi- nents, forming a robust and reactive system. These
trary uses. For a detailed description consider our components are illustrated in Figure 2.
earlier publication [12]. For the development and The following subsections will provide insights into
demonstration work in ScOSA, an adapted bread- the different components that compose the system
board version was used. The current RCN imple- software stack. The system management services are
mentation also supports Consultative Committee for at the top most layer. They provide high-level func-
Space Data Systems (CCSDS) compatible telemetry tionality, such as reconfiguration and reintegration
and telecommand interfaces, and is prepared to take of nodes as well as handling monitoring and health
the role of the satellite bus on-board computer. management tasks. The Distributed Tasking Frame-
work is positioned at the next lower layer. It of-
4.3 Interface Node fers the interface for application developers to in-
tegrate their algorithms by the definition of tasks
In our base design, the IFN exclusively interfaces the and communication channels between them. Which
components of the Attitude Control System (ACS), task is executed on which node is determined by the
respectively their simulation in the Hardware in the configuration of the system and can be changed at
Loop (HiL) setup. IFNs are likely to be application run-time. Lower layers in the software stack ensure

4
Application 1... N Sys Mgt. Services

distributed Tasking

SpW-IPC
3rd Party Libs
(e.g. OpenCV)
OUTPOST

OS (RTEMS / Linux)

Hardware

Figure 2: The layered software structure enables Figure 3: A list of system management services and
developing system components with clear inter- how they are categorized according to their FDIR
faces. Nonetheless, applications can access each OS- capabilities. The green services mask errors while
agnostic layers for maximum flexibility while remain- the higher levels contain errors within their domains.
ing portable across the heterogeneous nodes.
the system and reacting to failures reported by other
a correct routing of the packages from one task at services. When the system initially boots, the reinte-
one node to another task at another node via the gration service on each node is responsible for acquir-
communication channels. This mechanism and oth- ing the system’s current state and then requesting the
ers are implemented in the SpaceWire-IPC [13], our master node to include them in the initial configura-
robust and Fault Detection, Isolation and Recovery tion. The reintegration service on the master node
(FDIR)-capable transmission protocol. It can be de- then notifies all active nodes to configure to a spe-
ployed on both, SpaceWire and Ethernet links. Ap- cific configuration, such as the initial configuration.
plication developers are free to include third-party li- Changing configurations is handled by the reconfigu-
braries and their operating system dependencies are ration service on each node.
also respected in the deployment of tasks. The final During a reconfiguration, all application tasks are
layer is represented by the OUTPOST library [14] paused to allow them to be redistributed among the
and the operating system provides the abstraction to active nodes. During this time, the associations be-
the hardware underneath. This is an important as- tween the channels and the tasks are set according to
pect, due to the heterogeneous hardware components the configuration. This also includes the routing of
employed in the ScOSA setup. the channels between the nodes. This is an important
aspect since a channel’s endpoints can be required on
multiple nodes, but to avoid unnecessary traffic, it
5.1 System Management Services is only routed to those that have tasks that depend
The individual system management services are listed on it. Once the reconfiguration has been completed,
in Figure 3. These implement the majority of the each node is informed by the master node that the
system’s FDIR capabilities. They range from the system is ready and the application’s execution is re-
data-based voter, that implements triple modular re- sumed.
dundancy to the inter-node level reconfiguration and When a fault occurs that cannot be masked, the
reintegration services. Subsequently, a selection of standard system response is to reconfigure itself to
services is given a short introduction that includes exclude the node on which the error occurred. This is
its sphere of influence and the faults it handles. The handled by the reconfiguration service that places the
remaining services will be described in the project’s system into a reconfiguration state, redistributes the
concluding software paper. application’s tasks, and then resumes operation once
The reconfiguration and reintegration services each active node has confirmed that it successfully
work hand in hand to configure which nodes are cur- reconfigured itself. The node that was excluded will
rently active, integrating rebooted nodes back into reboot itself in order to resume from a known good

5
state. Once it passes its own self-checks, it sends a
request to the master to be reintegrated. The pro-
cess of recovering the failed node is handled by the
reintegration service and the reconfiguration service
is then informed of this node’s renewed availability.
The reconfiguration service can then, at a given point
in the future, reconfigure the system to include the
recovered node. This allows system’s degraded per-
formance to be fully recovered. Figure 4: Example of a Distributed Tasking appli-
The monitoring service is responsible for detect- cation. Despite tasks 2 and 3 running on different
ing the loss of a node and informing the relevant nodes, the same data is pushed to both of them. Ef-
services about this loss. Once a loss has been de- fectively, the channel exists on both nodes and is han-
tected, its state is updated locally and the master dled transparently from the application’s view.
node is informed about this. The detection is per-
formed through heartbeats that regularly ping a node terface to the application developers to periodically
and expect an acknowledgment within a given time. store their task’s state. This state is then distributed
The absence of the acknowledgment leads to the con- to other nodes so that if the task’s node fails, the task
clusion that the node is not available anymore. Other can be resumed from the checkpointed state. After a
services are responsible for determining the cause of reconfiguration completes, a task is provided with the
the failure. last available global consistent state which results in
The health manager service is used to measure a reduction of data loss. One example is the reloca-
soft metrics that can lead to faults that if not handled tion of a Kalman filter. Its stability can be preserved
cause failures. The health manager service is broken if it is initialized with a state in its past which is an
down into increasingly higher-level observations. At approximation of its current state had the relocation
the lowest level, its abnormality detection is used to not occurred.
notice when specific system parameters leave their
expected range. In the case of channels that con-
nect tasks with one another, how full they are during
5.2 Distributed Tasking
nominal operation is compared to their current state. In order to deal with distributed systems, the Tasking
If the remaining buffer of a channel starts to dimin- Framework [15] was extended to handle enabling and
ish, it can be a good signal that a node is unable disabling tasks during reconfigurations and routing
to keep up with the data is sent to process. An- the data that is pushed into channels to the appro-
other aspect is the number of transmission attempts priate nodes.
that SpaceWire-IPC, the network protocol, needs to Application developers implement encapsulated
transmit a message. logic of their application in tasks and then connect
Furthermore, the health manager’s tasks include these discrete tasks with one another with channels.
diagnostics reporting that give context to individual How the tasks can be distributed between nodes is
parameters. This is useful for drawing conclusions shown in Figure 4.
from the individual events, which can range from ab- After a reconfiguration that results in the loss of
normal behavior to faults and system wide failures. Node B, it is possible to relocate Task 3 on Node A.
This can be used to extract a percentage of the sys- This will increase the CPU load of that node and po-
tem’s observed health state. On the highest level, tentially reduce its throughput. Nonetheless, from a
the service provides prognostics about when the next functional point of view, the system behaves in the
system failure is expected. same way as it did before the reconfiguration. This
Another important service is the checkpoint- can be extended to more complex applications con-
ing service which is responsible for data consistency sisting of a multitude of more nodes, tasks and chan-
across reconfigurations. The service provides an in- nels. From a practical point of view, the system can

6
scale up to 20 individual nodes. a standard image processing application, which pro-
cesses RapidEye satellite imagery and prints out the
position and object feature of detected ships. It con-
5.3 SpaceWire-IPC sists mainly of three steps, the creation of a land-
The communication between nodes is handled by the watermask, the object extraction and the deeper ex-
SpaceWire-IPC [13] transmission protocol. Its main amination of each single object. One example of the
features include being FDIR-capable and functioning processing can be seen in Figure 5. Details of the ship
with a range of networks and architectures. detection application can be found in [16]. The ap-
Regarding the FDIR aspects, SpaceWire-IPC is ca-
pable of sending heartbeat requests to other nodes
and in case of a missed acknowledgment, will cre-
ate an error notification. Similarly, in order to fulfill
the system’s robustness requirements, if a message
cannot be delivered within a specific time frame, a
similar notification will be generated. These error
notifications will be forwarded to the system man-
agement services where the appropriate actions, i.e.
FDIR, can then be taken.
Figure 5: Ship detection: Example Images, Left: In-
5.4 OUTPOST and the OSes put data, Right: Result

OUTPOST [14] is a collection of low-level libraries


that provide uniform interfaces for operating system plication was available as an experimental version for
functionality. These include mutexes, interfaces to demonstrating and analysing the capabilities of a ship
User Datagram Protocol (UDP) sockets and clocks. detection method. The application is a C++ appli-
This is necessary in creating cross-platform applica-cation using the open-source computer vision library
tions. Due to the heterogeneous nature of hardware OpenCV[17] for reading in image data and for apply-
platforms and operating systems, this makes it pos- ing various filter functions to the images. Also, the
Boost library[18] was used for various components,
sible to e.g. write and run an application for an x86-
based desktop and subsequently recompile the same such as file system access and the unit-test frame-
source code for RTEMS on an ARM-based node. work. The application was never meant to be ported
Two operating systems are provided, RTEMS for to an onboard computer system, therefore no prepa-
real-time tasks and a custom Yocto Linux for the re- ration was done in this regard.
maining tasks. The customization allows for includ- When porting the application to the ScOSA plat-
ing third party system libraries and configuring the form, one goal was to share as much code as pos-
device drivers. This is relevant to add support for sible with the original application. The process-
the SpaceWire router residing in the Xilinx Zynq’s ing routines of the application where split up in
FPGA. an application-core library. This application-core li-
brary should be used by the original and the ScOSA
application. The application-core library depends on
6 Application Demonstration 3rd party libraries as OpenCV and Boost. With the
ScOSA platform supporting them, no parts of the
6.1 Earth Oberservation application-core library had to be adapted. For inte-
gration of the application in the tasking framework at
To illustrate the suitability of the ScOSA platform first a very simple approach was chosen. We imple-
for earth observation applications, a ship detection mented three tasks, one for reading the input data,
application was ported to the ScOSA platform. It is one for processing the data, and one for writing the

7
results back to the disk. This is illustrated in Fig- cycle time of 1kHz for motion control implies a HPN
ure 6. A runtime analysis reveals, that most comput- running a real-time operating system to fulfil this
time constraint. Besides this motion controller the
robotics system should have a certain level of auton-
omy, which means that high-level task commands,
TaskRead TaskShip TaskWrite e.g. received from a ground station on Earth, must
Data Detect Data be interpreted on-board, i.e. decomposed into sub-
task for execution on the robot controller. Therefore
Figure 6: Ship Detection, Dataflow of first version. two function blocks have to be integrated into the
ScOSA system:
ing time is spent on the processing task. Therefore we
• A task interpreter, which checks the incom-
build a version with three processing tasks working in
ing telecommands, provides the robot controller
parallel. This is illustrated in Figure 7. For a single
with the respective subcommands and parame-
ters.
• A robot controller, which calculates the motion
TaskShip
reference values for the manipulator system in
Detect 1
real-time.
The OOS application demonstrates the implemen-
TaskRead TaskShip TaskWrite tation of such a space robotics control application
Data Detect 2 Data to the ScOSA system using the example described in
[19]. It demonstrates the capability of running a 7 de-
grees of freedom (DOF) robotic arm with the ScOSA
TaskShip HPN nodes. The system setup shown in Figure 8
Detect 3 contains three different nodes of which two are HPNs
(Node 2+3) and one is the ”ground operation” node
for generating and sending telecommands as well as
Figure 7: Ship Detection, Dataflow second version, receiving and displaying telemetry data (Node 1).
with three processing tasks working in parallel. Node 2, running Yocto Linux on a HPN, is respon-
sible for the task decomposition, control and super-
vision of the application. It provides telemetry data
CPU system migrating from the first to the second and receives the command data from the ground op-
version, no increase in throughput is expected. In the eration node.
the distributed setting of ScOSA the second version Node 3, running RTEMS on a HPN, contains all
can distribute the load of processing across multiple the real-time critical software components and con-
nodes, and thus effectively multiplying the the total nects to the robot hardware via the network inter-
throughput. face. This real-time controller module (designed in
matlab/simulink) runs with 1kHz and contains the
model with all control calculations as well as the
6.2 Robotic On Orbit Servicing
hardware interface which acts as an EtherCat master
Regarding future On-Orbit Servicing (OOS) mis- for the robot. This module will be controlled by the
sions, the necessary robotics system not only requires robotControl task which again communicates with
a dedicated manipulator system, which will be able the taskControl task on Node 2.
to work in space environments, but also the corre- Figure 8 illustrates the different nodes and tasks
sponding computing system to control the robotics as well as the communication lines between all of
system according to the specific requirements, e.g. them via the ScOSA messaging system. Node 1 is
concerning real-time behavior. Especially a system connected with Node 2 via Ethernet, whereas Node

8
Figure 8: OOS architectural overview

2 talks with Node 3 via SpaceWire. But using the of the framework the HiL test bench utilized for the
ScOSA tasking and communication system, the com- verification of the original mission was reused to test
munication type is fully transparent. the performance of the software running on a RCN.

The RCN was extended with an IFN, overcoming


6.3 Attitude Control System the limited number of serial ports, which transpar-
ently gives the ScOSA framework access to the serial
The Attitude Control System (ACS) application channels on the IFN. Both nodes are connected via
demonstrates the real-time capability of the ScOSA SpaceWire. The twelve available serial ports are con-
framework by running realistic software on the space- nected to a dSPACE system, a simulation environ-
qualified architecture of the RCN. The application ment running a Simulink model in real-time. This
was ported from the ACS software of the satellite model simulates the sensor characteristics, including
mission Eu:CROPIS[20]. the real hardware’s protocol, and the magnetic tor-
Eu:CROPIS is a spin-stabilized satellite with mag- quers control unit. Within the Eu:CROPIS mission,
netic attitude control torquers, scheduled to be sensors and actuators are available in a redundant
launched in 2018. For demonstrating the maturity fashion. For the HiL setup, only the gyroscopes are

9
SpW Rt. SpW Rt. SpW Rt.
partly redundant given that four sensors are arranged HPN CCD EPOS
on the faces of a tetrahedron. That way, one gyro- GNC Camera
scope can fail during the mission while maintaining

Ethernet
a three-axis attitude determination. The (simulated) HPN
Satellite
Pose
devices connected to the RCN are: Simulator
Estimation

• 4 Gyroscopes HPN
GNC
Pose
• 1 Magnetometer Console
Estimation
• 5 Sun sensors
• 1 Magnetic torquer control unit Figure 9: Rendezvous Navigation Setup
• 1 Global positioning system (GPS) receiver
telecommands from ground and monitored by teleme-
The basic functionality of the state estimation, try. This GNC-part interfering with the satellite sim-
magnetic control, and vehicle mode transition was ulator (which in turn interferes with EPOS, see [22])
proven on the ScOSA hardware. This verifies the is operated on a fixed framerate of 10Hz and handles
real-time capability and sufficient performance of the the estimated poses asynchronous.
breadboard. This setup can be integrated into the End-to-End
The Distributed Tasking framework from Sec- Simulation enabling the verification of ScOSA hard-
tion 5.2 is derived from the same framework that and software in an environment as realistic as possible
Eu:CROPIS is using for scheduling. This limited which includes ground-segment, communication, on-
the porting effort except for the interface of the board computer and HiL-simulated space-segment.
(SpaceWire-tunneled) serial ports. It also gives the See [23] for more details.
opportunity to split the application within the al-
ready implemented task structure, mainly the atti-
tude Estimator task and the Controller task. These
tasks could be distributed over multiple RCNs, trans- 7 Hardware Accelator
parently diverting the data over the SpaceWire net-
work to balance the computational effort over multi- Sometimes even the best commercial processors are
ple CPUs, or to handle redundancy in case one RCN not fast enough for demanding tasks, like image-
breaks down. processing. For these tasks the parallel architecture
of FPGAs is more powerful and efficient. The Xil-
6.4 Autonomous Rendezvous Naviga- inx Zynq SoC, as core element of the HPN, already
contains a significant number of FPGA cells. Some
tion FPGA fabric is reserved for chip interfacing and the
The Autonomous Rendezvous Navigation Applica- SpaceWire router. But a significant part is available
tion is capable of performing a semi-autonomous for application-specific hardware accelerators.
ground-controlled rendezvous maneuver towards an The FPGA configuration-file is eventually a com-
uncooperative target satellite on the European Prox- bination of fixed building blocks. To these blocks be-
imity Operations Simulator (EPOS) HiL simula- long I/O-specific functionality, the SpaceWire router
tor. The application makes use of three Yocto and the Reconfigurable Partitions (RP) for applica-
Linux HPNs to perform computationally expensive tion specific code. One or more of these RPs can
6D pose-estimation. exist and act as placeholder. The application specific
The components pose-estimation, guidance, navi- blocks are so called Reconfigurable Modules (RM)
gation and control (GNC) as shown in Figure 9, de- and are loadable from the processor during the run-
scribed in detail in [21], are implemented as separate time of the system to the appropriate RP. The com-
tasks inside the ScOSA Tasking Framework. They bination of RPs and RMs supports the system recon-
can be controlled independently via PUS-conform figuration, explained in Section 3.4.

10
7.1 FPGA API compute dense disparity and depth maps from stereo
images, is the Semi Global Matching (SGM) [24]. Be-
For the scalable nature of ScOSA we tried to find side the advantages of SGM in robustness, density
a general approach for integrating different types and accuracy there is the disadvantge of algorithms
of hardware-accelerators seamlessly into the tasking complexity and expense. A FPGA implementation
middleware. The FPGA-accelerators shall process of these algorithm was choosen as example to demon-
small or medium-sized pieces of the algorithm. The strate the performance capabilities of FPGAs and the
input-data will be taken directly from the memory reconfiguration concept. The FPGA based algorithm
and the output-data will be written back into the can significantly speed up the depth map generating
memory via DMA. From the hardware point-of-view task compared to a CPU only solution.
a shared memory model is used. But the tasking mid-
dleware follows a message-based approach. Two tasks
will be utilized to involve hardware-accelerators into 8 Conclusion
the overall processing flow (see Figure 10). The first
task will setup the accelerator and the corresponding This paper introduces the concepts and the actual
DMA-engine. The second task will aquire the results used architecture of a distributed on-board comput-
from the hardware-processing and transfers it to the ing system for space applications with the overall in-
next task. The hardware issues an interrupt-request, tend of providing high performance and reliability.
after the processing has been finished. The interrupt These attributes are achieved by creating a hybrid
is fetched by the tasking-middleware, which combines system consisting of traditional radiation hardened
the message sent by the first task with the interrupt and COTS components. These components are used
and puts the second task into the execution-queue. to create different nodes with specific properties, con-
nected through a SpaceWire network. Thereby it is
possible to reconfigure the whole system during run-
time. This allows on one hand workload balancing to
further increase the computation performance. On
the other hand the reaction and mitigation of occur-
ring faults are covered in that way
Multiple demonstrations are applied to evaluate
system capabilities and performance. These demon-
strations ranging from real time depending ACS
tasks, robotic motion control and also computation-
ally expensive autonomous rendezvous navigation
and data-processing for earth observation data.
At the time of writing the development and eval-
uation is ongoing. The prospect after the successful
evaluation is to increase the technological readiness of
Figure 10: Involving hardware-acceleration in the the ScOSA platform and the identification of a suit-
tasking-middleware able path from the testbed to demonstration and use
in space.

7.2 Stereo Image Processing References


A depth map generated from stereo images is useful [1] D. Lüdtke, K. Westerdorff, K. Stohlmann, A. Börner,
O. Maibaum, T. Peng, B. Weps, G. Fey, and A. Gerndt,
or mandatory for many tasks in robotics, espacially “OBC-NG: Towards a reconfigurable on-board comput-
for capturing freeflying objects or autonomous rover ing architecture for spacecraft,” in IEEE Aerospace Con-
exploration tasks. An algorithm, generally used to ference, March 2014.

11
[2] M. Pignol, “Cots-based applications in space avionics,” [15] O. Maibaum, D. Lüdtke, and A. Gerndt,
in Design, Automation and Test in Europe Conference “Tasking framework: Parallelization of computa-
and Exhibition, 2010. tions in onboard control systems,” in ITG/GI
[3] G. Lentaris, K. Maragos, I. Stratakos, L. Papadopou- Fachgruppentreffen Betriebssysteme, November 2013,
los, O. Papanikolaou, D Soudris, M. Lourakis, X. Zabu- http://www.betriebssysteme.org/Aktivitaeten/Treffen/2013-
lis, D. Gonzalez-Arjona, G. Furano, “High-performance Berlin/Programm/. [Online]. Available:
embedded computing in space: Evaluation of plat- https://elib.dlr.de/87505/
forms for vision-based navigation,” in JOURNAL OF [16] K. S. Katharina A. M. Willburger, “Using the time
AEROSPACE INFORMATION SYSTEMS, 2018, pp. shift in single pushbroom datatakes to detect ships and
178–192. their heading,” pp. 10 427 – 10 427 – 13, 2017. [Online].
[4] C. Wilson, A. George, “CSP Hybrid Space Computing,” Available: https://doi.org/10.1117/12.2277535
in Journal of Aerospace Information Systems, 2018, pp. [17] OpenCv. (2018) Opencv project homepage -
215–227. https://opencv.org/. [Online]. Available: https:
[5] G. A/S. Nanomind z7000. [Online]. Avail- //opencv.org/
able: https://gomspace.com/Shop/subsystems/ [18] Boost. (2018) Boost c++ libraries project homepage
computers/nanomind-z7000.aspx - https://www.boost.org/. [Online]. Available: https:
[6] M. Amrbar, F. Irom, S. M. Guertin, G Allen, “Heavy //www.boost.org/
ion single event effects measurements of xilinx zynq- [19] G. Hirzinger, K. Landzettel, B. Brunner, M. Fis-
7000 fpga,” IEEE Radiation Effects Data Workshop cher, C. Preusche, D. Reintsema, A. Albu-Schäffer,
(REDW), 2015. G. Schreiber, and B.-M. Steinmetz, “Dlr’s robotics
[7] L. A. Tambara, F. L. Kastensmidt, N. H. Medina, N. technologies for on-orbit servicing,” Advanced Robotics,
Added, V. A. P. Aguiar, F. Aguirre, E. L. A. Mac- vol. 18, no. 2, pp. 139–174, 2004. [Online]. Available:
chione, M. A. G. Silveira, “Heavy ions induced single https://doi.org/10.1163/156855304322758006
event upsets testing of the 28 nm xilinx zynq-7000 all [20] A. Heidecker, T. Kato, O. Maibaum, and M. Hölzel,
programmable soc,” IEEE Radiation Effects Data Work- “Attitude control system of the eu:cropis mission,” in
shop (REDW), 2015. 65th International Astronautical Congress. Interna-
[8] D. Elftmann, “Xilinx on-orbit reconfigurable kintex tional Astronautical Federation, Oktober 2014. [Online].
ultrascale fpga technology for space,” Xilinx, Tech. Available: https://elib.dlr.de/90977/
Rep., 2018. [Online]. Available: https://indico.esa.int/ [21] F. Rems, E. Risse, and H. Benninghoff, “Rendezvous gnc-
event/232/contributions/2161/attachments/1811/2111/ system for autonomous orbital servicing of uncooperative
2018-04-09 Xilinx Space Products SEFUW.pdf targets,” in Proceedings of the 10th International ESA
[9] J. W. Alexander, B. J. Clement, K. P. Gostelow, Conference on Guidance, Navigation and Control
and J. Y. Lai, “Fault mitigation schemes for future Systems, Salzburg, Austria, 2017. [Online]. Available:
spaceflight multicore processors,” in Design, Automation https://elib.dlr.de/112587/
and Test in Europe (DATE), 2012, pp. 358–367. [22] H. Benninghoff, F. Rems, E.-A. Risse, and C. Mietner,
[Online]. Available: https://trs.jpl.nasa.gov/bitstream/ “European proximity operations simulator 2.0 (EPOS)
handle/2014/42752/12-2172 A1b.pdf - a robotic-based rendezvous and docking simulator,”
[10] Obc-sa. [Online]. Available: https://www.fokus. Journal of large-scale research facilities JLSRF, vol. 3,
fraunhofer.de/go/obcsa apr 2017.
[11] T. V. M. Fayyaz, “Survey and future directions of fault- [23] H. Benninghoff, F. Rems, E. Risse, B. Brunner,
tolerant distributed computing on board spacecraft,” in M. Stelzer, R. Krenn, M. Reiner, C. Stangl, and M. Gnat,
Advances in Space Research, 2016, pp. 2352–2357. “End-to-end simulation and verification of GNC and
robotic systems considering both space segment and
[12] C. J. Treudler, J.-C. Schröder, F. Greif, K. Stohlmann,
ground segment,” CEAS Space Journal, jan 2018.
G. Aydos, and G. Fey, “Scalability of a base level design
for an on-board-computer for scientific missions,” in Data [24] H. Hirschmüller, “Accurate and efficient stereo process-
Systems In Aerospace (DASIA), 2014. ing by semi-global matching and mutual information,”
in CVPR 2005, vol. 2. IEEE, June 2005, pp. 807–814.
[13] T. Peng, B. Weps, K. Höflinger, K. Borchers,
[Online]. Available: https://elib.dlr.de/22952/
D. Lüdtke, and A. Gerndt, “A new spacewire protocol
for reconfigurable distributed on-board computers,”
in International SpaceWire Conference 2016, October
2016, pp. 175–182. [Online]. Available: https://elib.dlr.
de/108116/
[14] F. Greif, M. Bassam, J. Sommer, N. Toth, J.-G.
Mess, A. Ofenloch, R. Rinaldo, M. Goeksu, B. Weps,
O. Maibaum, J. Reinking, and M. Ulmer, “Outpost,”
https://github.com/DLR-RY/outpost-core/, 2018.

12

View publication stats

You might also like