Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline

Search Results (406)

Search Parameters:
Keywords = Xilinx

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 3611 KiB  
Article
Delineation of Optimized Single and Multichannel Approximate DA-Based Filter Design Using Influential Single MAC Strategy for Trans-Multiplexer
by Britto Pari James, Leung Man-Fai, Mariammal Karuthapandian and Vaithiyanathan Dhandapani
Sensors 2024, 24(22), 7149; https://doi.org/10.3390/s24227149 - 7 Nov 2024
Viewed by 291
Abstract
In this paper, a multichannel FIR filter design based on the Time Division Multiplex (TDM) approach that incorporates one multiply and add unit, regardless of the variable coefficient length and varying channels, by associating the resource sharing doctrine is suggested. A multiplier based [...] Read more.
In this paper, a multichannel FIR filter design based on the Time Division Multiplex (TDM) approach that incorporates one multiply and add unit, regardless of the variable coefficient length and varying channels, by associating the resource sharing doctrine is suggested. A multiplier based on approximate distributed arithmetic (DA) circuits is employed for effective resource optimization. Although no explicit multiplication was conducted in this realization, the radix-8 and radix-4 Booth algorithms are utilized in the DA framework to curtail and optimize the partial products (PPs). Furthermore, the input stream is truncated with an erratum mending unit to roughly construct the partial products. For an aggregation of PPs, an approximate Wallace tree is taken into consideration to further minimize hardware expenses. Consequently, the suggested design’s latency, utilized area, and power usage are largely reduced. The Xilinx Vertex device is expedited, given the synthesis of the suggested multichannel realization with 16 taps, which is simulated using the Verilog formulary. It is observed that the filter structure with one channel produced the desired results, and the system’s frequency can support up to 429 MHz with a reduced area. Utilizing TSMC 180 nm CMOS technology and the Cadence RC compiler, cell-level performance is also achieved. Full article
Show Figures

Figure 1

32 pages, 28323 KiB  
Article
FPGA Realization of an Image Encryption System Using a 16-CPSK Modulation Technique
by Jose-Cruz Nuñez-Perez, Miguel-Angel Estudillo-Valdez, Yuma Sandoval-Ibarra and Vincent-Ademola Adeyemi
Electronics 2024, 13(22), 4337; https://doi.org/10.3390/electronics13224337 - 5 Nov 2024
Viewed by 722
Abstract
Nowadays, M-Quadrature Amplitude Modulation (M-QAM) techniques are widely used to modulate information by bit packets due to their ability to increase transfer rates. These techniques require more power when increasing the modulation index M to avoid interference between symbols. This article proposes a [...] Read more.
Nowadays, M-Quadrature Amplitude Modulation (M-QAM) techniques are widely used to modulate information by bit packets due to their ability to increase transfer rates. These techniques require more power when increasing the modulation index M to avoid interference between symbols. This article proposes a technique that does not suffer from interference between symbols, but instead uses memory elements to store the modulation symbols. In addition, the aim of this paper is to implement a four-dimensional reconfigurable chaotic oscillator that generates 16-Chaotic Phase Shift Keying (16-CPSK) modulation–demodulation carriers. An encryption and modulation transmitter module, a reception module, and a master–slave Hamiltonian synchronization module make up the system. A 16-CPSK modulation scheme implemented in Field Programmable Gate Array (FPGA) and applied to a red-green-blue (RGB) and grayscale image encryption system are the main contributions of this work. Matlab and Vivado were used to verify the modulation–demodulation scheme and synchronization. This proposal achieved excellent correlation coefficients according to various investigations, the lowest being 15.9×106 and 0.13×103 for RGB and grayscale format images, respectively. The FPGA implementation of the 16-CPSK modulation–demodulation system was carried out using a manufacturer’s card, Xilinx’s Artix-7 AC701 (XC7A200TFBG676-2). Full article
(This article belongs to the Section Microwave and Wireless Communications)
Show Figures

Figure 1

20 pages, 6537 KiB  
Article
A Field-Programmable Gate Array-Based Adaptive Sleep Posture Analysis Accelerator for Real-Time Monitoring
by Mangali Sravanthi, Sravan Kumar Gunturi, Mangali Chinna Chinnaiah, Siew-Kei Lam, G. Divya Vani, Mudasar Basha, Narambhatla Janardhan, Dodde Hari Krishna and Sanjay Dubey
Sensors 2024, 24(22), 7104; https://doi.org/10.3390/s24227104 - 5 Nov 2024
Viewed by 319
Abstract
This research presents a sleep posture monitoring system designed to assist the elderly and patient attendees. Monitoring sleep posture in real time is challenging, and this approach introduces hardware-based edge computation methods. Initially, we detected the postures using minimally optimized sensing modules and [...] Read more.
This research presents a sleep posture monitoring system designed to assist the elderly and patient attendees. Monitoring sleep posture in real time is challenging, and this approach introduces hardware-based edge computation methods. Initially, we detected the postures using minimally optimized sensing modules and fusion techniques. This was achieved based on subject (human) data at standard and adaptive levels using posture-learning processing elements (PEs). Intermittent posture evaluation was performed with respect to static and adaptive PEs. The final stage was accomplished using the learned subject posture data versus the real-time posture data using posture classification. An FPGA-based Hierarchical Binary Classifier (HBC) algorithm was developed to learn and evaluate sleep posture in real time. The IoT and display devices were used to communicate the monitored posture to attendant/support services. Posture learning and analysis were developed using customized, reconfigurable VLSI architectures for sensor fusion, control, and communication modules in static and adaptive scenarios. The proposed algorithms were coded in Verilog HDL, simulated, and synthesized using VIVADO 2017.3. A Zed Board-based field-programmable gate array (FPGA) Xilinx board was used for experimental validation. Full article
(This article belongs to the Special Issue Robust Motion Recognition Based on Sensor Technology)
Show Figures

Figure 1

16 pages, 6070 KiB  
Article
Implementation of a Reduced Decoding Algorithm Complexity for Quasi-Cyclic Split-Row Threshold Low-Density Parity-Check Decoders
by Bilal Mejmaa, Chakir Aqil, Ismail Akharraz and Abdelaziz Ahaitouf
Information 2024, 15(11), 684; https://doi.org/10.3390/info15110684 - 1 Nov 2024
Viewed by 348
Abstract
We propose two decoding algorithms for quasi-cyclic LDPC codes (QC-LDPC) and implement the more efficient one in this paper. These algorithms depend on the split row for the layered decoding method applied to the Min-Sum (MS) algorithm. We designate the first algorithm “Split-Row [...] Read more.
We propose two decoding algorithms for quasi-cyclic LDPC codes (QC-LDPC) and implement the more efficient one in this paper. These algorithms depend on the split row for the layered decoding method applied to the Min-Sum (MS) algorithm. We designate the first algorithm “Split-Row Layered Min-Sum” (SRLMS), and the second algorithm “Split-Row Threshold Layered Min-Sum” (SRTLMS). A threshold message passes from one partition to another in SRTLMS, minimizing the gap from the MS and achieving a binary error rate of 3 × 10−5 with Imax = 4 as the maximum number of iterations, resulting in a decrease of 0.25 dB. The simulation’s findings indicate that the SRTLMS is the most efficient variant decoding algorithm for LDPC codes, thanks to its compromise between performance and complexity. This paper presents the two invented algorithms and a comprehensive study of the co-design and implementation of the SRTLMS algorithm. We executed the implementation on a Xilinx Kintex-7 XC7K160 FPGA, achieving a maximum operating frequency of 101 MHz and a throughput of 606 Mbps. Full article
Show Figures

Figure 1

16 pages, 4393 KiB  
Article
A Field-Programmable Gate Array-Based Quasi-Cyclic Low-Density Parity-Check Decoder with High Throughput and Excellent Decoding Performance for 5G New-Radio Standards
by Bilal Mejmaa, Ismail Akharraz and Abdelaziz Ahaitouf
Technologies 2024, 12(11), 215; https://doi.org/10.3390/technologies12110215 - 31 Oct 2024
Viewed by 628
Abstract
This work presents a novel fully parallel decoder architecture designed for high-throughput decoding of Quasi-Cyclic Low-Density Parity-Check (QC-LDPC) codes within the context of 5G New-Radio (NR) communication. The design uses the layered Min-Sum (MS) algorithm and focuses on increasing throughput to meet the [...] Read more.
This work presents a novel fully parallel decoder architecture designed for high-throughput decoding of Quasi-Cyclic Low-Density Parity-Check (QC-LDPC) codes within the context of 5G New-Radio (NR) communication. The design uses the layered Min-Sum (MS) algorithm and focuses on increasing throughput to meet the strict needs of enhanced Mobile BroadBand (eMBB) applications. We incorporated a Sub-Optimal Low-Latency (SOLL) technique to enhance the critical check node processing stage inherent to the MS algorithm. This technique efficiently computes the two minimum values, rendering the architecture well-suited for specific Ultra-Reliable Low-Latency Communication (URLLC) scenarios. We design the decoder to be reconfigurable, enabling efficient operation across all expansion factors. We rigorously validate the decoder’s effectiveness through meticulous bit-error-rate (BER) performance evaluations using Hardware Description Language (HDL) co-simulation. This co-simulation utilizes a well-established suite of tools encompassing MATLAB/Simulink for system modeling and Vivado, a prominent FPGA design suite, for hardware representation. With 380,737 Look-Up Tables (LUTs) and 32,898 registers, the decoder’s implementation on a Virtex-7 XC7VX980T FPGA platform by AMD/Xilinx shows good hardware utilization. The architecture attains a robust operating frequency of 304.5 MHz and a normalized throughput of 49.5 Gbps, marking a 36% enhancement compared to the state-of-the-art. This advancement propels decoding capabilities to meet the demands of high-speed data processing. Full article
Show Figures

Figure 1

24 pages, 5816 KiB  
Article
Adaptive FPGA-Based Accelerators for Human–Robot Interaction in Indoor Environments
by Mangali Sravanthi, Sravan Kumar Gunturi, Mangali Chinna Chinnaiah, Siew-Kei Lam, G. Divya Vani, Mudasar Basha, Narambhatla Janardhan, Dodde Hari Krishna and Sanjay Dubey
Sensors 2024, 24(21), 6986; https://doi.org/10.3390/s24216986 - 30 Oct 2024
Viewed by 396
Abstract
This study addresses the challenges of human–robot interactions in real-time environments with adaptive field-programmable gate array (FPGA)-based accelerators. Predicting human posture in indoor environments in confined areas is a significant challenge for service robots. The proposed approach works on two levels: the estimation [...] Read more.
This study addresses the challenges of human–robot interactions in real-time environments with adaptive field-programmable gate array (FPGA)-based accelerators. Predicting human posture in indoor environments in confined areas is a significant challenge for service robots. The proposed approach works on two levels: the estimation of human location and the robot’s intention to serve based on the human’s location at static and adaptive positions. This paper presents three methodologies to address these challenges: binary classification to analyze static and adaptive postures for human localization in indoor environments using the sensor fusion method, adaptive Simultaneous Localization and Mapping (SLAM) for the robot to deliver the task, and human–robot implicit communication. VLSI hardware schemes are developed for the proposed method. Initially, the control unit processes real-time sensor data through PIR sensors and multiple ultrasonic sensors to analyze the human posture. Subsequently, static and adaptive human posture data are communicated to the robot via Wi-Fi. Finally, the robot performs services for humans using an adaptive SLAM-based triangulation navigation method. The experimental validation was conducted in a hospital environment. The proposed algorithms were coded in Verilog HDL, simulated, and synthesized using VIVADO 2017.3. A Zed-board-based FPGA Xilinx board was used for experimental validation. Full article
(This article belongs to the Special Issue Deep Learning for Perception and Recognition: Method and Applications)
Show Figures

Figure 1

37 pages, 1450 KiB  
Article
FPGA-Based Design of a Ready-to-Use and Configurable Soft IP Core for Frame Blocking Time-Sampled Digital Speech Signals
by Nettimi Satya Sai Srinivas, Nagarajan Sugan, Lakshmi Sutha Kumar, Malaya Kumar Nath and Aniruddha Kanhe
Electronics 2024, 13(21), 4180; https://doi.org/10.3390/electronics13214180 - 24 Oct 2024
Viewed by 715
Abstract
‘Frame blocking’ or ‘Framing’ is a technique that divides a time-sampled speech or audio signal into consecutive and equi-sized short-time frames, either overlapped or non-overlapped, for analysis. The framing hardware architectures (FHA) in the literature support framing speech or audio samples of specific [...] Read more.
‘Frame blocking’ or ‘Framing’ is a technique that divides a time-sampled speech or audio signal into consecutive and equi-sized short-time frames, either overlapped or non-overlapped, for analysis. The framing hardware architectures (FHA) in the literature support framing speech or audio samples of specific word size with specific frame size and frame overlap size. However, speech and audio applications often require framing signal samples of varied word sizes with varied frame sizes and frame overlap sizes. Therefore, the existing FHAs must be redesigned appropriately to keep up with the variability in word size, frame size and frame overlap size, as demanded across multiple applications. Redesigning the existing FHAs for each specific application is laborious, prompting the need for a configurable intellectual property (IP) core. The existing FHAs are inappropriate for creating configurable IP cores as they lack adaptability to accommodate variability in frame size and frame overlap size. Therefore, to address these issues, a novel FHA, adaptable to accommodate the desired variability, is proposed. Furthermore, the proposed FHA is transformed into a field-programmable gate array-based soft, ready-to-use and configurable frame blocking IP core using the Xilinx® Vivado tool. The resulting IP core is versatile, offering configurability for framing in numerous applications incorporating real-time digital speech and audio systems. This research article discusses the proposed FHA and frame blocking IP core in detail. Full article
(This article belongs to the Special Issue Recent Advances in Signal Processing and Applications)
Show Figures

Figure 1

28 pages, 8907 KiB  
Article
LSTM-CRP: Algorithm-Hardware Co-Design and Implementation of Cache Replacement Policy Using Long Short-Term Memory
by Yizhou Wang, Yishuo Meng, Jiaxing Wang and Chen Yang
Big Data Cogn. Comput. 2024, 8(10), 140; https://doi.org/10.3390/bdcc8100140 - 21 Oct 2024
Viewed by 651
Abstract
As deep learning has produced dramatic breakthroughs in many areas, it has motivated emerging studies on the combination between neural networks and cache replacement algorithms. However, deep learning is a poor fit for performing cache replacement in hardware implementation because its neural network [...] Read more.
As deep learning has produced dramatic breakthroughs in many areas, it has motivated emerging studies on the combination between neural networks and cache replacement algorithms. However, deep learning is a poor fit for performing cache replacement in hardware implementation because its neural network models are impractically large and slow. Many studies have tried to use the guidance of the Belady algorithm to speed up the prediction of cache replacement. But it is still impractical to accurately predict the characteristics of future access addresses, introducing inaccuracy in the discrimination of complex access patterns. Therefore, this paper presents the LSTM-CRP algorithm as well as its efficient hardware implementation, which employs the long short-term memory (LSTM) for access pattern identification at run-time to guide cache replacement algorithm. LSTM-CRP first converts the address into a novel key according to the frequency of the access address and a virtual capacity of the cache, which has the advantages of low information redundancy and high timeliness. Using the key as the inputs of four offline-trained LSTM network-based predictors, LSTM-CRP can accurately classify different access patterns and identify current cache characteristics in a timely manner via an online set dueling mechanism on sampling caches. For efficient implementation, heterogeneous lightweight LSTM networks are dedicatedly constructed in LSTM-CRP to lower hardware overhead and inference delay. The experimental results show that LSTM-CRP was able to averagely improve the cache hit rate by 20.10%, 15.35%, 12.11% and 8.49% compared with LRU, RRIP, Hawkeye and Glider, respectively. Implemented on Xilinx XCVU9P FPGA at the cost of 15,973 LUTs and 1610 FF registers, LSTM-CRP was running at a 200 MHz frequency with 2.74 W power consumption. Full article
Show Figures

Figure 1

16 pages, 2967 KiB  
Technical Note
Field Programmable Gate Array (FPGA) Implementation of Parallel Jacobi for Eigen-Decomposition in Direction of Arrival (DOA) Estimation Algorithm
by Shuang Zhou and Li Zhou
Remote Sens. 2024, 16(20), 3892; https://doi.org/10.3390/rs16203892 - 19 Oct 2024
Viewed by 608
Abstract
The eigen-decomposition of a covariance matrix is a key step in the Direction of Arrival (DOA) estimation algorithms such as subspace classes. Eigen-decomposition using the parallel Jacobi algorithm implemented on FPGA offers excellent parallelism and real-time performance. Addressing the high complexity and resource [...] Read more.
The eigen-decomposition of a covariance matrix is a key step in the Direction of Arrival (DOA) estimation algorithms such as subspace classes. Eigen-decomposition using the parallel Jacobi algorithm implemented on FPGA offers excellent parallelism and real-time performance. Addressing the high complexity and resource consumption of the traditional parallel Jacobi algorithm implemented on FPGA, this study proposes an improved FPGA-based parallel Jacobi algorithm for eigen-decomposition. By analyzing the relationship between angle calculation and rotation during the Jacobi algorithm decomposition process, leveraging parallelism in the data processing, and based on the concepts of time-division multiplexing and parallel partition processing, this approach effectively reduces FPGA resource consumption. The improved parallel Jacobi algorithm is then applied to the classic DOA estimation algorithm, the MUSIC algorithm, and implemented on Xilinx’s Zynq FPGA. Experimental results demonstrate that this parallel approach can reduce resource consumption by approximately 75% compared to the traditional method but introduces little additional time consumption. The proposed method in this paper will solve the problem of great hardware consumption of eigen-decomposition based on FPGA in DOA applications. Full article
Show Figures

Figure 1

19 pages, 2406 KiB  
Article
FPGA Realization of a Fractional-Order Model of Universal Memory Elements
by Opeyemi-Micheal Afolabi, Vincent-Ademola Adeyemi, Esteban Tlelo-Cuautle and Jose-Cruz Nuñez-Perez
Fractal Fract. 2024, 8(10), 605; https://doi.org/10.3390/fractalfract8100605 - 18 Oct 2024
Viewed by 1146
Abstract
This paper addresses critical gaps in the digital implementations of fractional-order memelement emulators, particularly given the challenges associated with the development of solid-state devices using nanomaterials. Despite the potentials of these devices for industrial applications, the digital implementation of fractional-order models has received [...] Read more.
This paper addresses critical gaps in the digital implementations of fractional-order memelement emulators, particularly given the challenges associated with the development of solid-state devices using nanomaterials. Despite the potentials of these devices for industrial applications, the digital implementation of fractional-order models has received limited attention. This research contributes to bridging this knowledge gap by presenting the FPGA realization of the memelements based on a universal voltage-controlled circuit topology. The digital emulators successfully exhibit the pinched hysteresis behaviors of memristors, memcapacitors, and meminductors, showing the retention of historical states of their constitutive electronic variables. Additionally, we analyze the impact of the fractional-order parameters and excitation frequencies on the behaviors of the memelements. The design methodology involves using Xilinx System Generator for DSP blocks to lay out the architectures of the emulators, with synthesis and gate-level implementation performed on the Xilinx Artix-7 AC701 Evaluation kit, where resource utilization on hardware accounts for about 1% of available hardware resources. Further hardware analysis shows successful timing validation and low power consumption across all designs, with an average on-chip power of 0.23 Watts and average worst negative slack of 0.6 ns against a 5 ns constraint. We validate these results with Matlab 2020b simulations, which aligns with the hardware models. Full article
(This article belongs to the Section Engineering)
Show Figures

Figure 1

19 pages, 6481 KiB  
Article
Parallel Lossless Compression of Raw Bayer Images on FPGA-Based High-Speed Camera
by Žan Regoršek, Aleš Gorkič and Andrej Trost
Sensors 2024, 24(20), 6632; https://doi.org/10.3390/s24206632 - 15 Oct 2024
Viewed by 577
Abstract
Digital image compression is applied to reduce camera bandwidth and storage requirements, but real-time lossless compression on a high-speed high-resolution camera is a challenging task. The article presents hardware implementation of a Bayer colour filter array lossless image compression algorithm on an FPGA-based [...] Read more.
Digital image compression is applied to reduce camera bandwidth and storage requirements, but real-time lossless compression on a high-speed high-resolution camera is a challenging task. The article presents hardware implementation of a Bayer colour filter array lossless image compression algorithm on an FPGA-based camera. The compression algorithm reduces colour and spatial redundancy and employs Golomb–Rice entropy coding. A rule limiting the maximum code length is introduced for the edge cases. The proposed algorithm is based on integer operators for efficient hardware implementation. The algorithm is first verified as a C++ model and later implemented on AMD-Xilinx Zynq UltraScale+ device using VHDL. An effective tree-like pipeline structure is proposed to concatenate codes of compressed pixel data to generate a bitstream representing data of 16 parallel pixels. The proposed parallel compression achieves up to 56% reduction in image size for high-resolution images. Pipelined implementation without any state machine ensures operating frequencies up to 320 MHz. Parallelised operation on 16 pixels effectively increases data throughput to 40 Gbit/s while keeping the total memory requirements low due to real-time processing. Full article
(This article belongs to the Section Sensing and Imaging)
Show Figures

Figure 1

17 pages, 610 KiB  
Article
Gaussian Kernel Approximations Require Only Bit-Shifts
by R. J. Cintra, Paulo Martinez, André Leite, Vítor A. Coutinho, Fábio M. Bayer, Arjuna Madanayake and Diego F. G. Coelho
Information 2024, 15(10), 618; https://doi.org/10.3390/info15100618 - 9 Oct 2024
Viewed by 558
Abstract
An approach to approximate the 2D Gaussian filter for all possible kernel sizes based on the binary optimization technique is introduced. The approximate filter coefficients are designed as negative powers of two, allowing hardware implementation with remarkable savings in the chip area. The [...] Read more.
An approach to approximate the 2D Gaussian filter for all possible kernel sizes based on the binary optimization technique is introduced. The approximate filter coefficients are designed as negative powers of two, allowing hardware implementation with remarkable savings in the chip area. The proposed approximate filters were evaluated and compared with competing methods using both similarity analysis and edge detection applications. The proposed method and the competing works for masks of size 3×3, 5×5, and 7×7 were implemented in a Xilinx Artix-7 FPGA. The proposed method showed up to a 60.0% reduction in DSP usage and a 75.0% increase in the maximum operating frequency when compared with state-of-art methods for the 7×7 kernel size case and a 48.8% reduction in the dynamic power normalized by the maximum operating frequency. Full article
(This article belongs to the Special Issue Feature Papers in Information in 2024–2025)
Show Figures

Figure 1

27 pages, 5306 KiB  
Article
Area-Time-Efficient Secure Comb Scalar Multiplication Architecture Based on Recoding
by Zhantao Zhang, Weijiang Wang, Jingqi Zhang, Xiang He, Mingzhi Ma, Shiwei Ren and Hua Dang
Micromachines 2024, 15(10), 1238; https://doi.org/10.3390/mi15101238 - 7 Oct 2024
Viewed by 742
Abstract
With the development of mobile communication, digital signatures with low latency, low area, and high security are in increasing demand. Elliptic curve cryptography (ECC) is widely used because of its security and lightweight. Elliptic curve scalar multiplication (ECSM) is the basic arithmetic in [...] Read more.
With the development of mobile communication, digital signatures with low latency, low area, and high security are in increasing demand. Elliptic curve cryptography (ECC) is widely used because of its security and lightweight. Elliptic curve scalar multiplication (ECSM) is the basic arithmetic in ECC. Based on this background information, we propose our own research objectives. In this paper, a low-latency and low-area ECSM architecture based on the comb algorithm is proposed. The detailed methodology is as follows. The recoding-k algorithm and randomization-Z algorithm are used to improve security, which can resist sample power analysis (SPA) and differential power analysis (DPA). A low-area multi-functional architecture for comb is proposed, which takes into account different stages of the comb algorithm. Based on this, the data dependency is considered and the comb architecture is optimized to achieve a uniform and efficient execution pattern. The interleaved modular multiplication algorithm and modified binary inverse algorithm are used to achieve short clock cycle delay and high frequency while taking into account the need for a low area. The proposed architecture has been implemented on Xilinx Virtex-7 series FPGA to perform ECSM on 256-bits prime field GF(p). In the hardware architecture with only 7351 slices of resource usage, a single ECSM only takes 0.74 ms, resulting in an area-time product (ATP) of 5.41. The implementation results show that our design can compete with the existing state-of-the-art engineering in terms of performance and has higher security. Our design is suitable for computing scenarios where security and computing speed are required. The implementation of the overall architecture is of great significance and inspiration to the research community. Full article
Show Figures

Figure 1

15 pages, 1106 KiB  
Article
GPU@SAT DevKit: Empowering Edge Computing Development Onboard Satellites in the Space-IoT Era
by Gionata Benelli, Giovanni Todaro, Matteo Monopoli, Gianluca Giuffrida, Massimiliano Donati and Luca Fanucci
Electronics 2024, 13(19), 3928; https://doi.org/10.3390/electronics13193928 - 4 Oct 2024
Cited by 1 | Viewed by 836
Abstract
Advancements in technology have driven the miniaturization of embedded systems, making them more cost-effective and energy-efficient for wireless applications. As a result, the number of connectable devices in Internet of Things (IoT) networks has increased significantly, creating the challenge of linking them effectively [...] Read more.
Advancements in technology have driven the miniaturization of embedded systems, making them more cost-effective and energy-efficient for wireless applications. As a result, the number of connectable devices in Internet of Things (IoT) networks has increased significantly, creating the challenge of linking them effectively and economically. The space industry has long recognized this challenge and invested in satellite infrastructure for IoT networks, exploiting the potential of edge computing technologies. In this context, it is of critical importance to enhance the onboard computing capabilities of satellites and develop enabling technologies for their advancement. This is necessary to ensure that satellites are able to connect devices while reducing latency, bandwidth utilization, and development costs, and improving privacy and security measures. This paper presents the GPU@SAT DevKit: an ecosystem for testing a high-performance, general-purpose accelerator designed for FPGAs and suitable for edge computing tasks on satellites. This ecosystem provides a streamlined way to exploit GPGPU processing in space, enabling faster development times and more efficient resource use. Designed for FPGAs and tailored to edge computing tasks, the GPU@SAT accelerator mimics the parallel architecture of a GPU, allowing developers to leverage its capabilities while maintaining flexibility. Its compatibility with OpenCL simplifies the development process, enabling faster deployment of satellite-based applications. The DevKit was implemented and tested on a Zynq UltraScale+ MPSoC evaluation board from Xilinx, integrating the GPU@SAT IP core with the system’s embedded processor. A client/server approach is used to run applications, allowing users to easily configure and execute kernels through a simple XML document. This intuitive interface provides end-users with the ability to run and evaluate kernel performance and functionality without dealing with the underlying complexities of the accelerator itself. By making the GPU@SAT IP core more accessible, the DevKit significantly reduces development time and lowers the barrier to entry for satellite-based edge computing solutions. The DevKit was also compared with other onboard processing solutions, demonstrating similar performance. Full article
Show Figures

Figure 1

27 pages, 22292 KiB  
Article
RFSoC Softwarisation of a 2.45 GHz Doppler Microwave Radar Motion Sensor
by Peter Hobden, Edmond Nurellari and Saket Srivastava
J. Sens. Actuator Netw. 2024, 13(5), 58; https://doi.org/10.3390/jsan13050058 - 23 Sep 2024
Viewed by 1033
Abstract
Microwave Doppler sensors are used extensively in motion detection as they are energy-efficient, small-size and relatively low-cost sensors. Common applications of microwave Doppler sensors are for detecting intrusion behind a car roof liner inside an automotive vehicle and to detect moving objects. These [...] Read more.
Microwave Doppler sensors are used extensively in motion detection as they are energy-efficient, small-size and relatively low-cost sensors. Common applications of microwave Doppler sensors are for detecting intrusion behind a car roof liner inside an automotive vehicle and to detect moving objects. These applications require a millisecond response from the target for effective detection. A Doppler microwave sensor is ideally suited to the task, as we are only interested in movement of a large water-based mass (i.e., a person) (FMCW Radar also detect static objects). Although microwave components at 2.45 GHz are now relatively cheap due to mass production of other Industrial Scientific and Medical application (ISM) devices, they do require tuning for temperature compensation, dielectric, and manufacturing variability. A digital solution would be ideal, as chip solutions are known to be more repeatable, but Application-Specific Integrated Circuits (ASICs) are expensive to initially prototype. This paper presents the first completely digital Doppler motion sensor solution at 2.45 GHz, implemented on the new RFSoC from Xilinx without the need to up/downconvert the frequency externally. Our proposed system uses a completely digital approach bringing the benefits of product repeatability, better overtemperature performance and softwarisation, without compromising any performance metric associated with a comparable analogue motion sensor. The RFSoC shows to give superior distance versus false detection, as the Signal-to-Noise Ratio (SNR) is better than a typical analogue system. This is mainly due to the high gain amplification requirement of an analogue system, making it susceptible to electrical noise appearing in the intermediate-frequency (IF) baseband. The proposed RFSoC-based Doppler sensor shows how digital technology can replace traditional analogue radio frequency (RF). A case study is presented showing how we can use a novel method of using multiple Doppler channels to provide range discrimination, which can be performed in both analogue and in a digital implementation (RFSoC). Full article
Show Figures

Figure 1

Back to TopTop