Skip to main content

Showing 1–50 of 52 results for author: Di Stefano, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.04092  [pdf, other

    cs.CV

    Looking for Tiny Defects via Forward-Backward Feature Transfer

    Authors: Alex Costanzino, Pierluigi Zama Ramirez, Giuseppe Lisanti, Luigi Di Stefano

    Abstract: Motivated by efficiency requirements, most anomaly detection and segmentation (AD&S) methods focus on processing low-resolution images, e.g., $224\times 224$ pixels, obtained by downsampling the original input images. In this setting, downsampling is typically applied also to the provided ground-truth defect masks. Yet, as numerous industrial applications demand identification of both large and ti… ▽ More

    Submitted 8 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2406.11840  [pdf, other

    cs.CV

    LLaNA: Large Language and NeRF Assistant

    Authors: Andrea Amaduzzi, Pierluigi Zama Ramirez, Giuseppe Lisanti, Samuele Salti, Luigi Di Stefano

    Abstract: Multimodal Large Language Models (MLLMs) have demonstrated an excellent understanding of images and 3D data. However, both modalities have shortcomings in holistically capturing the appearance and geometry of objects. Meanwhile, Neural Radiance Fields (NeRFs), which encode information within the weights of a simple Multi-Layer Perceptron (MLP), have emerged as an increasingly widespread modality t… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Under review. Project page: https://andreamaduzzi.github.io/llana/

  3. arXiv:2404.07993  [pdf, other

    cs.CV

    Connecting NeRFs, Images, and Text

    Authors: Francesco Ballerini, Pierluigi Zama Ramirez, Roberto Mirabella, Samuele Salti, Luigi Di Stefano

    Abstract: Neural Radiance Fields (NeRFs) have emerged as a standard framework for representing 3D scenes and objects, introducing a novel data type for information exchange and storage. Concurrently, significant progress has been made in multimodal representation learning for text and image data. This paper explores a novel research direction that aims to connect the NeRF modality with other modalities, sim… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPRW-INRV 2024

  4. arXiv:2404.03743  [pdf, other

    cs.CV

    Test Time Training for Industrial Anomaly Segmentation

    Authors: Alex Costanzino, Pierluigi Zama Ramirez, Mirko Del Moro, Agostino Aiezzo, Giuseppe Lisanti, Samuele Salti, Luigi Di Stefano

    Abstract: Anomaly Detection and Segmentation (AD&S) is crucial for industrial quality control. While existing methods excel in generating anomaly scores for each pixel, practical applications require producing a binary segmentation to identify anomalies. Due to the absence of labeled anomalies in many real scenarios, standard practices binarize these maps based on some statistics derived from a validation s… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

    Comments: Accepted at VAND 2.0, CVPRW 2024

  5. arXiv:2312.13277  [pdf, other

    cs.CV

    Deep Learning on Object-centric 3D Neural Fields

    Authors: Pierluigi Zama Ramirez, Luca De Luigi, Daniele Sirocchi, Adriano Cardace, Riccardo Spezialetti, Francesco Ballerini, Samuele Salti, Luigi Di Stefano

    Abstract: In recent years, Neural Fields (NFs) have emerged as an effective tool for encoding diverse continuous signals such as images, videos, audio, and 3D shapes. When applied to 3D data, NFs offer a solution to the fragmentation and limitations associated with prevalent discrete representations. However, given that NFs are essentially neural networks, it remains unclear whether and how they can be seam… ▽ More

    Submitted 15 July, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: Extended version of the paper "Deep Learning on Implicit Neural Representations of Shapes" that was presented at ICLR 2023. Accepted at TPAMI. arXiv admin note: text overlap with arXiv:2302.05438

  6. arXiv:2312.04521  [pdf, other

    cs.CV

    Multimodal Industrial Anomaly Detection by Crossmodal Feature Mapping

    Authors: Alex Costanzino, Pierluigi Zama Ramirez, Giuseppe Lisanti, Luigi Di Stefano

    Abstract: The paper explores the industrial multimodal Anomaly Detection (AD) task, which exploits point clouds and RGB images to localize anomalies. We introduce a novel light and fast framework that learns to map features from one modality to the other on nominal samples. At test time, anomalies are detected by pinpointing inconsistencies between observed and mapped features. Extensive experiments show th… ▽ More

    Submitted 8 July, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Comments: Accepted at CVPR 2024

  7. arXiv:2310.01140  [pdf, other

    cs.CV

    Neural Processing of Tri-Plane Hybrid Neural Fields

    Authors: Adriano Cardace, Pierluigi Zama Ramirez, Francesco Ballerini, Allan Zhou, Samuele Salti, Luigi Di Stefano

    Abstract: Driven by the appealing properties of neural fields for storing and communicating 3D data, the problem of directly processing them to address tasks such as classification and part segmentation has emerged and has been investigated in recent works. Early approaches employ neural fields parameterized by shared networks trained on the whole dataset, achieving good task performance but sacrificing rec… ▽ More

    Submitted 30 January, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: Accepted at ICLR 2024

  8. arXiv:2309.07917  [pdf, other

    cs.CV

    Looking at words and points with attention: a benchmark for text-to-shape coherence

    Authors: Andrea Amaduzzi, Giuseppe Lisanti, Samuele Salti, Luigi Di Stefano

    Abstract: While text-conditional 3D object generation and manipulation have seen rapid progress, the evaluation of coherence between generated 3D shapes and input textual descriptions lacks a clear benchmark. The reason is twofold: a) the low quality of the textual descriptions in the only publicly available dataset of text-shape pairs; b) the limited effectiveness of the metrics used to quantitatively asse… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: ICCV 2023 Workshop "AI for 3D Content Creation", Project page: https://cvlab-unibo.github.io/CrossCoherence-Web/, 26 pages

  9. arXiv:2307.15052  [pdf, other

    cs.CV

    Learning Depth Estimation for Transparent and Mirror Surfaces

    Authors: Alex Costanzino, Pierluigi Zama Ramirez, Matteo Poggi, Fabio Tosi, Stefano Mattoccia, Luigi Di Stefano

    Abstract: Inferring the depth of transparent or mirror (ToM) surfaces represents a hard challenge for either sensors, algorithms, or deep networks. We propose a simple pipeline for learning to estimate depth properly for such surfaces with neural networks, without requiring any ground-truth annotation. We unveil how to obtain reliable pseudo labels by in-painting ToM objects in images and processing them wi… ▽ More

    Submitted 27 July, 2023; originally announced July 2023.

    Comments: Accepted at ICCV 2023. Project Page: https://cvlab-unibo.github.io/Depth4ToM

  10. arXiv:2307.09776  [pdf, other

    cs.LO cs.FL cs.PL eess.SY

    LTL Synthesis on Infinite-State Arenas defined by Programs

    Authors: Shaun Azzopardi, Nir Piterman, Gerardo Schneider, Luca di Stefano

    Abstract: This paper deals with the problem of automatically and correctly controlling infinite-state reactive programs to achieve LTL goals. Applications include adapting a program to new requirements, or to repair bugs discovered in the original specification or program code. Existing approaches are able to solve this problem for safety and some reachability properties, but require an a priori template of… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  11. arXiv:2304.10448  [pdf, other

    cs.CV

    ReLight My NeRF: A Dataset for Novel View Synthesis and Relighting of Real World Objects

    Authors: Marco Toschi, Riccardo De Matteo, Riccardo Spezialetti, Daniele De Gregorio, Luigi Di Stefano, Samuele Salti

    Abstract: In this paper, we focus on the problem of rendering novel views from a Neural Radiance Field (NeRF) under unobserved light conditions. To this end, we introduce a novel dataset, dubbed ReNe (Relighting NeRF), framing real world objects under one-light-at-time (OLAT) conditions, annotated with accurate ground-truth camera and light poses. Our acquisition pipeline leverages two robotic arms holding,… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: Accepted at CVPR 2023 as a highlight

  12. arXiv:2304.02991  [pdf, other

    cs.CV

    Exploiting the Complementarity of 2D and 3D Networks to Address Domain-Shift in 3D Semantic Segmentation

    Authors: Adriano Cardace, Pierluigi Zama Ramirez, Samuele Salti, Luigi Di Stefano

    Abstract: 3D semantic segmentation is a critical task in many real-world applications, such as autonomous driving, robotics, and mixed reality. However, the task is extremely challenging due to ambiguities coming from the unstructured, sparse, and uncolored nature of the 3D point clouds. A possible solution is to combine the 3D information with others coming from sensors featuring a different modality, such… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Comments: Accepted at the CVPR2023 Workshop on Autonomous Driving (WAD)

  13. arXiv:2302.05438  [pdf, other

    cs.CV

    Deep Learning on Implicit Neural Representations of Shapes

    Authors: Luca De Luigi, Adriano Cardace, Riccardo Spezialetti, Pierluigi Zama Ramirez, Samuele Salti, Luigi Di Stefano

    Abstract: Implicit Neural Representations (INRs) have emerged in the last few years as a powerful tool to encode continuously a variety of different signals like images, videos, audio and 3D shapes. When applied to 3D shapes, INRs allow to overcome the fragmentation and shortcomings of the popular discrete representations used so far. Yet, considering that INRs consist in neural networks, it is not clear wh… ▽ More

    Submitted 10 February, 2023; originally announced February 2023.

    Comments: Accepted at ICLR 2023

  14. arXiv:2301.11310  [pdf, other

    cs.CV

    Learning Good Features to Transfer Across Tasks and Domains

    Authors: Pierluigi Zama Ramirez, Adriano Cardace, Luca De Luigi, Alessio Tonioni, Samuele Salti, Luigi Di Stefano

    Abstract: Availability of labelled data is the major obstacle to the deployment of deep learning algorithms for computer vision tasks in new domains. The fact that many frameworks adopted to solve different tasks share the same architecture suggests that there should be a way of reusing the knowledge learned in a specific setting to solve novel tasks with limited or no additional supervision. In this work,… ▽ More

    Submitted 26 January, 2023; originally announced January 2023.

    Comments: Extended version of the paper "Learning Across Tasks and Domains" presented at ICCV 2019. Accepted at TPAMI

  15. arXiv:2301.08245  [pdf, other

    cs.CV

    Booster: a Benchmark for Depth from Images of Specular and Transparent Surfaces

    Authors: Pierluigi Zama Ramirez, Alex Costanzino, Fabio Tosi, Matteo Poggi, Samuele Salti, Stefano Mattoccia, Luigi Di Stefano

    Abstract: Estimating depth from images nowadays yields outstanding results, both in terms of in-domain accuracy and generalization. However, we identify two main challenges that remain open in this field: dealing with non-Lambertian materials and effectively processing high-resolution images. Purposely, we propose a novel dataset that includes accurate and dense ground-truth labels at high resolution, featu… ▽ More

    Submitted 30 January, 2024; v1 submitted 19 January, 2023; originally announced January 2023.

    Comments: Extension of the paper "Open Challenges in Deep Stereo: the Booster Dataset" presented at CVPR 2022. Accepted at TPAMI

  16. arXiv:2211.13762  [pdf, other

    cs.CV

    ScanNeRF: a Scalable Benchmark for Neural Radiance Fields

    Authors: Luca De Luigi, Damiano Bolognini, Federico Domeniconi, Daniele De Gregorio, Matteo Poggi, Luigi Di Stefano

    Abstract: In this paper, we propose the first-ever real benchmark thought for evaluating Neural Radiance Fields (NeRFs) and, in general, Neural Rendering (NR) frameworks. We design and implement an effective pipeline for scanning real objects in quantity and effortlessly. Our scan station is built with less than 500$ hardware budget and can collect roughly 4000 images of a scanned object in just 5 minutes.… ▽ More

    Submitted 20 December, 2022; v1 submitted 24 November, 2022; originally announced November 2022.

    Comments: WACV 2023. The first three authors contributed equally. Project page: https://eyecan-ai.github.io/scannerf/

  17. arXiv:2210.08226  [pdf, other

    cs.CV

    Self-Distillation for Unsupervised 3D Domain Adaptation

    Authors: Adriano Cardace, Riccardo Spezialetti, Pierluigi Zama Ramirez, Samuele Salti, Luigi Di Stefano

    Abstract: Point cloud classification is a popular task in 3D vision. However, previous works, usually assume that point clouds at test time are obtained with the same procedure or sensor as those at training time. Unsupervised Domain Adaptation (UDA) instead, breaks this assumption and tries to solve the task on an unlabeled target domain, leveraging only on a supervised source domain. For point cloud class… ▽ More

    Submitted 15 October, 2022; originally announced October 2022.

    Comments: WACV 2023, Project Page: https://cvlab-unibo.github.io/FeatureDistillation/

  18. arXiv:2209.00648  [pdf, other

    cs.CV

    Cross-Spectral Neural Radiance Fields

    Authors: Matteo Poggi, Pierluigi Zama Ramirez, Fabio Tosi, Samuele Salti, Stefano Mattoccia, Luigi Di Stefano

    Abstract: We propose X-NeRF, a novel method to learn a Cross-Spectral scene representation given images captured from cameras with different light spectrum sensitivity, based on the Neural Radiance Fields formulation. X-NeRF optimizes camera poses across spectra during training and exploits Normalized Cross-Device Coordinates (NXDC) to render images of different modalities from arbitrary viewpoints, which a… ▽ More

    Submitted 1 September, 2022; originally announced September 2022.

    Comments: 3DV 2022. Project page: https://cvlab-unibo.github.io/xnerf-web/

  19. arXiv:2206.07047  [pdf, other

    cs.CV

    RGB-Multispectral Matching: Dataset, Learning Methodology, Evaluation

    Authors: Fabio Tosi, Pierluigi Zama Ramirez, Matteo Poggi, Samuele Salti, Stefano Mattoccia, Luigi Di Stefano

    Abstract: We address the problem of registering synchronized color (RGB) and multi-spectral (MS) images featuring very different resolution by solving stereo matching correspondences. Purposely, we introduce a novel RGB-MS dataset framing 13 different scenes in indoor environments and providing a total of 34 image pairs annotated with semi-dense, high-resolution ground-truth labels in the form of disparity… ▽ More

    Submitted 14 June, 2022; originally announced June 2022.

    Comments: CVPR 2022, New Orleans. Project page: https://cvlab-unibo.github.io/rgb-ms-web/

  20. arXiv:2206.05194  [pdf, other

    cs.CV cs.LG

    Learning the Space of Deep Models

    Authors: Gianluca Berardi, Luca De Luigi, Samuele Salti, Luigi Di Stefano

    Abstract: Embedding of large but redundant data, such as images or text, in a hierarchy of lower-dimensional spaces is one of the key features of representation learning approaches, which nowadays provide state-of-the-art solutions to problems once believed hard or impossible to solve. In this work, in a plot twist with a strong meta aftertaste, we show how trained deep models are as redundant as the data t… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

    Comments: Accepted at ICPR2022

  21. arXiv:2206.04671  [pdf, other

    cs.CV

    Open Challenges in Deep Stereo: the Booster Dataset

    Authors: Pierluigi Zama Ramirez, Fabio Tosi, Matteo Poggi, Samuele Salti, Stefano Mattoccia, Luigi Di Stefano

    Abstract: We present a novel high-resolution and challenging stereo dataset framing indoor scenes annotated with dense and accurate ground-truth disparities. Peculiar to our dataset is the presence of several specular and transparent surfaces, i.e. the main causes of failures for state-of-the-art stereo networks. Our acquisition pipeline leverages a novel deep space-time stereo framework which allows for ea… ▽ More

    Submitted 9 June, 2022; originally announced June 2022.

    Comments: CVPR 2022, New Orleans. Project page: https://cvlab-unibo.github.io/booster-web/

  22. arXiv:2110.15367  [pdf, other

    cs.CV

    Neural Disparity Refinement for Arbitrary Resolution Stereo

    Authors: Filippo Aleotti, Fabio Tosi, Pierluigi Zama Ramirez, Matteo Poggi, Samuele Salti, Stefano Mattoccia, Luigi Di Stefano

    Abstract: We introduce a novel architecture for neural disparity refinement aimed at facilitating deployment of 3D computer vision on cheap and widespread consumer devices, such as mobile phones. Our approach relies on a continuous formulation that enables to estimate a refined disparity map at any arbitrary output resolution. Thereby, it can handle effectively the unbalanced camera setup typical of nowaday… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

    Comments: 3DV 2021 Oral paper. Project page: https://cvlab-unibo.github.io/neural-disparity-refinement-web

  23. arXiv:2110.11036  [pdf, other

    cs.CV

    RefRec: Pseudo-labels Refinement via Shape Reconstruction for Unsupervised 3D Domain Adaptation

    Authors: Adriano Cardace, Riccardo Spezialetti, Pierluigi Zama Ramirez, Samuele Salti, Luigi Di Stefano

    Abstract: Unsupervised Domain Adaptation (UDA) for point cloud classification is an emerging research problem with relevant practical motivations. Reliance on multi-task learning to align features across domains has been the standard way to tackle it. In this paper, we take a different path and propose RefRec, the first approach to investigate pseudo-labels and self-training in UDA for point clouds. We pres… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: 3DV 2021 (Oral) Code: https://github.com/CVLAB-Unibo/RefRec

  24. arXiv:2110.06685  [pdf, other

    cs.CV

    Plugging Self-Supervised Monocular Depth into Unsupervised Domain Adaptation for Semantic Segmentation

    Authors: Adriano Cardace, Luca De Luigi, Pierluigi Zama Ramirez, Samuele Salti, Luigi Di Stefano

    Abstract: Although recent semantic segmentation methods have made remarkable progress, they still rely on large amounts of annotated training data, which are often infeasible to collect in the autonomous driving scenario. Previous works usually tackle this issue with Unsupervised Domain Adaptation (UDA), which entails training a network on synthetic images and applying the model to real ones while minimizin… ▽ More

    Submitted 13 October, 2021; originally announced October 2021.

    Comments: Accepted at WACV 2022

  25. arXiv:2110.02833  [pdf, other

    cs.CV

    Shallow Features Guide Unsupervised Domain Adaptation for Semantic Segmentation at Class Boundaries

    Authors: Adriano Cardace, Pierluigi Zama Ramirez, Samuele Salti, Luigi Di Stefano

    Abstract: Although deep neural networks have achieved remarkable results for the task of semantic segmentation, they usually fail to generalize towards new domains, especially when performing synthetic-to-real adaptation. Such domain shift is particularly noticeable along class boundaries, invalidating one of the main goals of semantic segmentation that consists in obtaining sharp segmentation masks. In thi… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

    Comments: Accepted at WACV 2022

  26. arXiv:2012.13210  [pdf, other

    cs.RO cs.CV

    Effective Deployment of CNNs for 3DoF Pose Estimation and Grasping in Industrial Settings

    Authors: Daniele De Gregorio, Riccardo Zanella, Gianluca Palli, Luigi Di Stefano

    Abstract: In this paper we investigate how to effectively deploy deep learning in practical industrial settings, such as robotic grasping applications. When a deep-learning based solution is proposed, usually lacks of any simple method to generate the training data. In the industrial field, where automation is the main goal, not bridging this gap is one of the main reasons why deep learning is not as widesp… ▽ More

    Submitted 24 December, 2020; originally announced December 2020.

  27. SAFFIRE: System for Autonomous Feature Filtering and Intelligent ROI Estimation

    Authors: Marco Boschi, Luigi Di Stefano, Martino Alessandrini

    Abstract: This work introduces a new framework, named SAFFIRE, to automatically extract a dominant recurrent image pattern from a set of image samples. Such a pattern shall be used to eliminate pose variations between samples, which is a common requirement in many computer vision and machine learning tasks. The framework is specialized here in the context of a machine vision system for automated product ins… ▽ More

    Submitted 4 March, 2021; v1 submitted 4 December, 2020; originally announced December 2020.

    Comments: 14 pages, 23 figures, 2 tables

    Journal ref: ICPR International Workshops and Challenges. ICPR 2021. Lecture Notes in Computer Science, vol 12664, pp 552-565

  28. arXiv:2011.03298  [pdf, other

    cs.CV

    Learning to Orient Surfaces by Self-supervised Spherical CNNs

    Authors: Riccardo Spezialetti, Federico Stella, Marlon Marcon, Luciano Silva, Samuele Salti, Luigi Di Stefano

    Abstract: Defining and reliably finding a canonical orientation for 3D surfaces is key to many Computer Vision and Robotics applications. This task is commonly addressed by handcrafted algorithms exploiting geometric cues deemed as distinctive and robust by the designer. Yet, one might conjecture that humans learn the notion of the inherent orientation of 3D objects from experience and that machines may do… ▽ More

    Submitted 13 November, 2020; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: Accepted to NeurIPS 2020

  29. arXiv:2007.05233  [pdf, other

    cs.CV cs.LG eess.IV

    Continual Adaptation for Deep Stereo

    Authors: Matteo Poggi, Alessio Tonioni, Fabio Tosi, Stefano Mattoccia, Luigi Di Stefano

    Abstract: Depth estimation from stereo images is carried out with unmatched results by convolutional neural networks trained end-to-end to regress dense disparities. Like for most tasks, this is possible if large amounts of labelled samples are available for training, possibly covering the whole data distribution encountered at deployment time. Being such an assumption systematically unmet in real applicati… ▽ More

    Submitted 3 May, 2021; v1 submitted 10 July, 2020; originally announced July 2020.

    Comments: Extended version of CVPR 2019 paper "Real-time self-adaptive deep stereo" - Accepted to TPAMI

  30. arXiv:2003.14030  [pdf, other

    cs.CV cs.LG

    Distilled Semantics for Comprehensive Scene Understanding from Videos

    Authors: Fabio Tosi, Filippo Aleotti, Pierluigi Zama Ramirez, Matteo Poggi, Samuele Salti, Luigi Di Stefano, Stefano Mattoccia

    Abstract: Whole understanding of the surroundings is paramount to autonomous systems. Recent works have shown that deep neural networks can learn geometry (depth) and motion (optical flow) from a monocular video without any explicit supervision from ground truth annotations, particularly hard to source for these two tasks. In this paper, we take an additional step toward holistic scene understanding with mo… ▽ More

    Submitted 31 March, 2020; originally announced March 2020.

    Comments: CVPR 2020. Code will be available at https://github.com/CVLAB-Unibo/omeganet

  31. arXiv:2003.10381  [pdf, other

    stat.ML cs.CV cs.LG

    Ambiguity in Sequential Data: Predicting Uncertain Futures with Recurrent Models

    Authors: Alessandro Berlati, Oliver Scheel, Luigi Di Stefano, Federico Tombari

    Abstract: Ambiguity is inherently present in many machine learning tasks, but especially for sequential models seldom accounted for, as most only output a single prediction. In this work we propose an extension of the Multiple Hypothesis Prediction (MHP) model to handle ambiguous predictions with sequential data, which is of special importance, as often multiple futures are equally likely. Our approach can… ▽ More

    Submitted 10 March, 2020; originally announced March 2020.

    Journal ref: Robotics and Automation Letters 2020 (RA-L)

  32. Boosting Object Recognition in Point Clouds by Saliency Detection

    Authors: Marlon Marcon, Riccardo Spezialetti, Samuele Salti, Luciano Silva, Luigi Di Stefano

    Abstract: Object recognition in 3D point clouds is a challenging task, mainly when time is an important factor to deal with, such as in industrial applications. Local descriptors are an amenable choice whenever the 6 DoF pose of recognized objects should also be estimated. However, the pipeline for this kind of descriptors is highly time-consuming. In this work, we propose an update to the traditional pipel… ▽ More

    Submitted 6 November, 2019; originally announced November 2019.

    Comments: International Conference on Image Analysis and Processing (ICIAP) 2019

  33. arXiv:1910.05021  [pdf, other

    cs.CV

    Shooting Labels: 3D Semantic Labeling by Virtual Reality

    Authors: Pierluigi Zama Ramirez, Claudio Paternesi, Luca De Luigi, Luigi Lella, Daniele De Gregorio, Luigi Di Stefano

    Abstract: Availability of a few, large-size, annotated datasets, like ImageNet, Pascal VOC and COCO, has lead deep learning to revolutionize computer vision research by achieving astonishing results in several vision tasks.We argue that new tools to facilitate generation of annotated datasets may help spreading data-driven AI throughout applications and domains. In this work we propose Shooting Labels, the… ▽ More

    Submitted 24 October, 2020; v1 submitted 11 October, 2019; originally announced October 2019.

  34. arXiv:1909.06887  [pdf, other

    cs.CV

    Learning an Effective Equivariant 3D Descriptor Without Supervision

    Authors: Riccardo Spezialetti, Samuele Salti, Luigi Di Stefano

    Abstract: Establishing correspondences between 3D shapes is a fundamental task in 3D Computer Vision, typically addressed by matching local descriptors. Recently, a few attempts at applying the deep learning paradigm to the task have shown promising results. Yet, the only explored way to learn rotation invariant descriptors has been to feed neural networks with highly engineered and invariant representation… ▽ More

    Submitted 15 September, 2019; originally announced September 2019.

    Comments: Accepted to International Conference on Computer Vision 2019

  35. arXiv:1909.06884  [pdf, other

    cs.CV

    Performance Evaluation of Learned 3D Features

    Authors: Riccardo Spezialetti, Samuele Salti, Luigi Di Stefano

    Abstract: Matching surfaces is a challenging 3D Computer Vision problem typically addressed by local features. Although a variety of 3D feature detectors and descriptors has been proposed in literature, they have seldom been proposed together and it is yet not clear how to identify the most effective detector-descriptor pair for a specific application. A promising solution is to leverage machine learning to… ▽ More

    Submitted 15 September, 2019; originally announced September 2019.

    Journal ref: International Conference on Image Analysis and Processing. Springer, Cham, 2019

  36. arXiv:1909.03943  [pdf, other

    cs.CV

    Unsupervised Domain Adaptation for Depth Prediction from Images

    Authors: Alessio Tonioni, Matteo Poggi, Stefano Mattoccia, Luigi Di Stefano

    Abstract: State-of-the-art approaches to infer dense depth measurements from images rely on CNNs trained end-to-end on a vast amount of data. However, these approaches suffer a drastic drop in accuracy when dealing with environments much different in appearance and/or context from those observed at training time. This domain shift issue is usually addressed by fine-tuning on smaller sets of images from the… ▽ More

    Submitted 9 September, 2019; originally announced September 2019.

    Comments: 14 pages, 7 pages. Accepted to TPAMI

  37. arXiv:1908.01862  [pdf, other

    cs.CV

    Semi-Automatic Labeling for Deep Learning in Robotics

    Authors: Daniele De Gregorio, Alessio Tonioni, Gianluca Palli, Luigi Di Stefano

    Abstract: In this paper, we propose Augmented Reality Semi-automatic labeling (ARS), a semi-automatic method which leverages on moving a 2D camera by means of a robot, proving precise camera tracking, and an augmented reality pen to define initial object bounding box, to create large labeled datasets with minimal human intervention. By removing the burden of generating annotated data from humans, we make th… ▽ More

    Submitted 5 August, 2019; originally announced August 2019.

  38. arXiv:1907.07745  [pdf, other

    cs.CV eess.IV eess.SP

    Real-Time Highly Accurate Dense Depth on a Power Budget using an FPGA-CPU Hybrid SoC

    Authors: Oscar Rahnama, Tommaso Cavallari, Stuart Golodetz, Alessio Tonioni, Thomas Joy, Luigi Di Stefano, Simon Walker, Philip H. S. Torr

    Abstract: Obtaining highly accurate depth from stereo images in real time has many applications across computer vision and robotics, but in some contexts, upper bounds on power consumption constrain the feasible hardware to embedded platforms such as FPGAs. Whilst various stereo algorithms have been deployed on these platforms, usually cut down to better match the embedded architecture, certain key parts of… ▽ More

    Submitted 17 July, 2019; originally announced July 2019.

    Comments: 6 pages, 7 figures, 2 tables, journal

    Journal ref: IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 66, no. 5, pp. 773-777, May 2019

  39. arXiv:1904.04744  [pdf, other

    cs.CV

    Learning Across Tasks and Domains

    Authors: Pierluigi Zama Ramirez, Alessio Tonioni, Samuele Salti, Luigi Di Stefano

    Abstract: Recent works have proven that many relevant visual tasks are closely related one to another. Yet, this connection is seldom deployed in practice due to the lack of practical methodologies to transfer learned concepts across different training processes. In this work, we introduce a novel adaptation framework that can operate across both task and domains. Our framework learns to transfer knowledge… ▽ More

    Submitted 3 October, 2019; v1 submitted 9 April, 2019; originally announced April 2019.

    Comments: Accepted at ICCV 2019

  40. arXiv:1904.02957  [pdf, other

    cs.CV

    Learning to Adapt for Stereo

    Authors: Alessio Tonioni, Oscar Rahnama, Thomas Joy, Luigi Di Stefano, Thalaiyasingam Ajanthan, Philip H. S. Torr

    Abstract: Real world applications of stereo depth estimation require models that are robust to dynamic variations in the environment. Even though deep learning based stereo methods are successful, they often fail to generalize to unseen variations in the environment, making them less suitable for practical applications such as autonomous driving. In this work, we introduce a "learning-to-adapt" framework th… ▽ More

    Submitted 5 April, 2019; originally announced April 2019.

    Comments: Accepted at CVPR2019. Code available at https://github.com/CVLAB-Unibo/Learning2AdaptForStereo

    Journal ref: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 9661-9670

  41. arXiv:1902.00760  [pdf, other

    cs.CV

    Domain invariant hierarchical embedding for grocery products recognition

    Authors: Alessio Tonioni, Luigi Di Stefano

    Abstract: Recognizing packaged grocery products based solely on appearance is still an open issue for modern computer vision systems due to peculiar challenges. Firstly, the number of different items to be recognized is huge (i.e., in the order of thousands) and rapidly changing over time. Moreover, there exist a significant domain shift between the images that should be recognized at test time, taken in st… ▽ More

    Submitted 2 February, 2019; originally announced February 2019.

  42. Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade

    Authors: Tommaso Cavallari, Stuart Golodetz, Nicholas A. Lord, Julien Valentin, Victor A. Prisacariu, Luigi Di Stefano, Philip H. S. Torr

    Abstract: Camera pose estimation is an important problem in computer vision. Common techniques either match the current image against keyframes with known poses, directly regress the pose, or establish correspondences between keypoints in the image and points in the scene to estimate the pose. In recent years, regression forests have become a popular alternative to establish such correspondences. They achie… ▽ More

    Submitted 2 July, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: Tommaso Cavallari, Stuart Golodetz, Nicholas Lord and Julien Valentin assert joint first authorship

    MSC Class: 68T45

  43. arXiv:1810.05852  [pdf, other

    cs.CV

    Exploiting Semantics in Adversarial Training for Image-Level Domain Adaptation

    Authors: Pierluigi Zama Ramirez, Alessio Tonioni, Luigi Di Stefano

    Abstract: Performance achievable by modern deep learning approaches are directly related to the amount of data used at training time. Unfortunately, the annotation process is notoriously tedious and expensive, especially for pixel-wise tasks like semantic segmentation. Recent works have proposed to rely on synthetically generated imagery to ease the training set creation. However, models trained on these ki… ▽ More

    Submitted 13 October, 2018; originally announced October 2018.

    Comments: 6 pages, Accepted to IPAS 2018

  44. arXiv:1810.05424  [pdf, other

    cs.CV

    Real-time self-adaptive deep stereo

    Authors: Alessio Tonioni, Fabio Tosi, Matteo Poggi, Stefano Mattoccia, Luigi Di Stefano

    Abstract: Deep convolutional neural networks trained end-to-end are the state-of-the-art methods to regress dense disparity maps from stereo pairs. These models, however, suffer from a notable decrease in accuracy when exposed to scenarios significantly different from the training set, e.g., real vs synthetic images, etc.). We argue that it is extremely unlikely to gather enough samples to achieve effective… ▽ More

    Submitted 5 April, 2019; v1 submitted 12 October, 2018; originally announced October 2018.

    Comments: Accepted at CVPR2019 as oral presentation. Code Available https://github.com/CVLAB-Unibo/Real-time-self-adaptive-deep-stereo

  45. arXiv:1810.04461  [pdf, other

    cs.CV

    Let's take a Walk on Superpixels Graphs: Deformable Linear Objects Segmentation and Model Estimation

    Authors: Daniele De Gregorio, Gianluca Palli, Luigi Di Stefano

    Abstract: While robotic manipulation of rigid objects is quite straightforward, coping with deformable objects is an open issue. More specifically, tasks like tying a knot, wiring a connector or even surgical suturing deal with the domain of Deformable Linear Objects (DLOs). In particular the detection of a DLO is a non-trivial problem especially under clutter and occlusions (as well as self-occlusions). Th… ▽ More

    Submitted 10 October, 2018; originally announced October 2018.

    Comments: Accepted as Oral to ACCV 2018, Perth

  46. arXiv:1810.04093  [pdf, other

    cs.CV

    Geometry meets semantics for semi-supervised monocular depth estimation

    Authors: Pierluigi Zama Ramirez, Matteo Poggi, Fabio Tosi, Stefano Mattoccia, Luigi Di Stefano

    Abstract: Depth estimation from a single image represents a very exciting challenge in computer vision. While other image-based depth sensing techniques leverage on the geometry between different viewpoints (e.g., stereo or structure from motion), the lack of these cues within a single image renders ill-posed the monocular depth estimation task. For inference, state-of-the-art encoder-decoder architectures… ▽ More

    Submitted 26 October, 2018; v1 submitted 9 October, 2018; originally announced October 2018.

    Comments: 16 pages, Accepted to ACCV 2018

  47. arXiv:1810.01733  [pdf, other

    cs.CV

    A deep learning pipeline for product recognition on store shelves

    Authors: Alessio Tonioni, Eugenio Serra, Luigi Di Stefano

    Abstract: Recognition of grocery products in store shelves poses peculiar challenges. Firstly, the task mandates the recognition of an extremely high number of different items, in the order of several thousands for medium-small shops, with many of them featuring small inter and intra class variability. Then, available product databases usually include just one or a few studio-quality images per product (ref… ▽ More

    Submitted 27 January, 2019; v1 submitted 3 October, 2018; originally announced October 2018.

  48. Towards formal models and languages for verifiable Multi-Robot Systems

    Authors: Rocco De Nicola, Luca Di Stefano, Omar Inverso

    Abstract: Incorrect operations of a Multi-Robot System (MRS) may not only lead to unsatisfactory results, but can also cause economic losses and threats to safety. These threats may not always be apparent, since they may arise as unforeseen consequences of the interactions between elements of the system. This call for tools and techniques that can help in providing guarantees about MRSs behaviour. We think… ▽ More

    Submitted 8 May, 2018; v1 submitted 22 April, 2018; originally announced April 2018.

    Comments: Changed formatting

  49. arXiv:1707.08378  [pdf, other

    cs.CV

    Product recognition in store shelves as a sub-graph isomorphism problem

    Authors: Alessio Tonioni, Luigi Di Stefano

    Abstract: The arrangement of products in store shelves is carefully planned to maximize sales and keep customers happy. However, verifying compliance of real shelves to the ideal layout is a costly task routinely performed by the store personnel. In this paper, we propose a computer vision pipeline to recognize products on shelves and verify compliance to the planned layout. We deploy local invariant featur… ▽ More

    Submitted 19 September, 2017; v1 submitted 26 July, 2017; originally announced July 2017.

    Comments: Slightly extended version of the paper accepted at ICIAP 2017. More information @project_page --> http://vision.disi.unibo.it/index.php?option=com_content&view=article&id=111&catid=78

  50. arXiv:1704.05832  [pdf, other

    cs.CV cs.RO

    SkiMap: An Efficient Mapping Framework for Robot Navigation

    Authors: Daniele De Gregorio, Luigi Di Stefano

    Abstract: We present a novel mapping framework for robot navigation which features a multi-level querying system capable to obtain rapidly representations as diverse as a 3D voxel grid, a 2.5D height map and a 2D occupancy grid. These are inherently embedded into a memory and time efficient core data structure organized as a Tree of SkipLists. Compared to the well-known Octree representation, our approach e… ▽ More

    Submitted 19 April, 2017; originally announced April 2017.

    Comments: Accepted by International Conference on Robotics and Automation (ICRA) 2017. This is the submitted version. The final published version may be slightly different