Skip to main content

Showing 1–20 of 20 results for author: Kanai, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.16261  [pdf, other

    cs.LG cs.AI

    Evaluating Time-Series Training Dataset through Lens of Spectrum in Deep State Space Models

    Authors: Sekitoshi Kanai, Yasutoshi Ida, Kazuki Adachi, Mihiro Uchida, Tsukasa Yoshida, Shin'ya Yamaguchi

    Abstract: This study investigates a method to evaluate time-series datasets in terms of the performance of deep neural networks (DNNs) with state space models (deep SSMs) trained on the dataset. SSMs have attracted attention as components inside DNNs to address time-series data. Since deep SSMs have powerful representation capacities, training datasets play a crucial role in solving a new task. However, the… ▽ More

    Submitted 29 August, 2024; originally announced August 2024.

    Comments: 11 pages, 5 figures

  2. arXiv:2403.10097  [pdf, other

    cs.LG cs.AI cs.CV

    Adaptive Random Feature Regularization on Fine-tuning Deep Neural Networks

    Authors: Shin'ya Yamaguchi, Sekitoshi Kanai, Kazuki Adachi, Daiki Chijiwa

    Abstract: While fine-tuning is a de facto standard method for training deep neural networks, it still suffers from overfitting when using small target datasets. Previous methods improve fine-tuning performance by maintaining knowledge of the source datasets or introducing regularization terms such as contrastive loss. However, these methods require auxiliary source information (e.g., source labels or datase… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: Accepted to CVPR 2024

  3. arXiv:2311.06642  [pdf, other

    cond-mat.mes-hall cs.ET

    Double-Free-Layer Stochastic Magnetic Tunnel Junctions with Synthetic Antiferromagnets

    Authors: Kemal Selcuk, Shun Kanai, Rikuto Ota, Hideo Ohno, Shunsuke Fukami, Kerem Y. Camsari

    Abstract: Stochastic magnetic tunnel junctions (sMTJ) using low-barrier nanomagnets have shown promise as fast, energy-efficient, and scalable building blocks for probabilistic computing. Despite recent experimental and theoretical progress, sMTJs exhibiting the ideal characteristics necessary for probabilistic bits (p-bit) are still lacking. Ideally, the sMTJs should have (a) voltage bias independence prev… ▽ More

    Submitted 30 March, 2024; v1 submitted 11 November, 2023; originally announced November 2023.

    Journal ref: Phys. Rev. Applied 21, 054002 (2024)

  4. arXiv:2308.16454  [pdf, other

    cs.CV cs.LG

    Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff

    Authors: Satoshi Suzuki, Shin'ya Yamaguchi, Shoichiro Takeda, Sekitoshi Kanai, Naoki Makishima, Atsushi Ando, Ryo Masumura

    Abstract: This paper addresses the tradeoff between standard accuracy on clean examples and robustness against adversarial examples in deep neural networks (DNNs). Although adversarial training (AT) improves robustness, it degrades the standard accuracy, thus yielding the tradeoff. To mitigate this tradeoff, we propose a novel AT method called ARREST, which comprises three components: (i) adversarial finetu… ▽ More

    Submitted 31 August, 2023; originally announced August 2023.

    Comments: Accepted by International Conference on Computer Vision (ICCV) 2023

  5. arXiv:2307.13899  [pdf, other

    cs.LG cs.AI cs.CV

    Regularizing Neural Networks with Meta-Learning Generative Models

    Authors: Shin'ya Yamaguchi, Daiki Chijiwa, Sekitoshi Kanai, Atsutoshi Kumagai, Hisashi Kashima

    Abstract: This paper investigates methods for improving generative data augmentation for deep learning. Generative data augmentation leverages the synthetic samples produced by generative models as an additional dataset for classification with small dataset settings. A key challenge of generative data augmentation is that the synthetic data contain uninformative samples that degrade accuracy. This is becaus… ▽ More

    Submitted 23 October, 2023; v1 submitted 25 July, 2023; originally announced July 2023.

    Comments: Accepted to NeurIPS 2023

  6. arXiv:2304.05949  [pdf, other

    cond-mat.mes-hall cs.AI cs.ET cs.LG

    CMOS + stochastic nanomagnets: heterogeneous computers for probabilistic inference and learning

    Authors: Nihal Sanjay Singh, Keito Kobayashi, Qixuan Cao, Kemal Selcuk, Tianrui Hu, Shaila Niazi, Navid Anjum Aadit, Shun Kanai, Hideo Ohno, Shunsuke Fukami, Kerem Y. Camsari

    Abstract: Extending Moore's law by augmenting complementary-metal-oxide semiconductor (CMOS) transistors with emerging nanotechnologies (X) has become increasingly important. One important class of problems involve sampling-based Monte Carlo algorithms used in probabilistic machine learning, optimization, and quantum simulation. Here, we combine stochastic magnetic tunnel junction (sMTJ)-based probabilistic… ▽ More

    Submitted 23 February, 2024; v1 submitted 12 April, 2023; originally announced April 2023.

    Journal ref: Nature Communications volume 15, Article number: 2685 (2024)

  7. arXiv:2303.07597  [pdf, other

    cs.LG stat.ML

    Fast Regularized Discrete Optimal Transport with Group-Sparse Regularizers

    Authors: Yasutoshi Ida, Sekitoshi Kanai, Kazuki Adachi, Atsutoshi Kumagai, Yasuhiro Fujiwara

    Abstract: Regularized discrete optimal transport (OT) is a powerful tool to measure the distance between two discrete distributions that have been constructed from data samples on two different domains. While it has a wide range of applications in machine learning, in some cases the sampled data from only one of the domains will have class labels such as unsupervised domain adaptation. In this kind of probl… ▽ More

    Submitted 13 March, 2023; originally announced March 2023.

    Comments: This is an extended version of the paper accepted by the 37th AAAI Conference on Artificial Intelligence (AAAI 2023)

  8. arXiv:2302.06457  [pdf, other

    cs.ET cs.AR cs.DC cs.NE physics.comp-ph

    A full-stack view of probabilistic computing with p-bits: devices, architectures and algorithms

    Authors: Shuvro Chowdhury, Andrea Grimaldi, Navid Anjum Aadit, Shaila Niazi, Masoud Mohseni, Shun Kanai, Hideo Ohno, Shunsuke Fukami, Luke Theogarajan, Giovanni Finocchio, Supriyo Datta, Kerem Y. Camsari

    Abstract: The transistor celebrated its 75${}^\text{th}$ birthday in 2022. The continued scaling of the transistor defined by Moore's Law continues, albeit at a slower pace. Meanwhile, computing demands and energy consumption required by modern artificial intelligence (AI) algorithms have skyrocketed. As an alternative to scaling transistors for general-purpose computing, the integration of transistors with… ▽ More

    Submitted 16 March, 2023; v1 submitted 13 February, 2023; originally announced February 2023.

    Journal ref: IEEE Journal on Exploratory Solid-State Computational Devices and Circuits (2023)

  9. arXiv:2210.01348  [pdf, other

    cs.LG cs.NE

    Fast Saturating Gate for Learning Long Time Scales with Recurrent Neural Networks

    Authors: Kentaro Ohno, Sekitoshi Kanai, Yasutoshi Ida

    Abstract: Gate functions in recurrent models, such as an LSTM and GRU, play a central role in learning various time scales in modeling time series data by using a bounded activation function. However, it is difficult to train gates to capture extremely long time scales due to gradient vanishing of the bounded function for large inputs, which is known as the saturation problem. We closely analyze the relatio… ▽ More

    Submitted 3 October, 2022; originally announced October 2022.

    Comments: 9 pages of main texts with 4 pages appendices, 12 figures

  10. arXiv:2207.10283  [pdf, other

    cs.LG cs.AI stat.ML

    One-vs-the-Rest Loss to Focus on Important Samples in Adversarial Training

    Authors: Sekitoshi Kanai, Shin'ya Yamaguchi, Masanori Yamada, Hiroshi Takahashi, Kentaro Ohno, Yasutoshi Ida

    Abstract: This paper proposes a new loss function for adversarial training. Since adversarial training has difficulties, e.g., necessity of high model capacity, focusing on important data points by weighting cross-entropy loss has attracted much attention. However, they are vulnerable to sophisticated attacks, e.g., Auto-Attack. This paper experimentally reveals that the cause of their vulnerability is thei… ▽ More

    Submitted 26 April, 2023; v1 submitted 20 July, 2022; originally announced July 2022.

    Comments: ICML2023, 26 pages, 19 figures

  11. arXiv:2204.12833  [pdf, other

    cs.LG cs.AI stat.ML

    Transfer Learning with Pre-trained Conditional Generative Models

    Authors: Shin'ya Yamaguchi, Sekitoshi Kanai, Atsutoshi Kumagai, Daiki Chijiwa, Hisashi Kashima

    Abstract: Transfer learning is crucial in training deep neural networks on new target tasks. Current transfer learning methods always assume at least one of (i) source and target task label spaces overlap, (ii) source datasets are available, and (iii) target network architectures are consistent with source ones. However, holding these assumptions is difficult in practical settings because the target task ra… ▽ More

    Submitted 29 September, 2022; v1 submitted 27 April, 2022; originally announced April 2022.

    Comments: 24 pages, 6 figures

  12. arXiv:2106.02343  [pdf, other

    cs.CV cs.LG eess.IV

    F-Drop&Match: GANs with a Dead Zone in the High-Frequency Domain

    Authors: Shin'ya Yamaguchi, Sekitoshi Kanai

    Abstract: Generative adversarial networks built from deep convolutional neural networks (GANs) lack the ability to exactly replicate the high-frequency components of natural images. To alleviate this issue, we introduce two novel training techniques called frequency dropping (F-Drop) and frequency matching (F-Match). The key idea of F-Drop is to filter out unnecessary high-frequency components from the inpu… ▽ More

    Submitted 18 August, 2021; v1 submitted 4 June, 2021; originally announced June 2021.

    Comments: Accepted to ICCV 2021; Added experiments on StyleGAN2-ADA

  13. arXiv:2103.01400  [pdf, other

    cs.LG cs.AI stat.ML

    Smoothness Analysis of Adversarial Training

    Authors: Sekitoshi Kanai, Masanori Yamada, Hiroshi Takahashi, Yuki Yamanaka, Yasutoshi Ida

    Abstract: Deep neural networks are vulnerable to adversarial attacks. Recent studies about adversarial robustness focus on the loss landscape in the parameter space since it is related to optimization and generalization performance. These studies conclude that the difficulty of adversarial training is caused by the non-smoothness of the loss function: i.e., its gradient is not Lipschitz continuous. However,… ▽ More

    Submitted 15 June, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

    Comments: 22 pages, 7 figures. In V3, we add the results of EntropySGD for adversarial training

  14. arXiv:2102.02950  [pdf, other

    stat.ML cs.AI cs.LG

    Adversarial Training Makes Weight Loss Landscape Sharper in Logistic Regression

    Authors: Masanori Yamada, Sekitoshi Kanai, Tomoharu Iwata, Tomokatsu Takahashi, Yuki Yamanaka, Hiroshi Takahashi, Atsutoshi Kumagai

    Abstract: Adversarial training is actively studied for learning robust models against adversarial examples. A recent study finds that adversarially trained models degenerate generalization performance on adversarial examples when their weight loss landscape, which is loss changes with respect to weights, is sharp. Unfortunately, it has been experimentally shown that adversarial training sharpens the weight… ▽ More

    Submitted 4 February, 2021; originally announced February 2021.

    Comments: 9 pages, 5 figures

  15. arXiv:2010.02558  [pdf, other

    stat.ML cs.AI cs.LG

    Constraining Logits by Bounded Function for Adversarial Robustness

    Authors: Sekitoshi Kanai, Masanori Yamada, Shin'ya Yamaguchi, Hiroshi Takahashi, Yasutoshi Ida

    Abstract: We propose a method for improving adversarial robustness by addition of a new bounded function just before softmax. Recent studies hypothesize that small logits (inputs of softmax) by logit regularization can improve adversarial robustness of deep learning. Following this hypothesis, we analyze norms of logit vectors at the optimal point under the assumption of universal approximation and explore… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: 19 pages, 16 figures

  16. arXiv:1912.11603  [pdf, other

    stat.ML cs.CV cs.LG

    Image Enhanced Rotation Prediction for Self-Supervised Learning

    Authors: Shin'ya Yamaguchi, Sekitoshi Kanai, Tetsuya Shioda, Shoichiro Takeda

    Abstract: The rotation prediction (Rotation) is a simple pretext-task for self-supervised learning (SSL), where models learn useful representations for target vision tasks by solving pretext-tasks. Although Rotation captures information of object shapes, it hardly captures information of textures. To tackle this problem, we introduce a novel pretext-task called image enhanced rotation prediction (IE-Rot) fo… ▽ More

    Submitted 4 June, 2021; v1 submitted 25 December, 2019; originally announced December 2019.

    Comments: Accepted to IEEE ICIP 2021. The title has been changed from "Multiple Pretext-Task for Self-Supervised Learning via Mixing Multiple Image Transformations"

  17. arXiv:1912.11597  [pdf, other

    stat.ML cs.CV cs.LG

    Effective Data Augmentation with Multi-Domain Learning GANs

    Authors: Shin'ya Yamaguchi, Sekitoshi Kanai, Takeharu Eda

    Abstract: For deep learning applications, the massive data development (e.g., collecting, labeling), which is an essential process in building practical applications, still incurs seriously high costs. In this work, we propose an effective data augmentation method based on generative adversarial networks (GANs), called Domain Fusion. Our key idea is to import the knowledge contained in an outer dataset to a… ▽ More

    Submitted 25 December, 2019; originally announced December 2019.

    Comments: AAAI-2020

  18. arXiv:1909.08830  [pdf, other

    stat.ML cs.CV cs.LG

    Absum: Simple Regularization Method for Reducing Structural Sensitivity of Convolutional Neural Networks

    Authors: Sekitoshi Kanai, Yasutoshi Ida, Yasuhiro Fujiwara, Masanori Yamada, Shuichi Adachi

    Abstract: We propose Absum, which is a regularization method for improving adversarial robustness of convolutional neural networks (CNNs). Although CNNs can accurately recognize images, recent studies have shown that the convolution operations in CNNs commonly have structural sensitivity to specific noise composed of Fourier basis functions. By exploiting this sensitivity, they proposed a simple black-box a… ▽ More

    Submitted 19 September, 2019; originally announced September 2019.

    Comments: 16 pages, 39 figures

  19. arXiv:1903.10709  [pdf, other

    stat.ML cs.LG

    Autoencoding Binary Classifiers for Supervised Anomaly Detection

    Authors: Yuki Yamanaka, Tomoharu Iwata, Hiroshi Takahashi, Masanori Yamada, Sekitoshi Kanai

    Abstract: We propose the Autoencoding Binary Classifiers (ABC), a novel supervised anomaly detector based on the Autoencoder (AE). There are two main approaches in anomaly detection: supervised and unsupervised. The supervised approach accurately detects the known anomalies included in training data, but it cannot detect the unknown anomalies. Meanwhile, the unsupervised approach can detect both known and u… ▽ More

    Submitted 26 March, 2019; originally announced March 2019.

  20. arXiv:1805.10829  [pdf, other

    stat.ML cs.LG

    Sigsoftmax: Reanalysis of the Softmax Bottleneck

    Authors: Sekitoshi Kanai, Yasuhiro Fujiwara, Yuki Yamanaka, Shuichi Adachi

    Abstract: Softmax is an output activation function for modeling categorical probability distributions in many applications of deep learning. However, a recent study revealed that softmax can be a bottleneck of representational capacity of neural networks in language modeling (the softmax bottleneck). In this paper, we propose an output activation function for breaking the softmax bottleneck without addition… ▽ More

    Submitted 28 May, 2018; originally announced May 2018.

    Comments: 15pages, 2 figures