0% found this document useful (0 votes)
8 views2 pages

CNN Special Topics Interview Questions

CNNs can be applied to various data types beyond images, including time-series and text data. The document discusses concepts such as attention mechanisms, group convolutions, and differences between CNNs and Transformers in vision tasks. It also covers practical aspects of training CNNs, including the choice of architecture, challenges in deployment, and the benefits of using pre-trained models.

Uploaded by

Rinki Kumari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views2 pages

CNN Special Topics Interview Questions

CNNs can be applied to various data types beyond images, including time-series and text data. The document discusses concepts such as attention mechanisms, group convolutions, and differences between CNNs and Transformers in vision tasks. It also covers practical aspects of training CNNs, including the choice of architecture, challenges in deployment, and the benefits of using pre-trained models.

Uploaded by

Rinki Kumari
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

CNN Interview Questions – Special

Topics & Applications


Can CNNs be applied to non-image data? If so, how?
Yes, CNNs can be applied to any data with a spatial or grid-like structure, such as:
- 1D CNNs for time-series data
- 2D CNNs for images
- 3D CNNs for volumetric data (e.g., CT scans)
Text data can be converted into sequences or matrices (like embeddings) to apply 1D or 2D
CNNs.

What is attention mechanism and how does it relate to CNNs?


Attention allows models to focus on important parts of the input. While CNNs capture local
patterns, attention can capture global dependencies. Hybrid models combine CNNs with
attention for tasks like image captioning and segmentation.

What are group convolutions and why are they used?


Group convolutions divide the input channels and filters into groups, applying convolution
separately to each. This reduces computation and was used in models like AlexNet and
ResNeXt to improve performance and efficiency.

How is CNN different from Transformer for vision tasks?


CNNs use convolution to capture local patterns and are efficient with inductive bias.
Transformers use self-attention to capture long-range dependencies and global context.
Vision Transformers (ViTs) outperform CNNs on large datasets but require more training
data.

How can you visualize filters or feature maps in a trained CNN?


You can visualize:
- Filters: By plotting the learned weights
- Feature maps: By feeding an image to the model and extracting intermediate layer outputs
Visualization helps interpret what the model is learning.

What’s the difference between training a CNN from scratch and using pre-
trained models?
- Training from scratch: Requires large data and more time
- Using pre-trained: Faster, needs less data, and leverages learned features from large
datasets (e.g., ImageNet)
Transfer learning is commonly used in practice.
How do you choose the depth and number of filters in each CNN layer?
Typically:
- Start shallow and increase depth gradually
- Use fewer filters in early layers and more in deeper layers to capture complex patterns
Hyperparameter tuning and architecture search can help find optimal values.

What are common challenges in deploying CNNs in production?


- High inference latency
- Large model size
- Need for model compression or quantization
- Hardware limitations
- Maintaining performance across real-world scenarios
- Explainability and debugging difficulties

You might also like