NLP Assignment 2-2
NLP Assignment 2-2
Problem: Fine-tune a lightweight transformer model (like DistilBERT or ALBERT) on a small subset of a
question-answering dataset (10000 examples).
a) After fine-tuning, visualize the attention patterns and explore the impact of pruning specific
attention heads on the model's performance. Prune a few attention heads and measure the impact
on model accuracy or loss.
b) Implement layer freezing—freeze different layers during fine-tuning (e.g., freeze the bottom N
layers and only train the top layers). Compare how freezing different layers affects performance.
Analyze the trade-offs between computational efficiency and performance.
Fine-tune the model with Adapter Modules added for efficient, task-specific fine-tuning. Adapters are
lightweight neural modules that can be inserted into transformer layers and are trained on a specific task
without modifying the original model weights
Use BertViz, Captum, or other libraries to visualize attention across heads and layers for different QA
examples
Dataset: Use a small subset of the SQuAD dataset or use another small dataset, such as BoolQ. Aim for a
dataset size of 10000 examples for training and 200 examples for evaluation
Dataset Link:
https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json
https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json
Q2
Design a system to convert natural language questions into SPARQL queries for retrieving answers from
Wikidata. Your system should:
1.Train and test a Neural Machine Translation (NMT) model using the QALD-9 dataset, which contains
question-query pairs.
2.Perform entity and relation linking to map question entities and relations to Wikidata using tools like
BLINK,TagMe or Falcon 2.0.
During inference, experiment with greedy decoding vs. beam search decoding.
Compare the results with and without teacher forcing during training.
Dataset link:
Train set:https://github.com/KGQA/QALD_9_plus/blob/main/data/qald_9_plus_train_wikidata.json
Test set:https://github.com/KGQA/QALD_9_plus/blob/main/data/qald_9_plus_test_wikidata.json