0% found this document useful (0 votes)
13 views2 pages

NLP Assignment 2-2

The document outlines two tasks: fine-tuning a lightweight transformer model on a question-answering dataset and designing a system to convert natural language questions into SPARQL queries. The first task involves visualizing attention patterns, pruning attention heads, and implementing layer freezing with Adapter Modules for efficient fine-tuning. The second task focuses on training a Neural Machine Translation model using the QALD-9 dataset, performing entity and relation linking, and evaluating the accuracy of SPARQL query generation and answer retrieval from Wikidata.

Uploaded by

SUPRIYA MADDELA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views2 pages

NLP Assignment 2-2

The document outlines two tasks: fine-tuning a lightweight transformer model on a question-answering dataset and designing a system to convert natural language questions into SPARQL queries. The first task involves visualizing attention patterns, pruning attention heads, and implementing layer freezing with Adapter Modules for efficient fine-tuning. The second task focuses on training a Neural Machine Translation model using the QALD-9 dataset, performing entity and relation linking, and evaluating the accuracy of SPARQL query generation and answer retrieval from Wikidata.

Uploaded by

SUPRIYA MADDELA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Q1

Problem: Fine-tune a lightweight transformer model (like DistilBERT or ALBERT) on a small subset of a
question-answering dataset (10000 examples).

a) After fine-tuning, visualize the attention patterns and explore the impact of pruning specific
attention heads on the model's performance. Prune a few attention heads and measure the impact
on model accuracy or loss.
b) Implement layer freezing—freeze different layers during fine-tuning (e.g., freeze the bottom N
layers and only train the top layers). Compare how freezing different layers affects performance.
Analyze the trade-offs between computational efficiency and performance.

Fine-tune the model with Adapter Modules added for efficient, task-specific fine-tuning. Adapters are
lightweight neural modules that can be inserted into transformer layers and are trained on a specific task
without modifying the original model weights

Use BertViz, Captum, or other libraries to visualize attention across heads and layers for different QA
examples

Dataset: Use a small subset of the SQuAD dataset or use another small dataset, such as BoolQ. Aim for a
dataset size of 10000 examples for training and 200 examples for evaluation

Dataset Link:

https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v2.0.json
https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v2.0.json

Q2

Design a system to convert natural language questions into SPARQL queries for retrieving answers from
Wikidata. Your system should:

1.Train and test a Neural Machine Translation (NMT) model using the QALD-9 dataset, which contains
question-query pairs.

2.Perform entity and relation linking to map question entities and relations to Wikidata using tools like
BLINK,TagMe or Falcon 2.0.

3.Find the corresponding answers to the questions.

Report two metrics:

1.Accuracy of correct SPARQL query generation.


2.Accuracy of correct answer retrieval from Wikidata based on the generated SPARQL queries.

Utilize a sequence-to-sequence (Seq2Seq) architecture enhanced with Bi-directional LSTM (Bi-LSTM)


and multi-layer LSTM to capture complex patterns in the data.

Train the model both with and without attention mechanisms.

During inference, experiment with greedy decoding vs. beam search decoding.

Compare the results with and without teacher forcing during training.

Dataset link:

Train set:https://github.com/KGQA/QALD_9_plus/blob/main/data/qald_9_plus_train_wikidata.json

Test set:https://github.com/KGQA/QALD_9_plus/blob/main/data/qald_9_plus_test_wikidata.json

Consider only the English language questions in the dataset.

SPARQL query example:

Question:Where did Abraham Lincoln die?

SPARQL query:SELECT DISTINCT ?uri WHERE { wd:Q91 wdt:P20 ?uri }

You might also like