Techical Seminar Report Sam - Edit
Techical Seminar Report Sam - Edit
Submitted by
SamiAhmad Sanadi
(2JR20CS080)
It is certified that, the Seminar work entitled “Large Language Model” is a bonafide
work satisfactorily completed by SamiAhmad Sanadi (2JR20CS080), in partial fulfillment the
award of Bachelor of Engineering degree in Computer Science and Engineering of the
Visvesvaraya Technological University, Belagavi during the year 2023-2024. The Seminar
report has fulfilled & reached the requirements of regulations related to the degree.
i
ACKNOWLEDGEMENT
It is my proud privilege and duty to acknowledge the kind of help and guidance received
from several people in preparation of this report. It would not have been possible to prepare
this report in this form without their valuable help, cooperation, and guidance.
First and foremost, I wish to record our sincere gratitude to Management of our college
and to our beloved Principal Dr. S. V. Gorabal, Jain College of Engineering and Research,
Belagavi for his constant support and encouragement for completion of seminar.
I am also thankful to Dr. Virupaxi B. Dalal, Dean Academics, for creating right kind of
milieu and giving moral support.
My sincere thanks to our HOD Dr. Pritam Dhumale Head of Computer Science and
Engineering Department, JCER, for her support which helped in completion of the seminar.
My sincere thanks to Prof. Bharateesh N.F and Prof. Basavaraj M, Seminar Coordinators
for providing platform to present the seminar.
My sincere gratitude is also owed to all the staff members and parents for their direct and
indirect support on successful completion of this seminar.
SamiAhmad
(2JR20CS080)
ii
ABSTRACT
Large language models (LLMs) hold immense potential, yet their true capabilities can remain
veiled without proper guidance. Prompt engineering emerges as the key that unlocks this
potential. This report delves into the intricate art of crafting effective prompts, acting as a
bridge between human intent and LLM understanding. We embark on a journey to explore the
core principles of prompt engineering, fortified by insights gleaned from relevant research
papers. By dissecting the advantages and potential drawbacks of this innovative approach, we
gain a nuanced understanding of its impact. Finally, we unveil the transformative power of
prompt engineering through a captivating exploration of its diverse applications across various
domains. This report serves as a compass, guiding us towards a future where prompt
engineering empowers LLMs to revolutionize human-machine interaction and shape the very
landscape of AI.
iii
CONTENTS
Declaration i
Acknowledgement ii
Abstract iii
Chapter 1 Introduction 1
1.1 Defining Large Language Model
Chapter -2 Literature survey 2-3
Chapter -3 Methodology 4-5
3.1 Data Collection and Preprocessing
3.2 Model Selection and Training
3.3 Evaluation
3.4 Deployment and Integration
3.5 Ethical Considerations
3.6 Documentation and Reporting
Chapter 1
INTRODUCTION
The development of Large Language Models marks a milestone in the evolution of NLP,
building upon decades of research and innovation in the field. Early NLP systems relied on
rule-based approaches and shallow linguistic models, which had limited effectiveness in
capturing the complexity of human language. However, with the advent of deep learning and
the availability of massive datasets, LLMs have emerged as the state-of-the-art approach to
various NLP tasks, including language translation, text summarization, sentiment analysis,
and more. This evolution has been driven by advances in machine learning algorithms,
computational power, and the availability of large-scale annotated datasets.
The rise of Large Language Models has led to a proliferation of applications across diverse
domains, revolutionizing how we interact with technology and process vast amounts of
textual data. From virtual assistants and chatbots to content generation and language
understanding tasks, LLMs have demonstrated remarkable versatility and utility. However,
their widespread adoption also raises important ethical, societal, and technical considerations,
including issues related to bias, privacy, and the environmental impact of training such large
models. Understanding the capabilities, limitations, and implications of LLMs is crucial for
harnessing their potential while mitigating associated risks.
Chapter 2
LITERATURE SURVEY
Wayne Xin Zhao, Kun Zhou, Junyi Li, Tianyi Tang, Xiaolei Wang, Yupeng
Hou, Yingqian Min, Beichen Zhang, Junjie Zhang, 24 Nov 2023
Shervin Minaee, Tomas Mikolov, Narjes Nikzad, Meysam Chenaghlu, Richard Socher,
Xavier Amatriain, Jianfeng Gao, 20 Feb 2024
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance
on a wide range of natural language tasks, since the release of ChatGPT in November 2022.
LLMs' ability of general-purpose language understanding and generation is acquired by training
billions of model's parameters on massive amounts of text data, as predicted by scaling laws
\cite {kaplan2020scaling, hoffmann2022training}. The research area of LLMs, while very
recent, is evolving rapidly in many different ways.
Yang Liu, Jiahuan Cao, Chongyu Liu, Kai Ding, Lianwen Jin, 28 Feb 2024
This paper embarks on an exploration into the Large Language Model (LLM) datasets, which
play a crucial role in the remarkable advancements of LLMs. The datasets serve as the
foundational infrastructure analogous to a root system that sustains and nurtures the
development of LLMs. Consequently, examination of these datasets emerges as a critical
topic in research. In order to address the current lack of a comprehensive overview and
thorough analysis of LLM datasets, and to gain insights into their current status and future
trends, this survey consolidates and categorizes the fundamental aspects of LLM datasets
from five perspectives.
Tongtong Wu, Linhao Luo, Yuan-Fang Li, Shirui Pan, Thuy-Trang Vu, Gholamreza
Haffari, 7 Feb 2024
Large language models (LLMs) are not amenable to frequent re-training, due to high training
costs arising from their massive scale. However, updates are necessary to endow LLMs with
new skills and keep them up-to-date with rapidly evolving human knowledge. This paper
surveys recent works on continual learning for LLMs. Due to the unique nature of LLMs, we
catalog continue learning techniques in a novel multi-staged categorization scheme,
involving continual pretraining, instruction tuning, and alignment. We contrast continual
learning for LLMs with simpler adaptation methods used in smaller models, as well as with
other enhancement strategies like retrieval-augmented generation and model editing.
Moreover, informed by a discussion of benchmarks and evaluation, we identify several
challenges and future work directions for this crucial task.
Chapter 3
METHODOLOGY
3.1 Data Collection and Preprocessing
Data Selection: Identify and collect a diverse range of text data sources relevant to the scope
of your project. These may include books, articles, websites, social media posts, and more.
Data Preprocessing: Clean and preprocess the collected data to ensure consistency and
remove noise. This may involve tasks such as tokenization, lowercasing, removing special
characters, and handling stop words.
• Choice of Large Language Model: Select a pre-trained Large Language Model suitable
for your project requirements. Popular choices include GPT (Generative Pre-trained
Transformer), BERT (Bidirectional Encoder Representations from Transformers), and
their variants.
• Fine-tuning: Fine-tune the selected pre-trained model on your specific dataset and task.
Fine-tuning involves updating the parameters of the pre-trained model using your
annotated or labeled data to adapt it to the target task.
3.3 Evaluation
• Metric Selection: Choose appropriate evaluation metrics based on the nature of your
task. Common metrics for language modeling tasks include perplexity, BLEU score (for
translation tasks), accuracy, and F1 score (for classification tasks).
• Cross-Validation: Perform cross-validation or split your dataset into training, validation,
and test sets to evaluate the generalization performance of the model.
Chapter 4
4.2 Disadvantages
1. Computational Resources: Training and fine-tuning Large Language Models
require significant computational resources, including high-performance GPUs or
TPUs and large-scale distributed computing infrastructure. This can pose barriers to
entry for smaller research teams or organizations with limited resources.
2. Data Dependency: LLMs are highly dependent on the quality and diversity of
training data. Biases, inaccuracies, or lack of representativeness in the training data
can lead to biased model outputs and unreliable performance, particularly in real-
world applications where the data may be noisy or unstructured.
Chapter 5
APPLICATIONS
5.1 Virtual Assistants and Chatbots
LLMs power virtual assistants and chatbots that interact with users in natural language,
providing information, answering questions, and assisting with tasks. Examples include
Apple's Siri, Amazon's Alexa, Google Assistant, and chatbots used in customer service.
Chapter 6
RESULT
CONCLUSION
LLM’s has unveiled a new frontier in the realm of artificial intelligence. It empowers us to
harness the true potential of large language models, transforming them from repositories of data
into powerful tools for understanding, generating, and interacting with the world around us. By
crafting effective prompts, we act as conductors, guiding LLMs towards desired outcomes and
unlocking a universe of possibilities.
As we delve deeper into this nascent field, the applications of prompt engineering are only
beginning to unfold. From revolutionizing information retrieval to fostering cross- cultural
communication and unleashing creative expression, the potential is limitless. The ability to
fine-tune LLMs through prompts offers a cost-effective and efficient alternative to traditional
model training, accelerating innovation and democratizing access to advanced AI capabilities.
REFERENCES
[1] “Efficient Large Language Models” A SurveyZhongwei Wan, Xin Wang, Che Liu, Samiul
Alam, Yu Zheng, Jiachen Liu, Zhongnan Qu, Shen Yan, Yi Zhu, Quanlu Zhang, Mosharaf
Chowdhury, Mi Zhang,(Jan 2024)
[2] “ Chat GPT & Google Bard AI: A Review”, Shashi Kant Singh, Shubham Kumar, Pawan
Singh Mehra (23-24 June 2023).
[3] “ChatGPT is not all you need. A State of the Art Review of large Generative AI models”,
Roberto Gozalo-Brizuela, Eduardo C. Garrido-Merchan, (11 Jan 2023) .
[4] “AI and Deep Learning-driven Chatbots: A Comprehensive Analysis and Application
Trends ”, Santosh Maher, Suvarnsing Bhable, Ashish Lahase, Sunil Nimbore (08 June
2023).