0% found this document useful (0 votes)
17 views23 pages

Advances in AI: Module-1

The document discusses various machine learning techniques, including Co-Training, Multi-Task Learning (MTL), and Coupled Semi-Supervised Learning (CSSL), highlighting their methodologies, applications, advantages, and challenges. Co-Training utilizes multiple views of data for semi-supervised learning, while MTL focuses on training models on related tasks simultaneously to improve performance. CSSL combines supervised and unsupervised learning to enhance model accuracy using both labeled and unlabeled data, addressing the complexities and challenges associated with each approach.

Uploaded by

gitelov533
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views23 pages

Advances in AI: Module-1

The document discusses various machine learning techniques, including Co-Training, Multi-Task Learning (MTL), and Coupled Semi-Supervised Learning (CSSL), highlighting their methodologies, applications, advantages, and challenges. Co-Training utilizes multiple views of data for semi-supervised learning, while MTL focuses on training models on related tasks simultaneously to improve performance. CSSL combines supervised and unsupervised learning to enhance model accuracy using both labeled and unlabeled data, addressing the complexities and challenges associated with each approach.

Uploaded by

gitelov533
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Advances in AI

Module-1
Co-Training

Co-Training is a semi-supervised learning technique in machine


learning, where two or more models (or learners) are trained
simultaneously using different views (or subsets) of the data. The key
idea behind co-training is that each model is trained on a different
subset of features, and they help each other improve by providing
additional labeled data.
Co-Training

• Multiple Views: Co-training works under the assumption that the data can be
represented by multiple "views," where each view provides sufficient information
to make predictions. For example, in a text classification task, one view might be
based on the words in the text, and another view might be based on the
metadata (e.g., author, publication date).

• Semi-Supervised Learning: Co-training is especially useful in scenarios where


labeled data is scarce, but there is an abundance of unlabeled data. The models
start with a small amount of labeled data and gradually expand their labeled
dataset by labeling the unlabeled data they feel most confident about.
Co-Training

• Iterative Process: The models are trained iteratively. Initially, they are trained on
a small labeled dataset. Then, each model makes predictions on the unlabeled
data. The most confident predictions are added to the labeled dataset, and the
models are retrained on this expanded dataset. This process is repeated until the
models converge.

• Assumptions: Co-training assumes that the views are conditionally independent


given the class label, meaning that if you know the class label, knowing one view
doesn’t give you additional information about the other view. This assumption
allows the models to correct each other’s mistakes and improve overall accuracy.
Multi-Task Learning

Multi-task learning (MTL) is an approach in artificial intelligence and


machine learning where a model is trained on multiple tasks
simultaneously. Instead of focusing on a single objective, the model
learns to perform several related tasks, sharing knowledge across them.
This shared knowledge often leads to better performance, especially in
tasks with limited data, as the model can generalize better by
leveraging information from related tasks.
Multi-Task Learning
• In MTL, tasks share a common representation (e.g., shared layers in a neural
network). This allows the model to learn features that are useful across multiple
tasks, leading to better generalization.

• The effectiveness of MTL depends on the relatedness of the tasks. If the tasks are
too dissimilar, sharing representations might not be beneficial and could even
degrade performance.

• MTL can act as a form of regularization. By requiring the model to perform well
on multiple tasks, it discourages overfitting to a single task and promotes learning
more general features.
Multi-Task Learning
Applications

MTL is widely used in areas like natural language processing (NLP),


where tasks like sentiment analysis, translation, and named entity
recognition can benefit from shared learning. It's also applied in
computer vision, where tasks like object detection, segmentation, and
classification are often learned together.
Multi-Task Learning
Advantages of Multi-Task Learning

• Improved Performance: By learning related tasks together, MTL can


lead to better overall performance, especially on smaller datasets.
• Data Efficiency: MTL allows the model to leverage data from multiple
tasks, which is especially useful when data for individual tasks is
scarce.
• Generalization: The shared learning encourages the model to find
features that generalize well across tasks, reducing overfitting.
Multi-Task Learning
Challenges in Multi-Task Learning

Task Imbalance: If one task is much harder or has more data than
others, it can dominate the learning process, leading to suboptimal
performance on the other tasks.
Negative Transfer: If the tasks are not sufficiently related, the model
might struggle, as knowledge from one task could negatively impact the
performance on another.
Coupled Semi-Supervised
Learning (CSSL)
Coupled Semi-Supervised Learning (CSSL) is an approach in machine
learning that combines elements of both supervised and unsupervised
learning to leverage the strengths of both paradigms. The core idea is
to use a small amount of labeled data in conjunction with a large
amount of unlabeled data to improve learning performance.
Coupled Semi-Supervised
Learning (CSSL)
• In traditional supervised learning, a model is trained on labeled data, where each
input comes with an associated label. In contrast, semi-supervised learning uses
both labeled and unlabeled data. The main goal is to improve model accuracy by
exploiting the structure in the unlabeled data to assist the learning process. This
is particularly useful when labeling data is expensive or time-consuming.

• In the context of CSSL, "coupling" refers to the interaction or integration between


multiple learners or models, each learning from different views or aspects of the
data. These coupled learners share information during training, which allows
them to enhance each other’s learning process.
Coupled Semi-Supervised
Learning (CSSL)
Applications

• Image and Video Analysis: CSSL can be used in image classification, where
different views might be different augmentations of the same image or features
extracted using different convolutional layers.

• Natural Language Processing (NLP): In tasks like sentiment analysis or machine


translation, different views could be based on different linguistic features or
models.

• Bioinformatics: CSSL is used in genomic data analysis, where one model might
focus on sequence data and another on structural data.
Coupled Semi-Supervised
Learning (CSSL)
Advantages

• Efficiency: It reduces the need for large amounts of labeled data, which can
be expensive and time-consuming to obtain.

• Improved Accuracy: By leveraging unlabeled data, CSSL can achieve better


generalization and accuracy compared to purely supervised approaches.
Coupled Semi-Supervised
Learning (CSSL)
Challenges

• Complexity: Designing effective coupling strategies and ensuring the


models complement each other rather than reinforcing incorrect
assumptions can be challenging.

• Sensitivity to Noisy Labels: If the pseudo-labeling process introduces


errors, it can propagate through the learning process, potentially harming
performance.
Macro reading vs Micro reading
Macro Reading
Macro reading involves understanding the overall meaning or theme of a larger text or document. It
focuses on grasping the big picture, such as the main topics, key ideas, or general sentiment.
Use Cases:
• Document Summarization: AI condenses a lengthy document into a brief summary, capturing the
essential points.
• Topic Modeling: Identifying and categorizing the major themes or topics within a collection of
texts.
• Sentiment Analysis: Determining the general sentiment or emotion conveyed in a large piece of
text (e.g., a full article, book, or series of tweets).
• Content Categorization: Classifying large amounts of content into predefined categories (e.g.,
news articles into topics like politics, sports, or entertainment).
Macro reading vs Micro reading
Techniques

• Latent Dirichlet Allocation (LDA): For topic modeling.


• Recurrent Neural Networks (RNNs) or Transformers: For understanding
sequences in text over longer spans.
• TextRank: For summarization tasks.
• BERT or GPT models: For capturing context across paragraphs or entire
documents.
Macro reading vs Micro reading
Micro Reading
Micro reading focuses on understanding specific details, individual sentences, or even words within
a text. It emphasizes the precise interpretation of small text segments.

Use Cases:
• Named Entity Recognition (NER): Identifying and classifying proper nouns (e.g., names of people,
places, organizations) within a text.
• Part-of-Speech Tagging: Determining the grammatical categories of each word (e.g., noun, verb,
adjective).
• Dependency Parsing: Analyzing the grammatical structure of a sentence to understand how
words relate to each other.
• Question Answering: Finding precise answers to questions within a specific passage of text.
Macro reading vs Micro reading
Techniques

• Word2Vec or GloVe: For word embeddings to understand word meanings in


context.
• CRFs (Conditional Random Fields): For structured prediction tasks like NER.
• BERT: For understanding the nuanced meaning of words or sentences within their
context.
• Seq2Seq Models: For translating or generating text based on specific inputs.
Open IE
Open Information Extraction (Open IE) in AI refers to the process of
automatically extracting structured information (such as relationships,
entities, and events) from unstructured text without relying on a
predefined schema or ontology. Unlike traditional Information
Extraction (IE) systems, which require a predefined set of relations to
look for, Open IE systems can extract a wide range of relationships from
text, making them more flexible and scalable.
Open IE
• Unsupervised Learning: Open IE systems often operate in an unsupervised or
semi-supervised manner. They do not need extensive labeled training data to
learn specific relations but instead rely on linguistic patterns and heuristics to
identify and extract potential relations.
• Relation Extraction: The primary goal of Open IE is to extract relationships
between entities in a sentence.
• Textual Triples: Open IE systems typically represent the extracted information as
triples (subject, relation, object), which can then be used for further processing,
such as building knowledge graphs or querying.
• Scalability: Open IE is designed to handle large-scale text corpora, making it
suitable for applications like web-scale information extraction, where the number
of potential relations is vast and constantly growing.
Open IE
Applications of Open IE
• Knowledge Graph Construction: Open IE is used to populate knowledge graphs with
entities and relationships extracted from large text corpora, enabling more effective
search and discovery of information.
• Question Answering Systems: Open IE can be used to extract relevant facts from text to
answer user queries, especially in open-domain question answering where the range of
possible answers is broad.
• Summarization: Extracted triples can be used to generate concise summaries of large
documents or articles by highlighting the key entities and their relationships.
• Content Analysis: Open IE helps in analyzing and understanding large volumes of text
data, such as social media content, news articles, or scientific papers, by extracting and
organizing relevant information.
Open IE
Challenges in Open IE

• Ambiguity and Polysemy: Words and phrases can have multiple meanings
depending on the context, making it challenging for Open IE systems to
accurately extract the correct relations.
• Complex Sentences: Sentences with complex structures, such as nested clauses
or multiple entities and relations, can be difficult for Open IE systems to parse
correctly.
• Lack of Schema: While the lack of a predefined schema provides flexibility, it can
also lead to inconsistencies in the extracted information, as the same relation
might be represented in multiple ways.
QUESTIONS
• Name some popular Open IE.
• What challenges arise when applying coupled semi-supervised
learning to domains with highly imbalanced datasets, and how can
these challenges be mitigated to ensure robust model training?
• In what scenarios is coupled semi-supervised learning most effective,
and how does it compare to other learning paradigms like co-training
or multi-task learning in terms of data efficiency and generalization?

You might also like