SOTA Topics
SOTA Topics
2
Roadmap
13.03.
• We present different topics
• Discuss with us which one you want to use
14.03. 20:00 - 17.3. 23:59
• Choose your topic on TUWEL
18.3. - 10.4.
• Work on the state of the art (SOTA) of your selected topic
09.4. 23:59
• Upload your presentation on TUWEL
10.4. (14:00 - 18:00)
• Present your work at the presentation day
Any questions during this time? Let us know and we’ll help you!
3
Literature Topics
Quantifying and directing Bias in Textual Data
Task: Explore methodologies for quantifying and directing bias in text. You
may focus on general biases or specific domains.
4
Literature Topics
Explainability through Modularization
Task: Investigate the increase of explainability of subsymbolic recommender
systems through modularization (cf. Unix-pipes concept, multi-agent
systems for micro-decisions).
• Research different current methods / state-of-the-art and applications.
Which concepts exist outside the space of GenAI that could be applied
to modularize LLMs for specific tasks?
5
Literature Topics
Exploring Text Similarity Dimensions
When comparing text, it's essential to consider multiple dimensions of similarity. These dimensions capture
the various ways texts can be similar or different, ranging from their subject matter to the style and
structure of the writing. Accurately capturing similarity in different dimensions is a necessity for building
accurate information retrieval systems, which are a core part of recommender systems.
A few examples would be:
• Semantic Similarity in different depths (word level, sentence level, paragraph level, page level, ...)
• Structural Similarity
• Stylistic Similarity
• Domain Similarity (subject area/sector/industry context of the texts)
• Similarity in the applied methodology: Two texts could describe the "divide and conquer” principle in
two completely different ways, applied to two problems from two domains in a different structure
and style (cf. software design patterns).
Task: Research the State-of-the-Art for comparing text by considering multiple dimensions of similarity.
6
Literature Topics
Reduce Degrees of Freedom of LLMs to enable them as recommender
engines
Text has an infinite amount of degrees of freedom. Most use case for
recommender engines however have a finite amount of possible outputs to
choose from.
Task: Examine methodologies for transforming the vast output of LLMs into
structured, machine-processable formats. This might include enforcing a
strict format in the output or incorporating symbolic concepts like knowledge
graphs, etc. Usage of BNF grammars, …
Chat LLMs should not be the last step of a pipeline, but part of a pipeline.
How can this look like?
7
Literature Topics
Methodologies for LLM-based Automated Data Labeling
It is very expensive and time consuming to manually annotate data. LLMs are
capable of supporting this process by automatically labelling data.
Task: Investigate the use of LLMs for automating the data labeling process,
focusing on methodologies to ensure accuracy and efficiency. Conduct a
State-of-the-Art survey.
8
Literature Topics
Fairness / Bias in Recommender Systems
Fairness and Bias are important topics within the research field of
recommender systems. A good starting points are conferences within this
field (https://recsys.acm.org/, https://www.um.org/umap2024/, etc.) other
conferences as also their corresponding workshops.
Task: Research the State-of-the-Art within this domain by identifying the
methodologies proposed within the field and highlighting the current trends
and developments.
9
Literature Topics
Responsible Recommendations
The rise of LLMs as part of recommender systems heightens the need for
building responsible systems. It's important to focus not only on the
theoretical concepts but also on how these aspects can be applied to build
responsible systems. In addition evaluate how the different aspects can be
measured.
Task: Identify the current state-of-the-art within the field and identify the
different aspects of building responsible recommender systems.
10
Literature Topics
Bot Detection of User Traffic
• It is a challenge to identify bot traffic in real time on websites. There are traditional
approaches to solve this problem (e.g. [1]). This involves especially the concept of user
modelling to model “abnormal” behavior which can identify an user as a “bot”.
Task: Identify the State-of-the-Art within this domain especially in the context of new
possibilities gained through the raise of LLMs.
Keep in mind that this is about a real time system whereby the methods need to be executed
live and not as a batch job!
[1] https://iris.unige.it/retrieve/e268c4ce-7ff8-a6b7-e053-3a05fe0adea1/1-s2.0-S0950705121003373-main.pdf
11
Literature Topics
Overview of Open Source LLMs and their performance versus proprietary
LLMs
• Give an overview over available open source and proprietary models
• Compare their performance on a predefined set of tasks
• Additional suggestions: track the development of performance over time
• Starting point: https://arxiv.org/abs/2402.06196
LLMs as agents for the evaluation of recommender systems
• Create agents that simulate users of a recommender system
• Assess potential to evaluate new systems
• Starting point: https://arxiv.org/abs/2310.10108;
https://arxiv.org/abs/2402.09176
12
Literature Topics
Impact of LLMs on conversational recommender systems
• Discuss the impact of Generative AI on conversational recommender systems based on recent
surveys and subsequent development
• How do systems differ now? What are emerging trends?
• How are user models built?
• Potential subtopic: Focus on evaluation of LLM-based conversational recommender systems
• Reviews/survey paper: https://www.sciencedirect.com/science/article/pii/S0957417422008612;
https://dl.acm.org/doi/10.1145/3453154;
https://www.sciencedirect.com/science/article/pii/S2666651021000164
Example Structure:
• Introduction / Motivation
• Background
• State-of-the-Art Overview
• Critical Analysis
• Future Directions
• Conclusion
• Q&A
14
Discussion Round
Present your own ideas/topics
15
Next Steps
Keep in mind to select your topic on TUWEL between 14.03.
20:00 - 17.3. 23:59.
16
Further Questions?
Please also use TUWEL forum for general questions for more
specific questions please contact us via:
17