Course: Large Language Models & Agents
School of Computer Science, Holon Institute of Technology
2025 Spring
Lecturer: Dr. Alexander(Sasha) Apartsin
HoS Course Series Home: Here
School of Computer Science, Holon Institute of Technology
2025 Spring
Lecturer: Dr. Alexander(Sasha) Apartsin
HoS Course Series Home: Here
Uncover the Hidden Leader: Reveal True Seniority in Every Resume
Matan Cohen, Shira Shany, Edan Menahem
Accurately assessing candidate seniority from resumes is a critical yet challenging task, complicated by the prevalence of overstated experience and ambiguous self-presentation. This study investigates the effectiveness of large language models (LLMs), including fine-tuned BERT architectures, for automating seniority classification in resumes. To rigorously evaluate model performance, we introduce a hybrid dataset comprising real-world resumes and synthetically generated hard examples designed to simulate exaggerated qualifications and understated seniority.
LLM-Driven Insights for Effortless Customer Support
Shanu Kupiec, Inbal Bolshinsky, Almog Sasson, Nadav Margalit
In the era of conversational AI, generating accurate and contextually appropriate service responses remains a critical challenge. A central question remains: Is explicit intent recognition a prerequisite for generating high-quality service responses, or can models bypass this step and produce effective replies directly? This paper conducts a rigorous comparative study to address this fundamental design dilemma. Leveraging two publicly available service interaction datasets, we benchmark several state-of-the-art language models, including a fine-tuned T5 variant, across Intent-First Response Generation and Direct Response Generation.
Accelerating Code Reviews with Transfer Learning — Embracing Every Language at Lightning Speed.
Yogev Cohen, Romi Simkin, David Ohayon
Automating the determination of whether a code change requires manual review is vital for maintaining software quality in modern development workflows. However, the emergence of new programming languages and frameworks creates a critical bottleneck: while large volumes of unlabelled code are readily available, there is insufficient labelled data to train supervised models for review classification. We address this challenge by leveraging Large Language Models (LLMs) to translate code changes from well-resourced languages into equivalent changes in underrepresented or emerging languages, generating synthetic training data where labelled examples are scarce.
Feel the Emotion Behind Every Line.
Naor Mаzliah, Shay Dahary, Mazal Lemalem, Avi Edana
The emotional content of song lyrics plays a pivotal role in shaping listener experiences and influencing musical preferences. This paper investigates the task of multi-label emotional attribution of song lyrics by predicting six emotional-intensity scores for six fundamental emotions. A manually labeled dataset is constructed using a mean opinion score (MOS) approach, which aggregates annotations from multiple human raters to ensure reliable ground-truth labels. Leveraging this dataset, we comprehensively evaluate several large language models (LLMs) across zero-shot and few-shot learning scenarios.
Tracing Emotions in Native Expressions
Yotam Hasid , Amit Keinan , Edo Koren
Hebrew sentiment analysis remains challenging due to the scarcity of large, high-quality labeled datasets. This study explores a cross-lingual transfer approach that leverages well-established English sentiment analysis datasets and automatically translates them into Hebrew. The translated datasets serve as a foundation for training and evaluating sentiment classifiers in Hebrew. We verify the accuracy of the translated test partitions and their corresponding labels to ensure the reliability of the evaluation. Our experimental framework includes assessing the few-shot learning capabilities of several state-of-the-art pretrained large language models (LLMs) and fine-tuning BERT-based models for sentiment classification.
Decoding deceptive headlines with transparent LLM insights
Aviv Elbaz, Lihi Nofar, Tomer Protal
The proliferation of clickbait headlines poses significant challenges to the credibility of information and user trust in digital media. While recent advances in machine learning have improved the detection of manipulative content, the lack of explainability limits their practical adoption. This paper presents an explainable framework for clickbait detection that identifies clickbait titles and attributes them to specific linguistic manipulation strategies. We introduce a synthetic dataset generated by systematically augmenting real news headlines using a predefined catalogue of clickbait strategies.
Empowering Fairness in AI—Detecting and Correcting Bias in LLM Text.
Netta Robinson, Shay Yafet, Katrin Zablianov, Ariel Sofer
This paper explores bias detection and mitigation in Natural Language Inference (NLI), with a focus on hypothesis statements that encode socially sensitive attributes such as gender, race, or profession. We introduce a curated dataset of premise–hypothesis pairs in which the hypothesis contains potentially biased language. Each pair is annotated to indicate whether the attribute reference is inferable from the premise or reflects an unwarranted bias. We develop a classification model that predicts this bias-aware NLI label, distinguishing between justified and biased inferences. To mitigate such biases, we implement a rewriting mechanism that generates bias-neutral hypotheses while preserving the original entailment relation.
Keeping Wikipedia Collaborative, Not Combative.
Ron Butbul, Yuval Horesh, Rotem Mustacchi
Toxicity detection in online discourse is crucial for maintaining healthy digital environments. This study compares classification methods for identifying toxic comments on Wikipedia. We benchmark various models using the publicly available Wikipedia Detox dataset, including traditional machine learning classifiers (e.g., logistic regression, Naive Bayes), deep neural networks, and transformer-based language models such as BERT. Each model is evaluated on its ability to detect various forms of toxicity, including insults, threats, and identity-based hate.
Microsoft GenAI suite for the development of next genration AI applications
GenAI in action: from research to practical applications
Securing and Scaling LLMs