Paper ID

d9d50e6d98f01f357357eafde24ab66370fc3559


Title

Dynamic Fairness Adaptation in LLM-based Recommender Systems: A Preference-Aligned Approach


Introduction

Problem Statement

Static fairness interventions in LLM-based recommender systems fail to adapt to changing user preferences and societal norms, potentially leading to outdated or misaligned fairness criteria over time. This problem is particularly acute in domains where user preferences and societal values evolve rapidly, such as content recommendation or job matching platforms.

Motivation

Existing fairness approaches for recommender systems typically use fixed fairness constraints or objectives defined at training time. While some dynamic methods exist, they often focus on item-side fairness or single-attribute user fairness, neglecting the multifaceted nature of user preferences and intersectional fairness concerns. Our proposed Dynamic Fairness Adaptation (DFA) framework is motivated by the need to continuously update fairness criteria in alignment with evolving user expectations and values. By leveraging the natural language understanding capabilities of LLMs, we can engage users in meaningful dialogues about fairness, extract structured representations of their preferences, and adapt the recommender system accordingly.


Proposed Method

We propose Dynamic Fairness Adaptation (DFA), a novel framework for continuously updating fairness criteria in LLM-based recommenders. DFA consists of three main components: 1) Fairness Perception Elicitation: We design prompts for the LLM to engage users in dialogues about their fairness perceptions, extracting structured representations of user-specific fairness criteria. 2) Adaptive Fairness Aggregation: We develop a mechanism to aggregate individual user fairness perceptions into a global fairness objective, using techniques from social choice theory and multi-stakeholder optimization. This aggregated fairness objective evolves over time as new user feedback is incorporated. 3) Online Fairness Fine-tuning: We introduce a novel online learning algorithm that allows the LLM to continuously fine-tune its representations and outputs to align with the evolving fairness objective. This algorithm balances exploration of new fairness criteria with exploitation of known good practices.


Experiments Plan

Step-by-Step Experiment Plan

Step 1: Dataset Preparation

Use the MovieLens-25M dataset with temporal information. Preprocess the data to include user demographic attributes (age, gender, occupation) and movie attributes (genre, release year, etc.). Split the data into training, validation, and test sets while preserving temporal order.

Step 2: Baseline LLM Recommender

Fine-tune a pre-trained LLM (e.g., GPT-3.5) on the MovieLens dataset to create a baseline recommender system. Use prompts like 'Recommend a movie for a [age] [gender] who likes [favorite genres].' Evaluate the baseline recommender using standard metrics (NDCG, MAP) and fairness measures (demographic parity, equal opportunity) computed over sliding time windows.

Step 3: Fairness Perception Elicitation

Design a set of prompts for the LLM to engage users in dialogues about fairness in movie recommendations. Example prompt: 'How important is it to you that movie recommendations are balanced across different genres? What about representation of diverse actors and directors?' Extract structured fairness criteria from user responses using named entity recognition and sentiment analysis.

Step 4: Adaptive Fairness Aggregation

Implement an aggregation mechanism that combines individual user fairness criteria into a global fairness objective. Use techniques like weighted majority voting or Borda count to balance different user preferences. Update this aggregated objective periodically (e.g., daily or weekly) based on new user feedback.

Step 5: Online Fairness Fine-tuning

Develop an online learning algorithm that fine-tunes the LLM's outputs based on the evolving fairness objective. Use a multi-armed bandit approach to balance exploration of new fairness criteria with exploitation of known good practices. Implement this as a prompt augmentation technique that dynamically adjusts the LLM's recommendation generation process.

Step 6: Evaluation

Compare DFA against the baseline recommender and static fairness approaches. Use a sliding window approach to compute recommendation quality metrics (NDCG, MAP) and fairness measures (demographic parity, equal opportunity) over time. Analyze how these metrics change as user preferences and the fairness objective evolve.

Step 7: User Study

Conduct a small-scale user study with 50-100 participants. Ask users to interact with both the baseline and DFA-enhanced recommender systems over a period of 2-4 weeks. Collect feedback on perceived fairness, recommendation quality, and overall satisfaction. Use qualitative analysis techniques to identify themes in user responses.

Step 8: Ablation Studies

Perform ablation studies to assess the impact of each component of DFA. Compare performance with and without the fairness perception elicitation, adaptive aggregation, and online fine-tuning components. This will help identify which aspects of the framework contribute most to improved fairness and recommendation quality.

Test Case Examples

Baseline Prompt Input

Recommend a movie for a 35-year-old female who likes action and sci-fi movies.

Baseline Prompt Expected Output

I recommend 'The Matrix' (1999). It's a groundbreaking sci-fi action film that combines intense action sequences with mind-bending concepts. The movie features strong performances, innovative special effects, and a compelling storyline that should appeal to fans of both action and science fiction genres.

Proposed Prompt Input (Fairness Perception Elicitation)

How important is it to you that movie recommendations include diverse representation in terms of directors, actors, and cultural perspectives? Please explain your thoughts on balancing this with your genre preferences.

Proposed Prompt Expected Output (Fairness Perception Elicitation)

Diversity in movie recommendations is quite important to me. I believe it's valuable to see different perspectives and experiences in film. However, I wouldn't want this to completely override my genre preferences. Ideally, I'd like to see a mix of movies that includes diverse representations within the action and sci-fi genres I enjoy. For example, recommending sci-fi films from different countries or action movies with female directors would be great.

Proposed Prompt Input (DFA-enhanced Recommendation)

Recommend a movie for a 35-year-old female who likes action and sci-fi movies, taking into account her preference for diverse representation while maintaining genre alignment.

Proposed Prompt Expected Output (DFA-enhanced Recommendation)

I recommend 'Black Panther' (2018) directed by Ryan Coogler. This film brilliantly combines elements of action and sci-fi, aligning with your genre preferences. It offers groundbreaking representation with a predominantly Black cast and explores themes of African culture and technology. The movie features strong female characters, innovative special effects, and a compelling storyline that should appeal to fans of both action and science fiction genres while also providing the diverse representation you value.

Explanation

The DFA-enhanced recommendation takes into account both the user's genre preferences and their expressed interest in diverse representation. It suggests a movie that not only fits the action and sci-fi genres but also offers diversity in terms of cast, cultural perspective, and strong female characters. This demonstrates how DFA can adapt recommendations to evolving fairness criteria while maintaining relevance to user preferences.

Fallback Plan

If the proposed DFA method doesn't significantly improve fairness metrics or user satisfaction compared to baselines, we can pivot the project to focus on analyzing why dynamic fairness adaptation is challenging in this context. We could conduct a more in-depth analysis of how user fairness perceptions change over time and how these changes correlate with recommendation quality and satisfaction. This could involve developing a taxonomy of fairness criteria in movie recommendations and tracking their prevalence and importance to users over time. Additionally, we could investigate potential barriers to implementing dynamic fairness in recommender systems, such as conflicting user preferences or limitations in LLM understanding of complex fairness concepts. This analysis could provide valuable insights for future work on adaptive fairness in AI systems and help identify specific challenges that need to be addressed in the next generation of fair recommender systems.


References

  1. Beyond Ordinal Preferences: Why Alignment Needs Cardinal Human Feedback (2025)
  2. Why Multi-Interest Fairness Matters: Hypergraph Contrastive Multi-Interest Learning for Fair Conversational Recommender System (2025)
  3. A Normative Framework for Benchmarking Consumer Fairness in Large Language Model Recommender System (2024)
  4. Enhancing Recommender Systems with Large Language Model Reasoning Graphs (2023)
  5. Understanding Biases in ChatGPT-based Recommender Systems: Provider Fairness, Temporal Stability, and Recency (2024)
  6. Is ChatGPT Fair for Recommendation? Evaluating Fairness in Large Language Model Recommendation (2023)
  7. BiFair: A Fairness-aware Training Framework for LLM-enhanced Recommender Systems via Bi-level Optimization (2025)
  8. Heterogeneous User Modeling for LLM-based Recommendation (2025)
  9. Ensuring User-side Fairness in Dynamic Recommender Systems (2023)
  10. Evaluating ChatGPT as a Recommender System: A Rigorous Approach (2023)