PRADALab_KAUST

Fraud-R1 Public

[ACL 2025 Findings] Fraud-R1 : A Multi-Round Benchmark for Assessing the Robustness of LLM Against Augmented Fraud and Phishing Inducements

Python 24 3

LLM-Persona-Steering Public

Official code of "Exploring the Personality Traits of LLMs through Latent Features Steering"

Python 16 2

repeat-curse-llm Public

[ACL 2025 Findings] Understanding the Repeat Curse in Large Language Models from a Feature Perspective

Python 16 2

SAE-Factory Public

Training SAEs for your LLM, and visualize it in one place

Python 7

CoT-Dataset Public

Jupyter Notebook 5 1

Provide feedback