Skip to content
Dispatch
Support
Send feedback
Revision history
Paper proposes activation-steering method to detect and reduce sycophancy in language models
Original publish · no revisions.
← Back to article
Tweaks