Can AI sandbag safety checks to sabotage users? Yes, but not very well — for now
Companies related to AI claim their models are accompanied by robust safety checks that keep them from engaging in unsafe or illegal activities. However, research on anthropology reveals it would be possible for AI models to bypass those safeguards to mislead or manipulate users. It’s low-risk so far, but the potential remains. Read Also: Runway…