...

lostnground

15

Karma

2025-04-22

Created

Recent Activity

  • It cannot really oversee this. If you can decompose a problem into individual steps that are not, in themselves, against the agent's alignment, it's certainly possible to have the aggregate do so.

  • While I agree that officers should be accountable. More enforcement of them will not suddenly make them good officers. Other nations train their police for years prior to putting them into the thick of it. US police spend far less time studying, and it shows, in everything from de-escalation tactics to general legal understanding. If you create a pipeline to weed out bad officers, then there needs to be a pipeline producing better officers

  • After a cursory read, I see how this might prevent exfiltration, but not potential escalation.

    It seems like it keeps you inside a box, but if the intention of my attack was to social engineer Bob by including instructions to whitelist attackers@location to hit with the next prompt, would this stop me?

HackerNews