Ask HN: Could LLMs be used to moderate content?

by brad0on 2/19/2024, 5:01 AMwith 3 comments

Would content moderation be possible via LLMs? If so, would it be cost effective vs human moderation?

by defroston 2/19/2024, 5:15 AM

"I want to stick my long-necked Giraffe up your fluffy white bunny".

The Untold History of Toontown’s SpeedChat aka The impossible task of monitoring user content.

http://habitatchronicles.com/2007/03/the-untold-history-of-t...

by mooredson 2/19/2024, 5:09 AM

1. Yes.

---

Me:

Can you please remove any curse words in the following statements? Replace them with asterisks.

Fuck the machine.

You are a douchebag.

What the hell is going on?

Shit shit shit.

ChatGPT:

Certainly! Here are the statements with the curse words replaced by asterisks:

    **** the machine.

    You are a ********.

    What the **** is going on?

    **** **** ****.
---

2. Depends. I haven't run the numbers on costs. Speed is also a concern.

Depending on the kind of moderation, I could see three passes:

* regexp/algorithmic moderation

* LLM

* humans (for the thorny stuff the LLM can't handle)

Full disclosure, my employer has a product called Cleanspeak which does algorithmic profanity filtering. I'm not close to the product, but I don't think there's any LLM usage going on right now.

by slateron 2/19/2024, 5:12 AM

1. Yes, of course.

2. No, of course not. Hire people to make accurate judgment calls, instead of deluding yourself that "statistics on steroids" will ever provide the necessary nuance, just so the CEO can grift his way to a new Porsche.