Text Moderation Overview

We are currently using third-party AI text moderation tools that help us to detect different kinds of undesirable content, including but not limited to: sexual, hate, violence, bullying, promotions, and links to external sites. The AI text moderation tools support over 15 languages and the text models are also trained to understand the semantic meaning of different emojis.

Text Classification

Classes are ordered by severity ranging from level 3 (most severe) to level 0 (benign).

Sexual

  • 3: Intercourse, masturbation, porn, sex toys, and genitalia
  • 2: Sexual intent, nudity, and lingerie
  • 1: Informational statements that are sexual, affectionate activities (kissing, hugging, etc.), flirting, pet names, relationship status, sexual insults, and rejecting sexual advances
  • 0: the text does not contain any of the above

 
Hate

  • 3: Slurs, hate speech, promotion of hateful ideology
  • 2: Negative stereotypes or jokes, degrading comments, denouncing slurs, challenging a protected group's morality or identity, violence against religion
  • 1: Positive stereotypes, informational statements, reclaimed slurs, references to hateful ideology, the immorality of protected group's rights
  • 0: the text does not contain any of the above

 
Violence

  • 3: Serious and realistic threats, mentions of past violence
  • 2: Calls for violence, destruction of property, calls for military action, calls for the death penalty outside a legal setting, mentions of self-harm/suicide
  • 1: Denouncing acts of violence, soft threats (kicking, punching, etc.), violence against non-human subjects, descriptions of violence, gun usage, abortion, self-defense, calls for capital punishment in a legal setting, destruction of small personal belongings, violent jokes
  • 0: the text does not contain any of the above

 
Bullying

  • 3: Slurs or profane descriptors toward specific individuals, encouraging suicide or severe self-harm, severe violent threats toward specific individuals
  • 2: Non-profane insults toward specific individuals, encouraging non-severe self-harm, non-severe violent threats toward particular individuals, silencing or exclusion
  • 1: Profanity in a non-bullying context, playful teasing, self-deprecation, reclaimed slurs, degrading a person's belongings, bullying toward organizations, denouncing bullying
  • 0: the text does not contain any of the above

 
Spam

  • 3: The text is intended to redirect a user to a different platform, including email addresses, phone numbers, and specific links
  • 0: The text does not include the above OR consists of a link to a safelist domain (i.e., popular, reputable sites).

 
Promotions

  • 3: Asking for likes/follows/shares, advertising monthly newsletters/special promotions, asking for donations/payments, advertising products, selling pornography, giveaways
  • 0: The text does not include the above.

Sign in

Sign in with Google
        

Sign up

Forgot password

Please provide your email. We will send you an email containing further instructions.