Safety & Moderation

Content Moderation & Flagging

Version 1.0 | Derniere revision: 2026-05-13

Purpose

Define how user-generated content and interactions are moderated to maintain platform safety and quality.

What Is Moderated

  • Mentor and student profile bios and photos
  • Session chat messages and shared files
  • Forum posts and community discussions (if enabled)
  • Session feedback and reviews

Automated Moderation

  • Keyword Filtering — Blocked terms list covering profanity, threats, personal information sharing, and grooming indicators
  • Contact Information Detection — Automatic flagging when phone numbers, email addresses, or social media handles are shared in messages
  • Image Scanning — Profile photos and shared images scanned for inappropriate content
  • Rate Limiting — Excessive messaging triggers throttling and review

Manual Moderation Queue

  1. Flagged content appears in country admin moderation dashboard
  2. Admin reviews context: full conversation, user history, severity assessment
  3. Actions available: approve (false positive), warn user, remove content, suspend user, escalate to safeguarding lead
  4. Target response time: 12 hours for standard flags, 2 hours for critical flags

Appeals Process

Users who receive moderation actions can appeal within 14 days. Appeals are reviewed by a different admin than the original reviewer. Final decisions are communicated within 7 business days.