
Overseer mitigates bias in hiring by pruning over-represented groups in hiring datasets. Data is transformed into text embeddings, then clusters are generated to find over-represented groups, which are pruned so no single demographic dominates the training set. This balanced dataset can be fed into downstream ML or LLM pipelines, giving recruiters a fairer signal and reducing gender- or ethnicity-based bias in ML-based hiring. Built with a Flask back-end (Cohere API, Scikit-Learn) and a Next.js/Three.js front-end. Overseer won "Best DEI AI Hack" at GenAI Genesis 2025.