back to projects
Custom MapReduce Engine
A Python implementation of the MapReduce programming model from scratch, applied to large-scale document processing.
data engineeringsystems
Overview
Built entirely without Hadoop or Spark, this engine implements the core MapReduce paradigm in Python — including shuffle, sort, and reduce phases — and benchmarks it against MongoDB aggregation pipelines on real datasets.