Big data platform for precision agriculture
Problem / Need
Raw agricultural signals (weather, crops, soil, yields) arrived in heterogeneous formats with no unified storage or compute layer, blocking the team from turning the data into product features.
Solution
Designed and deployed a big data architecture combining a Hadoop data lake, an Apache Spark cluster, and ElasticSearch. Ingested 100GB+ of agricultural data from multiple sources (scraped market data, spreadsheets, PDFs, images) covering crop information, fertilizers, yield production, rainfall, and climate. Built end-to-end ingestion and transformation pipelines turning the raw data into ML-ready features for the smart-farming app.
Tools
Hadoop·Apache Spark·ElasticSearch·Python
