FutureGrid - Building a Smart Energy Platform From Scratch
FutureGrid started as a simple idea - give solar panel owners real-time visibility into their energy production and consumption. It grew into a full platform with edge devices, a central management system, energy cooperatives, and a scenario engine for optimizing battery dispatch.
This is a side project built entirely outside my day job, in collaboration with engineers who have real experience in the energy sector. Every design decision, hardware choice, and protocol was debated and tested against actual grid constraints.
The Architecture
Two main components:
SolarManager - the edge device. Runs on custom embedded Linux hardware. Connects to solar inverters, reads energy data, and pushes it to the central platform over MQTT with mTLS.
SolarManager-Central - the cloud backend. Built in Go. Handles multi-tenant fleet management, energy analytics, billing, and a scenario engine for battery optimization. React frontend for the dashboard.
The edge and central systems talk over MQTT with mutual TLS - every device has its own certificate, auto-refreshed on connection failure. No shared secrets, no API keys in plaintext.
Edge Device (SolarManager)
62 commits of embedded Linux pain. The device supports two boot modes - SD card and eMMC with TPM2 for secure credential storage.
I built a custom image builder that produces minimal Linux images. The eMMC variant went from 14GB to 6GB by stripping unnecessary packages and using firstboot expansion. Flashing is automated - plug the device in, run one script, wait.
Multi-Inverter Support
The first version only talked to one inverter model. Real installations have mixed hardware - different brands, different protocols (Modbus RTU, Modbus TCP, SunSpec, proprietary APIs). We added a plugin architecture that auto-detects the inverter type and loads the right driver.
The WiFi Problem
Getting WiFi to work reliably on a headless embedded device is harder than it sounds. The device needs to:
- Boot into AP mode (hotspot) for initial setup
- Let the user configure their home WiFi via a web interface
- Switch from AP to client mode without losing the setup session
- Fall back to AP mode if the configured WiFi is unreachable
I went through five iterations of the hotspot manager. NetworkManager, wpa_supplicant, nmcli direct, static profiles - each one broke in a different way. The final version uses nmcli's device wifi hotspot command and handles the handoff cleanly. Seven commits just fixing WiFi.
Device Hardening
For a device sitting on someone's home network, security matters. We restrict GPIO, SPI, I2C, and serial access. TPM2 stores device credentials. SSH is locked to key-only auth. The image builder has a dedicated hardening milestone with its own issue tracker.
Central Platform (SolarManager-Central)
71 commits. Written in Go with a React frontend. This is where the business logic lives.
Energy Cooperative
The biggest feature - energy cooperatives (spodzielnia energetyczna). Polish energy law allows groups of prosumers to share surplus energy within a cooperative. The platform handles:
- Energy Ledger - hourly per-member energy records
- Balancing Engine - P2P surplus/deficit matching between cooperative members
- Monthly Settlement - member balances and amounts due
- Compliance Checks - Polish energy regulation rules
Building the balancing engine was the hardest part. Energy flows are directional, time-dependent, and subject to grid operator constraints. We had to model the physical grid topology to determine which members can actually share energy with each other.
Scenario Engine
A simulation environment where operators can test battery dispatch strategies before deploying them. Feed in historical production and consumption data, define rules ("charge battery when spot price < 200 PLN/MWh, discharge when > 500"), and see the financial impact.
The engine now auto-logs decisions for audit trails - every charge/discharge decision records the input data, the rule that triggered it, and the outcome. Required for regulatory compliance.
Other Milestones
- Multi-Tenant Gateway - one platform instance serves multiple clients with isolated data
- Demo Environment - sandboxed realm for sales presentations
- Tiered Billing - pricing based on fleet size
- Monitoring & SIEM - platform health monitoring and security event logging
- TGE Integration - pulling day-ahead market prices from the Polish power exchange
- KSeF Integration - electronic invoice system required by Polish tax authority
- EV Charging Coordination - scheduling EV charging to minimize grid impact
Infrastructure & Operations
Both SolarManager and SolarManager-Central run fully containerized. Every component has its own Docker image with pinned versions - no "latest" tags in production. Docker Compose for local development, dedicated deployment pipelines for staging and production.
The central platform runs behind Cloudflare for DDoS protection, SSL termination, and CDN caching. The frontend is a static React build served through Cloudflare Pages - sub-50ms load times globally. API traffic goes through Cloudflare's WAF before hitting the backend.
Monitoring is self-hosted. Grafana dashboards track everything - API latency percentiles, MQTT message throughput, device heartbeat status, energy data ingestion rates, error budgets. Prometheus scrapes metrics from every service. Alerting goes to a dedicated channel when SLA thresholds are breached.
Logging follows a structured format across all services. Centralized log aggregation with retention policies. Every API request gets a correlation ID that traces through the entire pipeline from edge device to database write.
The codebase is documented with architecture decision records (ADRs) for every significant design choice. API contracts are defined in OpenAPI specs and validated in CI. Database migrations are versioned and reversible. The deployment process is scripted end to end - a single command promotes a tested build from staging to production.
Everything is open source tooling. No vendor lock-in. PostgreSQL, Redis, MQTT (Mosquitto), Grafana, Prometheus, Cloudflare free tier. The entire stack can be replicated from the docker-compose files in the repo.
What Made It Hard
Domain complexity. Energy systems have their own language, regulations, and physics. Every feature required research into grid codes, tariff structures, and metering standards. The engineers I worked with saved months of wrong turns.
Hardware diversity. Each new inverter model means a new protocol implementation. Documentation ranges from excellent (SMA) to nonexistent (cheap Chinese inverters with no English docs).
Regulatory changes. Polish energy law changed twice during development. Each change required adjustments to the balancing engine and settlement calculations.
Embedded Linux. Getting a reliable, secure, auto-updating edge device is 10x harder than a cloud service. Everything that can go wrong on a customer's home network will go wrong.
Current State
The platform is live at futuregrid.pl. Development and staging environments run on the OLAB cluster. We're working toward mobile app support and expanding the cooperative features for larger installations.