"well just build it ourselves."
famous last words for literally every team that tries to provision per-tenant infra from scratch. every single one.
it starts simple enough. spin up a container per customer. route some traffic. done right?
then you need persistent storage. then environment isolation. then wildcard DNS. then SSL per subdomain. then a provisioning API. then monitoring per tenant. then independent scaling.
three months later your infra engineer is maintaining a custom orchestration system instead of shipping the features your customers actually asked for. sounds fun right.
the iceberg of DIY provisioning
what looks like "just spin up a container" is actually a massive stack of interconnected problems that keep multiplying.
container orchestration
you need to create, start, stop, restart, and destroy containers on demand. you need health checks. automatic restarts on failure. resource limits so one tenant cant eat the entire host machine.
if youre on kubernetes thats writing Deployments, Services, Ingresses, PersistentVolumeClaims, ConfigMaps, and Secrets per tenant. if youre NOT on kubernetes then congrats youre building your own orchestrator. neither option is quick or fun.
networking and routing
each tenant needs a unique URL. wildcard DNS, reverse proxy that routes based on subdomain, SSL certs per tenant (or a wildcard cert with proper management which is its own headache).
oh and custom domains. enterprise customers will absolutely want ai.theircompany.com instead of slug.yourdomain.com. that means DNS verification, cert provisioning, and proxy reconfiguration per custom domain. its a whole thing.
persistent storage
containers are ephemeral. your customers data is not. you need persistent volumes that survive restarts and redeployments. backup strategies. and you gotta make sure one tenants volume is never accessible to another. ever.
environment isolation
API keys, model configs, feature flags, rate limits. each tenant needs their own set of env vars injected at runtime. you need a secure way to store, update, and rotate these without redeploying the whole thing.
provisioning automation
someone has to create all of this when new customer signs up. manually? that doesnt scale past 5 customers. you need a provisioning API that creates container, volume, network config, DNS record, SSL cert, and env vars all in one operation. building that API is a project in itself.
monitoring and observability
10 tenants you can manage manually maybe. 100? no chance. you need per-tenant metrics β CPU, memory, disk, network. health status. uptime tracking. alerting. you need to know when tenant 47 is running hot BEFORE they file a support ticket. not after.
the real cost here is opportunity
engineering time is expensive yeah but its not the biggest cost. the biggest cost is opportunity.
every week your team spends building infrastructure is a week they dont spend on:
- features that actually differentiate your product
- onboarding flows that convert trial users to paying customers
- integrations that unlock new markets
- the actual AI capabilities your customers are paying you for
your customers dont care about your provisioning system. they care about what your AI does for them. infrastructure is necessary but its not differentiating. the less time you spend on it the more time you spend on what actually matters.
the maintenance tax (its permanent btw)
building it is only the beginning. and honestly its the easy part.
infrastructure requires ongoing maintenance forever:
- container runtime updates and security patches
- SSL certificate renewals (they expire. they always expire at the worst time.)
- storage capacity planning
- network config changes
- monitoring system upkeep
- incident response procedures
- documentation for the team (that nobody reads but you still gotta write it)
this is a permanent tax on your engineering capacity. it doesnt shrink as your product matures. it grows as your tenant count increases. more tenants = more things to break = more time maintaining instead of building.
when DIY actually makes sense
to be fair building your own makes sense in specific situations:
- you have a dedicated infrastructure team with nothing else on their plate
- your isolation requirements are unusual enough that no platform supports them
- infrastructure itself is your competitive advantage
- youre operating at scale where platform costs genuinely exceed build costs
for most teams shipping AI products? none of these apply. you need isolation not an infrastructure science project.
what ShipClaw replaces
ShipClaw replaces the entire stack above with a visual builder and a deploy button. thats not marketing speak its literally what it does.
| DIY component | ShipClaw equivalent |
|---|---|
| Container orchestration | drag a Runtime node onto canvas |
| Networking and routing | Gateway node handles wildcard routing + SSL |
| Persistent storage | Volume node mounts at /data automatically |
| Environment isolation | Env Config node per tenant |
| Provisioning automation | click deploy. platform does everything. |
| Monitoring | built-in dashboard with per-tenant metrics |
| Custom domains | Custom Domain node with automatic SSL |
no Dockerfiles. no kubernetes manifests. no terraform. no provisioning API to build and maintain.
design the topology visually. deploy. get back to building the thing your customers actually pay for.
the math (since nobody does it)
rough estimate for a team of two engineers building DIY tenant provisioning:
- initial build: 8-12 weeks of focused engineering (assuming nothing goes wrong lol)
- ongoing maintenance: 10-20 hours per week, every week, forever
- incident response: unpredictable but guaranteed to happen at worst possible time
- opportunity cost: ~3 months of product development just gone
ShipClaw gets you to the same outcome in an afternoon. the rest of the quarter is yours for actual product work.
bottom line
DIY tenant provisioning is a trap. looks like a weekend project. turns into a permanent engineering commitment that slowly eats your team alive.
the question isnt "can we build this?" β you can. every team can. the question is "should we?"
for most teams the answer is no. use the platform. ship the product. stop spending your best engineers time on problems that are already solved.
