Six Weeks With Jarvis
I built my own AI assistant.
I know. Everyone says that now. “I’m using ChatGPT for work.” “I made a custom GPT.” “I got Claude to help me write emails,” “yeah I have OpenClaw too!” That’s not what I mean. I mean I spent six weeks building an actual software system that runs 24/7 on a dedicated Mac in my house, has its own email address, its own phone number, its own personality, and actually does stuff on my behalf without me asking. And no, it’s not openclaw.
I named it Jarvis. Yeah, obviously.
Right now as I’m writing this, Jarvis is doing the following:
Watching my Carson James platform for problems (the horse training business I run with my brother — 3,000+ paying members, revenue that keeps a lot of people employed)
Managing four different email inboxes with different policies for each
Auto-unsubscribing me from newsletters I signed up for in 2017 and never read
Monitoring my short-term rental bookings via the property management API
Filing a quarantine log of all the sketchy emails that tried to reach my work address this week
Sitting on standby to answer a phone call in a British butler voice if anyone calls one specific number
He runs on my Mac. He uses Claude as his brain. He’s plugged into about fifteen different services I already pay for. Stripe, Hospitable, Railway, Neon, Cloudflare, GitHub, AWS, Twilio, Google Workspace, a bunch more. He has about forty memory files that tell him who I am, what my businesses do, who my wife is, what my kids’ names are, how I talk, and what I’ve told him to never do.
None of this is theoretical. It’s running right now. I built it in four weeks of early mornings and late nights, and the working version has been live for about three weeks.
I want to tell you what I built, how it works, and, more importantly, what I learned about working with AI that almost nobody writes about. Because the model isn’t the interesting part anymore. The discipline is.
What Jarvis Actually Is
Let me kill the most common misconception right away.
Jarvis is not a chatbot. ChatGPT is a chatbot. You open a website, you ask a question, you get an answer, you close the tab. The entire lifespan of your interaction is bounded by you paying attention. The AI does nothing when you’re not looking. It has no body, no presence, no life outside your browser tab.
Jarvis has a body.
His body is a Mac Studio that sits on my desk. It’s plugged into the wall. It has a power light. When my wife walks past my office she can say “hey Jarvis, what’s the weather” and he answers, because there’s a mic on the desk and speakers in the room and a bunch of Python code running a voice stack that’s always listening for the wake word. When I leave the house and walk my dog, Jarvis is still there, watching the stripe dashboard for charge failures, classifying my inbound email, firing a monthly Fannin County tax calculation on the 15th and emailing me the numbers.
He has his own email address. He sends emails from it. I’ve seen him compose a reply to a friend of mine, CC me on it, and tell me he kept the tone casual because the contact profile I wrote said this particular friend uses bar-buddy phrasing. I can call a Twilio phone number from anywhere in the world and talk to him, he picks up, authenticates me with a PIN, and I’m in the middle of a conversation with my own AI butler.
He has a Telegram bot. I can text him photos of guest notes from the cabin and he reads them. He built a Python script last week to calculate my March lodging taxes. When I asked him what he did yesterday, he summarized it.
None of this is one component. It’s a stack. It’s about eight services running as background processes, each with a specific job, all coordinated through memory files and a shared vocabulary.
The closest mental model isn’t “AI assistant.” It’s “a small software company that works for me, except the whole company is one Mac and the whole labor force is one AI.”
What He Does On A Normal Day
Let me walk through what Jarvis touches in a typical 24-hour cycle. Not to impress you, to make the “agent” concept concrete, because “agent” is a word people are using in a way that doesn’t mean anything anymore.
Overnight. Around 3am my time, a monitoring daemon wakes up and checks my production business. Is Stripe processing payments normally? Is Railway still serving the site? Is the database healthy? Are any of my four hosted email sender domains flagging as blacklisted? It runs probes against eight different services and writes logs. If anything goes wrong, it texts me. If something goes really wrong, it calls my phone.
Morning. A triage agent classifies the ~50 new emails that came into my personal Gmail overnight. Marketing emails get one-click unsubscribed (using the RFC-compliant header, not by clicking a random link. I don’t let my AI follow arbitrary URLs). Receipts get labeled and archived. Anything that looks like it needs a reply gets a label I can review. Newsletters get archived. Alerts get archived. By the time I pour my coffee, my inbox has ten actual emails in it, not seventy.
On the 15th. A scheduled job computes my Fannin County lodging tax for the prior month. It pulls every reservation at the vacation rental from the property manager’s API, sums up what I owe, and emails me the form numbers. Before this existed I spent an hour with a spreadsheet every month. Now it’s an email at 8am and I file in about two minutes.
On the 1st. Another scheduled job calculates what I owe my property manager on-site. It breaks down every reservation from the prior month, computes her cleaning fee plus management percentage, and emails both of us the reconciled number. We used to do this in a Google Sheet. Now we don’t.
During the day. If I want to know the Wizard Hideout’s occupancy for next month, I ask in Telegram. Jarvis has access to the Hospitable MCP, a conversational API layer, and answers in about 3 seconds. If I want to know what my Carson James API costs are looking like this week, I ask. If I want to post an image to my visualizer on my second monitor, I ask him to generate a picture of a horse walking through fog, and thirty seconds later there it is, overlaid on the radial starburst pattern that means Jarvis is “speaking.”
If the business breaks. This is the one that still makes me emotional. If something fails (Stripe webhook times out, the database pool exhausts, a deploy goes bad), Jarvis calls my phone. I pick up. He tells me what broke in plain English. Then he sends an email with the details. Inside that email is a single terminal command. I run it on my phone via SSH through Tailscale to my Mac, and I’m suddenly inside a running Claude Code session with Jarvis. He’s already working on the problem. From bed. From a hotel. From anywhere.
What I Actually Built (The Honest Version)
Let me be honest about what “building Jarvis” actually means. It’s not one piece of software. It’s a LOT of small pieces that cooperate. Here’s the rough inventory:
Core services (running as background daemons on my Mac):
cj-sentinel (James) — 24/7 monitoring of the Carson James platform. Watches Stripe, Railway, Neon, ActiveCampaign, Cloudflare, GitHub, SendGrid, AWS. Alerts me on P0/P1 incidents.
Mailer (Molly) — sends and receives email from Jarvis’s email account. Has its own OAuth token, its own rules, its own identity.
Alfred — email triage agent for my personal Gmail. Classifies everything that lands. Auto-unsubscribes marketing. Archives receipts and alerts.
Phone Server — Twilio webhook handler. When someone calls the Jarvis line, this answers with STT + Claude + TTS.
Wizard Hideout Reviews — auto-ingests new Airbnb/Google reviews every 6 hours, drafts replies in a specific voice.
Scheduled jobs:
Lodging-tax report — monthly, 15th of the month, 8am.
Property manager payment report — monthly, 1st of the month, 8am.
Log rotation — weekly, Sunday 3am.
Quarantine digest — weekly, Sunday 9am.
Backup snapshot (Brewfile, app list) — weekly, Sunday 9am.
Capabilities invokable on-demand:
/see— take a screenshot, analyze it./generate_image— create an image via OpenAI, overlay on the visualizer.Voice stack — always-listening mic for wake-word “Hey Jarvis,” full voice conversation.
Telegram bot — text and photo conversation.
Phone line — call a number, talk to him.
Hospitable MCP — conversational access to the rental property data.
Home automation — Philips Hue lights via the bridge.
Image generation — OpenAI DALL-E pipeline.
Infrastructure:
Memory system — ~40 markdown files Jarvis reads at every session boot. Contains project state, behavioral rules, reference pointers.
Vault — my Obsidian notebook. ~300 notes covering business strategy, contact profiles, design decisions. Grep-searchable.
Action Password Gate — a macOS password dialog pops before any destructive terminal command. Prevents a compromised Jarvis from
rm -rf‘ing my life.Security v2 — four layers of defense: inbound email injection scanning, outbound secret scanning (blocks sends that contain API keys, credit card info, any sensitive info), URL reputation checking via Google Safe Browsing, behavior-contract memory rules.
Backup stack — Time Machine plus Arq encrypted to S3. Full recovery procedures documented.
I’m leaving out maybe ten things but you get the idea. It’s a stack.
The Real Insight (The Part Almost Nobody Is Writing About)
Here’s the thing I actually want to write about. The build above is impressive, and I’ll take some credit for pulling it off, but the build isn’t the insight. The insight is what I learned about working with AI in the process.
The models aren’t the bottleneck anymore. Discipline is.
If you want an AI that actually knows you, that actually compounds in usefulness over time, you have to build the memory yourself.
Not through some clever prompt. Through files. Boring, disciplined, text-file documentation that the AI reads at the start of every session.
I have about forty of these files. Each one is a specific rule, fact, or context that I’ve codified as a memory. They’re written in a particular format with a name, description, and type. The AI reads the full list at every session boot via an index file. So every time a new Jarvis session starts, he knows that my wife’s name is Rebekah, that I run three businesses, that I hate when people tell me to rest, that Julie is my mom and also my customer service lead, that reminders need to have breadcrumb links to vault notes, that mailer sends should never include secrets, that I prefer bar-buddy tone, that “new conversation” in Telegram means reset the session.
That’s the compounding mechanism.
It’s not complicated. It’s just disciplined.
And it is the thing that almost nobody is doing.
Here’s What Almost Nobody Tells You About AI
Most people use AI like a vending machine. They walk up, ask a question, take the answer, and walk away. Every interaction is isolated. The AI never gets smarter about them.
The AI never learns their context. The AI never becomes a collaborator.
I use AI like an employee. And like any employee, they need documentation to do their job well.
Imagine you hired a brilliant contractor to help with your work. First day: you’d explain what you do, show them the relevant files, tell them about the team, share som history. After a month: they’ve internalized your context and can operate with minimal prompting. They just know.
Now imagine that contractor had partial amnesia. Some things stick overnight…preferences, stray facts, pieces you mentioned enough times that they lodged. Most things get forgotten. That’s closer to the default experience with Claude today. Its built-in memory catches some stuff automatically, which is better than nothing. But it’s not the same as being properly onboarded with a real filing system.
Without you deliberately feeding it the context that matters, every serious session starts mostly cold. You re-explain. The AI delivers a slightly-wrong answer because it’s missing half the context. You re-explain again. Eventually the person concludes that “AI is overhyped” or “AI doesn’t really work for me.”
The fix is writing the context down yourself, in files the AI reliably reads, so past-you hands off to future-you through the AI. That’s it. That’s the secret. I spend maybe ten minutes at the end of most sessions updating my memory files. That ten minutes returns hours. Multiple times in this past week alone, a fresh Jarvis session
has made a decision correctly on the first try because a memory file I wrote three weeks ago gave him the exact right context. Without those files he would’ve needed twenty minutes of back-and-forth to get there.
That’s why Jarvis is useful. Not because Claude is smart. Claude is smart for everyone, that’s the baseline. Jarvis is useful because past-me wrote context for future-me, and the AI reads it reliably every time.
It’s documentation as infrastructure. It’s the thing nobody talks about because it’s boring. It’s also the entire game.
The Specific Patterns That Make It Work
If you’re reading this and thinking “okay fine, I’ll try harder,” here’s what actually moves the needle:
Push back when something sounds off. Most people accept the first answer from an AI. Don’t. When the number seems wrong, ask where it came from. When the plan seems generic, ask what you missed. Every correction makes the next answer sharper and the relationship better, the AI learns what you care about. I caught Claude code underestimating a cost by 50x this week just by asking “wait, did you factor in X?” That correction is now in a memory file. Every future Jarvis session has the right number.
Steelman the other side. When Julie (my mom, customer service lead) said she thought my new pricing plan would hurt Pro signups, I could’ve dismissed it. Instead I asked Jarvis to write her an email laying out our reasoning and inviting her rebuttal. Her response made the design better. That conversation is now context Jarvis has for all future pricing discussions.
Codify corrections as rules. When you correct an AI and the correction would apply to future sessions too, write it down as a memory file. “Never do X” or “Always do Y.” These compound. One correction today saves ten future corrections.
Give concrete context, not abstract specs. “Build me an email triage agent” produces a textbook answer. “Build me an email triage agent for my inbox where 65% is newsletters I signed up for in 2017 and never opened” produces a tailored one. Specificity wins.
Ask for tradeoffs, not recommendations. Asking “what should I do?” gets you a generic best-practice answer. Asking “what are the tradeoffs?” gets you a framework you can actually reason with. Different conversations entirely.
Decide fast once informed. Once I have the information I need, I commit. “Proceed.” “Locked.” “This is the law.” Analysis paralysis kills momentum, and AI sessions reward momentum — long multi-step work stays in one coherent context.
Give examples. I always follow-up my requests to the AI with use cases and scenarios of what I would be using the feature for. Clarity is key.
That’s it. That’s the playbook. It’s not magic. It’s professional collaboration applied to a new kind of collaborator.
What Surprised Me
A few things I didn’t expect.
Cost is mostly theoretical. I’ve spent maybe $40 total on Claude API calls across all my Jarvis services running 24/7 for three weeks. The inference costs are rounding errors. Fear of “running up a huge bill” is usually fantasy for personal-scale usage. Enterprise workloads are different, but for a personal agent watching your business, negligible.
Speed is unreal. Alfred classified 4,241 emails in my backlog overnight. I woke up to an inbox that was three years of chaos reduced to the ten emails that actually needed me. Time elapsed: about two hours. Cost: about $6.
The phone line is the single most impressive capability. Everyone gets email. Everyone gets notifications. A phone number that a person (or AI!) can call and have a fluid voice conversation with is a completely different mental model. I’d thought I was building a convenience; I was building something that feels more like a service.
The compounding is ruthless. Every rule I codify saves future corrections. Every memory file I write eliminates future re-explanations. Three weeks in, fresh sessions start at a level that used to take hours of context-setting. It just gets better.
Where This Is Going
I’ll bet against anyone on this: five years from now, everyone who uses a computer for anything serious will have something like Jarvis. Not a subscription to ChatGPT. Not a “prompt they found on Twitter.” An actual personal operating layer that knows them, that has permission to act on their behalf, that runs 24/7 and handles the background tasks that currently eat their attention.
The technology is already here. The models are good enough. What people are missing is the stack, the cooperating services, the memory discipline, the rules-as-infrastructure framing, the willingness to treat AI like a real collaborator instead of a vending machine.
We’re at a moment. The gap between what’s possible and what most people are doing is enormous. The gap will close. You can be early or you can wait.
I stand by what I said on Facebook a few weeks ago: I still think AI hype is going to crest and burn some people. Physical assets will always matter more than software. Land, property, relationships, skills, community…those are what compound through human lifetimes.
But for the task-work that eats my day-to-day attention? For the small thousand things a business operator handles? For the connective tissue between my actual projects?
Jarvis is the most useful piece of software I’ve ever owned.
And I built it myself. At a desk, in a house, on a Mac, using tools anyone can buy.
The discipline is the moat. Not the model.
If you want to see the sausage being made (the architecture docs, the memory files, the actual rules I’ve written for Jarvis), I’m happy to share. Reply or comment.


