Real Work. Real Results.

Every engagement starts with a diagnosis. Here's what we found, and what changed.

LEGAL / WORKFLOW AUDIT

15-20 Hours a Week Recovered for a 2-Person Law Firm

15-20 hrs
recovered per week
$85/mo
total tool cost
60-70%
reduction in admin time

A wills and trusts attorney reached out for a workflow audit. Her firm: one attorney, one admin, one paralegal. Small team, high client volume, every hour counted.

She had a sense things could be more efficient. She didn't have time to figure out how.

What I found

A 90-minute deep-dive session mapped the full client journey from first contact to signed documents. Five problem areas surfaced immediately.

Document drafting was taking 2 to 4 hours per client. Every will, every trust, written more or less from scratch. The attorney had templates, but they weren't connected to anything.

Client intake forms were being filled out by clients, then manually re-entered into the practice management system by the paralegal. Every client. Every time. The form and the system were already capable of connecting. Nobody had set it up. That process had been running for two years.

Follow-up emails and appointment reminders were handled manually. Asset inventory collection (gathering account details, beneficiary designations, and financial information from clients) arrived in a mix of emails, PDFs, and handwritten notes with no consistent structure. No systematic way to track document revisions or client communications.

Five fixable problems. One firm. Two years of accumulated friction.

What we built

The roadmap rebuilt the full workflow from new client to signed documents across five tool integrations, total cost $85/month.

Smart intake: Typeform with conditional logic replaced the paper forms. A Zapier connection automatically creates the contact, matter, and populated templates in the practice management system the moment a client submits. The paralegal's manual re-entry step disappears entirely.

AI document drafting: a custom prompt library for Claude generates first drafts of wills and trusts from the intake data. What was taking 2 to 4 hours per client drops to a review-and-edit workflow.

Automated client communication: trigger-based emails for appointment confirmations, document request reminders, signing instructions, and post-signing follow-ups. When a meeting is booked, a prep checklist goes out automatically. When a document is signed, the next steps email fires on its own.

Asset inventory portal: clients get a secure link and fill out their asset information at their own pace. Data populates a structured spreadsheet by asset category. No more chasing PDFs and handwritten notes.

Meeting capture: Fireflies joins client consultations, transcribes them, and generates a summary with wishes and follow-up actions. Searchable records for compliance without anyone taking notes.

The implementation plan

Eight weeks. Foundation in weeks 1 and 2: intake system and practice management integration live and tested. Communication automation in weeks 3 and 4. Document automation in weeks 5 and 6. Full deployment, asset portal, and team training in weeks 7 and 8.

The actual point

The paralegal had been manually copying client data from one system to another for two years. Not because she wasn't capable of something better. Because nobody had stopped long enough to ask why it was happening at all.

That's the audit. Not finding broken tools. Finding the work that nobody questioned.

ENTERPRISE SOFTWARE / AGILE CONSULTING

Building a Continuous Improvement System a Team Could Actually Own

4 capability areas
assessed across 80+ data points
1 week onsite
embedded with the team
CI backlog + roadmap
built and owned by the team

The Amsterdam office of ServiceNow had a capable, enthusiastic team. Smart people who worked well together, communicated openly, and cared about what they were building. What they didn't have was clarity. No shared definition of what good looked like. No visibility into where work was getting stuck. No real process holding things together. For a team inside one of the world's leading enterprise software companies, they were operating like a bootstrapped startup.

How I ran it

I flew to Amsterdam and spent a week embedded with the team during release planning. Not observing from a distance. In the room. In the meetings. Talking to the people doing the work, the people managing it, and the teams upstream and downstream who depended on it.

The assessment framework I used covered four capability areas, each scored across multiple inspection points.

People Capabilities asked whether the team had the basic setup to deliver complete, potentially releasable work each sprint. Were they dedicated? Cross-functional? Did they hold retrospectives? What were the role boundaries between the Product Owner, Scrum Master, and development team?

Collaboration Capabilities looked at how they planned, refined, and communicated. Were sprint goals customer-facing? Was velocity stable enough to forecast? Did the working agreements actually hold? Were stakeholders engaged or peripheral?

Responsive Capabilities measured how quickly the team could react to change. Could they release their own product to customers? Were release activities handled in-sprint or batched at the end of a cycle? How close was the team to real continuous deployment?

Delivery Capabilities assessed what they were actually shipping. Did work meet the Definition of Done? Were PBIs prioritized and independently valuable? Was quality built in during the sprint or bolted on after?

Every finding got scored on a 1 to 5 maturity scale. Every score had an observation behind it. Not impressions. Evidence.

What I found

The team was genuinely strong on the human side. People communicated. They trusted each other. They engaged. That's harder to build than any process, and they already had it.

The problems were structural.

They weren't fully cross-functional. UI design and certain release activities lived outside the team, which meant external dependencies could block delivery regardless of how well the team executed internally. The people who needed to be in backlog refinement weren't in it.

Retrospectives were inconsistently held. When they did happen, the items weren't documented or tracked. Which meant the same problems kept surfacing without getting fixed.

Velocity was unstable, which made planning unreliable, which eroded stakeholder trust. Sprints weren't resulting in potentially releasable increments because release activities were batched for the end of the cycle rather than built into each sprint.

The Definition of Done existed but wasn't consistently followed, which meant “done” sometimes meant done and sometimes meant done-ish.

None of these were character flaws. They were system gaps. The team was capable of more than their structure was allowing.

What we built

The assessment findings went into a facilitated stakeholder session. Before any recommendations, before any roadmap, we put the data in front of the people who had to act on it.

Dot voting. Everyone gets five points. Spend them on the areas you think matter most. You can concentrate your points on one thing or spread them across several. Count the votes, rank the priorities, have the conversation. The loudest person in the room doesn't set the agenda. The data does.

From the prioritized list, the team broke down each focus area into specific, actionable improvement items. Not “improve retrospectives.” Specific: hold a retrospective every sprint, document the items, assign owners, track them in the next sprint. Not “get more cross-functional.” Specific: include UI in backlog refinement starting next release cycle, define the acceptance criteria for when specialists need to be pulled in.

Those items became the continuous improvement backlog. Milestones got assigned. Owners got named. A tracking rubric mapped improvement progress by area across sprints.

The team built it. Not me. I facilitated the sessions and held the framework. They made the decisions.

Why that distinction matters

A CI process only works if the people doing the work care about it. A consultant can hand you a beautiful roadmap and you can ignore it the moment they leave. That happens more than anyone admits.

The only way to build something that sticks is to make the team the authors of it. They know what's actually hard. They know which recommendations are realistic and which ones sound good on paper. They know what they'll actually do.

My job wasn't to tell them what to fix. It was to show them clearly what was happening, give them a framework to prioritize, and get out of the way while they built the plan.

AI INFRASTRUCTURE / INTERNAL BUILD

Building an AI Operating System to Run a Solo Consulting Business

60-90%
token savings per session
21 nodes
outreach pipeline, runs without me
14 slash commands
structured business workflows

Every time I opened a new Claude session, I started from scratch. Who are my prospects? What did I promise PPG last week? What does my brand voice actually sound like? All of that context lived in my head, or in disconnected documents I had to dig up manually. The AI was smart, but it had no idea what business it was working in.

I needed Claude to know my business. Not just answer questions about it. Know it. So I built a system that makes that possible.

What the AIOS is

AIOS stands for AI Operating System. The name is aspirational in the way that most startup names are aspirational. What it actually is: a layered infrastructure that gives Claude Code persistent access to everything relevant about my business, then automates the parts that don't need me.

Five layers:

Infrastructure: Python virtual environment, .env for API keys, SQLite databases for metrics and search, and a CLAUDE.md file that is the single most important file in the system. It's the rules document Claude reads at the start of every session. Business description, three KPIs, workspace structure, critical rules, full command reference. The difference between starting a session cold and starting with a briefed colleague.

Data collection: Scheduled Python scripts pull data from GA4, LinkedIn, Apollo, Google Calendar, and FX rates into SQLite. A daily brief script synthesizes this into a morning report pushed to Discord every day.

Intel layer: Makes meeting transcripts, Gmail, and Slack searchable via a dedicated database. When I need to know what a client said three weeks ago, I search the intel layer instead of digging through email.

Orchestration: n8n handles multi-step automation that runs without me.

Interface: Claude Code for desktop sessions. A Discord bot called CommandOS for mobile access. Both connect to the same underlying business context.

How Claude Code runs the business

Every session starts with /prime. It loads full business context: current prospect status, open proposals, active client notes, today's metrics, GTD next actions. After running it, Claude knows who the active clients are, what the outstanding proposals say, what the brand voice rules are, and what the three KPIs are. The AI isn't a general assistant anymore. It's a briefed colleague who was there for the last conversation.

14 slash commands, each triggering a structured workflow: /prime (load context), /outreach (draft personalized outreach from prospect data), /process (GTD inbox to zero), /review (weekly review), /todo, /brainstorm, /task-audit, /create-plan, /implement, /explore, /install, /commit.

These aren't just prompts. Each one is a full SKILL.md file that Claude Code reads and executes. /create-plan generates a structured plan file, asks clarifying questions, then calls ExitPlanMode to get approval before touching anything. /commit follows a specific git safety protocol.

The automation layer

The outreach pipeline is where the AIOS does its most independent work.

Stack: Apollo (prospect data) + Apify (LinkedIn post scraping) + Claude API (copy generation) + Instantly (email sequencing) + n8n (orchestration) + Discord (notifications).

The flow: manual trigger in n8n pulls a batch of prospects from Apollo, enriches with recent LinkedIn posts via Apify, loops through each prospect, calls Claude to generate personalized email copy and connection request text, adds the lead to Instantly with the generated copy, updates Apollo with the new stage. On errors: Discord error notification, loop back. When the batch is done: build summary, post to Discord.

21 nodes. Runs without me. Fires a Discord message at the end: “Processed: 12 | Succeeded: 11 | Failed: 1.”

The copy system prompt went through at least six iterations. Each time, I ran a batch, evaluated the output, described what was off, and Claude rewrote the relevant section. That kind of rapid iteration on a live system is exactly what Claude Code is built for.

Keeping the context window clean

Claude Code has a context window. Everything you send it goes into that window. A single git diff or API response could eat 56 KB of context. One bad command and the session gets sluggish or loses track of earlier instructions. So I built two systems to fix it.

RTK (Rust Token Killer) sits in front of every shell command. Instead of running git status, I run rtk git status. RTK intercepts the output and strips the noise before it hits the context window. Git diffs get 80% smaller. Test output drops by 90-99%. Only failures come through. It works because most command output is structural noise. You don't need 400 lines telling you which tests passed. You need the 3 lines telling you which ones failed.

context-mode routes large outputs to a sandboxed subprocess with a full-text search index. Raw file contents, API responses, web pages, log files. They get indexed in the sandbox. Only my printed summary enters context. One call replaces what would otherwise be 30+ individual reads flooding the window.

Between the two systems, typical token savings run 60-90% per session. That translates directly to longer, more coherent sessions where Claude doesn't lose the thread halfway through a complex build.

What Claude Code actually makes possible

The honest answer: this system wouldn't exist without Claude Code. Not because the individual pieces are impossible to build, but because the iteration speed would have required a team.

I'm an Agile coach by background, not a developer. Claude Code let me build a Python async Discord bot, a multi-node n8n automation, a SQLite data pipeline, and a slash command system in the same four months I was building the consulting business. The 30+ Python scripts running on a schedule were written with Claude Code. The n8n workflows were designed in Claude Code sessions, exported as JSON, debugged node by node. Audit reports, SOWs, one-pagers — all generated and grounded in the same business context that lives in memory across sessions.

The actual point

The AIOS is both a product I'm selling and the system I'm using to sell it. When I show a prospective client what a workflow audit looks like, I'm running a workflow audit on my own business in real time. The PPG case study came out of an audit that ran through this system. The pipeline generates the outreach emails going to prospects right now.

The consulting business is the AIOS being used in public.

Ready to find out what's hiding in your workflows?

Every engagement starts with a diagnosis. No guessing, no generic recommendations.

Book a Discovery Call
Case Studies - Real Work. Real Results. | Digital Hellos