Where are you based, and where do you work?
I’m based in the Lower Mainland, British Columbia, Canada. Most
client work happens in person within an hour or so of home —
Surrey, Vancouver, Burnaby, Richmond, Delta, Langley, New Westminster,
and Coquitlam — with remote engagements running anywhere in the
world for organizations that just need the engineering brain on the
problem.
How do I get started?
Start with the free AI readiness audit: a
no-cost 60-minute review and a short written report on where private
AI (and tighter IT generally) would actually save you money. Real
CAPEX-vs-subscription math for your situation, what’s safe to
keep in-house vs. cloud given your data, and one quick win you can act
on immediately. Yours to keep, no obligation.
What’s your pricing?
The free audit is free. The
Private AI Starter pilot is a fixed-scope project
quoted after a brief scoping conversation — one GPU server, one
or two use cases, IdP-backed login, handover runbook. The
Managed IT + AI Concierge retainer is a monthly fee
sized to your stack. Bespoke work is hourly or fixed-bid by scope.
Whatever the shape, the number is the number — no surprise
line-items.
Will my data leave my network?
For private AI engagements — no. That’s
the entire point of the offering. Models run on hardware you own,
prompts and uploaded files stay on your LAN, and you keep the audit
log. I configure the stack, hand over runbooks, and operate only what
we’ve explicitly contracted.
How does on-prem AI compare to per-seat AI subscriptions?
For moderate-to-heavy team usage, a single capable GPU server
amortized over 2–3 years often beats per-seat AI subscriptions
on a 36-month total cost basis. Exact crossover depends on your token
volume. You also get data sovereignty, customization (RAG over your
own docs, custom prompts and tools), full auditability, and no
forced model deprecation. Trade-offs I’m transparent about:
hardware capital cost upfront, ops responsibility (which is what I
provide as a service), and slightly more friction to upgrade to the
latest frontier model. For most business tasks, mid-range open-weight
models are sufficient.
Which open-weight models do you typically deploy?
Mostly 7B–14B parameter models from the Llama, Qwen, and
Mistral families — sufficient for most business tasks and they
run comfortably on a single mid-range GPU. For workflows that
genuinely need frontier capability, I’ll wire in an API model
(Claude, GPT-4-class) with explicit guardrails so you know exactly
what leaves the network and when.
Do you handle Microsoft 365 / Entra / Active Directory work?
Yes — deeply. My day job is Windows Server + SharePoint + M365
administration. Strong on Entra ID, conditional access, MFA rollouts,
AD cleanup, hybrid identity, and Tier-0 separation for admin accounts.
If your shop is mid-migration from on-prem AD to cloud-only Entra (or
living comfortably in hybrid), I’ve been there.
Do you support Mac, Windows, and Linux?
Yes to all three. Primary daily driver is a MacBook Pro M2; deep
experience with Windows Server + Microsoft 365 (day job), and Linux
(home lab, most consulting work, every client GPU server I’ve
racked).
Can I see what you’ve built?
The clearest demos are the two AI tools running on this same
infrastructure: LLM Chat (private on-prem Ollama,
nothing leaves the LAN) and Tower (browser SSH +
AI assistant for ops work). Both are tools I use every day — not
marketing demos. Client work itself is covered by NDA, but I can talk
through patterns and architecture in an audit.
Are you taking new engagements?
Yes — availability is signalled by the
Available for new engagements badge on the
home page. Audits are always open even when project
capacity is tight.
What stacks do you work in?
PHP, Python, JavaScript/TypeScript, Bash, PowerShell, and a fair bit
of SQL on the code side. nginx, Apache, systemd, Docker, and the
usual Linux sysadmin tooling on the infrastructure side —
whatever the job calls for.