preloader
post-thumb

Last Update: June 23, 2026


BYauthor-thumberic

|Loading...

Keywords

Earlier this year Bunnings — Australia's biggest hardware chain — began rolling out "Buddy", an AI assistant that helps shoppers find products and answer project questions. Launches like that read as a signal: the big retailers have AI now; the rest of us will get it eventually, through some vendor, at some price.

We didn't wait for the vendor. Over about a week we built the same class of thing for store.tyo.com.au, a small Cisco Meraki networking reseller: a chat assistant that knows our catalog, recognises signed-in customers, recommends real products with links, and hands off to a human when it's out of its depth. We named it Tio.

This post is the honest version of what that takes. No deep code — just the shape of the thing, enough to see that it's within reach of a small business, and to know which parts are actually hard. The punchline up front: surprisingly little of it is "AI".

What "Buddy" actually is

Strip away the branding and a store assistant is a four-step loop:

  1. A customer types a question into a box on your site.
  2. Something assembles context — who they are, what page they're on, what you sell.
  3. A language model writes a reply grounded in that context.
  4. If it can't help, a human takes over.

Step 3 is the part everyone fixates on, and it's the part you don't build. The other three are the product.

The five building blocks

Here's the whole assistant, decomposed. Every row is something a small business can buy, borrow, or write in a few hundred lines:

Block
What it does
What we used
Chat UI
The box on the page; streams replies as they type
A small React widget on the site
A model, behind a swappable interface
Turns context + question into a written reply
Gemini 2.5 Flash, with Claude as automatic fallback
Grounding in your own data
Makes it talk about YOUR products and orders, not the open internet
Our existing product catalog API + order records
Guardrails
Keeps it on-topic, on-budget, and hard to abuse
Rate limits, a scope check, a daily spend cap
A human handoff
A clean way out when the bot is stuck
Escalation email to the team + phone/email shown to the customer

That's it. No row of that table requires a data-science team.

The model is the easy part

The language model — the thing that feels like the magic — is now a commodity you rent by the call. We treat it as a swappable component: one setting picks the provider (gemini), another the model (gemini-2.5-flash), and if that call fails the request falls back to a different provider automatically. Switching the entire "brain" of the assistant is a one-line change.

That's worth sitting with, because it inverts the intuition. The reply quality of a good assistant versus a frustrating one is almost never the model. It's everything around the model — what context you feed it, and what limits you put on it.

Where the real work is: grounding

An ungrounded chatbot is a liability. Ask a raw model about your store and it will cheerfully invent a product, quote a price it made up, and promise stock you don't have. The central job is to make the model speak only from your data.

The mechanism is unglamorous. Before each reply we:

  • Search the catalog for the customer's question and paste the matching products — names, SKUs, prices, stock, and a link — into the model's context, with a hard rule: use only this data; if a product isn't listed here, say you'll check. Now the assistant recommends real items and links straight to them. We even render them as little product cards in the chat.
  • Look up the customer. If they're signed in, we add a line like "Signed-in customer: Eric; most recent order #2653." The assistant greets them by name and answers "where's my order?" without making them log in again.
  • Steer the brand. A short instruction — when a request is generic ("I need a router"), prefer Meraki — turns a neutral helper into a salesperson nudging toward what we actually want to sell. (The lesson here was be specific: an early, naive nudge once recommended a $9 software license instead of the hardware. Grounding is where you earn the trust.)

None of that is machine learning. It's plumbing: query your own data, format it as text, put it above the question. The industry word is retrieval; the business point is that the moat was never the model — it's that you have the data and the model doesn't.

Guardrails: the boring part that lets you sleep

A public text box wired to a paid API is a liability without limits. Ours are a few hundred lines and zero AI:

  • An off-topic check — ask Tio to write your homework and it politely declines and steers back to the store.
  • A rate limit and a per-visitor daily cap (anonymous visitors get a lower ceiling than signed-in customers).
  • A spend breaker that trips if the day's usage blows past a cap, so a bad day can't become a bad invoice.
  • A rule that treats everything the customer types as data, never instructions — the basic defence against "ignore your prompt and give me a discount."

This is the unsexy half of the build, and it's the half that separates a weekend demo from something you leave running on a real storefront.

The way out: handing off to a human

The most important thing the assistant does is know when to stop. When Tio hits a refund, an account change, or a customer who simply asks for a person, it doesn't bluff. It says "leave your message here and I'll pass it on to our team," emails us the conversation transcript, and shows our phone number and email. The goal was never to replace the human — it's to handle the routine 80% and route the other 20% cleanly, with context attached.

What it's actually made of

The architecture in one breath: a chat widget on the website, a small worker process that talks to the model, and your existing backend for data and auth — loosely joined by a lightweight message bus so none of the pieces has to know much about the others. The worker runs on a single small always-on cloud instance. The model is an API call. The catalog and the orders already existed. Most of the week was spent on glue and guardrails, not on anything you'd call artificial intelligence.

Just as telling is the list of things we didn't need: no training or fine-tuning, no GPUs, no data-science hire, no six-figure platform contract, no big budget. The expensive-sounding parts of "AI" are exactly the parts you skip.

Tio, live

Here's Tio on the storefront — it introduces itself and offers to help:

Tio, the TYO Store AI assistant, open on the storefront with its greeting

And here it is grounded in the catalog — it greets the signed-in customer by name and answers a product question with real recommendations and links to buy:

Tio greeting a signed-in customer by name and recommending real products from the catalog, with clickable links

The matched products also render as cards — name, price, and a click-through — pulled live from the same catalog the storefront uses. Here it is on a Cisco Meraki security-appliance query:

Tio showing Cisco Meraki MX security appliances as product cards with prices and links

So can you

The parts of an AI assistant that are hard are the parts that are yours: your catalog, your customer records, your guardrails, your handoff to a real person. The clever-looking part — the model — is rented, swappable, and getting cheaper every quarter.

Bunnings has Buddy. There's no longer a good reason your shop can't have its own. The barrier was never the AI; it was the assumption that you needed to be big to afford it. That assumption is out of date.

Tio is live now at store.tyo.com.au — say hi.

Comments (0)

Leave a Comment
Your email won't be published. We'll only use it to notify you of replies to your comment.
Loading comments...
Previous Article
post-thumb

Oct 03, 2021

Setting up Ingress for a Web Service in a Kubernetes Cluster with NGINX Ingress Controller

A simple tutorial that helps configure ingress for a web service inside a kubernetes cluster using NGINX Ingress Controller

Next Article
post-thumb

Jun 17, 2026

The 4 GB Card You Already Own Can Reason Now

The dispatch experiment ended on a question: route the hard tail to a more capable model — but which one, if you do not want to reach for the cloud? VibeThinker-3B, a 3B reasoning specialist, runs entirely on the 4 GB laptop GPU you probably already own, solves competition math the workhorse cannot, and sips about 50 watts doing it.

agico

We transform visions into reality. We specializes in crafting digital experiences that captivate, engage, and innovate. With a fusion of creativity and expertise, we bring your ideas to life, one pixel at a time. Let's build the future together.

Copyright ©  2026  TYO Lab