Building your first invoice-chase agent

A real, ugly walk-through of an agent that nudges late invoices without sounding like a robot.

Every business has the same leak. You send the invoice. The customer is good for it. Forty-seven days later you’re writing the third polite-but-firm follow-up while your bookkeeper is on holiday. Multiply that by sixty open invoices and you’re running a part-time collections department out of your inbox.

This is the easiest agent to build first. The data is already in your accounting tool. The rules are obvious. The downside of getting it wrong is small (worst case: an awkward email a human would’ve sent anyway). And the upside is real money you already earned, sitting in someone else’s account.

Here’s the version we keep building. Not pretty. Works.

What it actually does

Once a day, the agent asks your accounting system for every invoice that’s past due, looks at how long ago you last contacted that customer, and decides whether to send a follow-up — and what tone to use.

Day 3 past due: a soft nudge. “Just flagging this in case it slipped.”
Day 14 past due: a clearer ask. References the invoice number, the amount, the due date.
Day 30 past due: a firm message that mentions consequences without threats. Pings you in Slack.
Day 45+ past due: stops. Writes a hand-off note, drops it in your queue. A human takes it from there.

That’s the whole behavior. It’s not magic. The reason it works is that the agent doesn’t forget, doesn’t skip days, doesn’t accidentally double-send, and doesn’t write the same robotic copy every time.

The four pieces

We use the same four pieces for every operational agent we build. This is the smallest version that survives contact with reality.

1. A read of the world

A scheduled job pulls open invoices, customer info, last-contact timestamps, and any recent payments. We don’t use webhooks here. Cron is fine. The agent doesn’t need to be real-time; nobody’s waiting for a 9:03 AM nudge instead of a 9:00 AM one.

// every weekday, 9:00 IL time
schedule('0 9 * * 1-5', async () => {
  const invoices = await accounting.listOverdueInvoices();
  for (const invoice of invoices) {
    await maybeChase(invoice);
  }
});

2. A rules layer (not the LLM)

This is the bit founders skip and regret. The decision of whether to send a message is not an LLM call. It’s a deterministic rule. Days past due, last contact, payment status, do-not-contact flag. If you let the model decide whether to message someone, eventually it will message someone twice on the same day, or three times in a week, or someone who paid yesterday but the data hadn’t synced.

function shouldChase(invoice, lastContact, today) {
  if (invoice.doNotContact) return null;
  if (invoice.paidAt) return null;
  const daysOverdue = daysBetween(invoice.dueDate, today);
  const daysSinceContact = lastContact
    ? daysBetween(lastContact, today)
    : Infinity;

  if (daysOverdue >= 45) return 'handoff';
  if (daysOverdue >= 30 && daysSinceContact >= 7) return 'firm';
  if (daysOverdue >= 14 && daysSinceContact >= 7) return 'clear';
  if (daysOverdue >= 3 && daysSinceContact >= 14) return 'soft';
  return null;
}

Boring code. That’s the point. The agent calls this function and gets back one of four words or null. Most days it gets null. That’s good. A useful invoice-chase agent does nothing on most days for most invoices.

3. The model — for the words, only

Now the LLM gets to do what it’s good at: write a short, in-tone, non-robotic message that mentions the invoice number, the amount, the due date, and the customer’s name. We give it the tone (“soft” / “clear” / “firm”) and the company’s voice rules. We do not give it the decision power.

const draft = await claude.messages.create({
  model: 'claude-haiku-4-5',
  system: COMPANY_VOICE,
  messages: [{
    role: 'user',
    content: `Write a follow-up email at the "${tone}" level for invoice
    ${invoice.number}, amount ${invoice.amount}, due ${invoice.dueDate}.
    Customer: ${customer.name}. Their last reply: "${customer.lastReply ?? 'n/a'}".
    Constraints: under 80 words, mention the invoice number once,
    no threats, no exclamation marks, no "kindly".`,
  }],
});

We log the draft. Always. If something goes weird, we want to know what the model was about to send.

4. A guardrail before sending

Before the email goes out, a final deterministic check confirms: amount is correct, customer email is current, do-not-contact flag is still false, no payment came in between the rules check and now. The check takes one database read. It’s the cheapest insurance you can buy and it’s saved us from sending a chase email ten minutes after a customer paid.

The wrinkles you hit on day three

Every agent we’ve shipped has the same set of wrinkles in week one. Plan for them in advance.

Partial payments

A customer paid half. Your accounting system might or might not track this cleanly. Treat any invoice with a non-zero payment in the last 14 days as off-limits until a human looks at it. The cost of being wrong here is “customer paid you something and you yelled at them.” Don’t.

Sync delay

Your accounting system polls the bank. The bank takes a day. The customer paid via bank transfer Tuesday morning; your system thinks they’re late on Tuesday evening. Add a 48-hour grace window beyond the technical due date for any invoice with a method that’s not real-time.

Retainers and recurring

If the same customer has a recurring invoice and a one-off project invoice, you do not want to fire two messages on the same day. Group by customer before generating any drafts. One human-feeling message per customer per day, max.

The do-not-contact flag

Some customers are mid-renegotiation. Some are fired. Some are on hold pending a legal thing. You need a manual override that says “leave this customer alone until I say so.” This is a column in your customer table. The agent reads it. That’s the whole feature.

Where this fails without us

We’ve seen people try to build this with a single LLM call and a prompt that says “please follow up on overdue invoices appropriately.” It doesn’t work. Not because the model isn’t smart enough — because the system around it has no memory of what was sent yesterday, no idempotency, no logging of why it made a decision, and no way to roll back when it does something stupid.

The model is one of four pieces. The other three are what makes this an agent and not a chatbot.

Three days, not three months

The first version of this agent should ship in three working days, not three months. Day one: read your invoices, log what it would have done. No sending. Day two: send only the “soft” tone, only to a hand-picked test cohort of three customers. Day three: turn on the rest, watch logs hard for a week.

Then it runs. Quietly. Forever. And one day you notice your DSO dropped by twelve days and your bookkeeper is back to doing actual books.

If you want us to build it

This is exactly what our $99/mo-per-agent service line does. We do the data plumbing, the rules layer, the model integration, the guardrails, the logging. You point us at your accounting tool, give us a sample of three invoices and three of your real follow-up emails (so the agent matches your voice), and the thing is live in a week.

Or build it yourself with the four pieces above. Either way, stop chasing invoices manually. It’s 2026.