Best Practices July 13, 2026

BCP for Insurance Agencies: When Your Office, Internet, or Carrier Goes Down

David Castillo

Operations Manager

Most agency BCP plans defend against the wrong failure modes. The PDF in your compliance binder talks about hurricanes and pandemics, while the actual outages your operation experiences look like: a single carrier portal down for 6 hours during AEP, a regional ISP outage that takes out your office for a half-day, a vendor that suddenly enforces an MFA change you can't roll out in time, or a payroll system migration that locks 40 agents out of their dialer profiles overnight. The 2026 BCP framework starts from what actually breaks, not what makes for compelling tabletop exercises.

BCP Targets Worth Defending

≤ 2 hrs

RTO target — phones working again

≤ 24 hrs

RPO target — data recovery point

3 / yr

Tabletop exercise minimum

$30K+

Lost revenue per AEP day for a 50-agent op

The Failures That Actually Hit Insurance Agencies

A useful BCP starts with a reality-grounded threat list. After running operations through several years of incidents and reading NAIC's business-continuity guidance and the FFIEC BCP handbook (which insurance regulators frequently echo), the failure-mode list every agency should plan against looks like this:

Real-World Failure Modes, by Frequency

Failure	Typical Frequency	Typical Duration
Carrier portal down	2–6× / year	1–8 hours
ISP / office internet	2–4× / year	30 min – 12 hours
Lead vendor delivery glitch	1–3× / year	Hours – days
Telephony / dialer outage	1–2× / year	15 min – several hours
Regional weather event	~1× / year	Hours – days
Cybersecurity incident	1× / 2–3 yrs	Days – weeks
Office uninhabitable	< 1× / decade	Days – months

Most binders are built for the bottom of this list. The agencies that hold their numbers across a year are the ones that have plans for the top — the boring, frequent, recoverable failures.

Carrier-Portal Outage: The Most Common Real Outage

Carrier portal outages are the most frequent operational disruption an insurance agency experiences. Submission portals go down. Quote engines stop responding. Subscriber-facing portals reroute calls to your floor while their back-end stays offline. The tested response set: monitor each carrier's status page, redirect agents to a secondary product or carrier with similar appetite, switch outbound campaigns to non-affected carriers, capture applications offline for late submission, and have written authority from compliance for what's allowed during the outage window.

Pre-Authorized "Outage Playbook"

Compliance-approved offline application capture saves the day during a carrier outage. The play: agent collects PHI/PII on a pre-printed verification sheet, reads required disclosures from a paper script, captures a recorded affirmation, and submits to the carrier the moment the portal returns. Without pre-authorization from compliance, agents extemporize — which is where market-conduct findings come from.

Office Internet Down: The 2-Hour Recovery Test

Local ISP outages are nearly inevitable in a multi-year operating window. The right RTO target is two hours: phones working again from somewhere, even if not at the primary office. Agencies that have moved to browser-based dialer infrastructure already pass this test by default — agents log in from home, a coworking space, or a hotspot, and production resumes within minutes. Agencies running on installed-software stacks fail this test consistently.

For agencies still office-based, the practical preparation: every agent has a tested home-work setup with the dialer accessible from a personal laptop or browser, every agent's PSTN forwarding works to a personal cell as a last resort, and a published "office is down — work from home" decision protocol that the floor supervisor can invoke unilaterally without a one-hour leadership huddle. Tie this to the broader virtual-call-center playbook in our virtual call center buildout piece — the agencies that built virtually-resilient infrastructure already passed the BCP test as a side effect.

Telephony / Dialer Outage: The Vendor's Failure

When the dialer or telephony platform goes down, your agency is fully dependent on the vendor's recovery posture. The BCP question is twofold: how fast can your vendor recover, and what's your fallback during the outage? The fallback usually looks like: agents handle existing scheduled callbacks via personal cell or a backup VoIP line, no new outbound campaigns run, inbound calls forward to a designated cell-phone tree until restoration, and post-restoration the floor pushes hard on missed callbacks.

This is one place where the agency's vendor selection materially affects the BCP. Browser-based, multi-data-center dialer architecture has a structurally lower probability of region-wide outages than any single-tenant on-prem PBX. When evaluating vendors, ask for their published SLA, historical incident reports for the past 24 months, and the redundancy posture of their platform. The 15 vendor-evaluation questions in our tech buyer's checklist include several SLA and uptime probes worth asking before you sign.

The Single-Vendor Trap

Some agencies depend so heavily on a single platform that an outage there takes everything down: dialer, CRM, recordings, dispositioning. The mitigation is not "have two dialers" — it's exporting your data continuously to a second store you control, so even if a vendor disappears tomorrow, you can rebuild operations on a new platform with last week's data intact.

Cybersecurity Incident: The Plan Most Agencies Don't Have

A ransomware event or a credential-compromise incident is operationally and reputationally devastating, especially for an agency holding PHI/PII at scale. The BCP playbook for this scenario must be written with counsel, not just operations: incident-response retainer with a forensics firm, breach-notification template aligned with state breach laws, carrier notification procedures, employee-communication script, and clear isolation steps to contain a compromise without destroying evidence.

Many state DOIs now have specific cybersecurity event notification requirements (NAIC Insurance Data Security Model Law has been adopted in over 20 states), so the regulatory clock starts ticking the moment an event is identified — typically 72 hours to first notification. Agencies that haven't pre-loaded the regulatory contact list, the legal-counsel call tree, and the notification templates spend the first 24 hours assembling these — exactly when they should be containing the incident.

The Tabletop Exercise That Actually Works

BCP plans that aren't exercised aren't plans. The cadence: three tabletop exercises per year, each ~90 minutes, each focused on a different scenario from the realistic threat list. The format is simple — operations director presents the scenario, principal and key managers walk through the response in real time, gaps are identified and closed within 30 days, and the exercise log goes into the compliance binder. Pre-AEP, mid-year, and post-AEP are the three time slots most agencies converge on.

Tabletop Scenarios Worth Running

Carrier portal down 6 hours during peak AEP — what does the floor do, how does compliance approve offline application capture?
Office ISP failure 2 hours before open — who decides "work from home today," when is the announcement made, who confirms agent connectivity?
Ransomware on the back-office finance system — what's isolated, who's notified within 72 hours, how is payroll run?
Loss of a single key employee (compliance director) — who covers, what tribal knowledge is documented?
Lead-vendor delivery breakage during mid-week peak — what alternate sources are pre-vetted, how fast can spend redirect?

RTO and RPO: The Two Numbers the Plan Has to Quote

Recovery Time Objective (RTO) is the maximum time the business is willing to be down. Recovery Point Objective (RPO) is the maximum data loss measured in time. For an active insurance agency, the right targets are roughly: RTO ≤ 2 hours for operational systems (phones, dialer, CRM), RTO ≤ 24 hours for back-office (commission systems, reporting), RPO ≤ 24 hours for any system holding customer data. Quote these explicitly in the plan; they drive every infrastructure and vendor decision downstream. Pair them with the agency-level controls in your mid-year compliance audit to keep the numbers honest as systems change.

Key Takeaways for Agency Operators

Plan for what actually breaks — carrier portals, ISPs, lead vendors, dialer outages.
Two-hour RTO for phones — anything longer at AEP is a five-figure revenue event.
Pre-authorize the offline carrier playbook — agents extemporizing during outages create market-conduct exposure.
Browser-based infrastructure passes the BCP test by default — installed-software stacks usually don't.
Three tabletops per year — pre-AEP, mid-year, post-AEP, each with documented gap closure.
Cybersecurity incident plan needs counsel — 72-hour notification clocks start the moment you identify the event.

BCP isn't insurance-policy paperwork. It's the operational discipline that lets the floor keep producing through the predictable, frequent failures that hit every multi-year operation. Defend against the boring failures, and the dramatic ones become survivable — not the other way around.

Keep Your Floor Working From Anywhere

AgentTech Dialer's browser-based access means an office outage doesn't stop the day. Agents log in from home, a coffee shop, or a backup site within minutes — the recovery posture your BCP needs, by default.

Try AgentTech Dialer Now

References & Authoritative Sources

The information on this page is supported by the following official and authoritative sources.

1
NAIC — Business Continuity Planning Topic Center NAIC
2
FFIEC — Business Continuity Management Booklet FFIEC
3
NAIC Insurance Data Security Model Law (#668) NAIC
4
NIST SP 800-34 — Contingency Planning Guide NIST