On this page
TL;DR
The NAIC adopted the Model Bulletin on the Use of Artificial Intelligence Systems by Insurers in December 2023, and 24 US states had adopted or proposed versions of it by Q1 2026. Combined with the existing model risk management baseline from Federal Reserve SR 11-7 (2011) and OCC 2011-12 (2011), plus jurisdiction-specific bulletins from Colorado DOI and New York DFS, actuaries now operate under the most prescriptive AI model risk regime in any US-regulated industry. This guide is the practical framework. Source: NAIC Model Bulletin on the Use of AI Systems by Insurers, adopted December 4, 2023. Updated 2026-05-20.
The regulatory stack actuaries operate under in 2026
Insurance AI governance does not start with a clean slate. The supervisory framework for insurance models has 50 years of evolution behind it. AI-specific regulation has been layered on top of that existing framework rather than replacing it, which means actuaries must understand at least five overlapping regimes simultaneously. The stack below is what we recommend chief actuaries and model risk managers map their programmes against.
Layer 1: NAIC AI Model Bulletin (December 2023)
The NAIC Model Bulletin is the most concrete US-wide AI governance statement for insurers. It establishes that an insurer's use of AI systems is subject to all applicable insurance laws and regulations, requires a written AI Systems Programme documented to a level commensurate with the insurer's use of AI, and articulates expectations around governance, risk management, third-party AI vendor oversight, and model validation. The bulletin is not a model law (so it does not require legislative adoption), which is why states moved quickly: as of Q1 2026, more than half of US states have adopted or referenced the bulletin in supervisory guidance.
The trap to avoid: treating the NAIC bulletin as a finished destination. NAIC's Innovation, Cybersecurity, and Technology (H) Committee continues to publish supporting guidance, including the 2024 Big Data and Artificial Intelligence Working Group survey results that inform supervisory examinations. Chief actuaries should track NAIC working group output the way they track ASOP exposures.
Layer 2: State DOI bulletins and examinations
State Departments of Insurance (DOIs) examine insurers using the NAIC bulletin as a framework, but with state-specific overlays. Colorado is the most aggressive: Colorado SB21-169 (2021) and the resulting Division of Insurance regulations (3 CCR 702-10) impose detailed governance, testing, and reporting requirements specifically on life insurers' use of external consumer data and information sources (ECDIS) and predictive models. Other states follow with bulletins of varying specificity: Connecticut, Illinois, Maryland, Minnesota, New York, Oklahoma, Oregon, Pennsylvania, Virginia, Washington.
New York DFS Circular Letter No. 1 (January 2024) is the most prescriptive AI-specific guidance from a state regulator, addressing insurers' use of external consumer data and information sources and artificial intelligence systems with detailed requirements on testing, transparency, governance, and consumer-facing notice. NY DFS examinations in 2025-2026 have used Circular Letter No. 1 as the operative standard.
Layer 3: Model risk management (SR 11-7, OCC 2011-12)
Federal Reserve SR 11-7 (April 2011) and OCC 2011-12 (same date) established the foundational model risk management discipline for US banks - and insurance regulators expect insurers to apply the same discipline to consequential models, including AI. The four-pillar SR 11-7 framework (model development and implementation; use; validation; governance) is now the de facto standard for any quantitative model that affects regulated decisions, including AI-driven underwriting and pricing models.
The practical implication: actuaries cannot treat AI as a separate discipline outside model risk management. The model inventory, the validation function, the effective challenge requirement, the documentation standards - all extend to AI models. Per NAIC's 2024 Big Data and AI survey, 71 percent of state DOIs reported using SR 11-7-equivalent expectations when examining insurer AI programmes.
Layer 4: Fairness, anti-discrimination, and adverse action
Insurance is one of the most heavily regulated decisions when it comes to anti-discrimination, and AI does not change the underlying law. The Fair Housing Act, the Equal Credit Opportunity Act (ECOA), the Civil Rights Act, state-level unfair discrimination statutes, and emerging AI-specific statutes (Colorado SB21-169, NY DFS Circular Letter No. 1) collectively require that AI-driven underwriting and pricing decisions do not produce disparate impact on protected classes, that adverse actions are explainable to consumers, and that insurers can demonstrate the fairness of their models on examination.
The 2024-2025 enforcement trend is unambiguous: state DOIs have asked for fairness test results, disparate impact analyses, and adverse action explanations as a standard examination request. Actuaries who cannot produce these on demand face regulatory findings independent of any actual discrimination outcome.
Layer 5: NIST AI RMF and ISO/IEC 42001 as best-practice anchors
While not directly mandatory, NIST AI RMF (NIST AI 100-1, January 2023) and ISO/IEC 42001 (December 2023) are increasingly cited in NAIC and state DOI guidance as the best-practice anchors for AI governance. Insurers building their AI programmes against the NAIC bulletin alone often miss governance, testing, and documentation expectations that NIST AI RMF and ISO 42001 cover more comprehensively. The Areebi NIST AI RMF hub and ISO 42001 hub walk through the crosswalks.
The SR 11-7-aligned baseline applied to insurance AI
The cleanest way to organise an insurance AI governance programme in 2026 is to start from the SR 11-7 four-pillar model and layer in AI-specific requirements. Each pillar maps to a set of actuarial deliverables that hold up under DOI examination.
| SR 11-7 pillar | Insurance AI deliverable | NAIC bulletin expectation | State overlay (where applicable) |
|---|---|---|---|
| Model development and implementation | Model card, training data documentation, code repository, validation testing record | AI Systems Programme documentation | Colorado: ECDIS testing protocols (3 CCR 702-10) |
| Model use | Approved use case register, user training, monitoring plan, escalation triggers | Risk management and use governance | NY DFS: consumer-facing transparency (Circular Letter No. 1) |
| Model validation | Independent validation report, fairness testing, drift detection, performance monitoring | Independent validation and ongoing monitoring | Colorado: required quantitative testing reports |
| Governance | Board-level AI risk reporting, audit committee briefings, named accountable executive | Board oversight and senior management accountability | NY DFS: board-level governance attestation |
This mapping is what insurance examiners are now using. Programmes built without an explicit crosswalk to SR 11-7 tend to receive findings around effective challenge, validator independence, and documentation depth even when the underlying model risk practice is otherwise sound.
Model documentation that holds up in examination
The single most common DOI finding in 2024-2025 AI examinations was inadequate model documentation. Actuaries who built models under traditional GLM or credibility-weighted frameworks have decades of documentation practice; the documentation expectation for AI models is comparable in rigor but different in content.
The minimum AI model documentation set we recommend, aligned with NAIC, NY DFS Circular Letter No. 1, and Colorado 3 CCR 702-10:
- Purpose and scope. The specific business purpose, the decision the model informs, the regulatory frameworks the model is subject to, and the populations affected.
- Data sources and lineage. All data inputs including External Consumer Data and Information Sources (ECDIS), with vendor agreements, retention periods, refresh cadence, and known limitations. Per Colorado 3 CCR 702-10, ECDIS sources require specific disclosure.
- Methodology and algorithm. The model class (GLM, tree ensemble, neural network, ensemble), the feature engineering steps, the training procedure, and any human-in-the-loop adjustments.
- Validation results. Out-of-time and out-of-sample performance, calibration analysis, residual analysis, and benchmarking against challenger models. Independent validation must be performed by a function reporting outside the model development team.
- Fairness and disparate impact testing. Quantitative tests for disparate impact on protected classes (sex, race, national origin, age where applicable, plus state-specific protected classes), with thresholds and remediation plans for any findings. This is the deliverable that has caused the most examination findings.
- Adverse action and consumer disclosure. The specific reasons that will be communicated to consumers receiving adverse decisions, including the principal reasons under ECOA and any AI-specific transparency obligations under state law.
- Monitoring plan. Drift detection thresholds, recalibration triggers, escalation paths, and the cadence of revalidation. Monitoring is a separate deliverable, not an afterthought.
- Governance log. Approvals, sign-offs, exception decisions, and the audit committee briefing record.
The Areebi audit log captures most of the operational evidence (item 7 and 8) as a byproduct of model use; the analytical deliverables (1-6) remain the actuarial team's primary work product.
Fairness testing: the deliverable that decides the examination
Of all AI governance deliverables, fairness testing is the one where actuaries are most likely to fail an examination if they have not planned ahead. The traditional actuarial approach (model is fair because the rating variables are causally justified) does not meet the 2026 disparate impact standard, particularly under Colorado 3 CCR 702-10 and NY DFS Circular Letter No. 1.
The expected test programme. Quantitative fairness tests must be run before deployment and on a defined cadence thereafter (typically quarterly for high-impact models). The test set includes: statistical parity difference, equalised odds, disparate impact ratio, calibration by group, and group-conditional false positive and false negative rates. Each metric must be computed for each protected class and against documented thresholds.
The remediation expectation. When a model fails a fairness threshold, examiners expect: documented analysis of the cause, evaluation of remediation options (data adjustment, feature removal, post-processing, model retraining), a documented decision and rationale, and ongoing monitoring after remediation. The expectation is not zero disparate impact - it is documented effort to identify and address it.
The Colorado-specific overlay. Colorado 3 CCR 702-10 requires life insurers to provide the Division with the quantitative testing methodology and results for ECDIS and algorithms, including testing for unfair discrimination. Insurers operating in Colorado must have a Colorado-specific testing pack ready for examiner request. Per Colorado DOI 2024-2025 examination patterns, fairness testing has been the most common request and the most common finding.
See Areebi in action
Get a 30-minute personalised demo tailored to your industry, team size, and compliance requirements.
Get a DemoDrift, recalibration, and the monitoring discipline
AI models in insurance underwriting and pricing drift faster than traditional rating models because the inputs (consumer behaviour, third-party data sources, market conditions) change faster. The 2026 supervisory expectation is that insurers monitor drift continuously, detect material drift quickly, and recalibrate on a defined cadence.
What examiners look for in monitoring. A documented monitoring plan covering: input data drift (covariate shift on key features), output distribution drift, performance drift (accuracy, calibration, fairness metrics), and concept drift (the underlying relationship between inputs and outcomes). Thresholds must be defined per metric, with escalation paths and decision rights. The monitoring log must show actual triggered events and the responses.
What gets insurers in trouble. Monitoring plans that look comprehensive on paper but show no triggered events over months of operation. Either the model is genuinely stable, in which case the evidence must support that conclusion, or the monitoring is mis-calibrated and material drift is going undetected. Examiners now ask for the alert history and resolution log as a standard examination request.
Our AI agent monitoring guide covers the technical patterns; the Areebi audit log captures the operational evidence.
Third-party AI and ECDIS: the vendor question
The majority of insurance AI exposure in 2026 comes through third-party models and External Consumer Data and Information Sources (ECDIS), not internally built models. The NAIC bulletin makes clear that insurers remain responsible for compliance regardless of whether the model or data source is internal or external, which means vendor oversight is now an actuarial concern, not just a procurement concern.
The minimum vendor oversight programme. Each AI vendor and ECDIS source must have: a documented use case scope, a contract that grants the insurer audit rights and includes deletion SLA, an independent validation report or equivalent technical attestation, a fairness testing protocol (either run by the vendor or by the insurer), and named risk owners on both sides. Tier 1 vendors (those providing models or data that drive underwriting or pricing decisions) require additional documentation including model lineage, training data provenance, and refresh cadence.
The Colorado ECDIS-specific expectation. Colorado 3 CCR 702-10 imposes a specific obligation: insurers using ECDIS in life underwriting must conduct quantitative testing, document the methodology and results, and make the documentation available to the Division. The vendor cannot perform this testing in lieu of the insurer; the insurer remains accountable. See the AI vendor list guide for the inventory side of this programme.
Common pitfalls in insurance AI governance
Three failure patterns recur across insurance AI examinations.
Pitfall 1: SR 11-7 was for banks, AI is different. Some insurance teams resist applying SR 11-7-style discipline to AI on the grounds that AI is a different paradigm. State DOIs that have adopted the NAIC bulletin almost universally expect SR 11-7-aligned model risk management. Avoid this by mapping the existing actuarial model risk framework to AI explicitly, retaining the same independence, effective challenge, and documentation standards.
Pitfall 2: Fairness testing as a one-off deployment gate. Insurers run a fairness test at deployment, document a passing result, and never re-test. State DOIs expect ongoing testing, particularly when models are retrained, data sources change, or the population in scope shifts. Avoid this by including fairness metrics in the quarterly monitoring pack, with the same thresholds and escalation paths as performance monitoring.
Pitfall 3: Treating AI explainability as a technical artefact rather than a consumer disclosure. The principal reasons for an adverse action must be communicated to the consumer in plain language. Explainability outputs (SHAP values, partial dependence plots) are technical artefacts that do not directly satisfy the consumer disclosure requirement. Avoid this by maintaining a parallel "consumer reason" mapping that translates technical explainability outputs into the language used in adverse action notices, and validating the mapping on examination.
At Areebi, we built the platform's audit and policy layers specifically to produce evidence that maps to all four SR 11-7 pillars plus the NAIC bulletin expectations as a single dataset, which is the integration step most insurance programmes miss.
What to read next
To complete an insurance AI governance reading set.
- NIST AI RMF compliance hub - the canonical Areebi reference that NAIC and state DOI guidance increasingly aligns with.
- ISO/IEC 42001 certification guide - the management system standard that pairs with the NAIC AI Systems Programme expectation.
- AI agent monitoring and observability - the technical companion on drift, recalibration, and operational evidence.
- AI vendor list for CFOs - the third-party AI inventory that maps to the NAIC vendor oversight requirement.
- DORA and AI for financial institutions - the EU operational resilience overlay that increasingly intersects insurance AI governance.
Sources
- NAIC Model Bulletin on the Use of Artificial Intelligence Systems by Insurers - Adopted December 4, 2023. content.naic.org
- Federal Reserve SR 11-7 - Supervisory Guidance on Model Risk Management, April 4, 2011. federalreserve.gov/supervisionreg/srletters/sr1107
- OCC Bulletin 2011-12 - Sound Practices for Model Risk Management. occ.gov/bulletins/2011/2011-12
- Colorado Division of Insurance Regulation 3 CCR 702-10 - Governance and Risk Management Framework Requirements for Life Insurance Use of External Consumer Data and Information Sources, Algorithms, and Predictive Models. doi.colorado.gov
- NY DFS Circular Letter No. 1 (2024) - Use of Artificial Intelligence Systems and External Consumer Data and Information Sources in Insurance Underwriting and Pricing. dfs.ny.gov/industry_guidance/circular_letters/cl2024_01
- NIST AI Risk Management Framework (AI RMF 1.0) - NIST AI 100-1, January 26, 2023. nvlpubs.nist.gov
Frequently Asked Questions
Is the NAIC AI Model Bulletin legally binding?
The NAIC bulletin itself is not legally binding because NAIC is a regulator coordinating body, not a regulator. It becomes binding when adopted by a state Department of Insurance, either through formal regulation or supervisory bulletin. As of Q1 2026, more than half of US states have adopted or referenced the bulletin in supervisory guidance, and state DOI examinations routinely use the bulletin as the operative framework. Insurers operating in any of these states should treat the bulletin's expectations as effectively mandatory.
Does SR 11-7 apply to insurance companies?
SR 11-7 was issued by the Federal Reserve and OCC and is directly applicable to banks. It is not directly issued to insurance companies. However, NAIC bulletin guidance and state DOI examinations now expect insurers to apply SR 11-7-equivalent model risk management discipline to consequential models including AI. Per NAIC's 2024 Big Data and AI survey, 71 percent of state DOIs reported using SR 11-7-equivalent expectations when examining insurer AI programmes. Insurance actuaries who structure their AI programmes around the four SR 11-7 pillars consistently fare better in examinations.
What is ECDIS and why does Colorado regulate it specifically?
ECDIS stands for External Consumer Data and Information Sources - any data about a consumer obtained from a source other than the consumer or an insurance application. Examples include credit-based insurance scores, telematics data, motor vehicle records, public records, and data from data brokers. Colorado SB21-169 and the resulting regulation 3 CCR 702-10 impose specific governance, testing, and reporting requirements on life insurers using ECDIS and predictive models, in response to concerns about disparate impact on protected classes. Other states are following with similar requirements.
How often must fairness testing be repeated?
At minimum, fairness testing must be repeated when the model is retrained, when significant data sources change, when the population in scope shifts materially, and on a defined ongoing cadence (typically quarterly for high-impact models). The cadence must be documented in the monitoring plan and the actual testing log must show consistent execution. Colorado 3 CCR 702-10 and NY DFS Circular Letter No. 1 both contemplate ongoing testing rather than one-off deployment testing.
What documentation does a state DOI examiner ask for in an AI examination?
Based on examination patterns in Colorado, New York, Connecticut, and Illinois in 2024-2025: the AI Systems Programme document; the model inventory with risk tiering; for each high-impact model, a complete documentation package (purpose, data sources, methodology, validation, fairness testing, monitoring, governance); the independent validation function's organisational structure and recent validation reports; vendor agreements for any third-party AI or ECDIS; the consumer disclosure language used for adverse actions; the monitoring alert history and resolution log; and audit committee briefing materials covering AI risk.
Can we use a vendor to handle fairness testing for us?
The vendor can perform the technical work but the insurer remains accountable. Colorado 3 CCR 702-10 explicitly contemplates that the insurer must document the testing methodology and results and make them available to the Division on examination - the insurer cannot substitute the vendor's accountability for its own. The practical pattern that works: the vendor produces test outputs; the insurer's model validation function performs effective challenge on the methodology and reviews the results; the insurer retains the deliverables and signs the attestation. Vendor outputs without insurer review do not satisfy the regulatory expectation.
Related Resources
Stay ahead of AI governance
Weekly insights on enterprise AI security, compliance updates, and governance best practices.
Stay ahead of AI governance
Weekly insights on enterprise AI security, compliance updates, and best practices.
About the Author
Areebi Research
The Areebi research team combines hands-on enterprise security work with deep AI governance research. Our analysis is informed by primary sources (NIST, ISO, OECD, federal registers, IAPP) and the operational realities of CISOs running AI programs in regulated industries today.
Ready to govern your AI?
See how Areebi can help your organization adopt AI securely and compliantly.