Skip to main content

Compliance Checking Service Implementation Guide

Overview

This document provides detailed implementation guidance for the regulatory compliance checking service that leverages enriched chemical data to automate compliance assessments.

Table of Contents

  1. Compliance Check Types
  2. Data Sources & Seeding
  3. Database Schema
  4. Service Implementation
  5. Trigger Points
  6. API Endpoints
  7. Frontend Integration
  8. Background Jobs
  9. Testing Strategy

Compliance Check Types

1. OSHA HazCom Requirements

What It Checks:

  • Is the chemical hazardous (per GHS classification)?
  • Is the chemical on the company's written inventory?
  • Are workers trained on this specific chemical?

Determination Logic:

def check_hazcom_required(composition: List[SDSComposition]) -> HazComRequirement:
"""
HazCom applies to ANY hazardous chemical in the workplace.
Source: OSHA 29 CFR 1910.1200
"""
hazardous_chemicals = [
comp for comp in composition
if comp.is_hazardous
]

return HazComRequirement(
required=len(hazardous_chemicals) > 0,
chemicals=hazardous_chemicals,
requirements=[
"Maintain written hazard communication program",
"Keep SDS accessible to employees",
"Ensure proper container labeling",
"Train employees on chemical hazards"
]
)

Data Source: Extracted from SDS Section 2 (GHS classification) - no external data needed.


2. Exposure Monitoring Requirements

What It Checks:

  • Does this chemical have an OSHA PEL?
  • Is there a substance-specific OSHA standard (29 CFR 1910.1001-1052)?
  • What monitoring frequency is required?

Determination Logic:

# OSHA substance-specific standards requiring monitoring
OSHA_SPECIFIC_STANDARDS = {
# CAS Number: (Standard, Monitoring Requirements)
'71-43-2': ('1910.1028 Benzene', 'initial + periodic if >AL'),
'50-00-0': ('1910.1048 Formaldehyde', 'initial + periodic if >AL'),
'7439-92-1': ('1910.1025 Lead', 'initial + every 6 months'),
'7440-43-9': ('1910.1027 Cadmium', 'initial + periodic'),
'1332-21-4': ('1910.1001 Asbestos', 'initial + periodic'),
'75-21-8': ('1910.1047 Ethylene Oxide', 'initial + periodic if >AL'),
'106-99-0': ('1910.1051 1,3-Butadiene', 'initial + periodic if >AL'),
'107-13-1': ('1910.1045 Acrylonitrile', 'initial + periodic if >AL'),
'75-01-4': ('1910.1017 Vinyl Chloride', 'initial + periodic'),
'7784-42-1': ('1910.1018 Arsenic', 'initial + periodic if >AL'),
'13463-39-3': ('1910.1026 Chromium VI', 'initial + periodic'),
'100-42-5': ('1910.1052 Methylene Chloride', 'initial + periodic if >AL'),
}

def check_exposure_monitoring(
cas_number: str,
enriched_data: Optional[PubChemCache],
composition_percent: float
) -> ExposureMonitoringRequirement:
"""
Determine exposure monitoring requirements.

Hierarchy:
1. Substance-specific OSHA standard (mandatory)
2. OSHA PEL listed (recommended)
3. NIOSH REL only (advisory)
"""
result = ExposureMonitoringRequirement(
required=False,
frequency=None,
method=None,
action_level=None,
pel=None
)

# Check for substance-specific standard (highest priority)
if cas_number in OSHA_SPECIFIC_STANDARDS:
standard, monitoring = OSHA_SPECIFIC_STANDARDS[cas_number]
result.required = True
result.frequency = 'per_osha_standard'
result.osha_standard = standard
result.notes = f"Subject to OSHA {standard}. {monitoring}"
result.record_retention_years = 30 # Most require 30-year retention

# Check for PEL (from enriched data)
elif enriched_data and enriched_data.osha_pel_ppm:
result.required = True
result.frequency = 'initial_then_as_needed'
result.pel = enriched_data.osha_pel_ppm
result.action_level = enriched_data.osha_pel_ppm * 0.5 # AL = 50% of PEL
result.notes = f"PEL: {enriched_data.osha_pel_ppm} ppm. Monitor if exposure may exceed AL."
result.record_retention_years = 30

# NIOSH REL only (advisory)
elif enriched_data and enriched_data.niosh_rel_ppm:
result.required = False # Not legally required
result.frequency = 'recommended'
result.notes = f"No OSHA PEL. NIOSH REL: {enriched_data.niosh_rel_ppm} ppm (advisory)."

return result

Data Sources:

  • chemiq_pubchem_cache.osha_pel_ppm - from NIOSH Pocket Guide
  • chemiq_pubchem_cache.niosh_rel_ppm - from NIOSH Pocket Guide
  • Hardcoded list of OSHA substance-specific standards (~30 chemicals)

3. TRI Reporting (SARA 313)

What It Checks:

  • Is the chemical on the EPA TRI list?
  • Does annual usage exceed the reporting threshold?
  • Which Form (R or A) is required?

Determination Logic:

# TRI thresholds (most chemicals are 10,000 or 25,000 lbs)
# Some PBT chemicals have lower thresholds (10, 100, or 1,000 lbs)

def check_tri_reporting(
cas_number: str,
reg_data: ChemiqRegulatoryLists,
composition_percent: float,
annual_product_usage_lbs: float
) -> TRIReportingRequirement:
"""
Check SARA 313 / TRI reporting requirements.

TRI Form R required if:
1. Chemical is on TRI list AND
2. Company has 10+ FTEs in manufacturing/processing AND
3. Annual usage exceeds threshold (usually 10,000 or 25,000 lbs)

Form A (short form) allowed for lower releases.
"""
result = TRIReportingRequirement(
reportable=False,
form_type=None,
deadline=None,
chemical_usage_lbs=0,
threshold_lbs=0
)

if not reg_data or not reg_data.is_epa_sara_313:
return result

# Calculate how much of this specific chemical was used
threshold = reg_data.sara_313_threshold_lbs or 10000
chemical_usage_lbs = annual_product_usage_lbs * (composition_percent / 100)

result.threshold_lbs = threshold
result.chemical_usage_lbs = chemical_usage_lbs

if chemical_usage_lbs >= threshold:
result.reportable = True
result.form_type = 'Form R'
result.deadline = 'July 1 of following year'
result.notes = (
f"TRI reporting required. Annual usage ({chemical_usage_lbs:.0f} lbs) "
f"exceeds threshold ({threshold} lbs)."
)

# Form A certification allowed for lower release amounts
# (< 500 lbs total release AND no release to water)
result.form_a_eligible = chemical_usage_lbs < 5000 # Simplified check

return result

Data Source: EPA TRI Chemical List

Seeding the TRI List:

# Sample TRI list structure for seeding
TRI_CHEMICALS = [
# CAS, Chemical Name, Threshold (lbs), PBT flag, Notes
('67-64-1', 'Acetone', 10000, False, None),
('71-43-2', 'Benzene', 10000, False, 'Carcinogen'),
('7439-92-1', 'Lead', 100, True, 'PBT chemical - lower threshold'),
('7439-97-6', 'Mercury', 10, True, 'PBT chemical - 10 lb threshold'),
('50-00-0', 'Formaldehyde', 10000, False, 'Carcinogen'),
('7664-93-9', 'Sulfuric acid', 10000, False, 'Aerosol form only'),
# ... ~770 more chemicals
]

async def seed_tri_list(db: Session):
"""Pre-load EPA TRI chemical list into regulatory_lists table."""
for cas, name, threshold, is_pbt, notes in TRI_CHEMICALS:
existing = db.query(ChemiqRegulatoryLists).filter(
ChemiqRegulatoryLists.cas_number == cas
).first()

if existing:
existing.is_epa_sara_313 = True
existing.sara_313_threshold_lbs = threshold
else:
entry = ChemiqRegulatoryLists(
cas_number=cas,
is_epa_sara_313=True,
sara_313_threshold_lbs=threshold
)
db.add(entry)

db.commit()

4. CERCLA Reportable Quantities

What It Checks:

  • Is the chemical a CERCLA hazardous substance?
  • What is the Reportable Quantity (RQ)?

Purpose: Alert users about spill reporting obligations - releases >= RQ within 24 hours must be reported to the National Response Center.

Determination Logic:

def check_cercla_rq(
cas_number: str,
reg_data: ChemiqRegulatoryLists
) -> CERCLARequirement:
"""
Check CERCLA reportable quantity requirements.

Source: 40 CFR 302.4 - Table 302.4
RQs range from 1 lb to 5,000 lbs depending on chemical.
"""
result = CERCLARequirement(
has_rq=False,
rq_lbs=None,
reporting_required=False
)

if not reg_data or not reg_data.is_epa_cercla:
return result

if reg_data.cercla_rq_lbs:
result.has_rq = True
result.rq_lbs = reg_data.cercla_rq_lbs
result.notes = (
f"CERCLA RQ: {reg_data.cercla_rq_lbs} lbs. "
f"Releases >= {reg_data.cercla_rq_lbs} lbs within 24 hours must be reported "
f"to NRC (1-800-424-8802)."
)

return result

Data Source: 40 CFR 302.4 - Table 302.4

  • ~800 hazardous substances with RQs
  • Static list (changes require rulemaking)

Sample CERCLA Data:

CERCLA_SUBSTANCES = [
# CAS, Chemical Name, RQ (lbs)
('67-64-1', 'Acetone', 5000),
('71-43-2', 'Benzene', 10),
('7664-93-9', 'Sulfuric acid', 1000),
('7647-01-0', 'Hydrogen chloride', 5000),
('7439-92-1', 'Lead', 10),
('7439-97-6', 'Mercury', 1),
('50-00-0', 'Formaldehyde', 100),
('7664-38-2', 'Phosphoric acid', 5000),
# ... ~800 more substances
]

5. California Proposition 65

What It Checks:

  • Is the chemical on the Prop 65 list?
  • What type of warning is required (cancer, reproductive, or both)?
  • Does the company ship to or operate in California?

Determination Logic:

def check_prop65_requirements(
cas_number: str,
reg_data: ChemiqRegulatoryLists,
company_state: str,
ships_to_california: bool = False
) -> Prop65Requirement:
"""
Check California Proposition 65 requirements.

Source: OEHHA Proposition 65 List
https://oehha.ca.gov/proposition-65/proposition-65-list

Prop 65 applies to:
- Companies with 10+ employees doing business in California
- Products sold in California
"""
result = Prop65Requirement(
warning_required=False,
chemical_type=None,
warning_text=None
)

# Only applies if operating/selling in California
if company_state != 'CA' and not ships_to_california:
return result

if not reg_data or not reg_data.is_california_prop65:
return result

result.warning_required = True
result.chemical_type = reg_data.prop65_type # 'cancer', 'reproductive', 'both'

# Standard warning text per California Code of Regulations, Title 27
if reg_data.prop65_type == 'cancer':
result.warning_text = (
"⚠️ WARNING: This product can expose you to chemicals including "
f"{reg_data.chemical_name}, which is known to the State of California "
"to cause cancer. For more information go to www.P65Warnings.ca.gov."
)
elif reg_data.prop65_type == 'reproductive':
result.warning_text = (
"⚠️ WARNING: This product can expose you to chemicals including "
f"{reg_data.chemical_name}, which is known to the State of California "
"to cause birth defects or other reproductive harm. "
"For more information go to www.P65Warnings.ca.gov."
)
elif reg_data.prop65_type == 'both':
result.warning_text = (
"⚠️ WARNING: This product can expose you to chemicals including "
f"{reg_data.chemical_name}, which is known to the State of California "
"to cause cancer and birth defects or other reproductive harm. "
"For more information go to www.P65Warnings.ca.gov."
)

return result

Data Source: CA OEHHA Proposition 65 List

Prop 65 Data Structure:

PROP65_CHEMICALS = [
# CAS, Chemical Name, Type ('cancer', 'reproductive', 'both'), Listed Date
('71-43-2', 'Benzene', 'cancer', '1987-02-27'),
('50-00-0', 'Formaldehyde', 'cancer', '1988-01-01'),
('7439-92-1', 'Lead and lead compounds', 'both', '1987-02-27'),
('108-95-2', 'Phenol', 'reproductive', '2014-12-19'),
('127-18-4', 'Tetrachloroethylene', 'cancer', '1988-01-01'),
('79-01-6', 'Trichloroethylene', 'cancer', '1988-01-01'),
('75-09-2', 'Methylene chloride', 'cancer', '1988-02-27'),
# ... ~900 more chemicals
]

6. Carcinogen Classification

What It Checks:

  • Is the chemical classified as a carcinogen by IARC, NTP, or OSHA?
  • What is the classification level?

Determination Logic:

# IARC Classifications
IARC_GROUPS = {
'1': 'Carcinogenic to humans',
'2A': 'Probably carcinogenic to humans',
'2B': 'Possibly carcinogenic to humans',
'3': 'Not classifiable as to carcinogenicity',
}

# NTP Classifications
NTP_CLASSES = {
'K': 'Known to be a human carcinogen',
'R': 'Reasonably anticipated to be a human carcinogen',
}

def check_carcinogen_status(
cas_number: str,
reg_data: ChemiqRegulatoryLists
) -> CarcinogenRequirement:
"""
Check carcinogen classification and associated requirements.

Sources:
- IARC Monographs: https://monographs.iarc.who.int/list-of-classifications
- NTP Report on Carcinogens: https://ntp.niehs.nih.gov/whatwestudy/assessments/cancer/roc
- OSHA regulated carcinogens: 29 CFR 1910.1003-1016
"""
result = CarcinogenRequirement(
is_carcinogen=False,
classification=None,
source=None,
requirements=[]
)

if not reg_data or not reg_data.is_carcinogen:
return result

result.is_carcinogen = True
result.classification = reg_data.carcinogen_classification
result.source = reg_data.carcinogen_source # 'IARC', 'NTP', 'OSHA'

# Special requirements for carcinogens
result.requirements = [
"Minimize employee exposure to lowest feasible level",
"Implement engineering controls before PPE",
"Establish regulated areas with access controls",
"Provide annual medical surveillance",
"Maintain exposure records for 30 years",
"Post warning signs in regulated areas"
]

# Additional requirements for OSHA-regulated carcinogens
if reg_data.has_osha_carcinogen_standard:
result.requirements.append(
f"Comply with OSHA substance-specific standard {reg_data.osha_standard}"
)

return result

Data Sources:

  • IARC Monographs: ~120 Group 1 + ~90 Group 2A chemicals
  • NTP Report on Carcinogens: ~250 chemicals (overlaps with IARC)
  • Pre-load known carcinogens during NIOSH Pocket Guide seeding

Data Sources & Seeding

Master Seeding Strategy

# app/scripts/seed_regulatory_data.py

async def seed_all_regulatory_data(db: Session):
"""
Master function to seed all regulatory reference data.
Run once during initial setup, then periodically for updates.
"""
print("Starting regulatory data seeding...")

# 1. NIOSH Pocket Guide (~600 chemicals)
# Contains: PEL, REL, IDLH, carcinogen status
print(" Loading NIOSH Pocket Guide...")
await seed_niosh_pocket_guide(db)

# 2. EPA TRI List (~770 chemicals)
# Contains: SARA 313 status, reporting thresholds
print(" Loading EPA TRI List...")
await seed_tri_list(db)

# 3. CERCLA Table 302.4 (~800 substances)
# Contains: Reportable quantities
print(" Loading CERCLA RQs...")
await seed_cercla_rqs(db)

# 4. California Prop 65 List (~900 chemicals)
# Contains: Cancer/reproductive toxicant status
print(" Loading CA Prop 65 List...")
await seed_prop65_list(db)

# 5. OSHA Substance-Specific Standards (~30 chemicals)
# Contains: Special monitoring/medical requirements
print(" Loading OSHA specific standards...")
await seed_osha_specific_standards(db)

print("Regulatory data seeding complete!")

# Summary
stats = get_seeding_stats(db)
print(f" Total chemicals in regulatory_lists: {stats['total']}")
print(f" With OSHA PEL: {stats['with_pel']}")
print(f" TRI Listed: {stats['tri_listed']}")
print(f" Prop 65 Listed: {stats['prop65_listed']}")
print(f" Carcinogens: {stats['carcinogens']}")

Data File Locations

tellus-ehs-hazcom-service/
├── app/
│ └── data/
│ └── regulatory/
│ ├── niosh_pocket_guide.json # ~600 chemicals
│ ├── epa_tri_list.json # ~770 chemicals
│ ├── cercla_rqs.json # ~800 substances
│ ├── prop65_list.json # ~900 chemicals
│ └── osha_specific_standards.json # ~30 chemicals

Update Schedule

Data SourceUpdate FrequencyHow to Update
NIOSH Pocket GuideAnnually (Sept)Download from CDC, convert to JSON
EPA TRI ListAnnually (Jan 1)Download from EPA, convert to JSON
CERCLA RQsRarely (rulemaking)Monitor Federal Register
CA Prop 65QuarterlyDownload from OEHHA, convert to JSON
OSHA StandardsRarelyMonitor Federal Register

Database Schema

Enhanced Regulatory Lists Table

-- chemiq_regulatory_lists table (enhanced)
CREATE TABLE chemiq_regulatory_lists (
list_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
cas_number VARCHAR(20) NOT NULL UNIQUE,
chemical_name VARCHAR(255),

-- OSHA
is_osha_pel_listed BOOLEAN DEFAULT FALSE,
has_osha_specific_standard BOOLEAN DEFAULT FALSE,
osha_standard_citation VARCHAR(50), -- e.g., '1910.1028'

-- NIOSH
is_niosh_rel_listed BOOLEAN DEFAULT FALSE,

-- ACGIH
is_acgih_tlv_listed BOOLEAN DEFAULT FALSE,

-- EPA TRI (SARA 313)
is_epa_sara_313 BOOLEAN DEFAULT FALSE,
sara_313_threshold_lbs INT DEFAULT 10000,
sara_313_category VARCHAR(50), -- 'manufactured', 'processed', 'otherwise_used'
sara_313_pbt BOOLEAN DEFAULT FALSE, -- Persistent Bioaccumulative Toxic

-- EPA CERCLA
is_epa_cercla BOOLEAN DEFAULT FALSE,
cercla_rq_lbs INT,

-- California Prop 65
is_california_prop65 BOOLEAN DEFAULT FALSE,
prop65_type VARCHAR(20), -- 'cancer', 'reproductive', 'both'
prop65_listing_date DATE,
prop65_nsrl_ug DECIMAL(10,4), -- No Significant Risk Level (cancer)
prop65_madl_ug DECIMAL(10,4), -- Maximum Allowable Dose Level (reproductive)

-- Carcinogen Status
is_carcinogen BOOLEAN DEFAULT FALSE,
carcinogen_source VARCHAR(20), -- 'IARC', 'NTP', 'OSHA', 'CA_Prop65'
carcinogen_classification VARCHAR(20), -- 'Group 1', '2A', 'K', 'R'

-- EU REACH (future)
is_eu_reach BOOLEAN DEFAULT FALSE,
reach_status VARCHAR(50),

-- Metadata
last_verified_at TIMESTAMP WITH TIME ZONE,
source_urls JSONB,
notes TEXT,

created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

-- Indexes
CREATE INDEX idx_reg_lists_cas ON chemiq_regulatory_lists(cas_number);
CREATE INDEX idx_reg_lists_sara313 ON chemiq_regulatory_lists(is_epa_sara_313)
WHERE is_epa_sara_313 = TRUE;
CREATE INDEX idx_reg_lists_prop65 ON chemiq_regulatory_lists(is_california_prop65)
WHERE is_california_prop65 = TRUE;
CREATE INDEX idx_reg_lists_carcinogen ON chemiq_regulatory_lists(is_carcinogen)
WHERE is_carcinogen = TRUE;
CREATE INDEX idx_reg_lists_cercla ON chemiq_regulatory_lists(is_epa_cercla)
WHERE is_epa_cercla = TRUE;

Company Compliance Settings Table

-- Track company-specific compliance settings
CREATE TABLE chemiq_company_compliance_settings (
setting_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
company_id UUID NOT NULL REFERENCES core_companies(company_id),

-- Applicability
ships_to_california BOOLEAN DEFAULT FALSE,
employee_count INT, -- For TRI threshold (10+ employees)
naics_code VARCHAR(10), -- For TRI applicability

-- Preferences
use_niosh_rel_over_pel BOOLEAN DEFAULT FALSE, -- More protective
exposure_monitoring_frequency VARCHAR(20) DEFAULT 'as_needed',

-- Notification settings
notify_on_compliance_issues BOOLEAN DEFAULT TRUE,
compliance_notification_emails JSONB, -- Array of emails

created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),

UNIQUE(company_id)
);

Compliance Check Results Table

-- Store compliance check results for each inventory item
CREATE TABLE chemiq_compliance_results (
result_id UUID PRIMARY KEY DEFAULT gen_random_uuid(),

-- References
company_id UUID NOT NULL REFERENCES core_companies(company_id),
site_id UUID REFERENCES core_company_sites(site_id),
inventory_id UUID REFERENCES chemiq_inventory(chemical_id),
sds_id UUID REFERENCES chemiq_sds_documents(sds_id),

-- Overall status
compliance_score INT, -- 0-100
is_compliant BOOLEAN,
check_date TIMESTAMP WITH TIME ZONE DEFAULT NOW(),

-- Individual check results (JSONB for flexibility)
hazcom_result JSONB,
exposure_monitoring_result JSONB,
tri_reporting_result JSONB,
cercla_result JSONB,
prop65_result JSONB,
carcinogen_result JSONB,

-- Action items
required_actions JSONB, -- Array of action strings
warnings JSONB, -- Array of warning strings

-- Metadata
checked_by VARCHAR(50), -- 'system', 'user:{user_id}'

created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

-- Index for dashboard queries
CREATE INDEX idx_compliance_results_company ON chemiq_compliance_results(company_id, check_date DESC);
CREATE INDEX idx_compliance_results_site ON chemiq_compliance_results(site_id, check_date DESC);

Service Implementation

Complete Compliance Check Service

# app/services/compliance_check_service.py

from typing import List, Optional, Dict, Any
from dataclasses import dataclass, field
from datetime import datetime
from uuid import UUID
from sqlalchemy.orm import Session

from app.db.models.regulatory_lists import ChemiqRegulatoryLists
from app.db.models.pubchem_cache import PubChemCache
from app.db.models.sds import SDSComposition
from app.db.models.compliance import ChemiqComplianceResults


@dataclass
class ComplianceCheckResult:
"""Complete compliance check result for a product."""

# Overall
is_compliant: bool = True
compliance_score: int = 100

# Individual checks
hazcom: Dict = field(default_factory=dict)
exposure_monitoring: Dict = field(default_factory=dict)
tri_reporting: Dict = field(default_factory=dict)
cercla: Dict = field(default_factory=dict)
prop65: Dict = field(default_factory=dict)
carcinogen: Dict = field(default_factory=dict)

# Aggregated
required_actions: List[str] = field(default_factory=list)
warnings: List[str] = field(default_factory=list)
chemicals_checked: int = 0
chemicals_with_issues: int = 0


class ComplianceCheckService:
"""
Comprehensive regulatory compliance checking service.

Checks:
1. OSHA HazCom requirements
2. Exposure monitoring requirements
3. TRI (SARA 313) reporting
4. CERCLA reportable quantities
5. California Prop 65
6. Carcinogen status
"""

def __init__(self, db: Session):
self.db = db

async def check_product_compliance(
self,
composition: List[SDSComposition],
company_id: UUID,
site_id: Optional[UUID] = None,
annual_usage_lbs: Optional[float] = None,
) -> ComplianceCheckResult:
"""
Run all compliance checks for a product based on its composition.

Args:
composition: List of SDS composition components
company_id: Company ID for company-specific settings
site_id: Optional site ID for location-specific requirements
annual_usage_lbs: Optional annual usage for TRI calculations

Returns:
ComplianceCheckResult with all check details
"""
result = ComplianceCheckResult()

# Get company settings
company_settings = self._get_company_settings(company_id)
company_state = self._get_company_state(company_id, site_id)

# Track chemicals for aggregation
tri_chemicals = []
prop65_chemicals = []
carcinogen_chemicals = []
pel_chemicals = []

for component in composition:
if not component.cas_number:
result.warnings.append(
f"No CAS number for {component.chemical_name} - "
"compliance check limited"
)
continue

result.chemicals_checked += 1
component_has_issues = False

# Get regulatory data
reg_data = self.db.query(ChemiqRegulatoryLists).filter(
ChemiqRegulatoryLists.cas_number == component.cas_number
).first()

# Get enriched data (exposure limits)
enriched = self.db.query(PubChemCache).filter(
PubChemCache.cas_number == component.cas_number
).first()

if not reg_data:
result.warnings.append(
f"No regulatory data for {component.chemical_name} "
f"(CAS {component.cas_number})"
)

# 1. HazCom Check
if component.is_hazardous:
result.hazcom['required'] = True
result.hazcom.setdefault('chemicals', []).append({
'name': component.chemical_name,
'cas': component.cas_number
})

# 2. Exposure Monitoring Check
if enriched and (enriched.osha_pel_ppm or enriched.niosh_rel_ppm):
pel_chemicals.append({
'name': component.chemical_name,
'cas': component.cas_number,
'pel': enriched.osha_pel_ppm,
'rel': enriched.niosh_rel_ppm,
'idlh': enriched.niosh_idlh_ppm
})

# Check for substance-specific standards
if reg_data and reg_data.has_osha_specific_standard:
result.required_actions.append(
f"OSHA STANDARD: {component.chemical_name} is subject to "
f"OSHA {reg_data.osha_standard_citation}. "
"Review specific requirements."
)
component_has_issues = True
result.compliance_score -= 10

# 3. TRI Check
if reg_data and reg_data.is_epa_sara_313:
threshold = reg_data.sara_313_threshold_lbs or 10000

if annual_usage_lbs:
chemical_usage = annual_usage_lbs * (
(component.concentration_percent or 0) / 100
)

if chemical_usage >= threshold:
tri_chemicals.append({
'name': component.chemical_name,
'cas': component.cas_number,
'usage_lbs': chemical_usage,
'threshold_lbs': threshold
})
result.required_actions.append(
f"TRI REPORTING: {component.chemical_name} usage "
f"({chemical_usage:.0f} lbs) exceeds threshold "
f"({threshold} lbs). Form R required by July 1."
)
component_has_issues = True
result.compliance_score -= 15
else:
tri_chemicals.append({
'name': component.chemical_name,
'cas': component.cas_number,
'threshold_lbs': threshold,
'note': 'Usage data needed for threshold check'
})
result.warnings.append(
f"TRI: {component.chemical_name} is TRI-listed. "
"Provide annual usage to check reporting requirement."
)

# 4. CERCLA Check
if reg_data and reg_data.is_epa_cercla and reg_data.cercla_rq_lbs:
result.cercla.setdefault('chemicals', []).append({
'name': component.chemical_name,
'cas': component.cas_number,
'rq_lbs': reg_data.cercla_rq_lbs
})
result.warnings.append(
f"CERCLA: {component.chemical_name} has RQ of "
f"{reg_data.cercla_rq_lbs} lbs. Report spills >= RQ to NRC."
)

# 5. Prop 65 Check
if reg_data and reg_data.is_california_prop65:
if company_state == 'CA' or company_settings.get('ships_to_california'):
prop65_chemicals.append({
'name': component.chemical_name,
'cas': component.cas_number,
'type': reg_data.prop65_type
})
result.required_actions.append(
f"PROP 65: {component.chemical_name} ({reg_data.prop65_type}) "
"requires California Prop 65 warning label."
)
component_has_issues = True
result.compliance_score -= 5

# 6. Carcinogen Check
if reg_data and reg_data.is_carcinogen:
carcinogen_chemicals.append({
'name': component.chemical_name,
'cas': component.cas_number,
'classification': reg_data.carcinogen_classification,
'source': reg_data.carcinogen_source
})
result.required_actions.append(
f"CARCINOGEN: {component.chemical_name} is classified as "
f"{reg_data.carcinogen_classification} ({reg_data.carcinogen_source}). "
"Special controls and medical surveillance required."
)
component_has_issues = True
result.compliance_score -= 20

if component_has_issues:
result.chemicals_with_issues += 1

# Aggregate results
result.exposure_monitoring = {
'chemicals': pel_chemicals,
'monitoring_recommended': len(pel_chemicals) > 0
}
result.tri_reporting = {
'chemicals': tri_chemicals,
'reporting_may_be_required': len([c for c in tri_chemicals if 'usage_lbs' in c]) > 0
}
result.prop65 = {
'chemicals': prop65_chemicals,
'warning_required': len(prop65_chemicals) > 0
}
result.carcinogen = {
'chemicals': carcinogen_chemicals,
'special_controls_required': len(carcinogen_chemicals) > 0
}

# Set overall compliance
result.compliance_score = max(0, result.compliance_score)
result.is_compliant = result.compliance_score >= 70

return result

def _get_company_settings(self, company_id: UUID) -> Dict:
"""Get company-specific compliance settings."""
from app.db.models.compliance import ChemiqCompanyComplianceSettings

settings = self.db.query(ChemiqCompanyComplianceSettings).filter(
ChemiqCompanyComplianceSettings.company_id == company_id
).first()

if settings:
return {
'ships_to_california': settings.ships_to_california,
'employee_count': settings.employee_count,
'use_niosh_rel_over_pel': settings.use_niosh_rel_over_pel
}
return {}

def _get_company_state(
self,
company_id: UUID,
site_id: Optional[UUID]
) -> Optional[str]:
"""Get state for company/site."""
from app.db.models.core import CoreCompanySites, CoreCompanies

if site_id:
site = self.db.query(CoreCompanySites).filter(
CoreCompanySites.site_id == site_id
).first()
if site and site.state:
return site.state

# Fall back to company primary address
company = self.db.query(CoreCompanies).filter(
CoreCompanies.company_id == company_id
).first()
if company:
return company.state

return None

async def save_compliance_result(
self,
result: ComplianceCheckResult,
company_id: UUID,
site_id: Optional[UUID] = None,
inventory_id: Optional[UUID] = None,
sds_id: Optional[UUID] = None
) -> ChemiqComplianceResults:
"""Save compliance check result to database."""

db_result = ChemiqComplianceResults(
company_id=company_id,
site_id=site_id,
inventory_id=inventory_id,
sds_id=sds_id,
compliance_score=result.compliance_score,
is_compliant=result.is_compliant,
hazcom_result=result.hazcom,
exposure_monitoring_result=result.exposure_monitoring,
tri_reporting_result=result.tri_reporting,
cercla_result=result.cercla,
prop65_result=result.prop65,
carcinogen_result=result.carcinogen,
required_actions=result.required_actions,
warnings=result.warnings,
checked_by='system'
)

self.db.add(db_result)
self.db.commit()
self.db.refresh(db_result)

return db_result

Trigger Points

This section documents when compliance checks should be automatically invoked throughout the system. Compliance checking is a progressive process - initial results may be incomplete but improve as more data becomes available through enrichment.

Data Flow Diagram

┌─────────────────────────────────────────────────────────────────────────────────┐
│ COMPLIANCE CHECK TRIGGER POINTS │
└─────────────────────────────────────────────────────────────────────────────────┘

┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ SDS Upload │ ──► │ SDS Parsing │ ──► │ Chemical │ ──► │ Compliance │
│ (User/API) │ │ (Extract │ │ Enrichment │ │ Results │
│ │ │ Composition)│ │ (PubChem) │ │ (Final) │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘
│ │ │
▼ ▼ │
┌──────────────┐ ┌──────────────┐ │
│ TRIGGER #1 │ │ TRIGGER #2 │ │
│ Initial │ │ Enhanced │ │
│ Compliance │ │ Compliance │ │
│ Check │ │ Check │ │
│ (Limited) │ │ (Full) │ │
└──────────────┘ └──────────────┘ │

┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ Inventory │ ──► │ TRI Usage │ ──► │ TRIGGER #3 │ ───────────┘
│ Update │ │ Calculation │ │ Usage-Based │
│ (Quantity) │ │ │ │ Check │
└──────────────┘ └──────────────┘ └──────────────┘

┌──────────────┐ ┌──────────────┐
│ Company │ ───────────────────────►│ TRIGGER #4 │
│ Settings │ (State changed to CA) │ Re-check │
│ Change │ │ Prop 65 │
└──────────────┘ └──────────────┘

┌──────────────┐ ┌──────────────┐
│ Background │ ───────────────────────►│ TRIGGER #5 │
│ Scheduler │ (Daily/Weekly cron) │ Batch │
│ │ │ Refresh │
└──────────────┘ └──────────────┘

Trigger #1: On SDS Processing Completion (Primary)

When: Immediately after SDS parsing completes and composition data is extracted.

What's Available:

  • SDS Section 2 hazard classifications (GHS)
  • Composition with CAS numbers and concentrations
  • Hazardous chemical identification

What Gets Checked:

  • OSHA HazCom requirements (based on GHS classification)
  • Basic regulatory list lookups (TRI, CERCLA, Prop 65)

Limitations:

  • Exposure limits (PEL/REL) may not yet be available
  • Results are marked as "preliminary" until enrichment completes

Implementation:

# app/workers/sds_parse_job.py

from app.services.compliance_check_service import ComplianceCheckService

async def process_sds_document(sds_id: UUID, db: Session):
"""Parse SDS and trigger initial compliance check."""

# ... existing SDS parsing logic ...

# After composition is extracted and saved:
composition = db.query(SDSComposition).filter(
SDSComposition.sds_id == sds_id
).all()

if composition:
# TRIGGER #1: Initial compliance check
compliance_service = ComplianceCheckService(db)

result = await compliance_service.check_product_compliance(
composition=composition,
company_id=sds_document.company_id,
site_id=None, # Site-specific check happens on inventory add
annual_usage_lbs=None # Not known yet
)

# Save with "preliminary" flag
await compliance_service.save_compliance_result(
result=result,
company_id=sds_document.company_id,
sds_id=sds_id,
checked_by='system:sds_parse'
)

# Queue enrichment for each CAS number (triggers #2 when complete)
for comp in composition:
if comp.cas_number:
await queue_chemical_enrichment(comp.cas_number, sds_id)

Trigger #2: On Chemical Enrichment Completion (Secondary)

When: After PubChem data enrichment completes for a chemical.

What's Now Available:

  • OSHA PEL (Permissible Exposure Limit)
  • NIOSH REL (Recommended Exposure Limit)
  • NIOSH IDLH (Immediately Dangerous to Life or Health)
  • Carcinogen classifications
  • Additional regulatory status data

What Gets Checked:

  • Exposure monitoring requirements (now with PEL/REL data)
  • Substance-specific OSHA standards
  • Enhanced carcinogen status
  • Full regulatory compliance assessment

Why Re-check:

  • Initial check may have missed exposure limits
  • Carcinogen status may now be confirmed from multiple sources
  • Results are now marked as "complete"

Implementation:

# app/workers/chemical_enrichment_job.py

from app.services.compliance_check_service import ComplianceCheckService

async def enrich_chemical(cas_number: str, trigger_sds_id: Optional[UUID], db: Session):
"""Enrich chemical data from PubChem and re-run compliance check."""

# ... existing enrichment logic ...

# After enrichment completes:
if trigger_sds_id:
# TRIGGER #2: Enhanced compliance check
sds_document = db.query(SDSDocument).filter(
SDSDocument.sds_id == trigger_sds_id
).first()

if sds_document:
composition = db.query(SDSComposition).filter(
SDSComposition.sds_id == trigger_sds_id
).all()

# Check if all chemicals in composition are now enriched
all_enriched = all(
db.query(PubChemCache).filter(
PubChemCache.cas_number == comp.cas_number
).first() is not None
for comp in composition if comp.cas_number
)

if all_enriched:
compliance_service = ComplianceCheckService(db)

result = await compliance_service.check_product_compliance(
composition=composition,
company_id=sds_document.company_id,
site_id=None,
annual_usage_lbs=None
)

# Save as "complete" check (replaces preliminary)
await compliance_service.save_compliance_result(
result=result,
company_id=sds_document.company_id,
sds_id=trigger_sds_id,
checked_by='system:enrichment_complete'
)

Trigger #3: On Inventory Addition/Update (Usage-Based)

When:

  • When a product is added to inventory with quantity/usage data
  • When inventory quantities are updated
  • When annual usage is recorded or updated

What's Now Available:

  • Annual usage in pounds (for TRI threshold calculations)
  • Site-specific location (for state regulations like Prop 65)
  • Quantity on hand (for CERCLA RQ awareness)

What Gets Checked:

  • TRI (SARA 313) reporting thresholds with actual usage data
  • Site-specific regulatory requirements
  • Cumulative chemical quantities

Why This Matters:

  • TRI reporting is based on annual usage thresholds
  • A chemical may not trigger TRI at low usage but requires Form R at high usage
  • Site location affects which state regulations apply

Implementation:

# app/services/inventory_service.py

from app.services.compliance_check_service import ComplianceCheckService

async def add_to_inventory(
inventory_data: InventoryCreate,
company_id: UUID,
db: Session
):
"""Add product to inventory and run usage-based compliance check."""

# ... existing inventory creation logic ...

# TRIGGER #3: Usage-based compliance check
if inventory_data.annual_usage_lbs or inventory_data.quantity_lbs:
product = db.query(CompanyProductCatalog).filter(
CompanyProductCatalog.product_id == inventory_data.product_id
).first()

if product and product.current_sds_id:
composition = db.query(SDSComposition).filter(
SDSComposition.sds_id == product.current_sds_id
).all()

if composition:
compliance_service = ComplianceCheckService(db)

result = await compliance_service.check_product_compliance(
composition=composition,
company_id=company_id,
site_id=inventory_data.site_id,
annual_usage_lbs=inventory_data.annual_usage_lbs
)

await compliance_service.save_compliance_result(
result=result,
company_id=company_id,
site_id=inventory_data.site_id,
inventory_id=new_inventory.chemical_id,
sds_id=product.current_sds_id,
checked_by='system:inventory_update'
)


async def update_inventory_quantity(
inventory_id: UUID,
quantity_update: QuantityUpdate,
db: Session
):
"""Update inventory quantity and re-check TRI thresholds."""

# ... existing update logic ...

# TRIGGER #3: Re-check if annual usage changes significantly
if quantity_update.annual_usage_lbs:
inventory = db.query(ChemIQInventory).filter(
ChemIQInventory.chemical_id == inventory_id
).first()

# Check if usage crossed a threshold boundary
old_usage = inventory.annual_usage_lbs or 0
new_usage = quantity_update.annual_usage_lbs

# Re-check if crossing common thresholds (100, 1000, 10000, 25000 lbs)
thresholds = [100, 1000, 10000, 25000]
if any(
(old_usage < t <= new_usage) or (new_usage < t <= old_usage)
for t in thresholds
):
await _run_compliance_check_for_inventory(inventory, db)

Trigger #4: On Company Settings Change

When:

  • Company state/location changes
  • ships_to_california flag is updated
  • Employee count changes (affects TRI applicability)
  • NAICS code changes

What Gets Checked:

  • Prop 65 requirements (if now California or shipping to CA)
  • TRI applicability (based on employee count and NAICS)
  • State-specific regulations

Why This Matters:

  • A company moving to California suddenly needs Prop 65 warnings for all listed chemicals
  • Crossing the 10 FTE threshold changes TRI reporting obligations

Implementation:

# app/services/company_settings_service.py

from app.services.compliance_check_service import ComplianceCheckService

async def update_company_compliance_settings(
company_id: UUID,
settings_update: ComplianceSettingsUpdate,
db: Session
):
"""Update company compliance settings and re-check affected inventory."""

old_settings = db.query(ChemiqCompanyComplianceSettings).filter(
ChemiqCompanyComplianceSettings.company_id == company_id
).first()

# TRIGGER #4: Check if settings change affects compliance
needs_recheck = False

if settings_update.ships_to_california is not None:
if settings_update.ships_to_california != (old_settings.ships_to_california if old_settings else False):
needs_recheck = True

if settings_update.state and old_settings:
if settings_update.state == 'CA' and old_settings.state != 'CA':
needs_recheck = True

# ... update settings ...

if needs_recheck:
# Queue batch compliance re-check for all active inventory
await queue_batch_compliance_recheck(
company_id=company_id,
reason='company_settings_changed',
priority='high'
)


async def queue_batch_compliance_recheck(
company_id: UUID,
reason: str,
priority: str = 'normal'
):
"""Queue all company inventory for compliance re-check."""

# Add to background job queue
await job_queue.enqueue(
'compliance_batch_recheck',
{
'company_id': str(company_id),
'reason': reason,
'triggered_at': datetime.utcnow().isoformat()
},
priority=priority
)

Trigger #5: Scheduled Background Job (Periodic)

When:

  • Daily cron job for high-priority items
  • Weekly cron job for full inventory scan
  • Quarterly after regulatory list updates

What Gets Checked:

  • Items not checked in 30+ days
  • Items where regulatory data has been updated
  • New items added without compliance checks

Why This Matters:

  • Regulatory lists change (Prop 65 updates quarterly)
  • Catches any items missed by event-based triggers
  • Ensures compliance data stays current

Implementation:

See the Background Jobs section for the complete ComplianceCheckWorker implementation.

# Cron schedule (app/workers/scheduler.py)

COMPLIANCE_CHECK_SCHEDULE = {
# Daily: Check items flagged for urgent review
'compliance_check_urgent': {
'cron': '0 2 * * *', # 2 AM daily
'handler': 'compliance_check_worker.run_urgent_checks',
'description': 'Check items flagged for urgent compliance review'
},

# Weekly: Check items not reviewed in 30+ days
'compliance_check_periodic': {
'cron': '0 3 * * 0', # 3 AM Sunday
'handler': 'compliance_check_worker.run_periodic_checks',
'description': 'Periodic compliance check for stale items'
},

# After regulatory updates: Re-check affected chemicals
'compliance_check_regulatory_update': {
'cron': '0 4 1 2,5,8,11 *', # 4 AM, 1st of Feb/May/Aug/Nov
'handler': 'compliance_check_worker.run_regulatory_update_checks',
'description': 'Re-check after quarterly Prop 65 updates'
}
}

Trigger Summary Table

TriggerWhenData AvailableChecks PerformedPriority
#1 SDS ParseAfter SDS composition extractedGHS classification, CAS numbers, concentrationsHazCom, basic regulatory lookupsHigh
#2 EnrichmentAfter PubChem enrichment completesPEL, REL, IDLH, carcinogen statusFull exposure monitoring, enhanced carcinogenHigh
#3 InventoryOn add/update with quantitiesAnnual usage, site locationTRI thresholds, site-specific regsMedium
#4 SettingsCompany settings changeUpdated state, shipping flagsProp 65 re-check, TRI applicabilityMedium
#5 BackgroundScheduled cronAll available dataFull compliance re-checkLow

Progressive Compliance Results

Compliance checking is progressive - results improve as more data becomes available:

┌─────────────────────────────────────────────────────────────────────────────┐
│ COMPLIANCE CHECK PROGRESSION │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Stage 1: After SDS Parse (Trigger #1) │
│ ├── ✓ HazCom requirements (from GHS classification) │
│ ├── ✓ Basic TRI listing (yes/no, but no threshold check) │
│ ├── ✓ Basic Prop 65 listing (if CAS in regulatory list) │
│ ├── ✓ CERCLA RQ values │
│ ├── ⚠ Exposure monitoring: "Data pending enrichment" │
│ └── ⚠ Carcinogen status: "Limited - awaiting PubChem data" │
│ │
│ Stage 2: After Enrichment (Trigger #2) │
│ ├── ✓ HazCom requirements │
│ ├── ✓ TRI listing │
│ ├── ✓ Prop 65 status │
│ ├── ✓ CERCLA RQ values │
│ ├── ✓ Exposure monitoring: PEL/REL/IDLH values │
│ ├── ✓ Carcinogen status: Full IARC/NTP/OSHA classification │
│ └── ⚠ TRI threshold check: "Provide annual usage for threshold calculation" │
│ │
│ Stage 3: After Inventory Add (Trigger #3) │
│ ├── ✓ All previous checks │
│ └── ✓ TRI threshold check: "Usage 15,000 lbs exceeds 10,000 lb threshold" │
│ │
└─────────────────────────────────────────────────────────────────────────────┘

API Endpoints

Compliance Check Endpoints

# app/api/v1/chemiq/compliance.py

from fastapi import APIRouter, Depends, HTTPException, Query
from typing import Optional, List
from uuid import UUID

from app.api.dependencies.permissions import get_user_context, UserContext
from app.db.session import get_db
from app.services.compliance_check_service import ComplianceCheckService
from app.schemas.compliance import (
ComplianceCheckRequest,
ComplianceCheckResponse,
ComplianceDashboardResponse,
ComplianceReportRequest,
ComplianceReportResponse
)

router = APIRouter()


@router.post("/check", response_model=ComplianceCheckResponse)
async def check_compliance(
request: ComplianceCheckRequest,
ctx: UserContext = Depends(get_user_context),
db: Session = Depends(get_db)
):
"""
Run compliance check for a product/SDS.

Checks OSHA HazCom, exposure monitoring, TRI, CERCLA, Prop 65, carcinogens.
"""
service = ComplianceCheckService(db)

# Get composition from SDS
composition = db.query(SDSComposition).filter(
SDSComposition.sds_id == request.sds_id
).all()

if not composition:
raise HTTPException(404, "No composition data found for SDS")

result = await service.check_product_compliance(
composition=composition,
company_id=ctx.company_id,
site_id=request.site_id,
annual_usage_lbs=request.annual_usage_lbs
)

# Save result
if request.save_result:
await service.save_compliance_result(
result=result,
company_id=ctx.company_id,
site_id=request.site_id,
sds_id=request.sds_id,
inventory_id=request.inventory_id
)

return ComplianceCheckResponse(
compliance_score=result.compliance_score,
is_compliant=result.is_compliant,
hazcom=result.hazcom,
exposure_monitoring=result.exposure_monitoring,
tri_reporting=result.tri_reporting,
cercla=result.cercla,
prop65=result.prop65,
carcinogen=result.carcinogen,
required_actions=result.required_actions,
warnings=result.warnings,
chemicals_checked=result.chemicals_checked,
chemicals_with_issues=result.chemicals_with_issues
)


@router.get("/dashboard", response_model=ComplianceDashboardResponse)
async def get_compliance_dashboard(
site_id: Optional[UUID] = None,
ctx: UserContext = Depends(get_user_context),
db: Session = Depends(get_db)
):
"""
Get compliance dashboard for company or site.

Shows:
- Overall compliance score
- Chemicals requiring action
- Upcoming deadlines
- Recent compliance checks
"""
service = ComplianceCheckService(db)

# Get latest compliance results
query = db.query(ChemiqComplianceResults).filter(
ChemiqComplianceResults.company_id == ctx.company_id
)

if site_id:
query = query.filter(ChemiqComplianceResults.site_id == site_id)

results = query.order_by(
ChemiqComplianceResults.check_date.desc()
).limit(100).all()

# Aggregate data
total_checks = len(results)
compliant_count = sum(1 for r in results if r.is_compliant)
avg_score = sum(r.compliance_score for r in results) / total_checks if total_checks else 100

# Get chemicals with issues
chemicals_with_issues = set()
all_actions = []
for r in results:
if r.required_actions:
all_actions.extend(r.required_actions)
# Extract chemical names from various results
for key in ['carcinogen_result', 'prop65_result', 'tri_reporting_result']:
data = getattr(r, key)
if data and 'chemicals' in data:
for chem in data['chemicals']:
chemicals_with_issues.add(chem.get('name', 'Unknown'))

return ComplianceDashboardResponse(
site_compliance_score=int(avg_score),
total_checks=total_checks,
compliant_checks=compliant_count,
non_compliant_checks=total_checks - compliant_count,
chemicals_with_issues=list(chemicals_with_issues),
top_actions=all_actions[:10],
upcoming_deadlines=_get_upcoming_deadlines()
)


@router.get("/chemical/{cas_number}", response_model=dict)
async def get_chemical_regulatory_status(
cas_number: str,
ctx: UserContext = Depends(get_user_context),
db: Session = Depends(get_db)
):
"""
Get regulatory status for a specific chemical by CAS number.
"""
reg_data = db.query(ChemiqRegulatoryLists).filter(
ChemiqRegulatoryLists.cas_number == cas_number
).first()

enriched = db.query(PubChemCache).filter(
PubChemCache.cas_number == cas_number
).first()

if not reg_data and not enriched:
raise HTTPException(404, f"No regulatory data found for CAS {cas_number}")

return {
'cas_number': cas_number,
'chemical_name': reg_data.chemical_name if reg_data else enriched.common_name if enriched else None,
'osha': {
'has_pel': enriched.osha_pel_ppm is not None if enriched else False,
'pel_ppm': enriched.osha_pel_ppm if enriched else None,
'has_specific_standard': reg_data.has_osha_specific_standard if reg_data else False,
'standard_citation': reg_data.osha_standard_citation if reg_data else None
},
'niosh': {
'has_rel': enriched.niosh_rel_ppm is not None if enriched else False,
'rel_ppm': enriched.niosh_rel_ppm if enriched else None,
'idlh_ppm': enriched.niosh_idlh_ppm if enriched else None
},
'epa': {
'sara_313_listed': reg_data.is_epa_sara_313 if reg_data else False,
'sara_313_threshold_lbs': reg_data.sara_313_threshold_lbs if reg_data else None,
'cercla_listed': reg_data.is_epa_cercla if reg_data else False,
'cercla_rq_lbs': reg_data.cercla_rq_lbs if reg_data else None
},
'state': {
'prop65_listed': reg_data.is_california_prop65 if reg_data else False,
'prop65_type': reg_data.prop65_type if reg_data else None
},
'carcinogen': {
'is_carcinogen': reg_data.is_carcinogen if reg_data else False,
'classification': reg_data.carcinogen_classification if reg_data else None,
'source': reg_data.carcinogen_source if reg_data else None
}
}


def _get_upcoming_deadlines():
"""Get upcoming regulatory deadlines."""
from datetime import date

today = date.today()
year = today.year

deadlines = []

# TRI Form R due July 1
tri_deadline = date(year, 7, 1)
if tri_deadline > today:
deadlines.append({
'program': 'EPA TRI (Form R)',
'deadline': tri_deadline.isoformat(),
'description': 'Annual Toxics Release Inventory reporting'
})

# Tier II due March 1
tier2_deadline = date(year, 3, 1)
if tier2_deadline > today:
deadlines.append({
'program': 'SARA Tier II',
'deadline': tier2_deadline.isoformat(),
'description': 'Emergency planning hazardous chemical inventory'
})

return sorted(deadlines, key=lambda x: x['deadline'])

Pydantic Schemas

# app/schemas/compliance.py

from pydantic import BaseModel
from typing import Optional, List, Dict, Any
from uuid import UUID
from datetime import datetime


class ComplianceCheckRequest(BaseModel):
"""Request to run compliance check."""
sds_id: UUID
site_id: Optional[UUID] = None
inventory_id: Optional[UUID] = None
annual_usage_lbs: Optional[float] = None
save_result: bool = True


class ComplianceCheckResponse(BaseModel):
"""Response from compliance check."""
compliance_score: int
is_compliant: bool

hazcom: Dict[str, Any]
exposure_monitoring: Dict[str, Any]
tri_reporting: Dict[str, Any]
cercla: Dict[str, Any]
prop65: Dict[str, Any]
carcinogen: Dict[str, Any]

required_actions: List[str]
warnings: List[str]

chemicals_checked: int
chemicals_with_issues: int


class ComplianceDashboardResponse(BaseModel):
"""Response for compliance dashboard."""
site_compliance_score: int
total_checks: int
compliant_checks: int
non_compliant_checks: int
chemicals_with_issues: List[str]
top_actions: List[str]
upcoming_deadlines: List[Dict[str, Any]]


class CompanyComplianceSettingsUpdate(BaseModel):
"""Update company compliance settings."""
ships_to_california: Optional[bool] = None
employee_count: Optional[int] = None
use_niosh_rel_over_pel: Optional[bool] = None

Frontend Integration

TypeScript Types

// src/types/compliance.ts

export interface ComplianceCheckResult {
compliance_score: number;
is_compliant: boolean;

hazcom: HazComResult;
exposure_monitoring: ExposureMonitoringResult;
tri_reporting: TRIResult;
cercla: CERCLAResult;
prop65: Prop65Result;
carcinogen: CarcinogenResult;

required_actions: string[];
warnings: string[];

chemicals_checked: number;
chemicals_with_issues: number;
}

export interface HazComResult {
required: boolean;
chemicals?: { name: string; cas: string }[];
}

export interface ExposureMonitoringResult {
chemicals: {
name: string;
cas: string;
pel?: number;
rel?: number;
idlh?: number;
}[];
monitoring_recommended: boolean;
}

export interface TRIResult {
chemicals: {
name: string;
cas: string;
usage_lbs?: number;
threshold_lbs: number;
note?: string;
}[];
reporting_may_be_required: boolean;
}

export interface CERCLAResult {
chemicals?: {
name: string;
cas: string;
rq_lbs: number;
}[];
}

export interface Prop65Result {
chemicals: {
name: string;
cas: string;
type: 'cancer' | 'reproductive' | 'both';
}[];
warning_required: boolean;
}

export interface CarcinogenResult {
chemicals: {
name: string;
cas: string;
classification: string;
source: string;
}[];
special_controls_required: boolean;
}

export interface ComplianceDashboard {
site_compliance_score: number;
total_checks: number;
compliant_checks: number;
non_compliant_checks: number;
chemicals_with_issues: string[];
top_actions: string[];
upcoming_deadlines: {
program: string;
deadline: string;
description: string;
}[];
}

API Service

// src/services/api/compliance.api.ts

import { apiClient } from './client';
import type { ComplianceCheckResult, ComplianceDashboard } from '../../types/compliance';

export async function runComplianceCheck(
token: string,
sdsId: string,
options?: {
siteId?: string;
inventoryId?: string;
annualUsageLbs?: number;
saveResult?: boolean;
}
): Promise<ComplianceCheckResult> {
const response = await apiClient.post(
'/api/v1/chemiq/compliance/check',
{
sds_id: sdsId,
site_id: options?.siteId,
inventory_id: options?.inventoryId,
annual_usage_lbs: options?.annualUsageLbs,
save_result: options?.saveResult ?? true,
},
{
headers: { Authorization: `Bearer ${token}` },
}
);
return response.data;
}

export async function getComplianceDashboard(
token: string,
siteId?: string
): Promise<ComplianceDashboard> {
const response = await apiClient.get('/api/v1/chemiq/compliance/dashboard', {
headers: { Authorization: `Bearer ${token}` },
params: siteId ? { site_id: siteId } : undefined,
});
return response.data;
}

export async function getChemicalRegulatoryStatus(
token: string,
casNumber: string
): Promise<ChemicalRegulatoryStatus> {
const response = await apiClient.get(
`/api/v1/chemiq/compliance/chemical/${casNumber}`,
{
headers: { Authorization: `Bearer ${token}` },
}
);
return response.data;
}

Background Jobs

Periodic Compliance Check Job

# app/workers/compliance_check_worker.py

import asyncio
from datetime import datetime, timedelta
from sqlalchemy.orm import Session

from app.db.session import get_db
from app.db.models.inventory import ChemIQInventory
from app.db.models.sds import SDSComposition
from app.services.compliance_check_service import ComplianceCheckService
from app.core.logging import get_logger

logger = get_logger(__name__)


class ComplianceCheckWorker:
"""
Background worker for periodic compliance checks.

Runs compliance checks on inventory items that haven't been
checked recently or have updated SDS data.
"""

CHECK_INTERVAL_DAYS = 30 # Re-check every 30 days
BATCH_SIZE = 50

async def run_periodic_checks(self):
"""Run compliance checks on items needing review."""
db = next(get_db())

try:
cutoff_date = datetime.utcnow() - timedelta(days=self.CHECK_INTERVAL_DAYS)

# Find inventory items needing checks
# Either never checked or checked > 30 days ago
items_to_check = db.query(ChemIQInventory).outerjoin(
ChemiqComplianceResults,
ChemIQInventory.chemical_id == ChemiqComplianceResults.inventory_id
).filter(
ChemIQInventory.is_active == True,
db.or_(
ChemiqComplianceResults.check_date == None,
ChemiqComplianceResults.check_date < cutoff_date
)
).limit(self.BATCH_SIZE).all()

if not items_to_check:
logger.info("No inventory items need compliance checks")
return

logger.info(f"Running compliance checks on {len(items_to_check)} items")

service = ComplianceCheckService(db)

for item in items_to_check:
await self._check_item(db, service, item)

logger.info("Periodic compliance checks complete")

finally:
db.close()

async def _check_item(
self,
db: Session,
service: ComplianceCheckService,
item: ChemIQInventory
):
"""Run compliance check on a single inventory item."""
try:
# Get SDS ID from company product
sds_id = item.company_product.current_sds_id if item.company_product else None

if not sds_id:
logger.warning(f"No SDS for inventory {item.chemical_id}")
return

# Get composition
composition = db.query(SDSComposition).filter(
SDSComposition.sds_id == sds_id
).all()

if not composition:
logger.warning(f"No composition for SDS {sds_id}")
return

# Run check
result = await service.check_product_compliance(
composition=composition,
company_id=item.company_id,
site_id=item.site_id
)

# Save result
await service.save_compliance_result(
result=result,
company_id=item.company_id,
site_id=item.site_id,
inventory_id=item.chemical_id,
sds_id=sds_id
)

logger.debug(
f"Compliance check complete for {item.chemical_id}: "
f"score={result.compliance_score}"
)

except Exception as e:
logger.error(f"Error checking compliance for {item.chemical_id}: {e}")

Testing Strategy

Unit Tests

# tests/services/test_compliance_check_service.py

import pytest
from unittest.mock import Mock, patch
from app.services.compliance_check_service import ComplianceCheckService


@pytest.fixture
def mock_db():
return Mock()


@pytest.fixture
def service(mock_db):
return ComplianceCheckService(mock_db)


class TestHazComCheck:
"""Tests for OSHA HazCom compliance checking."""

def test_hazcom_required_for_hazardous_chemicals(self, service, mock_db):
"""HazCom should be required when any chemical is hazardous."""
composition = [
Mock(cas_number='67-64-1', chemical_name='Acetone', is_hazardous=True)
]

mock_db.query.return_value.filter.return_value.first.return_value = None

result = service.check_product_compliance(
composition=composition,
company_id='test-company'
)

assert result.hazcom['required'] == True
assert len(result.hazcom['chemicals']) == 1

def test_hazcom_not_required_for_non_hazardous(self, service, mock_db):
"""HazCom should not be required for non-hazardous chemicals."""
composition = [
Mock(cas_number='7732-18-5', chemical_name='Water', is_hazardous=False)
]

mock_db.query.return_value.filter.return_value.first.return_value = None

result = service.check_product_compliance(
composition=composition,
company_id='test-company'
)

assert result.hazcom.get('required', False) == False


class TestTRIReporting:
"""Tests for TRI/SARA 313 reporting checks."""

def test_tri_reporting_required_above_threshold(self, service, mock_db):
"""TRI reporting should be flagged when usage exceeds threshold."""
composition = [
Mock(
cas_number='67-64-1',
chemical_name='Acetone',
is_hazardous=True,
concentration_percent=50
)
]

reg_data = Mock(
is_epa_sara_313=True,
sara_313_threshold_lbs=10000
)
mock_db.query.return_value.filter.return_value.first.return_value = reg_data

result = service.check_product_compliance(
composition=composition,
company_id='test-company',
annual_usage_lbs=30000 # 50% * 30000 = 15000 > 10000 threshold
)

assert result.tri_reporting['reporting_may_be_required'] == True
assert any('TRI REPORTING' in action for action in result.required_actions)


class TestProp65:
"""Tests for California Prop 65 checks."""

def test_prop65_warning_for_california_company(self, service, mock_db):
"""Prop 65 warning should be required for CA companies with listed chemicals."""
composition = [
Mock(cas_number='71-43-2', chemical_name='Benzene', is_hazardous=True)
]

reg_data = Mock(
is_california_prop65=True,
prop65_type='cancer'
)
mock_db.query.return_value.filter.return_value.first.return_value = reg_data

# Mock company state as California
with patch.object(service, '_get_company_state', return_value='CA'):
result = service.check_product_compliance(
composition=composition,
company_id='test-company'
)

assert result.prop65['warning_required'] == True
assert any('PROP 65' in action for action in result.required_actions)

Summary

This implementation guide provides:

  1. Clear determination logic for each compliance check type
  2. Data source documentation with URLs and update schedules
  3. Complete database schema for storing regulatory data
  4. Service implementation with all check logic
  5. API endpoints for frontend integration
  6. Background jobs for periodic compliance checks
  7. Testing strategy with example unit tests

The compliance checking service integrates with the chemical database enrichment system documented in Chemical Database Integration to provide comprehensive regulatory compliance automation.