Causal Analysis: Unknown-Unknown Reasoning and Drift Monitoring
This document explains the src/analysis/causal.py and src/analysis/monitor.py
prototype modules introduced in issue #63.
1. Unknown-Unknown Causal Reasoner (src/analysis/causal.py)
Problem
The C3 DiD model (src/score/causal_sanction.py) quantifies causal links between
sanction announcements and AIS-gap behaviour for vessels already connected to the
known sanctions graph. This leaves a blind spot: vessels with no current sanctions
overlap that nonetheless exhibit evasion-consistent behaviour — the "unknown-unknowns".
Method
For every vessel with sanctions_distance = 99 (no graph link):
- Feature-delta profile — compare AIS gap rate in the recent 30-day window vs the 30–90-day baseline window (same DiD intuition as C3, applied per-vessel).
- Static signal checks —
sts_candidate_count ≥ 3andflag_changes_2y ≥ 2fromvessel_featuresare treated as additional evasion signals. - Signal scoring — matching signals are combined via mean log-uplift, normalised to [0, 1] with a soft cap at 10× uplift.
- Causal evidence attachment — C3
CausalEffectobjects (ATT, CI, p-value) from regimes with positive ATT are attached as context for analyst prompts.
Confidence and limitations
- A high causal score is an investigative lead, not a confirmed finding.
- The module cannot distinguish vessels with legitimately elevated activity from evasion candidates without additional field evidence.
- C3 causal evidence is regime-level (not vessel-level); it describes in general how sanctions affect gap behaviour, not whether this specific vessel responded to an announcement.
- Minimum signals threshold (
min_signals=1by default) can be raised to reduce false positives at the cost of lower recall.
Usage
from src.analysis.causal import score_unknown_unknowns
from src.score.causal_sanction import run_causal_model
effects = run_causal_model()
candidates = score_unknown_unknowns(db_path="data/processed/mpol.duckdb",
causal_effects=effects)
for c in candidates[:5]:
print(c.mmsi, c.causal_score)
print(c.prompt_context())
Analyst brief integration
When GET /api/briefs/{mmsi} is called, the brief system prompt automatically
includes the candidate's causal evidence context (if the vessel appears in the
unknown-unknown ranked list). The context block has the form:
CAUSAL EVIDENCE (unknown-unknown candidate):
• [OFAC Russia] ATT=+2.345 (95% CI [0.800, 3.890]), p=0.0210 (significant)
BEHAVIOURAL SIGNALS:
• ais_gap_count: recent=8.00, baseline=0.50, uplift×16.00
• flag_changes_2y: recent=3.00, baseline=0.00, uplift×3.00
NOTE: This vessel is NOT in any current sanctions list. ...
2. Drift Monitor (src/analysis/monitor.py)
Overview
The drift monitor runs four automated checks and emits DriftAlert objects with
severity levels ok | warning | critical.
| Check | What it detects |
|---|---|
ais_gap_rate |
Shift in AIS gap rate (gaps/vessel-day) between recent 30d and baseline 30–90d |
flag_distribution |
Shift in mean high-risk flag ratio vs a reference baseline (0.35) |
watchlist_score_shift |
Change in mean confidence score across sequential review history halves |
concept_drift_proxy |
Drop in confirmed/probable ratio across two sequential 90-day review windows |
Severity thresholds
| Check | Warning | Critical |
|---|---|---|
| AIS gap rate | ±30% relative change | ±60% |
| Flag distribution | ±10% relative change | ±25% |
| Watchlist score | ±8% relative change | ±15% |
| Concept drift proxy | ±10% relative change | ±20% |
What counts as a drift alert
A warning alert should trigger investigation: review whether recent ingestion
quality has changed, whether new vessel classes have been added to the watchlist,
or whether analyst review behaviour has shifted.
A critical alert should trigger escalation: notify the data/model owner and
consider pausing automated ranking decisions until the root cause is understood.
Limitations
watchlist_score_shiftandconcept_drift_proxyrequire at least 10 review records to produce non-trivial output.flag_distributionuses a hard-coded reference baseline (0.35) rather than a stored snapshot — this is a prototype approximation.- None of these checks substitute for a proper held-out evaluation set (see issue #62).
CLI usage
# Human-readable output
uv run python src/analysis/monitor.py --db data/processed/mpol.duckdb
# Machine-readable JSON
uv run python src/analysis/monitor.py --db data/processed/mpol.duckdb --json
Programmatic usage
from src.analysis.monitor import run_drift_checks, alerts_to_dict
alerts = run_drift_checks("data/processed/mpol.duckdb")
for alert in alerts:
if alert.severity != "ok":
print(alert)
# JSON export
import json
print(json.dumps(alerts_to_dict(alerts), indent=2))