Skip to content

arktrace

arktrace is a Causal Inference Engine for Shadow Fleet Prediction. It uses Difference-in-Differences (DiD) to identify vessels that causally respond to sanction announcements with evasion behaviour — detecting unknown-unknown threats 60–90 days before they appear on public sanctions lists. AIS behaviour, ownership graph proximity, and trade flow data form the evidentiary substrate; causal inference and network-based backtracking propagation are the novel methodology.


The Problem

A shadow fleet vessel moves sanctioned oil or cargo while deliberately hiding what it is doing. It combines several techniques at once, which is why individual tracking tools miss it:

Technique How it works
Going dark (AIS off) Switches off its tracking transponder during cargo transfers so no position record exists
GPS spoofing Broadcasts a false position while the actual transfer happens elsewhere
Flag hopping Changes its country of registration frequently to reset port inspection history
Name and identity change Renames itself and re-registers under new shell companies to break continuity in watchlists
Ship-to-ship transfer at sea Moves cargo vessel-to-vessel far from any port, leaving no port record
Shell company ownership Buries beneficial ownership behind 4–6 layers of holding companies across multiple jurisdictions

Conventional tools detect individual techniques in isolation. arktrace goes further: it uses causal inference to identify vessels whose evasion behaviour was triggered by specific sanction events, separating genuine evasion from ordinary commercial route changes, and propagates signals through the ownership network to surface connected threats not yet on any list.

The default area of interest is the Strait of Malacca and Singapore Strait — the world's busiest shipping lane. Five regions are supported: Singapore/Malacca Strait, Japan Sea, Middle East, Europe, and US Gulf.


How It Works — Four Steps

1. Ingest public data The pipeline pulls vessel tracking data (AIS), international sanctions lists (OFAC, EU, UN), vessel ownership records, and bilateral trade statistics from public sources. No proprietary data feeds or costly subscriptions are required.

2. Apply causal inference and compute 19 signals per vessel The core model (src/score/causal_sanction.py) runs a Difference-in-Differences regression for each vessel around every major sanction announcement, testing whether behavioural change was causally driven by the event rather than coincidental. This is the primary innovation. Four signal families — movement anomaly, identity churn, ownership network distance, and trade flow mismatch — serve as the evidentiary substrate that feeds the causal model and an unknown-unknown detector (src/analysis/causal.py), which surfaces vessels with no current sanctions link but evasion-consistent causal signatures.

3. Rank candidates on the watchlist Causal scores and network propagation results are combined into a single confidence score. The dashboard shows a map and ranked table with a plain-English explanation of the top signals that drove each vessel's ranking — e.g. "causal DiD response to 2024-10 OFAC announcement (p < 0.01), one ownership hop from a sanctioned entity, changed flag 3 times in 2 years" — so an analyst can immediately understand the causal chain.

4. Hand off to a patrol officer High-scoring vessels are exported as a task file for the patrol vessel. The officer dispatches for close-range inspection. Results (confirmed, cleared, inconclusive) feed back into the causal model as hard labels, tightening future DiD estimates and triggering graph-wide backtracking to surface connected threats.


How Effective Is It?

Capability What to expect
Pre-designation lead time 60–90 days before OFAC listing (backtested via DiD on historical sanction announcements). See docs/scoring-model.md.
Unknown-unknown detection Causal signatures surface vessels with no sanctions link whose behaviour pattern matches confirmed evaders — catching threats before they appear on any list.
Detection rate Precision@50 target ≥ 0.60: at least 30 of the top-50 ranked candidates are confirmed OFAC-listed vessels. AUROC and Recall@200 are also tracked.
False positive reduction Geopolitical rerouting filter down-weights DiD scores for vessels on declared diversion routes (e.g. Cape of Good Hope since 2023), reducing noise from legitimate commercial rerouting.
Network propagation Confirmed vessels trigger graph-wide backtracking (scripts/run_backtracking.py) to surface ownership-connected threats not yet on any watchlist.
Explainability Every score has a per-feature SHAP breakdown. Analysts see the exact causal and network signals that drove each result — no black-box verdicts.

How Efficient Is It?

Resource Requirement
Hardware Standard laptop (4 vCPU / 8 GB RAM). No GPU, no cloud, no external server.
Full pipeline run ~45 minutes from raw data to ranked watchlist
Incremental re-score Under 60 seconds per batch during live monitoring
Live alerting Real-time SSE alerts when a vessel crosses a configurable confidence threshold
Software cost Fully open-source. No licensing fees.
Regions Switch between Singapore, Japan Sea, Middle East, Europe, and US Gulf with a single CLI flag — no code changes

Screening and Physical Investigation

arktrace covers Phase A — Screening (this repository). Phase B — Physical Investigation — is the patrol vessel software suite implemented in edgesentry-rs and edgesentry-app.

Phase What it does Status
A — Screening Ingest public data → compute risk signals → rank candidates → analyst dashboard Working
B — Physical Investigation Patrol vessel dispatch → OCR identity check → LiDAR hull scan → cryptographically signed evidence → VDES secure transmission Design specification complete; implementation begins after trial contract award

Phase A produces the watchlist. Phase B acts on it. Patrol outcomes flow back into Phase A to improve future rankings.


Human Oversight

The model ranks candidates — humans decide what to do. No automated decision triggers legal or operational action.

  • Every candidate is reviewed by an analyst before escalation. Review tiers: Confirmed, Probable, Suspect, Cleared, Inconclusive.
  • "Confirmed" requires at least two independent high-credibility sources, or one official designation with a verified vessel identifier (MMSI/IMO match).
  • Every review decision is recorded with a rationale, evidence references, and reviewer identity.

Full policy: docs/triage-governance.md.


Document Index

To understand… Read
Detection signals and scoring formula docs/scoring-model.md
All 19 features and what each detects docs/feature-engineering.md
Causal reasoning and unknown-unknown detection docs/causal-analysis.md
Physical vessel investigation (Phase B) docs/field-investigation.md
Human oversight, evidence policy, tier taxonomy docs/triage-governance.md
Validation metrics and backtesting methodology docs/backtesting-validation.md
Three operational scenarios (duty officer, analyst, patrol) docs/scenarios.md
Full roadmap (Phase A and Phase B) docs/roadmap.md
Deployment (local, Docker, cloud VM) docs/deployment.md
Tech stack and algorithm details docs/technical-solution.md
Pipeline operations reference docs/pipeline-operations.md
Regional configuration (Singapore, Japan Sea, etc.) docs/regional-playbooks.md