The end of the volume-based SIEM tax.
Caver is the complete security operations platform built to retire the expensive, per-GB SIEM model of the last decade. One open OCSF Parquet lakehouse holds your data. Detection, SOAR, ITSI, UBA, and AI security run on top as integrated layers. We work with everything you already use, federate with most search tools (Splunk, Elastic, Sentinel, Sumo, Datadog), support every major query language, and deploy faster than anything else on the market.
Five integrated layers
Companion products
Caver Core
Caver is a language-agnostic security data platform. The storage layer is open OCSF Parquet on any S3-compatible object store. The query layer speaks whatever language your team prefers, SPL, SQL, KQL, PromQL, or natural language via AI agents.
Architecture
The central concept: your security data lives as OCSF Parquet in a bucket you control. Caver sits on top, exposing multiple query interfaces and running detection, SOAR, ITSI, and UBA logic against the same data lake.
Query Languages
Caver exposes a unified security data layer that any query interface can address. There is no lock-in to a single language, teams pick the tool they already know.
/ui. Speaks every supported language transparently, type SPL, SQL, KQL, Sigma, PromQL, or natural language and caver routes to the right engine. Built-in search lab, Detection IDE, SLAM notables, ECHO service trees.-- Same data, any language
-- SPL
index=aws_cloudtrail sourcetype=aws:cloudtrail eventName=ConsoleLogin
| stats count by userIdentity.userName | sort -count
-- SQL (DuckDB / Trino)
SELECT actor_user_name, COUNT(*) as logins
FROM caver_lake.aws_cloudtrail
WHERE class_uid = 3002 AND activity_name = 'Logon'
GROUP BY 1 ORDER BY 2 DESC
-- KQL
CaverLake
| where class_uid == 3002 and activity_name == "Logon"
| summarize logins=count() by actor_user_name
| order by logins desc
Ingest Paths
CAVERN Security for Enterprise
CAVERN is the detection engine inside Caver. It replaces Splunk Enterprise Security with risk-based alerting, ATT&CK-mapped correlation rules, and 123 out-of-box content packs covering 4,000+ detection rules.
Detection Rules
Every CAVERN rule is a YAML file with a Sigma-compatible match block, an optional query override in any supported language (SPL, SQL, KQL, PromQL, raw Sigma), risk score, ATT&CK tags, and at least one fixture for CI validation.
id: cavern.ai_usage.prompt_injection_candidate
enabled: true
title: Prompt-injection candidate in LLM request
severity: high
score: 65
attack: [T1059, T1190]
match:
selection:
sourcetype: ai_usage
prompt|contains:
- "ignore previous instructions"
- "forget the system prompt"
condition: selection
Sigma Integration
Caver ships full first-class Sigma support, the community standard for vendor-agnostic detection rules.
enabled: false; enable per-category after tuning thresholds for your environment.caver-sigma import ./rules/ -o ./cavern/ to transpile any Sigma rule file or directory. caver-sigma stats ./rules/ for transpile coverage report.sigma-sync.yml GitHub Actions workflow polls SigmaHQ releases weekly. When a new release drops, it transpiles, opens a PR with a coverage delta table, and waits for your review.# Transpile a single Sigma rule
caver-sigma import ./my_rule.yml
# Import the full SigmaHQ corpus
caver-sigma import ~/sigma/rules/ -o src/caver/cavern/content/v1/sigma/
# Coverage report
caver-sigma stats ~/sigma/rules/
# → 2960 transpiled (94.5%), 172 skipped (aggregations, field-refs)
Risk-Based Alerting
CAVERN's RBA model accumulates per-entity risk scores over a rolling window, then creates a SLAM notable when an entity's score crosses the threshold.
| Multiplier type | Factor |
|---|---|
| Admin account | 2.0× |
| Critical production asset | 1.875× |
| Service account | 0× (suppressed) |
| Default (standard user) | 1.0× |
Content Packs (123)
| Category | Packs | Rules |
|---|---|---|
| AI / LLM | ai_usage · ai_coding_assistants · shadow_ai · langflow · managed_llm · litellm · portkey · cloudflare_ai_gateway · mcp · vector_db · embeddings · rag_pipelines · agent_frameworks · browser_agents · voice_ai · image_generation · fine_tuning · on_prem_inference · huggingface · model_supply_chain · prompt_library_drift · ai_governance · ai_cost_governance · ai_data_governance | 200+ |
| ATT&CK / Endpoint | art_endpoint · credential_access · defense_evasion · discovery · initial_access · lateral_movement · persistence · command_and_control · execution · privilege_escalation · collection · exfiltration · impact | 240+ |
| Host telemetry | win_security_eventlog · sysmon · linux_auditd · powershell_scriptblock · osquery · yara · clamav | 140+ |
| Edge / Network | edge_security · network · network_infrastructure · firewall_syslog · rmm_tools | 75+ |
| OT / ICS | industrial_bacnet · industrial_dnp3 · industrial_modbus · industrial_iec104 · industrial_s7comm · industrial_ethernet_ip · scada_windows · it_ot_correlation | 95+ |
| Cloud platforms | aws_cloudtrail · aws_guardduty · azure_ad · google_workspace · k8s · m365_audit · devsec_correlation · cloud_correlation | 220+ |
| EDR / Endpoint | crowdstrike_falcon · sentinelone · microsoft_defender · wazuh · sysdig_secure · edr_common · edr_to_cloud_correlation | 130+ |
| SaaS audit (70+ adapters) | okta · slack_audit · github_audit · salesforce_audit · zoom_operations · and 65+ more | 380+ |
| Community (Sigma) | Auto-transpiled at every SigmaHQ release, 100% of corpus mapped | 2,960+ |
Detection IDE
The built-in Detection IDE at /ui/cavern-detection-ide lets you author, lint, backtest, and diff rules without touching the filesystem. Write a rule, paste a sample event, see it fire or not fire, live. Auto-tune proposes threshold adjustments based on historical FP/TP rates; apply with one click and roll back if needed.
Supported Technologies
Caver ships out-of-box integrations for 121+ vendors and protocols across security, identity, infrastructure, and OT. Every integration includes a caver-collector receiver/adapter and at least one CAVERN content pack with detection rules tuned to the source's event shape.
Cloud Platforms 12
Identity & SSO 10
Productivity & Collaboration 12
Developer & DevOps 14
EDR & Endpoint 10
Network & Perimeter 11
Cloud Security 8
AI & LLM 12
OT / ICS 7
Sales, CRM & Support 15
Observability & Analytics 10
If you do not see a tool you rely on, suggest it and we will scope it. New integrations land as a caver-collector receiver plus a CAVERN content pack, typically inside a single release cycle.
Already have a dashboard you love? Keep it. Caver registers as a peer on most search tools (Splunk, for example), so your existing dashboards, saved searches, and correlation rules keep running unchanged against the OCSF lake. Federation modes ship for Elastic / Kibana, Microsoft Sentinel, Sumo Logic, Datadog, and growing. See deployment options →
SLAM SOAR
SLAM (Security Localization And Mapping) is Caver's SOAR layer. It handles the full lifecycle from CAVERN notable creation through analyst triage, case management, evidence collection, and compliance-ready reporting.
Notables
A notable is created when an entity's RBA score crosses the configured threshold, or when a rule fires directly at high severity. Each notable carries the contributing rules, scores, entity identity, and a timeline of events.
Playbooks
Operator-editable YAML. Shipped playbooks cover AI usage, host telemetry, cloud privilege escalation, edge appliance exploitation, and OT/ICS anomalies. Playbook steps are composable, mix enrichment, routing, alerting, and case creation freely.
name: ai_usage_critical
trigger:
notable_severity: [critical]
pack: ai_usage
steps:
- enrich_identity:
field: actor.user.name
- alert_telegram:
message: "🔴 {title}, {actor.user.name} ({actor.department})"
- create_case:
priority: P1
sla_minutes: 15
- tag_notable:
tags: [ai-security, auto-cased]
Cases
Cases aggregate related notables, attach evidence (Parquet exports, network captures, screenshots, forensic images) into the Evidence Locker with full chain of custody, track SLA timers, and export signed PDF reports for compliance, legal, or post-incident review. SOC KPI dashboards surface MTTD, MTTR, and false-positive rates per content pack.
Oncall Integration
Evidence Locker
The Evidence Locker is Caver's tamper-evident, audit-logged store for any artifact that needs to survive an incident's full legal lifetime. It pairs with SLAM cases, CAVERN notables, ECHO episodes, and compliance reports. Every artifact lands with a SHA-256 hash, a timestamped signature, and an immutable audit trail. The same artifact can be exported decades later with full chain-of-custody verification.
Architecture
The Locker is a thin write-once layer over standard S3-compatible object storage. Caver writes artifacts with S3 Object Lock in compliance mode (legal-grade WORM). A separate ledger tracks every read, write, hold, and export. The ledger itself is appended-only and Merkle-tree hashed for integrity.
Evidence Types
The Locker is artifact-agnostic. Anything you can serialise to bytes lands as evidence with the same chain-of-custody guarantees.
forensic_acquire playbook step. Standard E01 / AFF4 / raw supported.Chain of Custody
Every artifact is admissible-ready. Caver records who touched it, when, and with what cryptographic proof. The chain survives operator turnover, vendor swaps, and platform migrations.
| Field | What it proves |
|---|---|
artifact_sha256 | The bytes have not been altered since the moment of ingestion |
ingest_timestamp + Ed25519 signature | The artifact existed in Caver no later than this moment, signed by a Caver-controlled key |
ingest_actor | Which user / playbook / API caller deposited the artifact, with their IdP identity at time of write |
ingest_source_event_id | Which CAVERN notable / SLAM case / ECHO episode triggered the deposit, fully linkable |
ledger_merkle_root | The state of the entire Locker ledger at the moment of ingestion, so the ledger itself cannot be edited without detection |
access_log | Every subsequent read, including who, when, from what IP, with what justification (free-text + dropdown) |
retention_policy | The retention class that applied at ingest, including any legal holds layered on later |
verification_payload | Bundled SHA-256 + signature + ledger proof, downloadable as a standalone JSON for independent verification |
Independent verification
Caver ships caver-evidence-verify, a standalone CLI binary signed and distributed with the product. Point it at an exported evidence bundle and it re-computes the SHA-256, validates the Ed25519 signature, walks the Merkle proof, and exits 0 if the chain is intact. The verifier runs offline. A defence expert, opposing counsel, or auditor can confirm chain integrity without access to your Caver deployment.
caver-evidence-verify --bundle case-2026-04127.evidence.zip
# → verifying 41 artifacts...
# → 41/41 SHA-256 valid
# → 41/41 Ed25519 signature valid
# → ledger Merkle proof valid (root: 0x4f3a...)
# → chain of custody intact through 2026-04-18T14:23:01Z
# → exit 0
Retention & Legal Hold
Retention is policy-driven and per-evidence-class. Legal holds layer on top and freeze deletion until the hold is released by an authorised user. Every hold action is logged.
| Policy class | Default retention | Common use |
|---|---|---|
p1_critical | 10 years | P1 / P2 incidents, regulated industries |
standard | 7 years | Default for all SLAM cases |
operational | 3 years | Low-severity tuning / FP investigations |
research | 1 year | Threat-hunt artifacts, intel-gathering |
legal_hold | Indefinite | Applied on top of any class, blocks deletion until released |
regulatory_lock | Per-policy (HIPAA 6y, PCI 1y after closure, SOX 7y) | Auto-applied when the affected asset is tagged with the compliance regime |
Legal hold workflow
Export Formats
API & SDK
The Locker exposes a REST API and Python SDK for integration with custom workflows, SOAR connectors, and external case-management systems.
REST endpoints
# Deposit an artifact under a case
POST /api/locker/cases/<case_id>/artifacts
Content-Type: multipart/form-data
X-Caver-Justification: incident-response
# Retrieve an artifact (logged in access ledger)
GET /api/locker/artifacts/<artifact_id>?justification=...
# Apply legal hold (requires dual-auth)
POST /api/locker/holds
{"case_id": "...", "second_auth_token": "...", "reason": "..."}
# Export a case as signed PDF
POST /api/locker/cases/<case_id>/export?format=pdf
# Verify an exported bundle
POST /api/locker/verify
Content-Type: application/zip
Python SDK
from caver_sdk import Locker
vault = Locker(endpoint="https://caver.internal", token=token)
# Deposit a PCAP under a case
art = vault.deposit(
case_id="2026-04127",
artifact_type="pcap",
body=pcap_bytes,
justification="suspicious egress flow capture",
)
print(art.sha256, art.locker_uri)
# Apply legal hold with dual-auth
vault.apply_hold(
case_id="2026-04127",
reason="pending litigation, see legal-2026-31",
second_auth_token=counsel_token,
)
# Export and download a signed PDF
pdf_bytes = vault.export(case_id="2026-04127", format="pdf")
open("incident-report.pdf", "wb").write(pdf_bytes)
Configuration
[evidence_locker]
bucket = "caver-evidence-locker"
object_lock_mode = "compliance" # or "governance"
retention_default = "standard" # see retention table
signing_key_env = "CAVER_EVIDENCE_SIGNING_KEY" # Ed25519 private key
ledger_anchor_cadence = "1h" # how often to write the Merkle root to a public timestamping authority
[evidence_locker.acl]
deposit = ["role:analyst", "role:playbook"]
read = ["role:analyst", "role:lead", "role:legal"]
hold = ["role:legal", "role:ciso"] # dual-auth enforced
export = ["role:lead", "role:legal"]
[evidence_locker.retention.regulatory]
hipaa = "6y_after_closure"
pci = "1y_after_closure"
sox = "7y_after_closure"
gdpr_subject_rights = "subject_request_override" # allow regulator-driven deletion
ECHO ITSI
ECHO is Caver's service intelligence layer. It replaces Splunk ITSI with service trees, KPI tracking, episode correlation, and real-time health scores, all built on the same OCSF Parquet lake.
Service Trees
Define services hierarchically with parent/child relationships. Each node has a health score derived from its KPIs and contributing notables. Health propagates upward, a degraded leaf impacts every ancestor.
KPI Types
Attach any SPL, SQL, or KQL query as a KPI on any service node. ECHO ships 12 built-in KPI types covering throughput, latency, error rate, saturation, and security signal.
| KPI type | Use case | Default thresholds |
|---|---|---|
throughput_rps | Requests per second | warn < 60% peer · crit < 30% |
latency_p50 | Median response time | warn > 2× baseline · crit > 5× |
latency_p95 | 95th-percentile latency | warn > 2× · crit > 5× |
latency_p99 | 99th-percentile latency | warn > 2× · crit > 5× |
error_rate | HTTP 5xx % | warn > 1% · crit > 5% |
saturation | CPU / memory / queue depth | warn > 70% · crit > 90% |
availability | Uptime % over window | warn < 99.5% · crit < 99% |
notable_count | SLAM notables per window | warn > 1 · crit > 3 |
anomaly_count | UBA anomalies per window | warn > 2 · crit > 5 |
auth_failure_rate | 4xx auth fail % | warn > 5% · crit > 15% |
data_freshness | Time since last event | warn > 5m · crit > 15m |
custom | Any SPL/SQL/KQL expression | operator-defined |
Episode Correlation
Correlated bursts of notables that share an entity, timeframe, and kill-chain stage become episodes. Episodes reduce alert volume from 40 individual notables down to one full-story incident with timeline.
Grafana Integration
Every service health score, KPI value, and episode count is exposed as a Prometheus metric at /metrics. Point Grafana at the endpoint for real-time dashboards. PagerDuty / Opsgenie / Telegram alerting via Grafana Alertmanager.
# Service health gauge
caver_echo_service_health{service="api-gateway"} 31
# KPI values
caver_echo_kpi_value{service="api-gateway",kpi="latency_p99"} 4823
caver_echo_kpi_value{service="api-gateway",kpi="error_rate"} 7.2
# Episode count
caver_echo_episode_count{service="customer-platform",window="15m"} 2
UBA, User Behavior Analytics
Caver UBA builds per-entity behavioral baselines and surfaces anomalies no single rule could catch. It operates across identity, endpoint, network, cloud, and OT signals simultaneously, with 65+ unsupervised anomaly models and ATT&CK kill-chain mapping built in.
Per-entity Baselines
Baselines are computed per individual entity over a rolling 30-day window. Deviation is scored relative to each entity's own history, not a global threshold.
Anomaly Models (65+)
Caver UBA ships 65+ unsupervised models organised by signal type. Models run continuously; scores update on every event. No labelled training data required.
Peer Group Analysis
UBA constructs peer groups from the identity graph: managers + reports, department members, role-based clusters, security-group membership. An action is "anomalous" not because it crosses a fixed threshold but because it is rare for this entity AND rare among its peers.
| Peer source | Cluster basis | Used by models |
|---|---|---|
| Identity graph (Azure AD / Okta / Google) | Manager · department · title | Department-outlier · role-outlier · manager-cluster |
| Security groups | Group membership intersect | Privilege-footprint · resource-set |
| Access patterns | K-means on application access histogram (30d) | Access-path · session-pattern |
| Asset graph | Service tag · tier · environment | Asset-tier outlier |
Threat Assembly
Individual anomalies are assembled into threat timelines when they share an entity, overlap in time, and span multiple ATT&CK phases. A timeline is one SLAM notable that contains the full attack story rather than 40 individual alerts.
ATT&CK Kill-Chain Mapping
Every anomaly model is tagged with the ATT&CK tactics it surfaces. Threat assembly preferentially groups anomalies that span multiple kill-chain phases, a high-confidence signal of an active intrusion.
| ATT&CK tactic | UBA models that fire here |
|---|---|
| Initial Access (TA0001) | First-time-app · impossible travel · MFA bypass · DGA shape |
| Execution (TA0002) | Rare process · LOLBin spike · unsigned binary · code injection |
| Persistence (TA0003) | Registry autorun · scheduled task · disabled-account reactivation · service account drift |
| Privilege Escalation (TA0004) | Token impersonation · privilege escalation · LSASS access · driver load |
| Defense Evasion (TA0005) | Token reuse · DLL sideload · unsigned binary · CloudTrail tamper |
| Credential Access (TA0006) | Password spray · credential dump · LSASS access · KMS key abuse |
| Discovery (TA0007) | Port scan · IAM enumeration · schema enumeration · column-mass-select |
| Lateral Movement (TA0008) | SMB/RDP lateral · token impersonation · cross-account assumption |
| Collection (TA0009) | Bulk export · sensitive-table access · S3 mass download |
| Exfiltration (TA0010) | Low-and-slow exfil · DNS tunneling · dump-to-stage anomaly |
| Command & Control (TA0011) | Beaconing detector · DGA shape · DNS tunneling · TLS fingerprint anomaly |
| Impact (TA0040) | Resource-deletion burst · KMS key abuse · disabled MFA admin |
Cross-source Correlation
UBA correlates anomalies across signal sources through the entity graph. An identity anomaly on alice@corp + an endpoint anomaly on alice-laptop (which Azure AD links to alice) + a cloud anomaly under role alice-iam become a single threat timeline scoring 3× their individual sum.
Identity + Asset Graph Enrichment
Every anomaly is enriched at fire time with the entity's full context:
Intelligence & AI
Caver exposes its full capabilities to AI agents via the Model Context Protocol, and ships a built-in orchestrator for natural-language operator workflows.
MCP Server
Install caver[mcp-server] and connect any MCP-compatible AI client, Claude Desktop, Claude Code, GPT-4 with function calling, or a custom agent. The MCP server enforces Caver's auth model across all queries.
// Claude Desktop / Claude Code config
{
"mcpServers": {
"caver": {
"command": "caver",
"args": ["mcp", "--config", "/etc/caver/caver.toml"]
}
}
}
Available MCP tools: query (run SPL/SQL/KQL), search_rules (find CAVERN rules by name/ATT&CK), get_notables, get_case, trigger_playbook, get_entity_timeline, get_service_health.
Orchestrator, 13 primitives
The Intelligence orchestrator is a Claude-powered admin interface. Operators describe what they need; the orchestrator plans and dispatches across these primitives:
| Primitive | Description |
|---|---|
onboard_sample | Ingest a sample event → propose OCSF mapping → generate source config + rule + fixture |
nl_to_config | Convert plain-language description to a caver.toml stanza or CAVERN rule YAML |
forge_rule | Author a new CAVERN rule from a threat description or CVE |
replay_against_history | Run a rule against historical Parquet data, count fires, estimate FP rate |
lint_config | Validate caver.toml, CAVERN rule, or pipeline config against the schema |
enrich_text | Look up an IP, domain, hash, or CVE in threat-intel feeds |
explain_rule_silence | Why hasn't this rule fired in N days? (data gap, filter, threshold) |
tune_threshold | Propose + apply a score/threshold change based on FP/TP history |
start_canary | Route X% of traffic through a new rule/pipeline config for A/B validation |
pipeline_diff | Compare two pipeline configs in plain language |
compliance_report | Generate coverage map for SOC 2, PCI, HIPAA, controls mapped to CAVERN rules |
Threat Intelligence
Caver ships first-class threat intelligence as a layer in the platform. Every supported feed normalises to OCSF 2003 (Threat Intelligence) or OCSF 2002 (Vulnerability Finding), populates the IOC bloom-filter for detection-time matching, and is queryable via the orchestrator's enrich_text primitive. Free open feeds ship enabled by default. Commercial feeds plug in with bring-your-own credentials.
Default Feeds (free, open, ship enabled)
These feeds ship enabled in a default Caver install. No commercial licence required, attribution preserved per source license.
| Feed | What you get | Indicator types | Cadence |
|---|---|---|---|
| abuse.ch URLhaus | Malicious URLs, live + recent | URL · domain · IP | 5m polling |
| abuse.ch MalwareBazaar | Malware samples, family + tags | SHA256 · YARA · family | real-time |
| abuse.ch ThreatFox | IOCs tied to malware family | IP · domain · URL · hash | 5m polling |
| abuse.ch Feodo Tracker | Botnet C2 IPs, Emotet / Dridex / TrickBot / QakBot / IcedID / BumbleBee | IP · port · family | hourly |
| abuse.ch SSL Blacklist | Malicious SSL/TLS certificates | SHA1 · family | daily |
| abuse.ch YARAify | YARA rules + sample matches | YARA · hash | 30m polling + webhook |
| LevelBlue OTX (formerly AlienVault) | Curated pulses with attribution + ATT&CK tagging | IP · domain · URL · hash · email · CVE | per-pulse subscription |
| OpenPhish Community | Verified live phishing URLs, last 5h | URL · target brand | 5m polling |
| CISA Known Exploited Vulnerabilities (KEV) | CVEs with confirmed in-the-wild exploitation + remediation due dates | CVE · vendor · product | daily |
| CISA AIS (Automated Indicator Sharing) | STIX/TAXII 2.1 indicator feed, US government and partner indicators | STIX bundle (full) | real-time TAXII |
| CISA CSAF Advisories | Structured security advisories from CISA + ICS-CERT | CVE · vendor · product | real-time |
| CISA / US-CERT Alerts | Human-readable APT bulletins + threat alerts | narrative + indicators | RSS / Atom |
| MISP (community sharing) | Plug into any MISP instance you have access to | full MISP attribute set | per-event subscription |
| Emerging Threats Open | Proofpoint community Suricata rules + IOCs | signature · IP · domain | daily |
| Spamhaus DROP / EDROP / ASN-DROP | Spam / hijacked / criminal hosting netblocks | CIDR · ASN | daily |
| Shodan InternetDB (free tier) | CVE-to-IP mapping, exposed services | IP · CVE · port · banner | on-demand lookup |
| GreyNoise Community | Benign internet-scan suppression (so an SSH scan does not light up every CAVERN rule) | IP · classification · noise score | on-demand lookup |
Commercial / Paid Feeds (bring your own credentials)
Commercial feeds plug in via standard receivers using credentials you already pay for. Configure your API key in caver.toml threat_intel section (Doppler-backed) and the feed lights up.
| Feed | Coverage | Auth |
|---|---|---|
| Recorded Future | Full Intelligence Card API, risk lists, alerts | API token |
| Mandiant Advantage / Google Threat Intel | APT profiles, IOCs, vulnerability intel | OAuth2 service principal |
| CrowdStrike Falcon Intelligence | Adversary tracking, IOCs, MalQuery hash lookups | OAuth2 client credentials |
| VirusTotal Enterprise | File / URL / domain / IP reputation lookups, intelligence searches | API key |
| Anomali ThreatStream | Aggregated threat intel platform feeds | API key |
| OpenPhish Premium | Full historic feed + target-brand attribution + WebSocket live stream | API key |
| abuse.ch Commercial | Commercial use of all abuse.ch feeds with no rate limits | Auth-Key header |
| GreyNoise Enterprise | Full-context noise dataset, IP scan history, RIOT trust list | API key |
| DomainTools Iris | Domain whois history, hosting graph, threat scoring | API key + username |
| Mimecast Threat Intelligence | Email-borne threat intel (URLs, attachments, sender reputation) | OAuth2 |
Custom Feeds
Any feed you can produce as STIX 2.1, OCSF JSON, CSV, NDJSON, or a webhook can be wired in without writing code. For weirder shapes, drop in a Python normaliser under src/caver_collector/transforms/ and the rest of the pipeline picks it up.
api_poll receiver against any HTTPS endpoint that returns JSON, NDJSON, or CSV. Map fields to OCSF via a small YAML config, no Python required.webhook receiver. Caver listens, normalises, and indexes.s3_sqs receiver picks it up via SQS notification, normalises, and loads it into the bloom-filter.src/caver_collector/transforms/ implementing the standard interface. Output OCSF, get the bloom-filter + enrichment + CAVERN integration for free.Custom feed config example
[[threat_intel.custom]]
name = "acme-internal-misp"
receiver = "misp"
endpoint = "https://misp.acme.corp"
auth_key_env = "ACME_MISP_KEY"
sharing_groups = ["internal-ops", "soc-tier-3"]
poll_interval = "5m"
[[threat_intel.custom]]
name = "isac-stix-feed"
receiver = "taxii"
endpoint = "https://taxii.example-isac.org/api/v21/"
collection_id = "94b1a4eb-..."
auth_basic_env = "ISAC_AUTH_BASIC"
poll_interval = "15m"
[[threat_intel.custom]]
name = "soar-playbook-output"
receiver = "webhook"
listen_path = "/ti/playbook"
secret_env = "SOAR_WEBHOOK_SECRET"
How feeds integrate
Every feed, whether default open, commercial, or custom, follows the same path through the platform.
From there, four things automatically light up:
enrich_indicator on notable creation to attach feed context: which feed first saw this IP, when, with what attribution.caver.toml threat-intel section
[threat_intel]
bloom_filter_rebuild = "0 3 * * *" # nightly 3am
enabled_default_feeds = ["abusech", "otx", "openphish", "cisa_kev", "cisa_ais"]
[threat_intel.commercial]
recorded_future.api_key_env = "RECORDED_FUTURE_KEY"
crowdstrike.client_id_env = "CS_CLIENT_ID"
crowdstrike.client_secret_env = "CS_CLIENT_SECRET"
virustotal.api_key_env = "VT_API_KEY"
[threat_intel.attribution]
preserve_source = true # every IOC carries original feed + first-seen + tags
preserve_sharing_group = true # MISP sharing groups + STIX TLP retained
AI Observatory
Caver provides the most comprehensive security coverage for AI/LLM usage in the industry. 24 dedicated content packs and 200+ purpose-built CAVERN rules cover the full AI threat surface, prompt injection, shadow AI, agent framework abuse, vector-DB exfiltration, supply-chain compromise, voice/image generation misuse, and on-prem inference monitoring.
Core CAVERN Rules (ai_usage pack)
api_key_leak_in_promptT1552 · credential accesssystem_prompt_exfil_suspectT1213 · collectionmodel_supply_chain_matchT1195 · supply chainprompt_injection_candidateT1059 · initial accessoutput_pii_leakT1567 · exfiltrationrag_indirect_injection_indicatorT1059 · indirect execdataset_poisoning_indicatorT1567 · exfiltrationcost_anomaly_per_userT1496 · resource hijackoff_hours_token_spikeT1078 · valid accountsmodel_variance_burstT1190 · exploit public appAI Content Packs (24)
ai_usageai_coding_assistantsshadow_ailangflowmanaged_llmlitellmportkeycloudflare_ai_gatewaymcpvector_dbembeddingsrag_pipelinesagent_frameworksbrowser_agentsvoice_aiimage_generationfine_tuningon_prem_inferencehuggingfacemodel_supply_chainprompt_library_driftai_governanceai_cost_governanceai_data_governanceSupported AI Tools
Gateways: LiteLLM · Portkey · Cloudflare AI Gateway · Helicone · LangSmith Proxy · Kong AI Gateway
Agents & orchestration: LangChain · LangGraph · AutoGen · CrewAI · LlamaIndex · LangFlow · MCP servers
Coding assistants: GitHub Copilot · Cursor · Codeium · Tabnine · Continue.dev · Amazon Q · Sourcegraph Cody · Windsurf
Browser / computer-use: Claude Computer Use · Browser-Use · Skyvern · Multi-On · OpenInterpreter
Voice / image / multimodal: ElevenLabs · OpenAI Realtime · Bland AI · DALL-E · Midjourney · Stable Diffusion · Flux · Replicate · Runway
Vector DBs: Pinecone · Weaviate · Qdrant · Chroma · pgvector · Milvus · LanceDB
On-prem inference: Ollama · LM Studio · llama.cpp · vLLM · TGI · TensorRT-LLM · any OpenAI-compatible endpoint
Telemetry is normalized to OCSF Application Activity (class 6005) by the caver-collector ai_usage_normalize source. All rules operate on OCSF fields, no raw HTTP inspection, no perimeter TLS break.
Caver-Collector v1.175
The security data pipeline layer. Three equally-featured backends, Python for orchestration and complex normalisation, Vector (Rust) for high-throughput hot paths, and an OTel collector distro for organisations already running OTel infrastructure.
Python Pipeline
The Python orchestrator handles control-plane logic, vendor-specific API polling, and complex normalisation that benefits from Python's ecosystem.
Receivers (21)
| Receiver | Purpose | Protocol / source |
|---|---|---|
hec | Splunk HTTP Event Collector ingress | HTTPS · JSON / raw events |
hec_replay | Replay recorded HEC NDJSON payloads | file / S3 NDJSON archive |
syslog | Standard syslog ingress | RFC3164 + RFC5424 · TCP/UDP/TLS |
tcp_line | Reliable newline-delimited TCP | TCP framed by newline |
udp_packet | Generic UDP datagram (no syslog framing) | UDP raw |
file_tail | File log tailing with checkpointing | Local files · rotation aware |
webhook | Generic HTTP POST endpoint for SaaS push | HTTPS · arbitrary JSON |
otlp_http | OpenTelemetry Protocol over HTTP | OTLP/HTTP · logs/metrics/traces |
kafka_consumer | Kafka topic consumer with offset tracking | Kafka 2.x+ · SASL/SCRAM · mTLS |
kinesis_firehose | AWS Kinesis Firehose HTTP push | HTTPS · Firehose record format |
s3_sqs | S3 object notifications via SQS queue | S3 → SQS → object fetch |
caver_lake_replay | Replay caver lake Parquet partitions | OCSF Parquet · time-bounded |
windows_event_log | Windows EvtSubscribe live channel subscription | WMI EvtSubscribe |
uf_compat | Splunk Universal Forwarder S2S inbound | Splunk S2S binary protocol |
api_poll | SaaS API poller (70+ adapters) | HTTPS · per-vendor pagination |
modbus_tcp | Modbus TCP industrial protocol | Modbus TCP · function code 1-127 |
dnp3 | DNP3 SCADA protocol monitor | DNP3 TCP/UDP |
bacnet | BACnet/IP building automation | BACnet/IP UDP |
iec104 | IEC 60870-5-104 power grid SCADA | IEC 104 TCP |
ethernet_ip | EtherNet/IP (Allen-Bradley CIP) | EtherNet/IP TCP/UDP |
s7comm | Siemens S7-300/400/1200/1500 PLC | S7Comm TCP |
SaaS API Adapters (70+)
Transformation & Normalization (44)
Vector Backend (vector-caver)
Rust-speed hot path for high-throughput log sources. Fork of Vector (MPL-2.0) with four custom Caver crates targeting 300k+ events/s.
remap transform.transform processor factory.OTel Backend
An OpenTelemetry Collector distribution (caver-otelcol) built with ocb v0.129. Deploy it anywhere you already run OTel infrastructure, it speaks native OTLP and drops into existing pipelines.
receivers:
filelog:
include: ["/var/log/app/*.jsonl"]
processors:
transform/ocsf:
log_statements:
- context: log
statements:
- set(attributes["class_uid"], 4001)
- set(attributes["metadata.product.name"], "caver-otelcol")
batch:
send_batch_size: 500
timeout: 30s
exporters:
awss3:
s3uploader:
endpoint: "http://minio:9000"
s3_bucket: "caver-lake"
marshaler: parquet
Caver Forge
Caver Forge is the companion product that turns newly published CVEs into queryable detection content with no human in the loop. It scrapes new CVEs from configurable threat-intel feeds, drafts Sigma rules via Claude grounded in your OCSF schema, transpiles to SPL, runs the query through Caver's /services/search/run endpoint, and writes Detection Finding (OCSF 2004) rows to stage_alerts/ in the lake. The output answers one question continuously: "would my lake have caught this CVE if I had been watching?"
What it does (and what it doesn't)
stage_alerts/, queryable like any other lake data.Threat Intel Feed Registry
Forge ships a curated default feed registry. Two feeds are enabled by default (NVD + CISA KEV); the rest are listed and opt-in. User overrides live in ~/.config/cve-forge/feeds.yaml.
| ID | Type | Default | What it covers |
|---|---|---|---|
nvd | NVD | enabled | NIST canonical CVE feed (CPE-matched, CVSS-scored) |
cisa-kev | CISA KEV | enabled | CISA Known Exploited Vulnerabilities. Federal patch deadline |
osv | OSV | opt-in | OSV.dev open-source ecosystem vulnerabilities |
ghsa | GHSA | opt-in | GitHub Security Advisories |
zdi | RSS | opt-in | Zero Day Initiative published advisories |
msrc | MSRC | opt-in | Microsoft Patch Tuesday CVRF bulletins |
cisco-psirt | Cisco openVuln | opt-in | Cisco PSIRT advisories |
project-zero | Atom | opt-in | Google Project Zero public bug tracker |
github-advisory-db | Git repo | opt-in | github/advisory-database canonical mirror |
vulncheck-nvd | VulnCheck | opt-in | VulnCheck NVD++ commercial mirror (faster + enriched) |
Custom feeds drop in via the same YAML, point at any RSS / JSON / Atom endpoint:
feeds:
- id: my-private-feed
name: "Internal threat intel"
type: rss
url: https://intel.internal.example.com/feed
enabled: true
auth_env: INTERNAL_FEED_TOKEN
Pipeline
Each CVE flows through five stages, fully observable, retry-safe, idempotent on CVE ID. Stages emit structured telemetry to Caver's /metrics for ECHO health tracking.
Enabled feeds (NVD, CISA KEV, custom, ...)
│
▼
cve_forge.scrape normalised CVE record (id, description,
│ CVSS, CPEs, CWEs, references)
▼
cve_forge.generate Claude prompt grounded in caver/sources.json,
│ returns a validated Sigma YAML
▼
cve_forge.transpile pysigma + pysigma-backend-splunk to SPL
│ (sentinel-aware: _not_applicable: true on
│ rules that cannot match this lake's sources)
▼
cve_forge.run POST SPL to caver, read X-Caver-Row-Count
│ header, build StagedResult
▼
S3 OCSF parquet sink s3://lake/stage_alerts/year=.../...
(class_uid 2004, Detection Finding)
Sentinel handling
Some CVEs do not map to any data shape your lake collects (firmware vulns in IoT devices you do not monitor, vulns in apps your org does not run). Forge does not pretend to generate a rule for those. The generate stage returns _not_applicable: true with a structured reason, the row lands in stage_alerts/ with status not_applicable, and the loop moves on. The kiosk surfaces these so an operator can review the coverage gap and decide whether to add a source.
Configuration
Forge runs as its own process with its own config. Caver reads Forge's output as one more lake source. Minimal install:
pip install caver-forge # separate distribution
# point at your caver instance + LLM
export CAVER_URL=https://caver.internal:8089
export CAVER_TOKEN=$(doppler secrets get CAVER_HEC_TOKEN --plain)
export ANTHROPIC_API_KEY=$(doppler secrets get ANTHROPIC_API_KEY --plain)
# write target (same MinIO bucket caver reads from is fine)
export FORGE_S3_BUCKET=caver-lake
export FORGE_S3_PREFIX=stage_alerts/
# kick off a one-shot scrape
cve-forge scrape --hours 24
# or run the autonomous background loop
cve-forge loop --interval 1h
Caver-side source config
Add stage_alerts to your Caver sources.json so the kiosk + SPL surfaces light up the same way they would for any other source:
{
"stage_alerts": {
"bucket": "caver-lake",
"prefix": "stage_alerts/",
"schema": "ocsf_detection_finding_v1.3.0",
"default_index": "stage_alerts"
}
}
CLI Reference
| Command | Description |
|---|---|
cve-forge feeds list | Show the registry + enabled state |
cve-forge feeds enable <id> | Enable a feed (writes to ~/.config/cve-forge/feeds.yaml) |
cve-forge feeds disable <id> | Disable a feed |
cve-forge scrape --hours N | One-shot, pull from every enabled feed for the last N hours |
cve-forge scrape --feeds-config /path/to/feeds.yaml | Use a custom feed config file |
cve-forge loop --interval 1h | Run autonomously, scrape every interval |
cve-forge replay <cve_id> | Re-process a specific CVE (debugging / re-grounding) |
cve-forge stats | Coverage stats: how many CVEs in the last 7d, how many staged, how many not_applicable, cost |
cve-forge cost --since <date> | LLM token cost report |
Forge vs CAVERN
Forge and CAVERN sit at opposite ends of the detection lifecycle. Forge is breadth-first (every CVE, no human), CAVERN is depth-first (every rule hand-tuned). They feed each other.
| Aspect | Caver Forge | CAVERN |
|---|---|---|
| Authoring | Autonomous (Claude-generated) | Hand-authored by detection engineers |
| Trigger | New CVE published | Threat model / hypothesis / incident |
| Speed | Minutes to hours after CVE drop | Days to weeks (proper engineering cycle) |
| Output | stage_alerts/ Detection Finding rows | Production CAVERN rules, RBA-scored notables |
| Alerting | None (staged, batch) | Real-time, oncall paging via SLAM |
| Tuning | Per-CVE auto-regenerate | FP/TP feedback loop with caver-cavern auto-tune |
| Cost model | ~$0.01 per CVE in LLM tokens | Detection engineer time |
| Promotion | Stage to production via SLAM Phase-5 bridge | Already in production |
Promotion bridge
When a Forge-generated staged finding surfaces an obviously valuable detection (high CVSS, observed in your lake, plausible threat model), SLAM's Phase-5 bridge offers a one-click promotion: the staged Sigma rule moves into a CAVERN content pack with operator review, gets fixtures, and joins the production detection set. The lineage from CVE to staged rule to production CAVERN rule is preserved end-to-end.
OT / ICS, Caver Industrial
Caver Industrial extends the core platform to operational technology and industrial control system environments. It ships as a separate plugin with per-deployment pricing.
IT/OT Correlation
The it_ot_correlation and scada_windows packs bridge IT and OT telemetry, correlating Windows event logs from engineering workstations with industrial protocol anomalies on the same OT network segment. An attacker moving from the IT perimeter to a SCADA workstation to a PLC shows up as a single correlated timeline in SLAM.
Vendor integrations
Compare Caver
Honest side-by-side comparisons against the field. Pick a tab, the comparison expands inline: at-a-glance table, where the incumbent wins, where Caver wins, and how to decide.
Active comparison
Caver vs Splunk
An honest comparison of Caver against Splunk Enterprise and Splunk Cloud. What Splunk does well, where Caver wins, and how to decide.
Splunk is the platform most teams compare Caver against, because Splunk is the platform most teams are running today. This page is the honest comparison.
At a glance
| Splunk Enterprise / Cloud | Caver | |
|---|---|---|
| License model | Ingest-per-day (Splunk Enterprise) or workload pricing (Splunk Cloud). Notoriously hard to predict. | Per-deployment, transparent license-key. No per-GB meter. |
| Cost trajectory at scale | Grows with telemetry volume. | Flat per deployment. |
| Cold-data search | Requires rehydration from frozen / archived buckets to a hot tier. Slow and expensive. | Native search over object storage. No rehydration. |
| Storage format | Splunk proprietary buckets. | Parquet, iceberg, and similar open formats. |
| Query languages | SPL. | SPL + KQL + SQL natively, all on the same backend with a language toggle. Plus AI agents over MCP, Grafana, DuckDB, Trino, and Athena against the same OCSF Parquet lake. |
| Forwarders | Universal Forwarder, per-agent licensing implications. | Pairs with caver-collector or your existing forwarders. No per-agent license. |
| Deploy time | Quarters for enterprise rollout. | Days for a working pilot. |
| App ecosystem | Splunkbase: largest catalog (thousands), 15+ years deep. Quality varies; many apps abandoned or shallow. Field normalization left to the operator. | Curated, meticulously authored. Each vendor pack includes dashboards, saved searches, data inputs, and OCSF field mappings. Daily updates. Ships with the product, no third-party install. Migrators auto-port Splunk dashboards, saved searches, ES correlation, ITSI, UBA, and SOAR playbooks. |
| Vendor lock-in | High (proprietary format, proprietary catalog). | Low (open storage formats, no proprietary catalog). |
| OT / ICS coverage | Splunk Industrial Asset Intelligence was deprecated in 2023. What’s left is community Splunkbase apps and the OT Security Add-on, both layered on the same general-purpose stack. No first-class industrial protocol decoding. | caver-industrial: passive deep-packet decoders for BACnet/IP, S7Comm, IEC 60870-5-104, DNP3, Modbus TCP, EtherNet/IP, OPC-UA. Framework alignment for NIST 800-82 + IEC 62443. Air-gap-friendly deploy. Curated industrial threat intel. |
| AI security visibility | Limited. | caver-aisec, purpose-built. |
Strengths
| Where Splunk wins | Where Caver wins |
|---|---|
| Ecosystem maturity. Splunkbase has been around for fifteen years. If there’s an obscure integration you need, someone has probably written it for Splunk. | Cost predictability. No quarterly ingest-license renegotiation. No “we ingested too much last month” surprises. |
| Enterprise sales motion. Splunk has the procurement story your CFO’s office already knows how to evaluate. | Cold data is just data. Years of telemetry, searchable at interactive speeds, without paying hot-tier storage cost or running a rehydration pipeline. |
| Operator pool. Hiring “a Splunk admin” is easier than hiring for any newer platform, just by candidate density. | No storage lock-in. Your data lives in object storage in standard formats. Your data engineering team can use the same data for non-SIEM purposes. |
| Query however you want. SPL, KQL, and SQL natively on the same backend. AI agents over MCP. Grafana, DuckDB, Trino, Athena over the same Parquet lake. Splunk gives you SPL. | |
| Content packs that ship complete. Each Caver pack includes dashboards, saved searches, data inputs, and OCSF field mappings, daily-updated. Splunkbase apps vary widely in completeness and freshness. | |
| Drop-in compatibility. Add Caver as a search peer to your existing Splunk environment. Operators don’t change tools. SPL queries fan out to both. | |
| Faster pilots. A working Caver pilot is days of work, not the multi-quarter procurement-plus-implementation cycle. | |
| OT and AI surfaces. caver-industrial and caver-aisec are first-class extensions, not bolt-ons. |
How to decide
If you have an existing Splunk investment you can’t justify ripping out, run Caver as a search peer alongside it. Use Caver for long-retention and cold-tier search; let Splunk continue to serve the workflows operators already know. The two work together.
If you’re greenfield, evaluate both. Splunk’s ecosystem maturity matters; Caver’s cost trajectory matters more if you expect telemetry volume to grow.
If you’ve already hit your Splunk license ceiling and the next true-up is the trigger for this conversation: that’s a great moment for Caver.
Talk to us about scoping — or read about how Caver works first.
Caver vs Elastic
Caver compared to Elasticsearch / Elastic Stack and Elastic Security. Open source pedigree vs. commercial focus, scale-engineering tax, and where each one fits.
At a glance
| Elastic Stack / Elastic Security | Caver | |
|---|---|---|
| License model | Open source (Apache 2.0 / Elastic License v2) + commercial subscriptions. | Per-deployment commercial license-key. |
| Storage | Elasticsearch indices. ILM moves data through hot / warm / cold / frozen tiers. | Native parquet / iceberg on object storage. |
| Cold-tier search | Frozen tier requires searchable snapshots with a performance penalty. | First-class search over object storage. |
| Query languages | KQL, EQL, ES|QL, Lucene. | SPL + KQL + SQL natively, all on the same backend with a language toggle. Plus AI agents over MCP, Grafana, DuckDB, Trino, Athena over the same OCSF Parquet lake. ES|QL native on roadmap. |
| Operator pool | Broad open-source community. | Smaller, focused on commercial deployments. |
| Scale engineering | Your team owns shard sizing, ILM tuning, rolling restarts, version upgrades. | We own the operational complexity. |
| Content ecosystem | Elastic integrations catalog plus community-authored content. Quality varies; many integrations require operator tuning. | Curated vendor packs that ship with dashboards, saved searches, data inputs, and OCSF field mappings. Daily updates. No third-party install. |
| OT / ICS coverage | No first-class OT product. Beats can ingest industrial telemetry via custom processors and community-authored content, but no out-of-box industrial protocol decoders and no framework-aligned content. | caver-industrial: passive deep-packet decoders for BACnet/IP, S7Comm, IEC 60870-5-104, DNP3, Modbus TCP, EtherNet/IP, OPC-UA. Framework alignment for NIST 800-82 + IEC 62443. Air-gap-friendly deploy. Curated industrial threat intel. |
| AI security visibility | Limited. | caver-aisec, purpose-built. |
Strengths
| Where Elastic wins | Where Caver wins |
|---|---|
| Open source heritage. You can run Elasticsearch entirely under your own roof, no commercial relationship required. | No scale-engineering tax. Shard layout, ILM policy, version upgrades, rolling restarts: these are operator-burden line items at any non-trivial Elastic deployment. Caver doesn’t ship that burden to you. |
| Broad community. Decades of community-authored detection content, dashboards, integrations. | Cold-tier economics. Searchable snapshots are a real Elastic capability, but they pay a measurable performance penalty. Caver’s object-storage search doesn’t. |
| Flexibility. Elasticsearch is a general-purpose engine that happens to also do SIEM. If you need full-text search, geospatial queries, log analytics, and observability all on the same platform, that’s a real story. | Query however you want. SPL, KQL, and SQL natively on the same backend. AI agents over MCP, Grafana, DuckDB, Trino, Athena over the same Parquet lake. Operators coming from KQL keep their language; operators on SPL or SQL get theirs too. |
| No license-key gate. Run as many clusters as you like with no per-deployment commercial conversation. | Content packs that ship complete. Each Caver pack includes dashboards, saved searches, data inputs, and OCSF field mappings, daily-updated. Elastic integrations vary widely in depth and freshness. |
| Purpose-built. Caver is SIEM-focused. Elasticsearch is general-purpose with a SIEM product layered on it. The differences show up at the edges (deployment hardening defaults, audit posture, retention guarantees). |
How to decide
If you have strong Elasticsearch operators on staff and the cluster is already healthy, Elastic Security on top of it is a reasonable answer.
If you’re paying real money in operator time for cluster maintenance, version upgrades, or shard sizing, and you’d rather that time go elsewhere, Caver removes that line item.
If you need OT / ICS visibility, caver-industrial is in a different league than what’s available for Elastic.
Caver vs Microsoft Sentinel
Caver compared to Microsoft Sentinel. Azure-bound SaaS SIEM with KQL as its native language vs Caver, which speaks KQL natively but doesn't lock you to Azure.
Microsoft Sentinel is the SIEM most cloud-native shops compare Caver against if they’re already running on Azure. The honest answer: if you’re all-in on Azure and the Microsoft ecosystem, Sentinel is hard to beat. If you’re not, or if Azure lock-in is a problem you want to avoid, Caver is the structural answer.
Worth noting up front: Caver speaks KQL natively. The Sentinel-migration story is real, not aspirational.
At a glance
| Microsoft Sentinel | Caver | |
|---|---|---|
| Deployment model | Azure-only SaaS. Multi-tenant managed by Microsoft. | Self-hosted in your cloud, on-prem, or air-gapped. BYO Azure / AWS / GCP / on-prem. |
| License model | Azure ingest + retention pricing. Variable by Log Analytics workspace and commit tier. | Per-deployment commercial license-key. Flat. No per-GB meter. |
| Cost predictability | Variable. Ingest spikes cost real money. | Predictable. Flat per deployment. |
| Cold-tier search | Archived logs require restore to interactive tier. Slow and expensive. | First-class search over object storage. No rehydration. |
| Storage | Microsoft-managed Log Analytics + Azure storage. Opaque to you. | Native Parquet on object storage. Your bucket, your keys, open format. |
| Query languages | KQL only. | KQL natively (same operator surface: where, extend, summarize, join inner/leftouter/anti, let, parse, mv-expand, bin, case/iff, union). Plus SPL and SQL. Plus AI agents over MCP, Grafana, DuckDB, Trino, Athena. |
| Azure ecosystem integration | Deep. Defender for Cloud, Defender for Endpoint, Purview, Entra ID, Sentinel Notebooks. | Standalone. Integrates via APIs and standard log sources, but isn’t a Microsoft-blessed component. |
| SOAR | Logic Apps. Mature but Azure-bound. | SLAM. Built into Caver. Configuration-as-code playbooks, version-controlled, no separate Logic Apps subscription. |
| Content packs | Sentinel content hub (community + Microsoft + partner). Variable depth, Azure-flavored. | 35+ vendor packs with bundled dashboards + data inputs + OCSF field mappings. Daily updates. |
| Threat intelligence | Microsoft Threat Intelligence + Sentinel TI connectors. | Curated industrial threat intel for caver-industrial; AI threat feeds (NIST AI 100-2, OWASP) for caver-aisec; built-in TI integration for the core. |
| OT / ICS coverage | Defender for IoT (separate product, separate license). | caver-industrial: passive decoders for BACnet/IP, S7Comm, IEC 60870-5-104, DNP3, Modbus TCP, EtherNet/IP, OPC-UA. NIST 800-82 + IEC 62443. Air-gap-friendly. |
| AI security visibility | Limited. | caver-aisec: prompt-injection detection, AI Observatory for LLM spend, NIST AI 100-2 + OWASP feeds. |
| Data residency | Azure regions only. | Your chosen environment, your storage account. |
| Air-gap deployment | Not supported. | Supported, including caver-industrial. |
| Migration tooling | Migration paths from Splunk, ArcSight, QRadar via Microsoft-published guides (manual). | caver-migrate ports dashboards, saved searches, ES correlation, ITSI, UBA, SOAR/Demisto playbooks. 9-of-9 migrator coverage tested end-to-end. KQL-native landing for Sentinel queries. |
Strengths
| Where Sentinel wins | Where Caver wins |
|---|---|
| Deep Azure integration. If your stack is Defender + Purview + Entra ID + Logic Apps + Sentinel, the integration story is unbeatable. Microsoft does this category as well as anyone. | Not locked to Azure. Multi-cloud, on-prem, hybrid, air-gapped, edge. Sentinel can’t follow. |
| Microsoft enterprise sales motion. Procurement, EA renegotiation, Microsoft credit consumption: all paths your finance team already understands. | Transparent licensing. Per-deployment flat. No ingest meter, no commit-tier negotiation, no quarterly Azure-bill surprise. |
| KQL is its native language. Operators who came up on Sentinel won’t switch query languages. | Query flexibility. KQL natively (so Sentinel queries land directly), plus SPL and SQL. Sentinel gives you KQL only. |
| Mature SOAR via Logic Apps. If you’re already a Logic Apps shop, the SOAR story is already familiar. | Cost predictability at scale. Azure ingest pricing punishes growth. Per-deployment pricing doesn’t. |
| Sentinel Notebooks for hunt-style investigations in Jupyter. | OT / ICS coverage. caver-industrial is first-class. Defender for IoT is a separate product with separate licensing and limited integration. |
| AI security visibility. caver-aisec is first-class. Sentinel doesn’t have a comparable AI-runtime product. | |
| Data residency. Your storage account, your keys, open format. Sentinel’s storage is opaque. | |
| Air-gap and on-prem. Sentinel can’t run in either; Caver can. | |
| Open storage. Same OCSF Parquet data is queryable by Grafana, DuckDB, Trino, Athena, AI agents, your data engineering team for non-SIEM purposes. Sentinel’s storage is locked behind KQL only. | |
| Migration tooling. caver-migrate ports a Sentinel deployment in one command, including the KQL queries that land directly on Caver’s native KQL engine. |
How to decide
Stay on Sentinel if: - You’re all-in on Azure and the Microsoft ecosystem. - Your security team’s depth is in KQL and the Defender stack. - Variable ingest pricing is something your finance team is OK negotiating quarterly. - You don’t need OT/ICS, AI security, multi-cloud, or air-gap deployment.
Move to Caver if: - You want to leave Azure-only or you’ve already left. - Your ingest is growing fast enough that the Azure bill has become an existential conversation. - You need OT/ICS, AI security, multi-cloud, or air-gapped deployment. - You want your data in open formats your data engineering team can also use. - Your operators speak SPL or want to add SQL alongside KQL.
Run both during a migration window if: - You have an existing Sentinel investment you can’t justify ripping out immediately. - Stand Caver up alongside, point new data sources at it, gradually move the KQL queries (they land natively). Decommission the Sentinel workspace when the bill is gone.
Talk to us about scoping — or read about caver-migrate for the Sentinel migration path.
Caver vs Sumo Logic
Caver compared to Sumo Logic. Cloud-native SIEM economics, multi-tenant SaaS tradeoffs, and where each fits.
At a glance
| Sumo Logic | Caver | |
|---|---|---|
| Deployment model | Multi-tenant SaaS only. | Self-hosted in your cloud, on-prem, or air-gapped. |
| License model | Per-GB ingest + per-credit pricing. | Per-deployment commercial license-key. |
| Data residency | Sumo’s chosen regions. | Your chosen environment, your storage account. |
| Storage | Sumo’s. | Your object storage. |
| Cold-tier search | Continuous and frequent tiers; performance varies by tier. | Single object-storage tier, consistent performance. |
| Query languages | Sumo’s own query language plus LogReduce / LogCompare. | SPL + KQL + SQL natively, all on the same backend with a language toggle. Plus AI agents over MCP, Grafana, DuckDB, Trino, Athena over the same OCSF Parquet lake. |
| Content ecosystem | Sumo Apps catalog (vendor-published, varying depth). | Curated vendor packs that ship with dashboards, saved searches, data inputs, and OCSF field mappings. Daily updates. No third-party install. |
| Air-gap deployment | Not supported. | Supported, including for caver-industrial. |
| Custom integration cost | API integration. | Direct repo access in customer environment. |
| OT / ICS coverage | No first-class OT product. The multi-tenant SaaS deployment model is structurally incompatible with air-gapped industrial environments. | caver-industrial: passive deep-packet decoders for BACnet/IP, S7Comm, IEC 60870-5-104, DNP3, Modbus TCP, EtherNet/IP, OPC-UA. Framework alignment for NIST 800-82 + IEC 62443. Air-gap-friendly deploy. Curated industrial threat intel. |
| AI security visibility | Limited. | caver-aisec, purpose-built. |
Strengths
| Where Sumo Logic wins | Where Caver wins |
|---|---|
| Zero infrastructure operations. Sumo runs the platform. Your team writes queries and reads dashboards. | Data sovereignty. Your data stays in your environment. Important for regulated industries, government work, OT, and any organization with a “data does not leave our infrastructure” policy. |
| MSP and multi-tenant friendliness. Sumo has a strong MSP / managed-service motion built into the product. | Cost trajectory at scale. Per-GB ingest pricing punishes growth. Per-deployment pricing doesn’t. |
| Predictable SaaS economics if your volume is stable. When your ingest doesn’t grow much month-to-month, Sumo’s pricing is straightforward. | Air-gap and on-prem. Sumo cannot run inside an air-gapped industrial network. Caver can. |
| Open storage format. Your data engineering team can use the same parquet / iceberg data for non-SIEM purposes. Sumo’s storage is opaque to you. | |
| Query however you want. SPL, KQL, and SQL natively on the same backend. AI agents over MCP, Grafana, DuckDB, Trino, Athena over the same Parquet lake. Sumo gives you Sumo’s query language. | |
| Content packs that ship complete. Each Caver pack includes dashboards, saved searches, data inputs, and OCSF field mappings, daily-updated. | |
| OT and AI surfaces. caver-industrial and caver-aisec have no Sumo equivalent. |
How to decide
If you’re cloud-native, multi-tenant, and your data-residency requirements don’t matter, Sumo is a reasonable SaaS answer.
If you have regulatory, compliance, or operational reasons to keep data in your own environment, or if your ingest is growing fast enough that per-GB pricing has become an existential conversation, Caver is the structural answer.
If you have OT, ICS, or air-gapped requirements, Caver is the only one of the two that can actually deploy there.
caver-collector vs Cribl Stream
A direct comparison between the caver-collector pre-index pipeline tier and Cribl Stream. Where the established player wins, where the integrated stack wins.
This comparison is specifically about caver-collector, the Caver-family pre-index pipeline component, against Cribl Stream. (For full SIEM-vs-SIEM comparisons, see vs Splunk, vs Elastic, or vs Sumo Logic.)
At a glance
| Cribl Stream | caver-collector | |
|---|---|---|
| Position | Independent pipeline tier in front of any SIEM. | Pipeline tier integrated with the Caver storage and search stack (also runs standalone). |
| Underlying engine | Cribl’s purpose-built pipeline. | Vector + OpenTelemetry dual backend. |
| License model | Cribl commercial license. | Per-deployment license-key (or included with Caver). |
| Pipeline UI | Mature visual pipeline builder. | Configuration-as-code first; UI a secondary surface. |
| Routing | Multi-destination routing, broadly. | Multi-destination routing, broadly. |
| Transformation primitives | Cribl’s own pack catalog. | Vector + OTel native primitives plus Caver-specific manipulation. 14 new stateless transforms shipped last week (parse_csv, parse_kv, cast_field, hash_field, rename_field, coalesce, extract_timestamp, filter, mask_value, json_parse, field_extract, rate_limit, dedupe, and more). |
| Adapter / source ecosystem | Cribl Packs catalog plus vendor-published TAs. | 60+ vendor adapters across two release cycles (Webex, Lacework, Mattermost, Buildkite, Discord, Meraki, CircleCI, Linode, MongoDB, and many more). Each ships with OCSF field mapping built in. |
| Industrial protocol decoding | Cribl Stream can route OT telemetry (syslog, raw TCP, custom inputs) but doesn’t decode industrial protocols natively. No first-class OT product story. | 7 passive deep-packet decoders (BACnet/IP, S7Comm, IEC 60870-5-104, DNP3, Modbus TCP, EtherNet/IP, OPC-UA). Air-gap-friendly. Pairs with caver-industrial on the SIEM side for NIST 800-82 + IEC 62443 framework alignment. |
| Vendor independence | Vendor-neutral (works in front of any SIEM). | Vendor-neutral (works in front of any SIEM), with first-class integration into Caver. |
Where Cribl wins
- Maturity. Cribl has been the dominant independent pipeline tier for years. The UI is more polished. The pack catalog is broader. The operator pool is bigger.
- Pure-play independence. Cribl explicitly positions as vendor-neutral. If neutrality from any single SIEM vendor is a procurement requirement, Cribl tells that story cleanly.
- Visual pipeline builder. If your team prefers to drag-and-drop transformations rather than write configuration, Cribl’s UI is better.
Where caver-collector wins
- Open-source engine pedigree. Vector and OpenTelemetry are both open-source, both mature, both have huge community ecosystems. Cribl’s engine is proprietary.
- Integration with Caver. When paired with Caver, the storage, search, content, and pipeline tiers all share an operator surface. With Cribl + Caver, you get two separate operator surfaces.
- Configuration-as-code first. GitOps-friendly. Version-controlled pipeline definitions. PR review on pipeline changes.
- No additional commercial conversation if you already have Caver. caver-collector is included with Caver deployments. Cribl is a separate purchase.
How to decide
If you’ve already chosen Cribl and it’s working, there’s no urgent reason to replace it. Cribl + Caver is a valid combination; Caver doesn’t care what fronts it.
If you’re greenfield and considering both, evaluate the integrated-stack benefit of caver-collector + Caver against Cribl’s maturity advantage. For most teams, the integrated stack wins on operational complexity. For teams that need a long-term independent pipeline tier as a deliberate architectural choice, Cribl wins.
Caver vs Wazuh
Caver compared to Wazuh (open-source SIEM/XDR). Why these two are a deliberate combo, not an either-or choice.
Wazuh is the dominant open-source SIEM/XDR. Free, Apache 2 licensed, deployed on roughly 25 million endpoints worldwide. The honest answer to “Caver vs Wazuh” is that for most teams, they’re a combo, not a choice. Wazuh is the endpoint-agent + assessment layer; Caver is the storage, search, analytics, AI-security, and OT layer behind it.
At a glance
| Wazuh | Caver | |
|---|---|---|
| License model | Open source (Apache 2). Free agent + manager + indexer. Paid Wazuh Cloud for SaaS. | Per-deployment commercial license-key. |
| Endpoint agent | First-class: Windows, macOS, Linux, Solaris, AIX. Self-updating, signed, manageable. | None native. We recommend Wazuh agent + Caver as the analytics tier. |
| File Integrity Monitoring (FIM) | First-class via Wazuh agent syscheck module. | Inherited from Wazuh agent via the partnership integration; CAVERN detection content consumes the events. |
| Compliance modules | PCI DSS, HIPAA, NIST 800-53, GDPR mapped to rules out-of-box. | Same coverage via Wazuh agent + dedicated compliance mapping page on docs.etairos.ai (planned). |
| Configuration assessment | CIS benchmark scanning via Wazuh sca module. | Inherited from Wazuh agent. |
| Vulnerability scanning | Package-CVE matching via Wazuh vulnerability detector. | Inherited from Wazuh agent. |
| Storage backing | Wazuh indexer (OpenSearch fork). Same scale-engineering tax as Elastic. | Native Parquet on object storage. No shard sizing, no ILM tuning. |
| Query languages | OpenSearch DSL (Lucene). | SPL + KQL + SQL natively. AI agents over MCP, Grafana, DuckDB, Trino, Athena over the same Parquet lake. |
| Container / K8s security | Wazuh-Kubernetes integration via API audit. | Falco + Trivy partnership for first-class K8s coverage: runtime detection, image scanning, and policy violations all normalised into the CAVERN content pipeline. |
| OT / ICS coverage | Limited; no first-class industrial product. | caver-industrial: passive decoders for BACnet/IP, S7Comm, IEC 60870-5-104, DNP3, Modbus TCP, EtherNet/IP, OPC-UA. NIST 800-82 + IEC 62443. Air-gap-friendly. |
| AI security visibility | None. | caver-aisec: prompt-injection detection, AI Observatory for LLM spend, NIST AI 100-2 + OWASP feeds. |
| Content packs / integrations | Wazuh agent built-in modules plus community-contributed rules. | 35+ vendor packs shipping with dashboards, saved searches, data inputs, and OCSF field mappings. Daily updates. |
| Cold-tier search | OpenSearch ISM moves data through hot/warm/cold/frozen with performance penalties. | First-class search over object storage. No rehydration. |
Strengths
| Where Wazuh wins | Where Caver wins |
|---|---|
| The endpoint agent itself. ~25M endpoints can’t be wrong. Mature, signed, self-updating, broad OS coverage. | The analytics tier. Object storage backing instead of OpenSearch indexer. No shard sizing, no ILM tuning, no version-upgrade pain. |
| Free OSS pricing for the agent + manager + indexer. If you can run the indexer yourself, the only cost is your operator time. | Query however you want. SPL, KQL, and SQL natively. Wazuh gives you OpenSearch DSL. |
| Established compliance posture. PCI DSS / HIPAA / NIST 800-53 mappings are mature and accepted by auditors. | AI security. caver-aisec has no Wazuh equivalent. |
| Active-response framework. Lightweight SOAR built into the agent (block IP, kill process, quarantine file). | OT / ICS. caver-industrial has no Wazuh equivalent. |
| Community + Wazuh Cloud. Real ecosystem, real managed-service option. | Daily-updated content packs with bundled dashboards + data inputs + OCSF field mapping. Wazuh’s vendor coverage is broad but variable in depth. |
| Transparent commercial license-key on the analytics tier vs OSS+paid-cloud bimodal pricing. |
How to decide
For most teams: deploy both. Run Wazuh agent on your endpoints. Land the events in Caver. Use Wazuh for endpoint coverage (FIM, CIS, CVE, OS-level events) and Caver for storage, search, AI security, OT/ICS, and cross-source correlation.
For shops already running Wazuh end-to-end (Wazuh agent + Wazuh manager + Wazuh indexer): Caver replaces just the indexer + analytics tier. Keep everything else.
For shops greenfield-evaluating SIEMs: Wazuh alone is a strong free starting point. The reasons to add Caver are scale economics, query-language flexibility, AI security, or OT/ICS coverage. None of those are urgent at small scale; all of them become urgent at scale.
For shops with regulated workloads where the “free OSS” answer doesn’t fly: Caver gives you the commercial backing, transparent licensing, and analytics tier that procurement can sign off on. Wazuh stays as the endpoint layer.
Caver vs Dragos
Caver compared to Dragos Platform for OT/ICS buyers. Where Dragos's decade of OT focus wins, where unified IT plus OT in one stack wins.
Dragos is the established OT/ICS security platform. A decade of industrial focus, deep ICS-vendor relationships, well-known threat research (the WorldView intel, the CHERNOVITE / VOLTZITE attributions), and a large federal customer base. If you’re an OT-only buyer evaluating OT-only platforms, Dragos is the incumbent.
Caver answers a different procurement question: do you want one stack that covers IT and OT, or two stacks? caver-industrial extends the same Caver lakehouse SIEM with passive industrial-protocol decoders and OT-aware detection content, so OT and IT events land in one analytics tier with one query language.
At a glance
| Dragos Platform | Caver (caver-industrial + caver-collector) | |
|---|---|---|
| Buyer focus | OT-only specialist. | IT and OT in one stack (convergence buyer). |
| Deployment model | Appliance and virtual sensor; vendor-managed and on-prem options. | On-prem lakehouse, air-gap friendly. caver-collector pipeline on the OT side, caver storage and analytics on the IT side. |
| Passive protocol coverage | Broad and mature: 25+ industrial protocols including DNP3, Modbus, IEC 60870-5-104, IEC 61850 GOOSE/SV, S7Comm, EtherNet/IP, OPC-UA, BACnet, plus vendor-specific dialects. | Active decoders for BACnet/IP, S7Comm, IEC 60870-5-104, DNP3, Modbus TCP, EtherNet/IP, OPC-UA. Roadmap aligns with the published industrial integration order. |
| Asset inventory | First-class, mature, with vendor-firmware mapping. | Native asset inventory built from passive decoder output and partner ingestion. |
| Industrial threat intel | WorldView intel program with named-threat attribution and quarterly releases. | Curated TI feeds focused on industrial CVEs and adversary TTPs. Updates daily through the Caver content pipeline rather than quarterly. |
| Framework alignment | NIST 800-82, IEC 62443, NERC CIP. | NIST 800-82, IEC 62443. Roadmap includes NERC CIP content. |
| IT-side coverage | Limited; integrates with IT SIEMs rather than being one. | Native. Caver is an IT SIEM that also covers OT. No second stack. |
| AI security visibility | None. | caver-aisec: prompt-injection detection, AI Observatory, NIST AI 100-2 + OWASP feeds. |
| Query languages | Platform-native UI and queries. | SPL, KQL, SQL natively against the Parquet lake. AI agents over MCP. |
| Pricing model | Enterprise procurement; per-asset and per-site licensing typical. | Transparent per-deployment license-key. Industrial pricing marketed Custom. |
| Update cadence | Quarterly platform releases plus intel updates. | Daily content updates through the pipeline. |
Strengths
| Where Dragos wins | Where Caver wins |
|---|---|
| A decade of OT-only focus. When the entire product roadmap is industrial, the depth shows. Protocol coverage, vendor-firmware mappings, and threat research are mature in a way a younger product cannot match. | One stack, not two. Caver is an IT SIEM that also handles OT. You don’t run separate query languages, separate dashboards, separate alerting tiers for the IT and OT halves of the same investigation. |
| Named-threat attribution. WorldView’s adversary profiles (CHERNOVITE, VOLTZITE, ELECTRUM, and others) carry weight with executive readers and federal buyers. | Transparent licensing. Per-deployment license-key with a published model on the Caver landing page. Dragos requires enterprise procurement. |
| Established federal and asset-owner base. If your procurement requires Dragos-by-name references in critical infrastructure, that’s a real moat. | Lakehouse economics. Caver stores everything as Parquet on object storage. The cost ceiling that keeps OT teams from keeping more than 90 days of historian data does not apply. |
| OT-vendor relationships. Deep integrations with Schneider, Siemens, GE, Honeywell, Rockwell, ABB built over years of co-engagement. | Native AI security. caver-aisec is part of the same stack. OT teams that are starting to deploy LLM-assisted maintenance copilots have one place to monitor that traffic. |
| OT-only specialization is sometimes the right answer. Some buyers do not want their OT analytics commingled with IT log volume; Dragos lets them buy an OT-only platform without that tradeoff. | Daily content updates. Caver’s content pipeline ships daily. Dragos’s platform releases are quarterly. |
| Convergence is the trend. OT and IT are merging operationally (remote engineering access, cloud-connected historians, IT-IDS in the OT DMZ). Buyers who plan for the convergence buy convergence-native tools. |
How to decide
For OT-only shops with a mature, separately-staffed OT security program and no plans to converge with IT: Dragos is the safer pick. A decade of OT focus is hard to argue with.
For shops where the same team owns IT and OT, or where the IT SOC is being asked to cover OT events as part of the convergence push: Caver removes a stack. One query language, one analytics tier, one license.
For shops doing a competitive bake-off: the honest comparison is depth-of-OT vs breadth-of-coverage. Dragos is deeper on OT specifically. Caver is broader across IT, OT, and AI security in one place. Map that to which procurement question is louder for you.
For shops where industrial pricing is a budget blocker: Caver’s per-deployment license-key is published, and industrial pricing is quoted Custom but built on the same transparent model rather than enterprise per-asset arithmetic.
caver-aisec vs Lakera
caver-aisec compared to Lakera Guard. Why these two are complementary, not competitive, for most AI-security buyers.
caver-aisec and Lakera live in adjacent halves of the AI-security problem. Lakera is the inline guard at the prompt boundary. caver-aisec is the runtime detection + SOC correlation layer. For most teams the right answer is to use both, not pick one.
At a glance
| Lakera | caver-aisec | |
|---|---|---|
| Posture | Inline guard. Intercepts prompts, scores, blocks or allows before the LLM call. | Runtime detection. Sees prompts after the LLM call, alerts, correlates with the rest of your security telemetry. |
| Latency profile | Sub-100ms; sits in the request path. | Asynchronous; doesn’t add latency to the LLM call itself. |
| Deployment | SaaS-first. | Self-hosted. On-prem, air-gapped, or cloud. |
| PII detection | Built-in PII detector with configurable redaction. | Shipped: Presidio integration with per-tenant policy and redaction. |
| Hallucination scoring | Faithfulness / groundedness scoring on RAG outputs. | Shipped: Ragas + TruLens integration with per-response scoring surfaced to SLAM notables. |
| Red-team testing | Lakera Red for adversarial probing. | Shipped: Garak + PyRIT scheduled probes with MITRE ATLAS coverage matrix. |
| Inline blocking | Lakera Guard, primary product. | Shipped: NeMo Guardrails + Rebuff runtime allow / deny / rewrite on the LLM-to-tool boundary. |
| SOC integration | Limited. AI-side product, not SOC-side. | First-class. AI events flow into Caver alongside identity, endpoint, network telemetry. |
| Cross-source correlation | None native. | Native. Prompt-injection attempt correlates with the IP, identity, endpoint, and tool-call activity around it. |
| AI Observatory / spend tracking | Limited. | First-class per-tenant LLM spend tracking with budget alerts. |
| Alert channels | Webhook + integrations. | PagerDuty, Discord, Teams, Slack, Telegram, SMTP, webhook. |
| Threat feeds | Lakera-curated. | NIST AI 100-2, OWASP Agentic AI Top 10, OWASP ML Top 10, HuggingFace Security, vendor advisories. |
| MCP tool-call audit | Not a focus. | First-class: LLM-to-MCP bridge instrumentation with CAVERN detection content. |
| Pricing | SaaS subscription. | Per-deployment commercial license-key. |
| Open source posture | Closed-source commercial. | Closed-source commercial. Built on open-source components (planned: NeMo Guardrails, Presidio, Ragas, Garak). |
Where Lakera wins
- Inline blocking is shipping today. Lakera Guard intercepts and blocks at the prompt boundary in production now. caver-aisec’s inline-block ticket is open but unshipped.
- PII detection is shipping today. Same.
- Lakera Red is shipping today. Pre-deploy adversarial test framework.
- SaaS-first delivery. Fast pilots, zero infrastructure.
- Established AI-security brand. Lakera holds mindshare in the inline-guard category.
- Browser-side Chrome extension for shadow-AI catch is real and useful.
Where caver-aisec wins
- SOC correlation. A prompt-injection attempt that touches identity, network, endpoint, and tool-call activity should show up in one queryable timeline. caver-aisec puts AI traffic next to the rest of your security telemetry; Lakera doesn’t.
- AI Observatory. Per-tenant LLM spend tracking with budget alerts. Lakera doesn’t track spend.
- Self-host / air-gap. Lakera is SaaS-first. For regulated industries, government, OT-adjacent, or air-gapped environments, Lakera can’t deploy. caver-aisec can.
- Alert channel breadth. 7 named channels (PagerDuty, Discord, Teams, Slack, Telegram, SMTP, webhook) vs Lakera’s webhook + integrations.
- Threat feed breadth. Multiple public AI-security feeds ingested as detection content, not just curated by the vendor.
- MCP tool-call audit. Caver instruments the LLM-to-tool boundary specifically; useful for agentic AI deployments.
How to decide
Most teams: use both.
Lakera + caver-aisec is a deliberate combo. Lakera blocks at the prompt boundary. caver-aisec gives you SOC-side visibility into what was blocked, why, and what other telemetry correlates with the attempt. Same posture as a WAF + SIEM combo on the traditional web stack: the WAF blocks, the SIEM investigates.
Lakera-only is reasonable when: - You’re SaaS-only, no SOC, no regulated workloads. - AI security is your only security tool (small shop, AI-first product). - You need inline blocking today and can’t wait for caver-aisec parity (#24).
caver-aisec-only is reasonable when: - You need air-gap, on-prem, or data-residency reasons SaaS won’t satisfy. - You’re already running Caver as your SIEM and want AI security on the same operator surface. - You need OT-adjacent or industrial AI deployments where SaaS can’t deploy. - You want a single per-tenant spend + visibility surface, not just guarding.
Both is best when: - You want inline prevention + runtime detection + SOC correlation. - You’re at the scale where a missed prompt-injection costs more than two vendor relationships.
caver-aisec vs Mindgard
caver-aisec compared to Mindgard. Pre-deployment red-teaming vs runtime detection, complementary not competitive.
caver-aisec and Mindgard solve different halves of the AI-security problem. Mindgard tests AI systems before deployment with automated red-teaming. caver-aisec detects attacks at runtime and correlates them with your SOC telemetry. Same defense-in-depth posture as a vulnerability scanner + EDR pairing on the traditional security stack: scanner finds the exposures, EDR catches the exploits.
At a glance
| Mindgard | caver-aisec | |
|---|---|---|
| Posture | Pre-deployment + scheduled continuous testing. Probes target AI systems with adversarial inputs to find weaknesses. | Runtime detection. Observes deployed AI systems and detects attacks in flight. |
| Test catalog | Curated test pack covering jailbreak, prompt injection, model extraction, training-data leakage, supply chain. | Detection content pack for the same categories, but observed at runtime, not probed pre-deploy. |
| ATT&CK-for-ML mapping | Formal mapping of test coverage to MITRE ATLAS techniques. | Shipped: Garak + PyRIT integration with MITRE ATLAS technique coverage matrix and OWASP LLM / Agentic / ML Top 10 tagging across the detection corpus. |
| Deployment | SaaS platform. | Self-hosted. On-prem, air-gapped, or cloud. |
| Continuous testing | Scheduled probes against deployed AI endpoints. | Shipped: scheduled Garak + PyRIT probes against your deployed endpoints, with results flowing into Caver alongside runtime telemetry for cross-source correlation. |
| SOC integration | Limited. AI-testing product, not SOC-side. | First-class. AI events + test results flow into Caver alongside identity, endpoint, network telemetry. |
| Cross-source correlation | None native. | Native. Failed probe correlates with deployed model version, the prompt patterns that triggered it, and the rest of your security telemetry. |
| AI Observatory / spend tracking | Limited. | First-class per-tenant LLM spend tracking with budget alerts. |
| Alert channels | Email + integrations. | PagerDuty, Discord, Teams, Slack, Telegram, SMTP, webhook. |
| Threat feeds | Mindgard-curated test catalog. | NIST AI 100-2, OWASP Agentic AI Top 10, OWASP ML Top 10, HuggingFace Security, vendor advisories. |
| MCP tool-call audit | Not a focus. | First-class: LLM-to-MCP bridge instrumentation. |
| Pricing | SaaS subscription. | Per-deployment commercial license-key. |
Where Mindgard wins
- SaaS-first delivery. Fast pilots, zero infrastructure.
- Established AI red-team brand. Mindgard holds mindshare in the pre-deploy testing category and ships a polished probe authoring UI.
- Curated proprietary test catalog. Their in-house research team adds novel attacks the open Garak / PyRIT corpora may lag on.
Where caver-aisec wins
- Pre-deploy probes + runtime detection in one stack. Shipped: Garak + PyRIT probes run on a schedule against your deployed AI endpoints, results land in Caver alongside runtime telemetry, the same operator surface investigates both. You don’t pick test-first vs detect-first, you get both.
- Formal MITRE ATLAS coverage matrix. Shipped: per-technique coverage report driven by the Garak + PyRIT corpus + Caver’s hand-authored detection content, with OWASP LLM, Agentic, and ML Top 10 tagging cross-referenced.
- Runtime detection + SOC correlation. A failed probe is a hypothesis; a successful exploit in production is the actual attack. caver-aisec catches the latter and correlates it with the rest of your telemetry.
- AI Observatory. Per-tenant LLM spend tracking with budget alerts. Mindgard doesn’t track spend; testing is its product.
- Self-host / air-gap. Mindgard is SaaS. For regulated industries, government, OT-adjacent, or air-gapped environments, Mindgard can’t deploy. caver-aisec can.
- Alert channel breadth. 7 named channels vs email + integrations.
- Threat feed breadth. Multiple public AI-security feeds ingested as runtime detection content, not just as testing inputs.
- MCP tool-call audit. Runtime instrumentation of the LLM-to-tool boundary; useful for agentic AI deployments.
How to decide
Most teams: use caver-aisec.
caver-aisec now ships both halves: pre-deploy probes (Garak + PyRIT, scheduled, with MITRE ATLAS coverage) and runtime detection on the same operator surface. The same notable surface investigates both probe failures and live exploits. For a team building an AI security program from one stack, this is the simpler path.
Mindgard remains a good fit when: - Your AI security mandate is testing-only and you want a polished SaaS authoring UI. - You’re a security testing firm or red-team consultancy and runtime detection is your customer’s problem, not yours. - You don’t have a SOC and aren’t building one. - You specifically value Mindgard’s in-house proprietary test catalog over the open Garak + PyRIT corpora.
caver-aisec-only is reasonable when: - You need air-gap, on-prem, or data-residency reasons SaaS won’t satisfy. - You already have a SIEM and want AI security on the same operator surface. - Your AI systems are deployed and you need to know what’s happening to them now, not what could happen in a lab.
Both is best when: - You ship AI systems to production and want both prevention (testing) and detection (runtime). - You operate at the scale where a missed AI attack costs more than two vendor relationships.
OCSF, design choices
Caver's storage layer is open OCSF (Open Cybersecurity Schema Framework) Parquet on any S3-compatible object store. This page explains what that means, why we picked it, and the other architectural calls we made along the way.
What is OCSF?
OCSF (Open Cybersecurity Schema Framework) is a vendor-neutral, open-source schema for security data. Originally announced by Splunk and AWS at Black Hat 2022 and now maintained at the OCSF Project on GitHub under the Linux Foundation, it has broad industry support: AWS, Splunk, IBM, CrowdStrike, Cloudflare, JupiterOne, Tanium, Zscaler, Salesforce, Okta, Trend Micro, DTEX, and dozens more.
OCSF defines a normalised event shape for every category of security telemetry: authentication, network activity, process activity, file activity, application activity, vulnerabilities, findings, configuration, audit. Every event has a stable schema with class_uid, category_uid, severity_id, activity_id, ATT&CK tagging, actor / target / source structures, and an OCSF-defined version field.
| OCSF class | Class UID | Examples |
|---|---|---|
| Authentication | 3002 | Okta logon, AWS Console login, SSH auth |
| Network Activity | 4001 | Firewall connection, VPN tunnel, DNS query |
| Process Activity | 1007 | EDR process start, sysmon event 1 |
| File Activity | 1001 | File write, file delete, file rename |
| Application Activity | 6005 | LLM API call, SaaS audit event |
| Security Finding | 2004 | CAVERN detection fire, Sigma rule match |
| Vulnerability Finding | 2002 | Tenable scan result, Wiz CVE finding |
| Cloud API | 3005 | CloudTrail event, Azure activity log |
Why we chose OCSF
Why Parquet on object storage
The second half of the storage decision: store OCSF as Apache Parquet on S3 / MinIO / R2 / GCS, not in a proprietary index format.
| Format | Vendor lock | Cold-storage cost | Cross-tool query | Caver pick |
|---|---|---|---|---|
| Splunk tsidx | Hard | High (replicated indexers) | None | No |
| Elastic shards | Hard | Medium | Limited | No |
| QRadar Ariel | Hard | High | None | No |
| OCSF Parquet on S3 | None (Apache 2.0) | Lowest (~$0.023/GB) | DuckDB, Trino, Athena, Spark, Pandas, anything | Yes |
Same files are queryable by Caver, by your data team via Athena or Snowflake external tables, and by your ML/AI team via PyArrow or Spark. The data lives once.
Other architectural choices
| Choice | Caver pick | Why |
|---|---|---|
| Detection rule format | CAVERN YAML (Sigma-shaped) | Sigma is the community standard. CAVERN rules transpile cleanly from Sigma and back. Operators do not need to learn a proprietary DSL. |
| Query layer | Multi-language (SPL, SQL, KQL, Sigma, PromQL, NL) | Teams pick the language they already know. Federation across tools is a first-class capability, not a migration project. |
| Pipeline backend | Vector (Rust) + Python + OpenTelemetry, equally featured | No single backend optimises for every use case. Vector for hot paths, Python for SaaS API polling, OTel for orgs already running OTel infrastructure. |
| Schema enforcement | OCSF validators with coerce / reject / DLQ modes | Bad events go to a DLQ, not into the lake. Schema drift surfaces as a notable, not a silent corruption. |
| AI integration | MCP server, native primitive registry | AI agents are first-class operators, not chatbots bolted on. The same primitives the orchestrator uses are exposed via MCP for any client. |
| Tenancy | Per-tenant S3 prefix + row filter + LLM config | MSSPs and large enterprises get hard tenant isolation at the storage layer, not just at the query layer. |
| Identity | OIDC / SAML / LDAP, role-based access control with audit | No custom auth. Plug Caver into the IdP you already use. |
External references
- OCSF Schema Browser, browse every class and field
- OCSF Project home, governance and contributors
- OCSF on GitHub, schema source
- Apache Parquet, columnar storage format
- SigmaHQ/sigma, the community detection rule corpus
Deployment
Installation
# Core install
pip install caver
# Install with specific roles
pip install 'caver[cavern-engine,slam-engine,scheduler]'
pip install 'caver[search-head]'
pip install 'caver[mcp-server]'
pip install 'caver[all-roles]' # dev / dogfood only
# Caver-Collector
pip install caver-collector
pip install 'caver-collector[parquet]' # adds PyArrow for Parquet sink
Configuration (caver.toml)
[roles]
enabled = ["search-peer", "cavern-engine", "slam-engine", "scheduler"]
[storage]
bucket = "caver-lake"
endpoint = "http://localhost:9000" # MinIO / LocalStack / R2
access_key_env = "MINIO_ACCESS_KEY"
secret_key_env = "MINIO_SECRET_KEY"
[cluster]
discovery = "static"
peers = ["localhost"]
[cavern]
content_path = "/etc/caver/content/v1"
rba.notable_threshold = 100
[slam]
oncall_telegram_chat_id = "YOUR_CHAT_ID"
telegram_token_env = "TELEGRAM_BOT_TOKEN"
[caver.ai]
user_daily_cost_usd = 50
Available roles
Splunk Peer Mode
Register Caver as a distributed search peer of your existing Splunk SH. No Splunk source or forwarder changes required. Caver receives SPL, scans the OCSF Parquet lake, and returns results over the splunkd wire format.
caver-cluster join --splunk-host splunk.internal --splunk-port 8089 --splunk-user admin --splunk-pass "$SPLUNK_PASS" --peer-uri https://caver.internal:8089
Any licensing change to your Splunk contract is made through Splunk per your existing agreement. Caver does not modify, patch, or bypass any Splunk license mechanism.
Health Check
caver-doctor --role search-peer
caver-doctor --role cavern-engine --config /etc/caver/caver.toml
# → exits 0 if healthy, 1 with structured error output if not
Docker / Kubernetes
docker run --rm -it \
-e MINIO_ACCESS_KEY=minioadmin \
-e MINIO_SECRET_KEY=minioadmin \
-v /etc/caver:/etc/caver \
ghcr.io/redeyesecurity/caver:latest \
--config /etc/caver/caver.toml
# Helm
helm install caver oci://ghcr.io/redeyesecurity/charts/caver \
--set storage.bucket=caver-lake \
--set storage.endpoint=http://minio:9000
Configuration Reference
Storage
| Key | Default | Description |
|---|---|---|
storage.bucket | none | S3/MinIO bucket for OCSF lake |
storage.endpoint | AWS S3 | Override for MinIO / LocalStack / R2 |
storage.prefix | "" | Key prefix for all lake writes |
storage.region | us-east-1 | AWS region |
CAVERN
| Key | Default | Description |
|---|---|---|
cavern.content_path | bundled | Path to v1/ content directory |
cavern.rba.entity_window_hours | 24 | Rolling window for risk score aggregation |
cavern.rba.notable_threshold | 100 | Score threshold to create a SLAM notable |
cavern.rba.admin_multiplier | 2.0 | Score multiplier for admin accounts |
cavern.rba.critprod_multiplier | 1.875 | Score multiplier for critical-prod assets |
SLAM
| Key | Default | Description |
|---|---|---|
slam.oncall_telegram_chat_id | none | Telegram chat ID for oncall alerts |
slam.sla.p1_minutes | 15 | P1 notable SLA in minutes |
slam.sla.p2_minutes | 60 | P2 notable SLA in minutes |
slam.playbook_path | bundled | Path to custom playbooks directory |
AI config
| Key | Default | Description |
|---|---|---|
caver.ai.user_daily_cost_usd | 50 | Per-user AI spend threshold (USD) for cost_anomaly rule |
caver.ai.off_hours_start | 22 | Off-hours start (local hour) for token spike rule |
caver.ai.off_hours_end | 6 | Off-hours end (local hour) |
caver.ai.variance_model_count | 5 | Model count threshold for model_variance_burst |
Multi-tenancy (MSSP)
[[tenants]]
id = "acme"
display_name = "Acme Corp"
s3_prefix = "tenants/acme/"
row_filter = "metadata.tenant_uid = 'acme'"
[tenants.llm]
provider = "anthropic"
model = "claude-sonnet-4-6"
api_key_env = "ACME_ANTHROPIC_KEY"