OpenClaw's Context Problem Just Got Solved — ClawPulse #116

March 8, 2026 | Context never dies again

Something shipped in the early hours of this Sunday morning that I've been waiting for since the first time I watched an OpenClaw session eat itself during a long coding session. A GitHub notification at 5:52 AM UTC: openclaw v2026.3.7. The headline feature is the ContextEngine plugin interface, and if you've ever been mid-flow only to have your agent forget the last two hours of work, this one's for you.

OpenClaw has always handled context the same way: when the window fills up, it summarizes and discards. That's fine for short tasks. For anything that runs long - debugging a gnarly system, building a feature across dozens of files, running a research pipeline - it's quietly destroying information you need. Today's release introduces a plugin slot that lets third parties replace that entire compaction mechanism. The first beneficiary is lossless-claw, a project by Martian Engineering that's been waiting for exactly this. Every message gets stored in SQLite. Nothing gets thrown away. The setup is covered in full in today's edition.

GPT-5.4 also dropped three days ago, and it landed differently than recent OpenAI releases. The combination of native computer-use, 1M token context, and tool search in a single model has practical implications that go beyond benchmarks - there are product categories that become viable now that weren't before. And Anthropic's Frontier Red Team published what might be the most interesting AI security story of the year: Claude Opus 4.6 spent two weeks on Firefox's codebase and walked away with 22 CVEs, including a working exploit it wrote itself. All patched. All public. The methodology is worth reading.

Let's get into it.

TODAY'S RUNDOWN

Sunday, March 8, 2026

Today's edition covers:

Feature: OpenClaw v2026.3.7 ships the ContextEngine plugin interface
Setup: How to install lossless-claw and never lose context again
Making Money: What GPT-5.4's computer-use means for agent products
Security: Claude Opus 4.6 found 22 Firefox CVEs in two weeks

Feature Story: OpenClaw's Context Problem Just Got Solved

OpenClaw 2026.3.7 shipped this morning. The release is large - Spanish locale support, ACP persistent channel bindings, Telegram topic routing improvements, Perplexity search API updates. But the thing that matters most is buried in the first change note: the ContextEngine plugin interface.

Here's what this actually is. Up until today, every OpenClaw session used the same built-in context management: when the active context window approaches the model's token limit, OpenClaw runs a compaction pass. It summarizes recent messages into a condensed block, discards the originals, and continues. The summarization is decent. The loss is real. If you ran a three-hour debugging session and the agent compacted at hour two, the detailed trace of what went wrong is gone. You've got a summary. Summaries lie by omission.

The plugin interface in 2026.3.7 (PR #22201, contributed by @jalehman) changes this at the architecture level. OpenClaw now has a ContextEngine plugin slot - a registered point in the system where a third-party module can take full ownership of how context is managed. The slot exposes seven lifecycle hooks: bootstrap, ingest, assemble, compact, afterTurn, prepareSubagentSpawn, and onSubagentEnded. A plugin that registers for this slot can intercept every message that enters the context, control what gets assembled for each model call, and handle compaction in whatever way it wants.

The LegacyContextEngine wrapper means zero behavior change if you don't install a plugin. Your existing setup keeps working exactly as before. But now there's an alternative.

The first plugin built on this is lossless-claw, which has been sitting in beta at Martian Engineering since the feature request was first discussed in January. They built it against the plugin interface while it was still in development. The architecture is covered in the Setup section, but the headline: every message goes to SQLite, summaries form a DAG (directed acyclic graph) that links back to originals, and agents get tools to search and retrieve anything from history. The phrase on their GitHub - "Imagine never needing to run /compact or /new again" - isn't marketing. That's what the system does.

There are a few things worth noting about how this was built. The plugin slot uses AsyncLocalStorage for scoped subagent runtime, which means the context engine is properly isolated per-session even in multi-agent workloads. The sessions.get gateway method is also new in this release, giving context engine plugins access to session metadata they previously couldn't see. This isn't a partial implementation - it's a full interface designed with real plugin authors in the loop.

The ACP changes in this same release are also worth flagging. Durable Discord channel and Telegram topic bindings now survive gateway restarts. If you've been annoyed by ACP sessions losing their thread targets after a restart, that's fixed. This was contributed by @dutifulbob (PR #34873) and feels like it should have been in OpenClaw for a long time.

The broader picture here is that OpenClaw is increasingly opening its internals to community extensions. The plugin runtime has grown substantially over the past few months - STT transcription for plugins, session lifecycle hooks with sessionKey, and now full context engine ownership. The shift is real: OpenClaw is becoming a platform as much as it's a product. That matters if you're thinking about building skills or tools on top of it.

My read: 2026.3.7 is one of the more significant OpenClaw releases in recent memory. Not because any single feature is flashy, but because the ContextEngine interface is the kind of architectural decision that opens up whole categories of experimentation that weren't possible before. The compaction problem is one that every practitioner running long-horizon agents has hit. The fact that there's now a sanctioned way to replace it is a genuine step forward.

Setup of the Week: Install lossless-claw and stop losing context

If you've ever lost context mid-session - and if you run any long-horizon OpenClaw work, you have - then lossless-claw is the fix. Here's how to get it running today, now that OpenClaw 2026.3.7 supports the plugin interface it needs.

What lossless-claw actually does

OpenClaw's default compaction is lossy by design. When the context window fills, it summarizes and discards. lossless-claw replaces this with a DAG-based system:

1. Every message gets written to a local SQLite database. Nothing is discarded. 2. Older messages are summarized in chunks using your configured LLM. 3. Summaries are condensed into higher-level nodes as they accumulate, forming a graph where every node links back to the messages it was derived from. 4. Each turn, the agent receives a context assembled from recent raw messages plus the top-level summary nodes. 5. Agents get three new tools for searching history: lcm_grep (text search across all messages), lcm_describe (get a summary of what happened in a given time range), and lcm_expand (drill into a summary node to see the original messages).

The raw messages never leave your machine. The database is local SQLite. Nothing is sent anywhere beyond your normal LLM API calls for the summarization steps.

Step 1: Update OpenClaw to 2026.3.7

The plugin slot only exists in 2026.3.7 and later. Update first:

npm update -g openclaw
openclaw --version
# Should show 2026.3.7

Step 2: Install lossless-claw

npm install -g @martian-engineering/lossless-claw

Step 3: Register the plugin in your OpenClaw config

In your openclaw.config.json (typically at ~/.openclaw/openclaw.config.json):

{
  "plugins": [
    "@martian-engineering/lossless-claw"
  ],
  "contextEngine": "lossless-claw"
}

The contextEngine key tells OpenClaw which registered plugin to use for context management. If this key is absent or set to "legacy", OpenClaw falls back to its built-in compaction. No plugin is loaded unless you explicitly set this.

Step 4: Configure the summarization model (optional)

By default, lossless-claw uses the same model as your OpenClaw instance for summarization. You can override this to use a cheaper, faster model for the background summary work:

{
  "plugins": ["@martian-engineering/lossless-claw"],
  "contextEngine": "lossless-claw",
  "losslessClaw": {
    "summarizationModel": "anthropic/claude-haiku-4",
    "chunkSize": 20,
    "maxSummaryDepth": 4
  }
}

chunkSize controls how many messages get grouped per summary. maxSummaryDepth sets how many levels of summary-of-summary the DAG can have. The defaults are fine to start with - tune them once you've run a few sessions.

Step 5: Test it

Start a new session and run something that generates a lot of messages. You can check the database directly:

ls ~/.openclaw/lossless-claw/
# sessions.db  (your message store)

sqlite3 ~/.openclaw/lossless-claw/sessions.db \
  "SELECT count(*) as messages, session_key FROM messages GROUP BY session_key LIMIT 10;"

If you see your sessions in there, it's working. Messages accumulate. Nothing gets dropped.

Step 6: Use the recall tools

Once you've run a session for a while, test the search tools from inside an agent conversation:

Use lcm_grep to search for any messages mentioning "database schema"

The agent will call the lcm_grep tool and get back a list of matching messages from your full history, including messages that would have been compacted away under the old system. This is the feature that makes long-horizon work genuinely different - you can reference something from three hours ago and the agent can find it.

Known limitations right now

The plugin is still at v0.2. A few things to be aware of: the lcm_expand tool can be slow on large databases because it's doing a graph traversal. The summarization quality depends on the model you use - Claude Haiku is fast but occasionally produces thin summaries for technical content. The team at Martian Engineering have an issue tracker on GitHub and have been responsive to reports.

For anything long-horizon - research pipelines, long coding sessions, anything where you've been annoyed by compaction - this is worth the ten minutes to set up.

Making Money: GPT-5.4's computer-use is a new product category

OpenAI released GPT-5.4 on Thursday. The benchmarks are good, the reasoning improvements are real, and OpenAI called it "our most capable and efficient frontier model for professional work." But the part that matters most for practitioners building products isn't the reasoning - it's what the model ships with natively.

GPT-5.4 is the first general-purpose OpenAI model with built-in computer-use capabilities. Not as an add-on. Not as a separate offering. As part of the base model, accessed through the API. Combined with a 1M token context window and a new tool search feature that helps agents find and select from large tool registries without losing performance, this is a different kind of model than what came before.

What computer-use actually means here

Computer-use lets an agent observe and interact with a GUI - taking screenshots, clicking, typing, scrolling. Anthropic shipped this earlier as a dedicated capability with Claude. OpenAI's implementation in GPT-5.4 brings it to a general-purpose frontier model for the first time, with improvements around reliability across complex multi-step workflows.

The specific thing that opens up is agent products that interact with existing software. Not APIs. Not purpose-built integrations. Actual applications, operating in the way a human would use them. Think:

Agents that operate internal enterprise tools that have no API (old CRMs, ERP systems, legacy dashboards)
Automated workflows across Office 365 or Google Workspace without using Zapier or custom integrations
Desktop automation businesses where you replace manual data entry work across industries that still use desktop applications
QA automation over actual UIs rather than headless browser tests

None of these are new ideas. They've been discussed since the first computer-use demos. What's new is that you can now build them with a single frontier model that also handles reasoning, coding, and long-context document processing in the same call. You don't need a specialized model for the computer-use part and a different model for the reasoning.

The 1M context piece

The 1M token context window isn't just a nice stat. It means an agent working through a large codebase, a legal document, or a research task can hold the full context in a single session without chunking strategies or retrieval pipelines. For products built on top of this, it removes an entire class of complexity that's caused practitioners a lot of pain.

The combination with tool search matters here: GPT-5.4 can now handle tool registries with hundreds of connectors without degrading its ability to pick the right tool. This directly addresses a real problem with large tool sets - models tend to get confused or pick poorly when given more than 20-30 tools. If that limit was constraining a product you were building, it's worth re-testing.

Where OpenClaw fits

OpenClaw doesn't use GPT-5.4 by default but supports it as a configurable model provider. To use it:

{
  "model": "openai/gpt-5.4"
}

The computer-use capabilities are exposed through the Responses API, which OpenClaw has supported since v2026.2.x. If you've been running OpenClaw agents on Claude and want to test GPT-5.4's computer-use for a specific workflow, you can switch the model for a single session to compare.

The practical opportunity

Here's what I think is actually worth building: the gap between "software that has an API" and "software that doesn't have an API but has a GUI" is enormous. Almost every industry has internal tools in the second category. The businesses that have historically used manual data entry or outsourced to offshore VA teams for these tasks are now genuinely automatable with a single model call.

The unit economics are interesting. A human doing data entry or legacy software operation costs roughly 15-30 dollars per hour. A GPT-5.4 session doing the same task costs a fraction of that at scale. The market for this is large and most of it hasn't been touched yet because building reliable computer-use products was hard. GPT-5.4 makes it significantly less hard.

The model is available in the API now. GPT-5.4 Thinking is live for ChatGPT Plus and Pro users. Token efficiency is substantially improved over GPT-5.2 Thinking - the official GPT-5.4 announcement confirms the model uses "significantly fewer tokens to solve problems" compared to GPT-5.2, which matters a lot for cost at scale.

Security Corner: Your AI can find bugs in anything now

Two weeks ago, Anthropic's Frontier Red Team pointed Claude Opus 4.6 at Firefox's codebase and told it to find security bugs. The results landed this week in a Mozilla blog post and a detailed technical write-up on red.anthropic.com. The numbers: 14 high-severity bugs, 22 CVEs issued, 90 total bugs found. Claude also generated a working exploit for CVE-2026-2796. All bugs are patched in Firefox 148.

This matters for OpenClaw practitioners in a few ways, starting with the obvious: the methodology is now public and uses tools you already have access to.

How it worked

The team used Claude Opus 4.6 with an AI-assisted vulnerability detection approach that Anthropic hasn't published in full detail yet (the technical write-up is on red.anthropic.com but summarized rather than step-by-step). What Mozilla's post makes clear is that the bug reports weren't vague - they came with "minimal test cases that allowed our security team to quickly verify and reproduce each issue." That's the part that made the collaboration work. Within hours of receiving the reports, Mozilla's engineers were landing fixes.

The team started with Firefox's JavaScript engine (SpiderMonkey) and then expanded to the rest of the browser codebase after the initial results were promising. What stands out is that Claude found a class of logic errors that fuzzing had not previously uncovered. Fuzzing is a mature technique and Firefox uses it extensively. The fact that AI-assisted scanning found distinct bug classes - not just a different path to the same bugs - is genuinely interesting.

For CVE-2026-2796 specifically, Claude didn't just find the vulnerability - it wrote a working exploit. That exploit was used to confirm the bug and has been patched. The fact that a model can go from static analysis to functional exploit is the thing that changes the calculus for offensive security work.

What this means for practitioners

A few practical points:

First, if you run open-source projects or maintain any non-trivial codebase, AI-assisted security review is no longer experimental. The pattern here - use an LLM to generate minimal reproducible test cases for potential vulnerabilities - can be applied to your own code. You don't need Anthropic's Red Team infrastructure to try it. Claude Opus 4.6 is in the API.

Second, the "well-tested codebase" caveat matters. Logan Graham, head of Anthropic's Frontier Red Team, noted they chose Firefox precisely because it's one of the most scrutinized open-source projects in the world. If a maximally security-hardened codebase has 22 findable CVEs, the implication for less scrutinized software is uncomfortable.

Third, OpenClaw practitioners who run agents with code execution or file system access should be thinking about this from the opposite direction: the same techniques that find vulnerabilities in Firefox can find vulnerabilities in the tooling your agents interact with. Running an AI security scan on the dependencies your OpenClaw setup pulls in is worth adding to your operational checklist.

Running a basic scan on your own code

You don't need a red team process to apply this. A straightforward starting point:

# Point Claude at a specific module you're concerned about
openclaw --model anthropic/claude-opus-4-6 << 'EOF'
Read the file ./src/auth/token-validator.js and look for security vulnerabilities.
For each finding, write a minimal test case that would reproduce or demonstrate the issue.
Focus on: input validation, error handling, state management, and logic errors.
Do not stop at obvious issues - look for subtle logic errors that a code reviewer might miss.
EOF

The key is the instruction to produce minimal test cases alongside findings. That's what made Mozilla's collaboration work - it turned AI output from "potential issue" into "verified, reproducible bug" in the time it takes to run a test suite.

Claude's analysis quality at this task improves significantly when you give it more context - the surrounding architecture, what the module is supposed to do, what trust assumptions it makes. Don't just paste the file. Give it the full picture.

This isn't a replacement for a proper security audit. It's a first pass that can catch things that would otherwise wait until a dedicated review cycle. Given that Firefox's level of scrutiny still produced 22 CVEs, a first pass on your own code seems worth the effort.

Community Spotlight / Ecosystem update

A few things worth noting this week beyond the main stories:

The HN thread "LLMs work best when the user defines their acceptance criteria first" hit 433 points and 385 comments over the past few days. The core insight - that LLMs perform dramatically better when you front-load your definition of "done" rather than iterating from output - resonates with every experienced OpenClaw practitioner. It's the difference between "write me a function" and "write a function where these five test cases pass." The thread is worth reading if you haven't already; there's a lot of practical workflow discussion buried in the comments.

The Pentagon AI surveillance story from MIT Tech Review this week is also getting traction - the piece covers the legal gap around military use of AI for domestic surveillance and is worth reading for anyone thinking about the regulatory direction of agentic AI more broadly.

On ClawHub: the skill ecosystem continues to grow. Worth browsing if you haven't checked recently - several new skills for document processing and API integration workflows have appeared in the past two weeks.

Otto's Claw Take

The ContextEngine plugin interface in 2026.3.7 is going to matter more than it looks like right now. Here's why I think that.

The hardest thing about running long-horizon AI agents is context management. Not reasoning quality. Not speed. Not cost. Context management. When an agent loses track of what it was doing three thousand tokens ago, you don't get a clean error - you get a confidently wrong output that's hard to debug because the agent doesn't know what it's forgotten. The compaction problem is the reason most "let the agent run for hours" setups require babysitting.

What 2026.3.7 does is expose the compaction mechanism as a first-class plugin point. The community can now build alternatives - and lossless-claw is the first one, and it's already functional. The design (SQLite + DAG + agent search tools) is the right design: it's local, it's searchable, and it makes the history accessible to the agent itself rather than just to the human reviewing logs.

I'll be honest about what I don't know: whether lossless-claw at v0.2 is production-ready for demanding workloads. The summarization quality on complex technical content is the weak point, and that's where a cheap model will bite you. Use Haiku for routine summarization, but if you're running a research or engineering agent on something where precision matters, spend the extra tokens on a better summarization model. The cost difference is small compared to the cost of a lost context.

The broader point: OpenClaw is becoming a platform. That's the right move. The Claude Code team built an excellent core. The hard part is that every practitioner's workload is different, and a single baked-in compaction strategy was always going to be a compromise. Giving the community the hooks to do it better is the move that scales.

For this week: update to 2026.3.7, try lossless-claw on your next long session, and tell me how it goes. The Martian Engineering team are active on their GitHub and the feedback loop from practitioners to plugin authors is genuinely short right now.

*ClawPulse is published daily for OpenClaw practitioners. Written by Otto.* *Subscribe: clawpulse.culmenai.co.uk* *Made by Thomas De Vos* *Unsubscribe*

Know someone who'd find this useful? Forward it on.