Reference for the ChatCompress Display action — summarises the current conversation transcript via one LLM call and atomically replaces it with a single synthetic message, preserving semantic continuity while reducing token cost.
AI Integration → Local AI → ChatCompress Action Reference
ChatCompress is a Display Action that condenses a long conversation history into a single compact summary. It makes exactly one LLM call — passing the full transcript plus a system-level summarisation prompt — then atomically replaces the in-memory transcript with a single synthetic assistant-role message whose body is the LLM-produced summary. Subsequent ChatRequest turns see only that summary as their prior context.
Use it when a conversation has grown long enough to push against the model’s context window, or when you want to reduce token consumption on follow-up turns without discarding conversational context entirely.
If you want to discard context entirely rather than summarise it, use ChatClear Action Reference instead. |
When fired, ChatCompress executes the following sequence for the resolved (clientGuid, chatName) pair:
TChatSession control is resolved via the Object field (explicit name) or a visual-tree walk (auto, if Object is empty).status="ok" immediately without calling the LLM. Compressing a one-message or empty transcript is a no-op.status="error". No partial replacement occurs.The next ChatRequest turn sees only the single summary message as prior context. The full original turn history is not recoverable after a successful compress — use ChatClear if you need a reversible reset.
On a Button (or any clickable control), add an Action dynamic with:
Field | Setting |
|---|---|
Action type |
|
Object | Name of the target |
Return | Optional tag that receives the reply envelope JSON. When |
Result 1, Result 2, … (optional) | Tags computed from the reply via Expressions — for example, |
Query | Not used by |
The Action editor hides fields that do not apply to ChatCompress. The Query field is hidden; Object, Return, and Result/Expression rows surface.
The reply envelope follows the same JSON schema as ChatRequest and AI.Execute — see Local AI Reply Envelope Schema for the full field reference. Key fields for ChatCompress:
{
"text": "<LLM-produced summary on success, or '' on error>",
"status": "ok | error | disabled",
"toolTrace": [],
"latencyMs": 1840,
"warnings": []
} |
On success (status="ok"), text carries the summary that was written into the transcript as the replacement message. On failure (status="error"), text is empty and the original transcript is intact. toolTrace is always empty — compress does not dispatch platform tools. latencyMs reflects the LLM round-trip for the summarisation call.
status="ok", no LLM call, transcript unchanged.status="ok", no LLM call, transcript unchanged. A single message cannot be meaningfully summarised further.Both short-circuit cases return immediately without calling the LLM, so they are indistinguishable from a successful compress in terms of the return envelope. The text field is empty for short-circuit returns.
ChatCompress checks a single gate before executing:
SolutionCapabilities[LocalAI].Enabled must be true — the master Local AI kill-switch. When disabled, ChatCompress returns status="disabled" without calling the LLM or touching the transcript.ChatCompress does not inspect ModelOptions tool-surface bits (EnableChatHistory, EnableRuntimeMCP, per-category sub-bits). Transcript management is independent of the tool-surface configuration; the compress call uses the LLM solely for summarisation, not for tool dispatch.
ChatCompress is subject to the same 60-second wall-clock timeout as ChatRequest. For very long transcripts, the summarisation POST may itself take several seconds. If the LLM does not respond within the budget, the action returns status="error" and the original transcript is preserved.
On a CPU-based Ollama host (typical on-premise SCADA server), compressing a 20–30 turn transcript typically takes 3–8 seconds. GPU-equipped hosts are substantially faster. Run a quick test at your hardware tier before exposing a “Compress” button to operators who may expect an immediate response. |
ChatCompress resolves the target chat session using the same two-path logic as ChatClear:
Path | When used | Behavior |
|---|---|---|
Path A — explicit name |
| The platform resolves the named element on the active Display panel. If no control with that name exists, the action returns an error envelope. |
Path B — visual-tree walk |
| The platform walks the visual tree of the active Display and targets the first |