compiler: concept update can write partial pages after finish_reason=length

## Problem

During `openkb add`, concept update calls can hit the provider/model output length limit (`finish_reason == "length"`). OpenKB prints a warning, but the truncated response can still be parsed via `json_repair` and written to `wiki/concepts/*.md`, leaving partially rewritten concept pages on disk.

In practice this means the command can finish with `[OK]` while one or more concept pages are silently corrupted/truncated and require manual repair.

## Environment

- OpenKB: `v0.4.2` (`c92f2e9`)
- Python: uv tool install, Python 3.12 venv
- KB config:

```yaml
language: zh
model: openai/claude-sonnet-4-6
pageindex_threshold: 20
```

## Reproduction

Re-ingest a medium/large markdown document that updates already-large concept pages:

```bash
openkb --kb-dir /Volumes/WDSN7100-2TB/大喵OpenKB add \
  /Volumes/WDSN7100-2TB/大喵OpenKB/raw/claude-desktop-1m-context-zh-patch-guide.md
```

Observed output:

```text
Adding: claude-desktop-1m-context-zh-patch-guide.md
  Compiling short doc...
    summary.......................... 34.2s (in=6428, out=1884)
    concepts-plan....... 7.2s (in=11882, out=452)
    Generating 5 concept(s) (concurrency=5)...
    concept: electron-app-bundle-patching... 37.1s (in=9693, out=1946)
    update: macos-codesign-entitlements... 46.5s (in=12118, out=3429)
    entity: claude-desktop... 24.4s (in=9719, out=1298)
openkb.agent.compiler WARNING: LLM [update: claude-model-whitelist] hit length limit — output may be truncated.
    [WARN] update: claude-model-whitelist hit length limit — output may be truncated.
    update: claude-model-whitelist... 67.3s (in=12869, out=4096)
openkb.agent.compiler WARNING: LLM [update: patch-survival-strategy] hit length limit — output may be truncated.
    [WARN] update: patch-survival-strategy hit length limit — output may be truncated.
    update: patch-survival-strategy... 70.1s (in=13566, out=4096)
openkb.agent.compiler WARNING: LLM [update: claude-desktop-3p-gateway] hit length limit — output may be truncated.
    [WARN] update: claude-desktop-3p-gateway hit length limit — output may be truncated.
    update: claude-desktop-3p-gateway... 72.6s (in=13444, out=4096)
  [OK] claude-desktop-1m-context-zh-patch-guide.md added to knowledge base.
```

Afterwards, at least two concept pages had visibly truncated tails and had to be manually repaired:

- `wiki/concepts/claude-model-whitelist.md`
- `wiki/concepts/claude-desktop-3p-gateway.md`

Example of a truncated tail:

```md
| Code tab 仍显示 200k，3P 配置看起来正确 | 按版本确认
```

and another:

```md
- **已是 ad-hoc
```

## Root cause hypothesis

`_warn_if_truncated()` detects the length finish reason correctly:

```python
if finish_reason != "length":
    return
sys.stdout.write(f"    [WARN] {step_name} hit length limit{cap} — output may be truncated.\n")
```

But the result is still returned as normal content. `_gen_update()` then calls `_parse_json(raw)`, and `json_repair` may salvage a truncated JSON prefix into a parseable object. That object is then written to disk.

Relevant flow:

```python
raw = await _llm_call_async(... f"update: {name}", response_format=_JSON_RESPONSE_FORMAT)
parsed = _parse_json(raw)
content = parsed.get("content") or ""
...
return name, content, True, brief
```

This is related to but not covered by:

- #90: fixed a hard `max_tokens=2048` cap for `concepts-plan`
- #71 / #75: hardened JSON mode and plan parsing
- #73: markdown input/context chunking

This issue is specifically about **per-concept create/update calls hitting output length limits and still writing partial pages**.

## Expected behavior

When an LLM response has `finish_reason == "length"`, OpenKB should not write the parsed/repaired content to `wiki/concepts/*.md` as if it were complete.

Safer options:

1. Treat `finish_reason == "length"` as a failed concept update and skip writing that page.
2. Retry once with a higher configurable `max_tokens` for concept create/update calls.
3. Return structured status from `_llm_call(_async)` such as `{content, finish_reason}` instead of only `content`, so callers can decide whether it is safe to write.
4. Make per-step output token caps configurable, e.g. `concept_update_max_tokens`, while defaulting high enough for large existing concept pages.
5. Optionally add a final validation step that rejects concept pages with clearly incomplete markdown/table/code block endings.

## Why this matters

The warning currently looks non-fatal, but the side effect can be data corruption in the generated wiki. Because `openkb add` still reports `[OK]`, users may not notice until later search/query results are based on incomplete concept pages.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

compiler: concept update can write partial pages after finish_reason=length #148

Problem

Environment

Reproduction

Root cause hypothesis

Expected behavior

Why this matters

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

compiler: concept update can write partial pages after finish_reason=length #148

Description

Problem

Environment

Reproduction

Root cause hypothesis

Expected behavior

Why this matters

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions