Gitea-Tools/docs/llm-agent-sha.md

# LLM-Agent-SHA — Opaque Agent Attribution (Phase 0)

Convention for attributing work to a specific LLM session/workstream across
issues, branches, PRs, and review handoffs, without exposing a human or model
identity. Approved by the owner decision on issue #86
(`#issuecomment-1354`); this document implements **Phase 0 only**.

## The one rule that matters

`LLM-Agent-SHA` is **informational attribution metadata only**. It must never
be used for authentication, authorization, review eligibility, merge
eligibility, profile permissions, or any other security decision.

The security gates remain, unchanged:

- the **authenticated Gitea user** (self-review/self-merge protection),
- the **active MCP profile** and its `allowed_operations`
  (see [`gitea-execution-profiles.md`](gitea-execution-profiles.md)),
- the fail-closed eligibility checks in `gitea_check_pr_eligibility`.

Two sessions with different `LLM-Agent-SHA` values that authenticate as the
same Gitea user are **the same actor** for review/merge safety. A different
SHA never unlocks self-review or self-merge. `tests/test_llm_agent_sha.py`
proves the eligibility logic never consults the SHA.

## Format

```text
LLM-Agent-SHA: llm-<12 lowercase hex chars>
```

Validation regex:

```text
^llm-[0-9a-f]{12}$
```

Examples: `llm-8f3a9c2d6b41`, `llm-41d0e7aa9f2c`, `llm-b7c93d441a08`.

### Generation

Generate 48 random bits, e.g. `python3 -c "import secrets; print('llm-' +
secrets.token_hex(6))"`, or hash a non-secret session UUID. An
operator-provided opaque ID is also fine.

Do **not** derive the value from any of:

- a Gitea token or other secret,
- an email address or username,
- a machine hostname or private filesystem path,
- a model or provider name,
- conversation contents.

The SHA must contain no model name, provider name, human name, email,
hostname, token, private path, or conversation-derived content. It is safe to
include in PR bodies, issue comments, and audit logs — and only there.

## Lifetime

Canonical lifetime is **per PR/workstream**: pick one SHA when starting an
issue and keep it through the branch, PR, and handoff for that workstream. A
per-session SHA is acceptable when the session maps cleanly to one
workstream. Do not reuse a SHA across unrelated workstreams.

## Placement

Phase 0 uses **visible markdown metadata blocks** (not hidden HTML
comments). Include the block in PR bodies and review handoffs; keep it out of
ordinary comments unless attribution is genuinely useful there.

**Never put the SHA in branch or worktree names.** Branches stay
issue-linked and human-readable (`docs/issue-86-llm-agent-sha-phase0`), per
the branch standard.

### Handoff metadata block (implementer → PR body / handoff report)

```markdown
LLM Handoff Metadata:
- LLM-Agent-SHA: llm-8f3a9c2d6b41
- LLM-Role: implementer
- Authenticated-Gitea-User: jcwalker3
- MCP-Profile: gitea-default
- Branch: docs/example-branch
- Worktree: branches/docs-example-branch
- Self-review allowed: no
```

### Review metadata block (reviewer → review comment)

```markdown
Review Metadata:
- LLM-Agent-SHA: llm-41d0e7aa9f2c
- LLM-Role: reviewer
- Authenticated-Gitea-User: sysadmin
- MCP-Profile: prgs-reviewer
- Eligibility: passed
```

## Same SHA vs same user vs same profile

Reviewers and operators must keep three distinct identities straight:

| Comparison | Meaning | Effect on eligibility |
|---|---|---|
| same `LLM-Agent-SHA` | same LLM session/workstream wrote both artifacts | **none — attribution only** |
| same authenticated Gitea user | same Gitea actor | **blocks** self-review / self-merge, regardless of SHA |
| same MCP profile | same capability set | governs `allowed_operations` (what actions are permitted at all) |

Concretely: an implementer session (`llm-8f3a…`, user `jcwalker3`) and a
would-be reviewer session (`llm-41d0…`, also user `jcwalker3`) have different
SHAs but the **same Gitea user** — the reviewer session is still the PR
author to Gitea and must not review, approve, or merge. Review handoffs
require a genuinely different authenticated user (e.g. `sysadmin` /
`prgs-reviewer`).

## Phase 0 scope (and what is deferred)

Phase 0 is documentation, handoff/review templates, and negative tests only.
Deferred to later owner-approved phases; none of this exists yet:

- launcher-enforced SHA generation,
- `LLM_AGENT_SHA` / `LLM_AGENT_ROLE` environment injection,
- `gitea_whoami` returning SHA/role,
- automatic PR body injection by MCP tools,
- audit schema changes requiring the SHA,
- release/orchestrator lineage tracking.

MCP tools neither read nor emit the SHA. Setting an `LLM_AGENT_SHA`
environment variable has no effect on any tool; the negative tests assert
eligibility results are byte-identical with and without it.

## Related documents

- [`llm-workflow-runbooks.md`](llm-workflow-runbooks.md) — the runbooks whose
  handoffs carry these blocks
- [`gitea-execution-profiles.md`](gitea-execution-profiles.md) — profiles and
  `allowed_operations` (the real permission gate)
- [`safety-model.md`](safety-model.md) — audit, redaction, confirmation gates