diff --git a/README.md b/README.md index 2f052d9..ceae647 100644 --- a/README.md +++ b/README.md @@ -381,8 +381,15 @@ python3 -m pytest tests/ -v | `test_python_cli.py` | `close_issue.py` + `mark_issue.py` CLI validation | | `test_mirror_refs.py` | Flags, safety defaults, local integration tests | +(Core suites — the table is non-exhaustive; see `tests/` for the full set.) + All tests mock network and keychain access — no real API calls are made. +For how to write tests — mocking the API/auth safely, testing profile and +self-review/self-merge gates, no-secret regression expectations, and unit vs. +integration guidance — see +[`docs/developer-testing-guidelines.md`](docs/developer-testing-guidelines.md). + ## Troubleshooting ### macOS: `com.apple.provenance` blocks Python execution (#3) diff --git a/docs/developer-testing-guidelines.md b/docs/developer-testing-guidelines.md new file mode 100644 index 0000000..dc5693c --- /dev/null +++ b/docs/developer-testing-guidelines.md @@ -0,0 +1,240 @@ +# Developer Testing Guidelines + +How to write and run tests for Gitea-Tools. This guide reflects the current +repository behavior and the safety model documented in +[`safety-model.md`](safety-model.md), +[`credential-isolation.md`](credential-isolation.md), and +[`gitea-execution-profiles.md`](gitea-execution-profiles.md). + +Core principle: **tests never make real network calls and never touch real +credentials.** Every test mocks the HTTP client and the keychain/auth lookup. + +--- + +## 1. Standard test commands + +The test suite needs the project virtualenv (it provides the MCP SDK): + +```bash +# From the repository root +source venv/bin/activate +python3 -m pytest tests/ -q +``` + +Or invoke the venv interpreter directly without activating: + +```bash +./venv/bin/python -m pytest tests/ -q +``` + +Use `-q` for a compact summary and `-v` to see individual test names. + +### Run the full suite + +```bash +./venv/bin/python -m pytest tests/ -q +``` + +### Run targeted tests + +```bash +# One file +./venv/bin/python -m pytest tests/test_mcp_server.py -q + +# One class +./venv/bin/python -m pytest tests/test_merge_pr.py::TestMergeDisabled -q + +# One test, by node id +./venv/bin/python -m pytest tests/test_review_pr.py::TestAPIPayload::test_payload_fields_and_workflow -q + +# By keyword expression +./venv/bin/python -m pytest tests/ -q -k "merge and fails_closed" +``` + +--- + +## 2. Syntax and formatting checks + +These are fast and belong in any pre-PR loop: + +```bash +# Byte-compile the main modules (catches syntax errors) +python3 -m py_compile mcp_server.py +python3 -m py_compile manage_labels.py + +# Lint shell scripts without executing them +bash -n scripts/clear-provenance + +# Detect stray conflict markers and whitespace errors in the diff +git diff --check +``` + +Run `git diff --check` before every commit; it flags leftover merge-conflict +markers and trailing-whitespace/whitespace-error lines. + +--- + +## 3. How to add an MCP tool test + +MCP tools live in `mcp_server.py` and are exercised in +`tests/test_mcp_server.py`. Tests call the underlying tool function directly +with the network layer and auth mocked. The established pattern: + +```python +from unittest.mock import patch, MagicMock + +FAKE_AUTH = "Basic ZmFrZTpmYWtl" # not a real credential + +class TestCreateIssue(unittest.TestCase): + @patch("mcp_server.api_request") + @patch("mcp_server.get_auth_header", return_value=FAKE_AUTH) + def test_creates_issue(self, _auth, mock_api): + mock_api.return_value = {"number": 1, "html_url": "https://gitea.example.com/issues/1"} + + result = mcp_server.gitea_create_issue(title="Add tests", remote="prgs") + + # Assert on the request the tool would have made + mock_api.assert_called_once() + method, url = mock_api.call_args[0][0], mock_api.call_args[0][1] + self.assertEqual(method, "POST") + self.assertIn("/issues", url) +``` + +Checklist when adding a tool test: + +* Patch `mcp_server.api_request` — never hit the network. +* Patch `mcp_server.get_auth_header` to return a fake header — never read the + keychain. +* Assert on the **method, URL, and payload** the tool builds, and on the shape + of the returned payload. +* Cover both `dadeschools` and `prgs` remotes when the tool takes `remote`, and + confirm the correct host/org/repo are used. +* Cover the error path (e.g. `api_request` raising) and confirm the tool + surfaces a clear message without leaking secrets. + +--- + +## 4. How to mock API requests safely + +* **Always patch `mcp_server.api_request`** (or `gitea_auth.api_request` for the + CLI/auth-level tests). No test should open a socket. +* **Always patch the auth lookup** (`get_auth_header` / `get_credentials`) and + return an obviously fake value. Do not put a real token or password in a test, + a fixture, or an environment file. +* Prefer asserting on `mock_api.call_args` (method/URL/payload) over asserting on + a real response body. +* For keychain behavior specifically, see `tests/test_credentials.py`, which + mocks the `git credential fill` subprocess (`Popen`) and the environment. + +--- + +## 5. How to test profile / allowed-operation failures + +The execution-profile model (see +[`gitea-execution-profiles.md`](gitea-execution-profiles.md)) enforces that a +tool may only perform operations in its profile's `allowed_operations`, and that +`forbidden_operations` always override `allowed_operations`. Mutating tools must +**fail closed** when the active profile does not permit the operation. + +When adding or changing a gated tool, add tests that: + +* Configure a profile whose `allowed_operations` does **not** include the + requested operation, and assert the tool refuses **without** calling + `api_request` (assert `mock_api.assert_not_called()`). +* Configure a profile where the operation is in **both** allowed and forbidden, + and assert forbidden wins (still refused). +* Confirm the refusal message names the missing operation and does not include + any secret material. +* Confirm the happy path (operation allowed) still reaches `api_request`. + +The guiding assertion is: **no mutation path may reach `api_request` unless the +profile/allowed-operation check passed first.** + +--- + +## 6. How to test self-review / self-merge gates + +Author-cannot-review and author-cannot-merge are hard safety gates. The merge +path is gated (`gitea_merge_pr`), the legacy review wrapper fails closed on +`merge=True`, and `gitea_submit_pr_review` never merges. Existing coverage lives +in `tests/test_merge_pr.py` and `tests/test_review_pr.py`. + +Patterns to follow (see those files for concrete examples): + +* **Self-merge blocked:** authenticated user == PR author → the tool returns a + refusal and **never calls the merge endpoint** (`mock_api.assert_not_called()` + or assert no `POST .../merge`). +* **Fail-closed inputs:** missing confirmation string, or an unexpected + `expected_head_sha`/changed-file set → refuse before any API call. +* **Legacy wrapper:** `merge=True` on the review wrapper fails closed and points + to the gated workflow, with no API call + (`test_merge_flag_fails_closed_without_api_call`). +* **Self-approval blocked:** authenticated user == PR author → `approve` / + `request_changes` refused. + +Every new gate should have a test proving the mutating endpoint is **not** +reached when the gate should block. + +--- + +## 7. No-secret / no-token regression expectations + +Secrets must never appear in logs, tool return values, audit records, or test +output (see [`safety-model.md`](safety-model.md) §3). The audit module +(`gitea_audit.py`) redacts secret-like keys and value prefixes; see +`tests/test_audit.py`. + +Expectations for new tests: + +* Assert that token/authorization/password fields are replaced with + `gitea_audit.REDACTED` in any structured output or audit record + (`test_redacts_secret_keys`, `test_redacts_nested_and_lists`). +* Assert that credential-looking substrings in free-text (error messages, + reasons) are redacted (`test_redacts_credential_value_prefixes`, + `test_metadata_and_reason_redacted`). +* Never commit a real token/password, even in a fixture. Use obviously fake + values (e.g. `FAKE_AUTH` above). +* When a tool returns identity/profile metadata, assert it contains the + non-secret fields (username, profile name) and **not** the token. + +There is no third-party secret scanner wired into this repo today; secret safety +is enforced by `gitea_audit.redact` plus the regression tests above. A quick +manual sweep before a PR: + +```bash +# Look for accidentally committed credentials in the diff +git diff --cached | grep -nEi "authorization: (basic|bearer)|password|token=[A-Za-z0-9]" || echo "clean" +``` + +--- + +## 8. Unit tests vs. future Docker integration tests + +* **Unit tests (today, default):** fast, fully mocked, no network, no keychain. + This is where the vast majority of coverage lives and where new tests should + go. They must stay fast and must not require credentials. +* **Docker/local-Gitea integration tests (planned, see #66):** opt-in and + skipped by default, gated behind an explicit environment variable and run + against a pinned, disposable Gitea container. They validate real API behavior + (pagination, permissions, label/PR-review endpoints, error payloads) that + mocks cannot prove. They must not require production credentials and must not + leak tokens. + +Rule of thumb: prove **logic and request-shaping** with unit tests; reserve +integration tests for **real-server compatibility**. Do not convert unit tests +into network tests. + +--- + +## 9. Read-only vs. mutating tool expectations + +* **Read-only tools** (e.g. `gitea_whoami`, `gitea_view_*`, `gitea_list_*`, + `gitea_get_profile`): test that they never issue a mutating HTTP method and + never require a mutation gate. Assert the request method is `GET`. +* **Mutating tools** (create/edit/close/label, review, merge, mirror): test that + they (a) pass the profile/allowed-operation gate, (b) honor confirmation and + self-action gates, (c) emit an audit record with the authenticated identity + and outcome, and (d) fail closed — no `api_request` call — when any gate fails. + +Keep this split explicit in test names and assertions so a reviewer can see, per +tool, which category it belongs to and which gates it must respect.