unity-mcp/.claude/prompts/nl-unity-suite-nl.md

203 lines
9.8 KiB
Markdown
Raw Normal View History

Improved ci prompt testing suite (#270) * CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions * CI: support both ULF + EBL; validate ULF before -manualLicenseFile; robust readiness wait; use game-ci @v2 actions * CI: activate EBL via container using UNITY_IMAGE; fix readiness regex grouping * CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error patterns, keep return-license @v2 for linter * CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manual_args array for -manualLicenseFile * CI: assert EBL activation writes entitlement to host mount; fail fast if missing * CI: use heredoc in wait step to avoid nested-quote issues; remove redundant EBL artifact copy; drop job-level if and unused UNITY_VERSION * CI: harden wait step (container status check, broader ready patterns, longer timeout); make license return non-blocking * CI: wait step — confirm bridge readiness via status JSON (unity_port) + host socket probe * CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP probe); workflow_dispatch trigger + ASCII step names * CI: refine license error pattern to ignore benign LicensingClient channel startup; only match true activation/return failures * Improve Unity bridge wait logic in CI workflow - Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands * Add comprehensive Unity workflow improvements - Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability. * Refine Unity workflow licensing and permissions - Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps. * fix workflow YAML parse * Normalize NL/T JUnit names and robust summary * Fix Python import syntax in workflow debug step * Improve prompt clarity for XML test fragment format - Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format * Fix problematic regex substitution in test name canonicalization - Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats * CI: NL/T hardening — enforce filename-derived IDs, robust backfill, single-testcase guard; tighten prompt emissions; disallow Bash * fix: keep file ID when canonicalizing test names * CI: move Unity Pro license return to teardown after stopping Unity; keep placeholder at original site * CI: remove revert helper & baseline snapshot; stop creating scripts dir; prompt: standardize T-B validation to level=standard * CI: remove mini workflow and obsolete NL prompts; redact email in all Unity log dumps * NL/T prompt: enforce allowed ops, require per-test fragment emission (incl. failures), add T-F..T-J XML templates * NL suite: enforce strict NL-4 emission; remove brittle relabeling; keep canonicalization + backfill * NL/T: minimize transcript; tighten NL-4 console reads; add final errors scan in T-J * ci: add local validate-nlt-coverage helper * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: minor polish and guardrails; keep staged reports promotion and placeholder detection * read_console: default count=50; normalize types str->list; tolerate legacy payload shapes * read_console: harden response parsing for legacy shapes (data as list, tuple entries) * Docs: refresh CI workflow and prompts (remove mini suite refs; per-test emissions, staging, guard) * CI: move T coverage check after staged promotion; accept _staging as present; dedupe promotion step * CI: make T retry conditional on explicit coverage probe (not failure()); respect _staging in probe
2025-09-08 06:47:56 +08:00
# Unity NL Editing Suite — Additive Test Design
You are running inside CI for the `unity-mcp` repo. Use only the tools allowed by the workflow. Work autonomously; do not prompt the user. Do NOT spawn subagents.
**Print this once, verbatim, early in the run:**
AllowedTools: Write,mcp__unity__manage_editor,mcp__unity__list_resources,mcp__unity__read_resource,mcp__unity__apply_text_edits,mcp__unity__script_apply_edits,mcp__unity__validate_script,mcp__unity__find_in_file,mcp__unity__read_console,mcp__unity__get_sha
---
## Mission
1) Pick target file (prefer):
- `unity://path/Assets/Scripts/LongUnityScriptClaudeTest.cs`
Improved ci prompt testing suite (#270) * CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions * CI: support both ULF + EBL; validate ULF before -manualLicenseFile; robust readiness wait; use game-ci @v2 actions * CI: activate EBL via container using UNITY_IMAGE; fix readiness regex grouping * CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error patterns, keep return-license @v2 for linter * CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manual_args array for -manualLicenseFile * CI: assert EBL activation writes entitlement to host mount; fail fast if missing * CI: use heredoc in wait step to avoid nested-quote issues; remove redundant EBL artifact copy; drop job-level if and unused UNITY_VERSION * CI: harden wait step (container status check, broader ready patterns, longer timeout); make license return non-blocking * CI: wait step — confirm bridge readiness via status JSON (unity_port) + host socket probe * CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP probe); workflow_dispatch trigger + ASCII step names * CI: refine license error pattern to ignore benign LicensingClient channel startup; only match true activation/return failures * Improve Unity bridge wait logic in CI workflow - Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands * Add comprehensive Unity workflow improvements - Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability. * Refine Unity workflow licensing and permissions - Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps. * fix workflow YAML parse * Normalize NL/T JUnit names and robust summary * Fix Python import syntax in workflow debug step * Improve prompt clarity for XML test fragment format - Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format * Fix problematic regex substitution in test name canonicalization - Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats * CI: NL/T hardening — enforce filename-derived IDs, robust backfill, single-testcase guard; tighten prompt emissions; disallow Bash * fix: keep file ID when canonicalizing test names * CI: move Unity Pro license return to teardown after stopping Unity; keep placeholder at original site * CI: remove revert helper & baseline snapshot; stop creating scripts dir; prompt: standardize T-B validation to level=standard * CI: remove mini workflow and obsolete NL prompts; redact email in all Unity log dumps * NL/T prompt: enforce allowed ops, require per-test fragment emission (incl. failures), add T-F..T-J XML templates * NL suite: enforce strict NL-4 emission; remove brittle relabeling; keep canonicalization + backfill * NL/T: minimize transcript; tighten NL-4 console reads; add final errors scan in T-J * ci: add local validate-nlt-coverage helper * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: minor polish and guardrails; keep staged reports promotion and placeholder detection * read_console: default count=50; normalize types str->list; tolerate legacy payload shapes * read_console: harden response parsing for legacy shapes (data as list, tuple entries) * Docs: refresh CI workflow and prompts (remove mini suite refs; per-test emissions, staging, guard) * CI: move T coverage check after staged promotion; accept _staging as present; dedupe promotion step * CI: make T retry conditional on explicit coverage probe (not failure()); respect _staging in probe
2025-09-08 06:47:56 +08:00
2) Execute NL tests NL-0..NL-4 in order using minimal, precise edits that build on each other.
3) Validate each edit with `mcp__unity__validate_script(level:"standard")`.
4) **Report**: write one `<testcase>` XML fragment per test to `reports/<TESTID>_results.xml`. Do **not** read or edit `$JUNIT_OUT`.
Improved ci prompt testing suite (#270) * CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions * CI: support both ULF + EBL; validate ULF before -manualLicenseFile; robust readiness wait; use game-ci @v2 actions * CI: activate EBL via container using UNITY_IMAGE; fix readiness regex grouping * CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error patterns, keep return-license @v2 for linter * CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manual_args array for -manualLicenseFile * CI: assert EBL activation writes entitlement to host mount; fail fast if missing * CI: use heredoc in wait step to avoid nested-quote issues; remove redundant EBL artifact copy; drop job-level if and unused UNITY_VERSION * CI: harden wait step (container status check, broader ready patterns, longer timeout); make license return non-blocking * CI: wait step — confirm bridge readiness via status JSON (unity_port) + host socket probe * CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP probe); workflow_dispatch trigger + ASCII step names * CI: refine license error pattern to ignore benign LicensingClient channel startup; only match true activation/return failures * Improve Unity bridge wait logic in CI workflow - Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands * Add comprehensive Unity workflow improvements - Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability. * Refine Unity workflow licensing and permissions - Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps. * fix workflow YAML parse * Normalize NL/T JUnit names and robust summary * Fix Python import syntax in workflow debug step * Improve prompt clarity for XML test fragment format - Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format * Fix problematic regex substitution in test name canonicalization - Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats * CI: NL/T hardening — enforce filename-derived IDs, robust backfill, single-testcase guard; tighten prompt emissions; disallow Bash * fix: keep file ID when canonicalizing test names * CI: move Unity Pro license return to teardown after stopping Unity; keep placeholder at original site * CI: remove revert helper & baseline snapshot; stop creating scripts dir; prompt: standardize T-B validation to level=standard * CI: remove mini workflow and obsolete NL prompts; redact email in all Unity log dumps * NL/T prompt: enforce allowed ops, require per-test fragment emission (incl. failures), add T-F..T-J XML templates * NL suite: enforce strict NL-4 emission; remove brittle relabeling; keep canonicalization + backfill * NL/T: minimize transcript; tighten NL-4 console reads; add final errors scan in T-J * ci: add local validate-nlt-coverage helper * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: minor polish and guardrails; keep staged reports promotion and placeholder detection * read_console: default count=50; normalize types str->list; tolerate legacy payload shapes * read_console: harden response parsing for legacy shapes (data as list, tuple entries) * Docs: refresh CI workflow and prompts (remove mini suite refs; per-test emissions, staging, guard) * CI: move T coverage check after staged promotion; accept _staging as present; dedupe promotion step * CI: make T retry conditional on explicit coverage probe (not failure()); respect _staging in probe
2025-09-08 06:47:56 +08:00
**CRITICAL XML FORMAT REQUIREMENTS:**
- Each file must contain EXACTLY one `<testcase>` root element
- NO prologue, epilogue, code fences, or extra characters
- NO markdown formatting or explanations outside the XML
- Use this exact format:
```xml
<testcase name="NL-0 — Baseline State Capture" classname="UnityMCP.NL-T">
<system-out><![CDATA[
(evidence of what was accomplished)
]]></system-out>
</testcase>
```
- If test fails, include: `<failure message="reason"/>`
- TESTID must be one of: NL-0, NL-1, NL-2, NL-3, NL-4
5) **NO RESTORATION** - tests build additively on previous state.
Improved ci prompt testing suite (#270) * CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions * CI: support both ULF + EBL; validate ULF before -manualLicenseFile; robust readiness wait; use game-ci @v2 actions * CI: activate EBL via container using UNITY_IMAGE; fix readiness regex grouping * CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error patterns, keep return-license @v2 for linter * CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manual_args array for -manualLicenseFile * CI: assert EBL activation writes entitlement to host mount; fail fast if missing * CI: use heredoc in wait step to avoid nested-quote issues; remove redundant EBL artifact copy; drop job-level if and unused UNITY_VERSION * CI: harden wait step (container status check, broader ready patterns, longer timeout); make license return non-blocking * CI: wait step — confirm bridge readiness via status JSON (unity_port) + host socket probe * CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP probe); workflow_dispatch trigger + ASCII step names * CI: refine license error pattern to ignore benign LicensingClient channel startup; only match true activation/return failures * Improve Unity bridge wait logic in CI workflow - Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands * Add comprehensive Unity workflow improvements - Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability. * Refine Unity workflow licensing and permissions - Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps. * fix workflow YAML parse * Normalize NL/T JUnit names and robust summary * Fix Python import syntax in workflow debug step * Improve prompt clarity for XML test fragment format - Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format * Fix problematic regex substitution in test name canonicalization - Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats * CI: NL/T hardening — enforce filename-derived IDs, robust backfill, single-testcase guard; tighten prompt emissions; disallow Bash * fix: keep file ID when canonicalizing test names * CI: move Unity Pro license return to teardown after stopping Unity; keep placeholder at original site * CI: remove revert helper & baseline snapshot; stop creating scripts dir; prompt: standardize T-B validation to level=standard * CI: remove mini workflow and obsolete NL prompts; redact email in all Unity log dumps * NL/T prompt: enforce allowed ops, require per-test fragment emission (incl. failures), add T-F..T-J XML templates * NL suite: enforce strict NL-4 emission; remove brittle relabeling; keep canonicalization + backfill * NL/T: minimize transcript; tighten NL-4 console reads; add final errors scan in T-J * ci: add local validate-nlt-coverage helper * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: minor polish and guardrails; keep staged reports promotion and placeholder detection * read_console: default count=50; normalize types str->list; tolerate legacy payload shapes * read_console: harden response parsing for legacy shapes (data as list, tuple entries) * Docs: refresh CI workflow and prompts (remove mini suite refs; per-test emissions, staging, guard) * CI: move T coverage check after staged promotion; accept _staging as present; dedupe promotion step * CI: make T retry conditional on explicit coverage probe (not failure()); respect _staging in probe
2025-09-08 06:47:56 +08:00
6) **STRICT FRAGMENT EMISSION** - After each test, immediately emit a clean XML file under `reports/<TESTID>_results.xml` with exactly one `<testcase>` whose `name` begins with the exact test id. No prologue/epilogue or fences. If the test fails, include a `<failure message="..."/>` and still emit.
---
## Environment & Paths (CI)
- Always pass: `project_root: "TestProjects/UnityMCPTests"` and `ctx: {}` on list/read/edit/validate.
- **Canonical URIs only**:
- Primary: `unity://path/Assets/...` (never embed `project_root` in the URI)
- Relative (when supported): `Assets/...`
CI provides:
- `$JUNIT_OUT=reports/junit-nl-suite.xml` (precreated; leave alone)
- `$MD_OUT=reports/junit-nl-suite.md` (synthesized from JUnit)
---
Improved ci prompt testing suite (#270) * CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions * CI: support both ULF + EBL; validate ULF before -manualLicenseFile; robust readiness wait; use game-ci @v2 actions * CI: activate EBL via container using UNITY_IMAGE; fix readiness regex grouping * CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error patterns, keep return-license @v2 for linter * CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manual_args array for -manualLicenseFile * CI: assert EBL activation writes entitlement to host mount; fail fast if missing * CI: use heredoc in wait step to avoid nested-quote issues; remove redundant EBL artifact copy; drop job-level if and unused UNITY_VERSION * CI: harden wait step (container status check, broader ready patterns, longer timeout); make license return non-blocking * CI: wait step — confirm bridge readiness via status JSON (unity_port) + host socket probe * CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP probe); workflow_dispatch trigger + ASCII step names * CI: refine license error pattern to ignore benign LicensingClient channel startup; only match true activation/return failures * Improve Unity bridge wait logic in CI workflow - Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands * Add comprehensive Unity workflow improvements - Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability. * Refine Unity workflow licensing and permissions - Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps. * fix workflow YAML parse * Normalize NL/T JUnit names and robust summary * Fix Python import syntax in workflow debug step * Improve prompt clarity for XML test fragment format - Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format * Fix problematic regex substitution in test name canonicalization - Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats * CI: NL/T hardening — enforce filename-derived IDs, robust backfill, single-testcase guard; tighten prompt emissions; disallow Bash * fix: keep file ID when canonicalizing test names * CI: move Unity Pro license return to teardown after stopping Unity; keep placeholder at original site * CI: remove revert helper & baseline snapshot; stop creating scripts dir; prompt: standardize T-B validation to level=standard * CI: remove mini workflow and obsolete NL prompts; redact email in all Unity log dumps * NL/T prompt: enforce allowed ops, require per-test fragment emission (incl. failures), add T-F..T-J XML templates * NL suite: enforce strict NL-4 emission; remove brittle relabeling; keep canonicalization + backfill * NL/T: minimize transcript; tighten NL-4 console reads; add final errors scan in T-J * ci: add local validate-nlt-coverage helper * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: minor polish and guardrails; keep staged reports promotion and placeholder detection * read_console: default count=50; normalize types str->list; tolerate legacy payload shapes * read_console: harden response parsing for legacy shapes (data as list, tuple entries) * Docs: refresh CI workflow and prompts (remove mini suite refs; per-test emissions, staging, guard) * CI: move T coverage check after staged promotion; accept _staging as present; dedupe promotion step * CI: make T retry conditional on explicit coverage probe (not failure()); respect _staging in probe
2025-09-08 06:47:56 +08:00
## Transcript Minimization Rules
- Do not restate tool JSON; summarize in ≤ 2 short lines.
- Never paste full file contents. For matches, include only the matched line and ±1 line.
- Prefer `mcp__unity__find_in_file` for targeting; avoid `mcp__unity__read_resource` unless strictly necessary. If needed, limit to `head_bytes ≤ 256` or `tail_lines ≤ 10`.
- Pertest `system-out` ≤ 400 chars: brief status only (no SHA).
- Console evidence: fetch the last 10 lines with `include_stacktrace:false` and include ≤ 3 lines in the fragment.
- Avoid quoting multiline diffs; reference markers instead.
— Console scans: perform two reads — last 10 `log/info` lines and up to 3 `error` entries (use `include_stacktrace:false`); include ≤ 3 lines total in the fragment; if no errors, state "no errors".
---
## Tool Mapping
- **Anchors/regex/structured**: `mcp__unity__script_apply_edits`
- Allowed ops: `anchor_insert`, `replace_method`, `insert_method`, `delete_method`, `regex_replace`
Improved ci prompt testing suite (#270) * CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions * CI: support both ULF + EBL; validate ULF before -manualLicenseFile; robust readiness wait; use game-ci @v2 actions * CI: activate EBL via container using UNITY_IMAGE; fix readiness regex grouping * CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error patterns, keep return-license @v2 for linter * CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manual_args array for -manualLicenseFile * CI: assert EBL activation writes entitlement to host mount; fail fast if missing * CI: use heredoc in wait step to avoid nested-quote issues; remove redundant EBL artifact copy; drop job-level if and unused UNITY_VERSION * CI: harden wait step (container status check, broader ready patterns, longer timeout); make license return non-blocking * CI: wait step — confirm bridge readiness via status JSON (unity_port) + host socket probe * CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP probe); workflow_dispatch trigger + ASCII step names * CI: refine license error pattern to ignore benign LicensingClient channel startup; only match true activation/return failures * Improve Unity bridge wait logic in CI workflow - Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands * Add comprehensive Unity workflow improvements - Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability. * Refine Unity workflow licensing and permissions - Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps. * fix workflow YAML parse * Normalize NL/T JUnit names and robust summary * Fix Python import syntax in workflow debug step * Improve prompt clarity for XML test fragment format - Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format * Fix problematic regex substitution in test name canonicalization - Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats * CI: NL/T hardening — enforce filename-derived IDs, robust backfill, single-testcase guard; tighten prompt emissions; disallow Bash * fix: keep file ID when canonicalizing test names * CI: move Unity Pro license return to teardown after stopping Unity; keep placeholder at original site * CI: remove revert helper & baseline snapshot; stop creating scripts dir; prompt: standardize T-B validation to level=standard * CI: remove mini workflow and obsolete NL prompts; redact email in all Unity log dumps * NL/T prompt: enforce allowed ops, require per-test fragment emission (incl. failures), add T-F..T-J XML templates * NL suite: enforce strict NL-4 emission; remove brittle relabeling; keep canonicalization + backfill * NL/T: minimize transcript; tighten NL-4 console reads; add final errors scan in T-J * ci: add local validate-nlt-coverage helper * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: minor polish and guardrails; keep staged reports promotion and placeholder detection * read_console: default count=50; normalize types str->list; tolerate legacy payload shapes * read_console: harden response parsing for legacy shapes (data as list, tuple entries) * Docs: refresh CI workflow and prompts (remove mini suite refs; per-test emissions, staging, guard) * CI: move T coverage check after staged promotion; accept _staging as present; dedupe promotion step * CI: make T retry conditional on explicit coverage probe (not failure()); respect _staging in probe
2025-09-08 06:47:56 +08:00
- For `anchor_insert`, always set `"position": "before"` or `"after"`.
- **Precise ranges / atomic batch**: `mcp__unity__apply_text_edits` (nonoverlapping ranges)
Improved ci prompt testing suite (#270) * CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions * CI: support both ULF + EBL; validate ULF before -manualLicenseFile; robust readiness wait; use game-ci @v2 actions * CI: activate EBL via container using UNITY_IMAGE; fix readiness regex grouping * CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error patterns, keep return-license @v2 for linter * CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manual_args array for -manualLicenseFile * CI: assert EBL activation writes entitlement to host mount; fail fast if missing * CI: use heredoc in wait step to avoid nested-quote issues; remove redundant EBL artifact copy; drop job-level if and unused UNITY_VERSION * CI: harden wait step (container status check, broader ready patterns, longer timeout); make license return non-blocking * CI: wait step — confirm bridge readiness via status JSON (unity_port) + host socket probe * CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP probe); workflow_dispatch trigger + ASCII step names * CI: refine license error pattern to ignore benign LicensingClient channel startup; only match true activation/return failures * Improve Unity bridge wait logic in CI workflow - Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands * Add comprehensive Unity workflow improvements - Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability. * Refine Unity workflow licensing and permissions - Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps. * fix workflow YAML parse * Normalize NL/T JUnit names and robust summary * Fix Python import syntax in workflow debug step * Improve prompt clarity for XML test fragment format - Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format * Fix problematic regex substitution in test name canonicalization - Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats * CI: NL/T hardening — enforce filename-derived IDs, robust backfill, single-testcase guard; tighten prompt emissions; disallow Bash * fix: keep file ID when canonicalizing test names * CI: move Unity Pro license return to teardown after stopping Unity; keep placeholder at original site * CI: remove revert helper & baseline snapshot; stop creating scripts dir; prompt: standardize T-B validation to level=standard * CI: remove mini workflow and obsolete NL prompts; redact email in all Unity log dumps * NL/T prompt: enforce allowed ops, require per-test fragment emission (incl. failures), add T-F..T-J XML templates * NL suite: enforce strict NL-4 emission; remove brittle relabeling; keep canonicalization + backfill * NL/T: minimize transcript; tighten NL-4 console reads; add final errors scan in T-J * ci: add local validate-nlt-coverage helper * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: minor polish and guardrails; keep staged reports promotion and placeholder detection * read_console: default count=50; normalize types str->list; tolerate legacy payload shapes * read_console: harden response parsing for legacy shapes (data as list, tuple entries) * Docs: refresh CI workflow and prompts (remove mini suite refs; per-test emissions, staging, guard) * CI: move T coverage check after staged promotion; accept _staging as present; dedupe promotion step * CI: make T retry conditional on explicit coverage probe (not failure()); respect _staging in probe
2025-09-08 06:47:56 +08:00
STRICT OP GUARDRAILS
- Do not use `anchor_replace`. Structured edits must be one of: `anchor_insert`, `replace_method`, `insert_method`, `delete_method`, `regex_replace`.
- For multispot textual tweaks in one operation, compute nonoverlapping ranges with `mcp__unity__find_in_file` and use `mcp__unity__apply_text_edits`.
- **Hash-only**: `mcp__unity__get_sha` — returns `{sha256,lengthBytes,lastModifiedUtc}` without file body
- **Validation**: `mcp__unity__validate_script(level:"standard")`
- **Dynamic targeting**: Use `mcp__unity__find_in_file` to locate current positions of methods/markers
---
## Additive Test Design Principles
**Key Changes from Reset-Based:**
1. **Dynamic Targeting**: Use `find_in_file` to locate methods/content, never hardcode line numbers
2. **State Awareness**: Each test expects the file state left by the previous test
3. **Content-Based Operations**: Target methods by signature, classes by name, not coordinates
4. **Cumulative Validation**: Ensure the file remains structurally sound throughout the sequence
5. **Composability**: Tests demonstrate how operations work together in real workflows
**State Tracking:**
Improved ci prompt testing suite (#270) * CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions * CI: support both ULF + EBL; validate ULF before -manualLicenseFile; robust readiness wait; use game-ci @v2 actions * CI: activate EBL via container using UNITY_IMAGE; fix readiness regex grouping * CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error patterns, keep return-license @v2 for linter * CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manual_args array for -manualLicenseFile * CI: assert EBL activation writes entitlement to host mount; fail fast if missing * CI: use heredoc in wait step to avoid nested-quote issues; remove redundant EBL artifact copy; drop job-level if and unused UNITY_VERSION * CI: harden wait step (container status check, broader ready patterns, longer timeout); make license return non-blocking * CI: wait step — confirm bridge readiness via status JSON (unity_port) + host socket probe * CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP probe); workflow_dispatch trigger + ASCII step names * CI: refine license error pattern to ignore benign LicensingClient channel startup; only match true activation/return failures * Improve Unity bridge wait logic in CI workflow - Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands * Add comprehensive Unity workflow improvements - Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability. * Refine Unity workflow licensing and permissions - Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps. * fix workflow YAML parse * Normalize NL/T JUnit names and robust summary * Fix Python import syntax in workflow debug step * Improve prompt clarity for XML test fragment format - Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format * Fix problematic regex substitution in test name canonicalization - Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats * CI: NL/T hardening — enforce filename-derived IDs, robust backfill, single-testcase guard; tighten prompt emissions; disallow Bash * fix: keep file ID when canonicalizing test names * CI: move Unity Pro license return to teardown after stopping Unity; keep placeholder at original site * CI: remove revert helper & baseline snapshot; stop creating scripts dir; prompt: standardize T-B validation to level=standard * CI: remove mini workflow and obsolete NL prompts; redact email in all Unity log dumps * NL/T prompt: enforce allowed ops, require per-test fragment emission (incl. failures), add T-F..T-J XML templates * NL suite: enforce strict NL-4 emission; remove brittle relabeling; keep canonicalization + backfill * NL/T: minimize transcript; tighten NL-4 console reads; add final errors scan in T-J * ci: add local validate-nlt-coverage helper * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: minor polish and guardrails; keep staged reports promotion and placeholder detection * read_console: default count=50; normalize types str->list; tolerate legacy payload shapes * read_console: harden response parsing for legacy shapes (data as list, tuple entries) * Docs: refresh CI workflow and prompts (remove mini suite refs; per-test emissions, staging, guard) * CI: move T coverage check after staged promotion; accept _staging as present; dedupe promotion step * CI: make T retry conditional on explicit coverage probe (not failure()); respect _staging in probe
2025-09-08 06:47:56 +08:00
- Track file SHA after each test (`mcp__unity__get_sha`) for potential preconditions in later passes. Do not include SHA values in report fragments.
- Use content signatures (method names, comment markers) to verify expected state
- Validate structural integrity after each major change
---
## Execution Order & Additive Test Specs
### NL-0. Baseline State Capture
**Goal**: Establish initial file state and verify accessibility
**Actions**:
- Read file head and tail to confirm structure
- Locate key methods: `HasTarget()`, `GetCurrentTarget()`, `Update()`, `ApplyBlend()`
- Record initial SHA for tracking
- **Expected final state**: Unchanged baseline file
### NL-1. Core Method Operations (Additive State A)
**Goal**: Demonstrate method replacement operations
**Actions**:
- Replace `HasTarget()` method body: `public bool HasTarget() { return currentTarget != null; }`
Unity MCP CI Test Improvements (#452) * Update github-repo-stats.yml * Server: refine shutdown logic per bot feedback\n- Parameterize _force_exit(code) and use timers with args\n- Consistent behavior on BrokenPipeError (no immediate exit)\n- Exit code 1 on unexpected exceptions\n\nTests: restore telemetry module after disabling to avoid bleed-over * Revert "Server: refine shutdown logic per bot feedback\n- Parameterize _force_exit(code) and use timers with args\n- Consistent behavior on BrokenPipeError (no immediate exit)\n- Exit code 1 on unexpected exceptions\n\nTests: restore telemetry module after disabling to avoid bleed-over" This reverts commit 74d35d371a28b2d86cb7722e28017b29be053efd. * Add fork-only Unity tests workflow and guard upstream run * Move fork Unity tests workflow to root * Fix MCP server install step in NL suite workflow * Harden NL suite prompts for deterministic anchors * update claude haiku version for NL/T tests * Fix CI: share unity-mcp status dir * update yaml * Add Unity bridge debug step in CI * Fail fast when Unity MCP status file missing * Allow Unity local share writable for MCP status * Mount Unity cache rw and dump Editor log for MCP debug * Allow Unity config dir writable for MCP heartbeat/logs * Write Unity logs to file and list config dir in debug * Use available Anthropic models for T pass * Use latest claude sonnet/haiku models in workflow * Fix YAML indentation for MCP preflight step * Point MCP server to src/server.py and fix preflight * another try * Add MCP preflight workflow and update NL suite * Fixes to improve CI testing * Cleanup * fixes * diag * fix yaml * fix status dir * Fix YAML / printing to stdout --> stderr * find in file fixes. * fixes to find_in_file and CI report format error * Only run the stats on the CoPlay main repo, not forks. * Coderabbit fixes.
2025-12-11 06:54:55 +08:00
- Validate.
- Insert `PrintSeries()` method after a unique anchor method. Prefer `GetCurrentTarget()` if unique; otherwise use another unique method such as `ApplyBlend`. Insert: `public void PrintSeries() { Debug.Log("1,2,3"); }`
- Validate that both methods exist and are properly formatted.
- Delete `PrintSeries()` method (cleanup for next test)
- **Expected final state**: `HasTarget()` modified, file structure intact, no temporary methods
### NL-2. Anchor Comment Insertion (Additive State B)
**Goal**: Demonstrate anchor-based insertions above methods
**Actions**:
Unity MCP CI Test Improvements (#452) * Update github-repo-stats.yml * Server: refine shutdown logic per bot feedback\n- Parameterize _force_exit(code) and use timers with args\n- Consistent behavior on BrokenPipeError (no immediate exit)\n- Exit code 1 on unexpected exceptions\n\nTests: restore telemetry module after disabling to avoid bleed-over * Revert "Server: refine shutdown logic per bot feedback\n- Parameterize _force_exit(code) and use timers with args\n- Consistent behavior on BrokenPipeError (no immediate exit)\n- Exit code 1 on unexpected exceptions\n\nTests: restore telemetry module after disabling to avoid bleed-over" This reverts commit 74d35d371a28b2d86cb7722e28017b29be053efd. * Add fork-only Unity tests workflow and guard upstream run * Move fork Unity tests workflow to root * Fix MCP server install step in NL suite workflow * Harden NL suite prompts for deterministic anchors * update claude haiku version for NL/T tests * Fix CI: share unity-mcp status dir * update yaml * Add Unity bridge debug step in CI * Fail fast when Unity MCP status file missing * Allow Unity local share writable for MCP status * Mount Unity cache rw and dump Editor log for MCP debug * Allow Unity config dir writable for MCP heartbeat/logs * Write Unity logs to file and list config dir in debug * Use available Anthropic models for T pass * Use latest claude sonnet/haiku models in workflow * Fix YAML indentation for MCP preflight step * Point MCP server to src/server.py and fix preflight * another try * Add MCP preflight workflow and update NL suite * Fixes to improve CI testing * Cleanup * fixes * diag * fix yaml * fix status dir * Fix YAML / printing to stdout --> stderr * find in file fixes. * fixes to find_in_file and CI report format error * Only run the stats on the CoPlay main repo, not forks. * Coderabbit fixes.
2025-12-11 06:54:55 +08:00
- Use `find_in_file` with a tolerant anchor to locate the `Update()` method, e.g. `(?m)^\\s*(?:public|private|protected|internal)?\\s*void\\s+Update\\s*\\(\\s*\\)`
- Expect exactly one match; if multiple, fail clearly rather than guessing.
- Insert `// Build marker OK` comment line above `Update()` method
- Verify comment exists and `Update()` still functions
- **Expected final state**: State A + build marker comment above `Update()`
### NL-3. End-of-Class Content (Additive State C)
Unity MCP CI Test Improvements (#452) * Update github-repo-stats.yml * Server: refine shutdown logic per bot feedback\n- Parameterize _force_exit(code) and use timers with args\n- Consistent behavior on BrokenPipeError (no immediate exit)\n- Exit code 1 on unexpected exceptions\n\nTests: restore telemetry module after disabling to avoid bleed-over * Revert "Server: refine shutdown logic per bot feedback\n- Parameterize _force_exit(code) and use timers with args\n- Consistent behavior on BrokenPipeError (no immediate exit)\n- Exit code 1 on unexpected exceptions\n\nTests: restore telemetry module after disabling to avoid bleed-over" This reverts commit 74d35d371a28b2d86cb7722e28017b29be053efd. * Add fork-only Unity tests workflow and guard upstream run * Move fork Unity tests workflow to root * Fix MCP server install step in NL suite workflow * Harden NL suite prompts for deterministic anchors * update claude haiku version for NL/T tests * Fix CI: share unity-mcp status dir * update yaml * Add Unity bridge debug step in CI * Fail fast when Unity MCP status file missing * Allow Unity local share writable for MCP status * Mount Unity cache rw and dump Editor log for MCP debug * Allow Unity config dir writable for MCP heartbeat/logs * Write Unity logs to file and list config dir in debug * Use available Anthropic models for T pass * Use latest claude sonnet/haiku models in workflow * Fix YAML indentation for MCP preflight step * Point MCP server to src/server.py and fix preflight * another try * Add MCP preflight workflow and update NL suite * Fixes to improve CI testing * Cleanup * fixes * diag * fix yaml * fix status dir * Fix YAML / printing to stdout --> stderr * find in file fixes. * fixes to find_in_file and CI report format error * Only run the stats on the CoPlay main repo, not forks. * Coderabbit fixes.
2025-12-11 06:54:55 +08:00
**Goal**: Demonstrate end-of-class insertions without ambiguous anchors
**Actions**:
Unity MCP CI Test Improvements (#452) * Update github-repo-stats.yml * Server: refine shutdown logic per bot feedback\n- Parameterize _force_exit(code) and use timers with args\n- Consistent behavior on BrokenPipeError (no immediate exit)\n- Exit code 1 on unexpected exceptions\n\nTests: restore telemetry module after disabling to avoid bleed-over * Revert "Server: refine shutdown logic per bot feedback\n- Parameterize _force_exit(code) and use timers with args\n- Consistent behavior on BrokenPipeError (no immediate exit)\n- Exit code 1 on unexpected exceptions\n\nTests: restore telemetry module after disabling to avoid bleed-over" This reverts commit 74d35d371a28b2d86cb7722e28017b29be053efd. * Add fork-only Unity tests workflow and guard upstream run * Move fork Unity tests workflow to root * Fix MCP server install step in NL suite workflow * Harden NL suite prompts for deterministic anchors * update claude haiku version for NL/T tests * Fix CI: share unity-mcp status dir * update yaml * Add Unity bridge debug step in CI * Fail fast when Unity MCP status file missing * Allow Unity local share writable for MCP status * Mount Unity cache rw and dump Editor log for MCP debug * Allow Unity config dir writable for MCP heartbeat/logs * Write Unity logs to file and list config dir in debug * Use available Anthropic models for T pass * Use latest claude sonnet/haiku models in workflow * Fix YAML indentation for MCP preflight step * Point MCP server to src/server.py and fix preflight * another try * Add MCP preflight workflow and update NL suite * Fixes to improve CI testing * Cleanup * fixes * diag * fix yaml * fix status dir * Fix YAML / printing to stdout --> stderr * find in file fixes. * fixes to find_in_file and CI report format error * Only run the stats on the CoPlay main repo, not forks. * Coderabbit fixes.
2025-12-11 06:54:55 +08:00
- Use `find_in_file` to locate brace-only lines (e.g., `(?m)^\\s*}\\s*$`). Select the **last** such line (preferably indentation 0 if multiples).
- Compute an exact insertion point immediately before that last brace using `apply_text_edits` (do not use `anchor_insert` for this step).
- Insert three comment lines before the final class brace:
```
// Tail test A
// Tail test B
// Tail test C
```
- **Expected final state**: State B + tail comments before class closing brace
### NL-4. Console State Verification (No State Change)
**Goal**: Verify Unity console integration without file modification
**Actions**:
Improved ci prompt testing suite (#270) * CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions * CI: support both ULF + EBL; validate ULF before -manualLicenseFile; robust readiness wait; use game-ci @v2 actions * CI: activate EBL via container using UNITY_IMAGE; fix readiness regex grouping * CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error patterns, keep return-license @v2 for linter * CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manual_args array for -manualLicenseFile * CI: assert EBL activation writes entitlement to host mount; fail fast if missing * CI: use heredoc in wait step to avoid nested-quote issues; remove redundant EBL artifact copy; drop job-level if and unused UNITY_VERSION * CI: harden wait step (container status check, broader ready patterns, longer timeout); make license return non-blocking * CI: wait step — confirm bridge readiness via status JSON (unity_port) + host socket probe * CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP probe); workflow_dispatch trigger + ASCII step names * CI: refine license error pattern to ignore benign LicensingClient channel startup; only match true activation/return failures * Improve Unity bridge wait logic in CI workflow - Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands * Add comprehensive Unity workflow improvements - Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability. * Refine Unity workflow licensing and permissions - Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps. * fix workflow YAML parse * Normalize NL/T JUnit names and robust summary * Fix Python import syntax in workflow debug step * Improve prompt clarity for XML test fragment format - Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format * Fix problematic regex substitution in test name canonicalization - Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats * CI: NL/T hardening — enforce filename-derived IDs, robust backfill, single-testcase guard; tighten prompt emissions; disallow Bash * fix: keep file ID when canonicalizing test names * CI: move Unity Pro license return to teardown after stopping Unity; keep placeholder at original site * CI: remove revert helper & baseline snapshot; stop creating scripts dir; prompt: standardize T-B validation to level=standard * CI: remove mini workflow and obsolete NL prompts; redact email in all Unity log dumps * NL/T prompt: enforce allowed ops, require per-test fragment emission (incl. failures), add T-F..T-J XML templates * NL suite: enforce strict NL-4 emission; remove brittle relabeling; keep canonicalization + backfill * NL/T: minimize transcript; tighten NL-4 console reads; add final errors scan in T-J * ci: add local validate-nlt-coverage helper * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: minor polish and guardrails; keep staged reports promotion and placeholder detection * read_console: default count=50; normalize types str->list; tolerate legacy payload shapes * read_console: harden response parsing for legacy shapes (data as list, tuple entries) * Docs: refresh CI workflow and prompts (remove mini suite refs; per-test emissions, staging, guard) * CI: move T coverage check after staged promotion; accept _staging as present; dedupe promotion step * CI: make T retry conditional on explicit coverage probe (not failure()); respect _staging in probe
2025-09-08 06:47:56 +08:00
- Read last 10 Unity console lines (log/info)
- Perform a targeted scan for errors/exceptions (type: errors), up to 3 entries
- Validate no compilation errors from previous operations
- **Expected final state**: State C (unchanged)
Improved ci prompt testing suite (#270) * CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions * CI: support both ULF + EBL; validate ULF before -manualLicenseFile; robust readiness wait; use game-ci @v2 actions * CI: activate EBL via container using UNITY_IMAGE; fix readiness regex grouping * CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error patterns, keep return-license @v2 for linter * CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manual_args array for -manualLicenseFile * CI: assert EBL activation writes entitlement to host mount; fail fast if missing * CI: use heredoc in wait step to avoid nested-quote issues; remove redundant EBL artifact copy; drop job-level if and unused UNITY_VERSION * CI: harden wait step (container status check, broader ready patterns, longer timeout); make license return non-blocking * CI: wait step — confirm bridge readiness via status JSON (unity_port) + host socket probe * CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP probe); workflow_dispatch trigger + ASCII step names * CI: refine license error pattern to ignore benign LicensingClient channel startup; only match true activation/return failures * Improve Unity bridge wait logic in CI workflow - Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands * Add comprehensive Unity workflow improvements - Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability. * Refine Unity workflow licensing and permissions - Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps. * fix workflow YAML parse * Normalize NL/T JUnit names and robust summary * Fix Python import syntax in workflow debug step * Improve prompt clarity for XML test fragment format - Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format * Fix problematic regex substitution in test name canonicalization - Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats * CI: NL/T hardening — enforce filename-derived IDs, robust backfill, single-testcase guard; tighten prompt emissions; disallow Bash * fix: keep file ID when canonicalizing test names * CI: move Unity Pro license return to teardown after stopping Unity; keep placeholder at original site * CI: remove revert helper & baseline snapshot; stop creating scripts dir; prompt: standardize T-B validation to level=standard * CI: remove mini workflow and obsolete NL prompts; redact email in all Unity log dumps * NL/T prompt: enforce allowed ops, require per-test fragment emission (incl. failures), add T-F..T-J XML templates * NL suite: enforce strict NL-4 emission; remove brittle relabeling; keep canonicalization + backfill * NL/T: minimize transcript; tighten NL-4 console reads; add final errors scan in T-J * ci: add local validate-nlt-coverage helper * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: minor polish and guardrails; keep staged reports promotion and placeholder detection * read_console: default count=50; normalize types str->list; tolerate legacy payload shapes * read_console: harden response parsing for legacy shapes (data as list, tuple entries) * Docs: refresh CI workflow and prompts (remove mini suite refs; per-test emissions, staging, guard) * CI: move T coverage check after staged promotion; accept _staging as present; dedupe promotion step * CI: make T retry conditional on explicit coverage probe (not failure()); respect _staging in probe
2025-09-08 06:47:56 +08:00
- **IMMEDIATELY** write clean XML fragment to `reports/NL-4_results.xml` (no extra text). The `<testcase name>` must start with `NL-4`. Include at most 3 lines total across both reads, or simply state "no errors; console OK" (≤ 400 chars).
## Dynamic Targeting Examples
**Instead of hardcoded coordinates:**
```json
{"startLine": 31, "startCol": 26, "endLine": 31, "endCol": 58}
```
**Use content-aware targeting:**
```json
# Find current method location
find_in_file(pattern: "public bool HasTarget\\(\\)")
# Then compute edit ranges from found position
```
**Method targeting by signature:**
```json
{"op": "replace_method", "className": "LongUnityScriptClaudeTest", "methodName": "HasTarget"}
```
**Anchor-based insertions:**
```json
Unity MCP CI Test Improvements (#452) * Update github-repo-stats.yml * Server: refine shutdown logic per bot feedback\n- Parameterize _force_exit(code) and use timers with args\n- Consistent behavior on BrokenPipeError (no immediate exit)\n- Exit code 1 on unexpected exceptions\n\nTests: restore telemetry module after disabling to avoid bleed-over * Revert "Server: refine shutdown logic per bot feedback\n- Parameterize _force_exit(code) and use timers with args\n- Consistent behavior on BrokenPipeError (no immediate exit)\n- Exit code 1 on unexpected exceptions\n\nTests: restore telemetry module after disabling to avoid bleed-over" This reverts commit 74d35d371a28b2d86cb7722e28017b29be053efd. * Add fork-only Unity tests workflow and guard upstream run * Move fork Unity tests workflow to root * Fix MCP server install step in NL suite workflow * Harden NL suite prompts for deterministic anchors * update claude haiku version for NL/T tests * Fix CI: share unity-mcp status dir * update yaml * Add Unity bridge debug step in CI * Fail fast when Unity MCP status file missing * Allow Unity local share writable for MCP status * Mount Unity cache rw and dump Editor log for MCP debug * Allow Unity config dir writable for MCP heartbeat/logs * Write Unity logs to file and list config dir in debug * Use available Anthropic models for T pass * Use latest claude sonnet/haiku models in workflow * Fix YAML indentation for MCP preflight step * Point MCP server to src/server.py and fix preflight * another try * Add MCP preflight workflow and update NL suite * Fixes to improve CI testing * Cleanup * fixes * diag * fix yaml * fix status dir * Fix YAML / printing to stdout --> stderr * find in file fixes. * fixes to find_in_file and CI report format error * Only run the stats on the CoPlay main repo, not forks. * Coderabbit fixes.
2025-12-11 06:54:55 +08:00
{"op": "anchor_insert", "anchor": "(?m)^\\s*(?:public|private|protected|internal)?\\s*void\\s+Update\\s*\\(\\s*\\)", "position": "before", "text": "// comment"}
```
---
## State Verification Patterns
**After each test:**
1. Verify expected content exists: `find_in_file` for key markers
2. Check structural integrity: `validate_script(level:"standard")`
3. Update SHA tracking for next test's preconditions
Improved ci prompt testing suite (#270) * CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions * CI: support both ULF + EBL; validate ULF before -manualLicenseFile; robust readiness wait; use game-ci @v2 actions * CI: activate EBL via container using UNITY_IMAGE; fix readiness regex grouping * CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error patterns, keep return-license @v2 for linter * CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manual_args array for -manualLicenseFile * CI: assert EBL activation writes entitlement to host mount; fail fast if missing * CI: use heredoc in wait step to avoid nested-quote issues; remove redundant EBL artifact copy; drop job-level if and unused UNITY_VERSION * CI: harden wait step (container status check, broader ready patterns, longer timeout); make license return non-blocking * CI: wait step — confirm bridge readiness via status JSON (unity_port) + host socket probe * CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP probe); workflow_dispatch trigger + ASCII step names * CI: refine license error pattern to ignore benign LicensingClient channel startup; only match true activation/return failures * Improve Unity bridge wait logic in CI workflow - Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands * Add comprehensive Unity workflow improvements - Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability. * Refine Unity workflow licensing and permissions - Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps. * fix workflow YAML parse * Normalize NL/T JUnit names and robust summary * Fix Python import syntax in workflow debug step * Improve prompt clarity for XML test fragment format - Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format * Fix problematic regex substitution in test name canonicalization - Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats * CI: NL/T hardening — enforce filename-derived IDs, robust backfill, single-testcase guard; tighten prompt emissions; disallow Bash * fix: keep file ID when canonicalizing test names * CI: move Unity Pro license return to teardown after stopping Unity; keep placeholder at original site * CI: remove revert helper & baseline snapshot; stop creating scripts dir; prompt: standardize T-B validation to level=standard * CI: remove mini workflow and obsolete NL prompts; redact email in all Unity log dumps * NL/T prompt: enforce allowed ops, require per-test fragment emission (incl. failures), add T-F..T-J XML templates * NL suite: enforce strict NL-4 emission; remove brittle relabeling; keep canonicalization + backfill * NL/T: minimize transcript; tighten NL-4 console reads; add final errors scan in T-J * ci: add local validate-nlt-coverage helper * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: minor polish and guardrails; keep staged reports promotion and placeholder detection * read_console: default count=50; normalize types str->list; tolerate legacy payload shapes * read_console: harden response parsing for legacy shapes (data as list, tuple entries) * Docs: refresh CI workflow and prompts (remove mini suite refs; per-test emissions, staging, guard) * CI: move T coverage check after staged promotion; accept _staging as present; dedupe promotion step * CI: make T retry conditional on explicit coverage probe (not failure()); respect _staging in probe
2025-09-08 06:47:56 +08:00
4. Emit a pertest fragment to `reports/<TESTID>_results.xml` immediately. If the test failed, still write a single `<testcase>` with a `<failure message="..."/>` and evidence in `system-out`.
5. Log cumulative changes in test evidence (keep concise per Transcript Minimization Rules; never paste raw tool JSON)
**Error Recovery:**
- If test fails, log current state but continue (don't restore)
- Next test adapts to actual current state, not expected state
- Demonstrates resilience of operations on varied file conditions
---
## Benefits of Additive Design
1. **Realistic Workflows**: Tests mirror actual development patterns
2. **Robust Operations**: Proves edits work on evolving files, not just pristine baselines
3. **Composability Validation**: Shows operations coordinate well together
4. **Simplified Infrastructure**: No restore scripts or snapshots needed
5. **Better Failure Analysis**: Failures don't cascade - each test adapts to current reality
6. **State Evolution Testing**: Validates SDK handles cumulative file modifications correctly
Improved ci prompt testing suite (#270) * CI: streamline Unity licensing (ULF/EBL); drop cache mounts & EBL-in-container; NL suite: clarify T-E/T-J, anchor positions, EOF brace targeting, SHA preconditions * CI: support both ULF + EBL; validate ULF before -manualLicenseFile; robust readiness wait; use game-ci @v2 actions * CI: activate EBL via container using UNITY_IMAGE; fix readiness regex grouping * CI: minimal patch — guard manualLicenseFile by ulf.ok, expand error patterns, keep return-license @v2 for linter * CI: harden ULF staging (printf+chmod); pass ULF_OK via env; use manual_args array for -manualLicenseFile * CI: assert EBL activation writes entitlement to host mount; fail fast if missing * CI: use heredoc in wait step to avoid nested-quote issues; remove redundant EBL artifact copy; drop job-level if and unused UNITY_VERSION * CI: harden wait step (container status check, broader ready patterns, longer timeout); make license return non-blocking * CI: wait step — confirm bridge readiness via status JSON (unity_port) + host socket probe * CI: YAML-safe readiness fallback (grep/sed unity_port + bash TCP probe); workflow_dispatch trigger + ASCII step names * CI: refine license error pattern to ignore benign LicensingClient channel startup; only match true activation/return failures * Improve Unity bridge wait logic in CI workflow - Increase timeout from 600s to 900s for Unity startup - Add 'bound' to readiness pattern to catch more bridge signals - Refine error detection to focus only on license failures - Remove non-license error patterns that could cause false failures - Improve error reporting with descriptive messages - Fix regex escaping for unity port parsing - Fix case sensitivity in sed commands * Add comprehensive Unity workflow improvements - Add project warm-up step to pre-import Library before bridge startup - Expand license mounts to capture full Unity config and local-share directories - Update bridge container to use expanded directory mounts instead of narrow license paths - Provide ULF licenses in both legacy and standard local-share paths - Improve EBL activation to capture complete Unity authentication context - Update verification logic to check full config directories for entitlements These changes eliminate cold import delays during bridge startup and provide Unity with all necessary authentication data, reducing edge cases and improving overall workflow reliability. * Refine Unity workflow licensing and permissions - Make EBL verification conditional on ULF presence to allow ULF-only runs - Remove read-only mounts from warm-up container for Unity user directories - Align secrets gate with actual licensing requirements (remove UNITY_SERIAL only) - Keep return-license action at v2 (latest available version) These changes prevent workflow failures when EBL has issues but ULF is valid, allow Unity to write preferences during warm-up, and ensure secrets detection matches the actual licensing logic used by the workflow steps. * fix workflow YAML parse * Normalize NL/T JUnit names and robust summary * Fix Python import syntax in workflow debug step * Improve prompt clarity for XML test fragment format - Add detailed XML format requirements with exact specifications - Emphasize NO prologue, epilogue, code fences, or extra characters - Add specific instructions for T-D and T-J tests to write fragments immediately - Include exact XML template and TESTID requirements - Should fix T-D and T-J test failures in CI by ensuring proper fragment format * Fix problematic regex substitution in test name canonicalization - Replace unsafe regex substitution that could create malformed names - New approach: preserve correctly formatted names, extract titles safely - Prevents edge cases where double processing could corrupt test names - Uses proper em dash (—) separator consistently - More robust handling of various input formats * CI: NL/T hardening — enforce filename-derived IDs, robust backfill, single-testcase guard; tighten prompt emissions; disallow Bash * fix: keep file ID when canonicalizing test names * CI: move Unity Pro license return to teardown after stopping Unity; keep placeholder at original site * CI: remove revert helper & baseline snapshot; stop creating scripts dir; prompt: standardize T-B validation to level=standard * CI: remove mini workflow and obsolete NL prompts; redact email in all Unity log dumps * NL/T prompt: enforce allowed ops, require per-test fragment emission (incl. failures), add T-F..T-J XML templates * NL suite: enforce strict NL-4 emission; remove brittle relabeling; keep canonicalization + backfill * NL/T: minimize transcript; tighten NL-4 console reads; add final errors scan in T-J * ci: add local validate-nlt-coverage helper * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: add staged report fragment promotion step (reports/_staging -> reports/) to support multi-edit reporting * CI: minor polish and guardrails; keep staged reports promotion and placeholder detection * read_console: default count=50; normalize types str->list; tolerate legacy payload shapes * read_console: harden response parsing for legacy shapes (data as list, tuple entries) * Docs: refresh CI workflow and prompts (remove mini suite refs; per-test emissions, staging, guard) * CI: move T coverage check after staged promotion; accept _staging as present; dedupe promotion step * CI: make T retry conditional on explicit coverage probe (not failure()); respect _staging in probe
2025-09-08 06:47:56 +08:00
This additive approach produces a more realistic and maintainable test suite that better represents actual SDK usage patterns.
---
BAN ON EXTRA TOOLS AND DIRS
- Do not use any tools outside `AllowedTools`. Do not create directories; assume `reports/` exists.
---