Commit Graph

5 Commits (5511a2b8ad16819089e4803f04f518fdb94e5add)

Author SHA1 Message Date
dsarno 711768d064
Async Test Infrastructure & Editor Readiness Status + new refresh_unity tool (#507)
* Add editor readiness v2, refresh tool, and preflight guards

* Detect external package changes and harden refresh retry

* feat: add TestRunnerNoThrottle and async test running with background stall prevention

- Add TestRunnerNoThrottle.cs: Sets editor to 'No Throttling' mode during test runs
  with SessionState persistence across domain reload
- Add run_tests_async and get_test_job tools for non-blocking test execution
- Add TestJobManager for async test job tracking with progress monitoring
- Add ForceSynchronousImport to all AssetDatabase.Refresh() calls to prevent stalls
- Mark DomainReloadResilienceTests as [Explicit] with documentation explaining
  the test infrastructure limitation (internal coroutine waits vs MCP socket polling)
- MCP workflow is unaffected - socket messages provide external stimulus that
  keeps Unity responsive even when backgrounded

* refactor: simplify and clean up code

- Remove unused Newtonsoft.Json.Linq import from TestJobManager
- Add throttling to SessionState persistence (once per second) to reduce overhead
- Critical job state changes (start/finish) still persist immediately
- Fix duplicate XML summary tag in DomainReloadResilienceTests

* docs: add async test tools to README, document domain reload limitation

- Add run_tests_async and get_test_job to main README tools list
- Document background stall limitation for domain reload tests in DEV readme

* ci: add separate job for domain reload tests

Run [Explicit] domain_reload tests in their own job using -testCategory

* ci: run domain reload tests in same job as regular tests

Combines into single job with two test steps to reuse cached Library

* fix: address coderabbit review issues

- Fix TOCTOU race in TestJobManager.StartJob (single lock scope for check-and-set)
- Store TestRunnerApi reference with HideAndDontSave to prevent GC/serialization issues

* docs: update tool descriptions to prefer run_tests_async

- run_tests_async is now marked as preferred for long-running suites
- run_tests description notes it blocks and suggests async alternative

* docs: update README screenshot to v8.6 UI

* docs: add v8.6 UI screenshot

* Update README for MCP version and instructions for v8.7

* fix: handle preflight busy signals and derive job status from test results

- manage_asset, manage_gameobject, manage_scene now check preflight return
  value and propagate busy/retry signals to clients (fixes Sourcery #1)
- TestJobManager.FinalizeCurrentJobFromRunFinished now sets job status to
  Failed when resultPayload.Failed > 0, not always Succeeded (fixes Sourcery #2)

* fix: increase HTTP server startup timeout for dev mode

When 'Force fresh server install' is enabled, uvx uses --no-cache --refresh
which rebuilds the package and takes significantly longer to start.

- Increase timeout from 10s to 45s when dev mode is enabled
- Add informative log message explaining the longer startup time
- Show actual timeout value in warning message

* fix: derive job status from test results in FinalizeFromTask fallback

Apply same logic as FinalizeCurrentJobFromRunFinished: check result.Failed > 0
to correctly mark jobs as Failed when tests fail, even in the fallback path
when RunFinished callback is not delivered.
2026-01-03 12:42:32 -08:00
dsarno 35a5c75596
Feature/run tests summary clean (#501)
* Optimize run_tests to return summary by default, reducing token usage by 98%

- Add includeFailedTests parameter: returns only failed/skipped test details
- Add includeDetails parameter: returns all test details (original behavior)
- Default behavior now returns summary only (~150 tokens vs ~13k tokens)
- Make results field optional in Python schema for backward compatibility

Token savings:
- Default: ~13k tokens saved (98.9% reduction)
- With failures: minimal tokens (only non-passing tests)
- Full details: same as before when explicitly requested

This prevents context bloat for typical test runs where you only need
pass/fail counts, while still allowing detailed debugging when needed.

* Add warning when run_tests filters match no tests; fix test organization

TDD Feature:
- Add warning message when filter criteria match zero tests
- New RunTestsTests.cs validates message formatting logic
- Modified RunTests.cs to append "(No tests matched the specified filters)" when total=0

Test Organization Fixes:
- Move MCPToolParameterTests.cs from EditMode/ to EditMode/Tools/ (matches folder hierarchy)
- Fix inconsistent namespaces to MCPForUnityTests.Editor.{Subfolder}:
  - MCPToolParameterTests: Tests.EditMode → MCPForUnityTests.Editor.Tools
  - DomainReloadResilienceTests: Tests.EditMode.Tools → MCPForUnityTests.Editor.Tools
  - Matrix4x4ConverterTests: MCPForUnityTests.EditMode.Helpers → MCPForUnityTests.Editor.Helpers

* Refactor test result message formatting

* Simplify RunTests warning assertions

* Tests: de-flake cold-start EditMode runs

- Make ManageScriptableObjectTests setup yield-based with longer Unity-ready timeout

- Mark DomainReloadResilienceTests explicit to avoid triggering domain reload during Run All
2026-01-01 20:36:45 -08:00
Voon Foo 60a9f66949
Add test filtering to run_tests tool (#462) 2025-12-17 16:59:21 -04:00
Marcus Sanatan 2f50962b58
Harden PlayMode test runs (#396)
* Harden PlayMode test runs

- Guard against starting tests while already in Play Mode.
- Pre-save dirty scenes before PlayMode runs to avoid SaveModifiedSceneTask failures.
- Temporarily disable domain reload during PlayMode tests to keep the MCP bridge alive; restore settings afterward.
- Avoid runSynchronously because it can freeze Unity

* Handle the not too uncommon case where we have an empty scene
2025-11-25 17:08:33 -04:00
Marcus Sanatan f2c57ca91e
Add testing and move menu items to resources (#316)
* deps: add tomli>=2.3.0 dependency to UnityMcpServer package

* feat: dynamically fetch package version from pyproject.toml for telemetry

* Add pydantic

* feat: add resource registry for MCP resource auto-discovery

* feat: add telemetry decorator for tracking MCP resource usage

* feat: add auto-discovery and registration system for MCP resources

* feat: add resource registration to MCP server initialization

* feat: add MCPResponse model class for standardized API responses

* refactor: replace Debug.Log calls with McpLog wrapper for consistent logging

* feat: add test discovery endpoints for Unity Test Framework integration

We haven't connected them as yet, still thinking about how to do this neatly

* Fix server setup

* refactor: reduce log verbosity by changing individual resource/tool registration logs to debug level

* chore: bump mcp[cli] dependency from 1.15.0 to 1.17.0

* refactor: remove Context parameter and add uri keyword argument in resource decorator

The Context parameter doesn't work on our version of FastMCP

* chore: upgrade Python base image to 3.13 and simplify Dockerfile setup

* fix: apply telemetry decorator before mcp.tool to ensure proper wrapping order

* fix: swap order of telemetry and resource decorators to properly wrap handlers

* fix: update log prefixes for consistency in logging methods

* Fix compile errors

* feat: extend command registry to support both tools and resources

* Run get tests as a coroutine because it doesn't return results immediately

This works but it spams logs like crazy, maybe there's a better/simpler way

* refactor: migrate from coroutines to async/await for test retrieval and command execution

* feat: add optional error field to MCPResponse model

* Increased timeout because loading tests can take some time

* Make message optional so error responses that only have success and error don't cause Pydantic errors

* Set max_retries to 5

This connection module needs a lookover. The retries should be an exponential backoff and we could structure why it's failing so much

* Use pydantic model to structure the error output

* fix: initialize data field in GetTestsResponse to avoid potential errors

* Don't return path parameter

* feat: add Unity test runner execution with structured results and Python bindings

* refactor: simplify GetTests by removing mode filtering and related parsing logic

* refactor: move test runner functionality into dedicated service interface

* feat: add resource retrieval telemetry tracking with new record type and helper function

* fix: convert tool functions to async and await ctx.info calls

* refactor: reorganize menu item functionality into separate execute and get commands

An MCP resource for retrieval, and a simple command to execute. Because it's a resource, it's easier for the user to see what's in the menu items

* refactor: rename manage_menu_item to execute_menu_item and update tool examples to use async/await

We'll eventually put a section for resources

* Revert "fix: convert tool functions to async and await ctx.info calls"

This reverts commit 012ea6b7439bd1f2593864d98d03d9d95d7bdd03.

* fix: replace tomllib with tomli for Python 3.10 compatibility in telemetry module

* Remove confusing comment

* refactor: improve error handling and simplify test retrieval logic in GetTests commands

* No cache by default

* docs: remove redundant comment for HandleCommand method in ExecuteMenuItem
2025-10-13 11:16:43 -04:00