MCP For Debug Instruments

First steps experimenting with connecting AI to tools

Aug 16, 2025

Standard disclaimer I’ll always add to posts that discuss embedded systems: opinions in this article are my own and don’t represent the opinions of my employer.

The Virtuous Cycle of Embedded Code Development

AI code assistance is just beginning to transform embedded code development. The slower adoption pace compared to other software development fields highlights a key difference between embedded systems development in constrained environments and development elsewhere in the industry. To put it possibly too simply: embedded systems devs are too close to the metal for AI.

Unlike mainstream software development that sits comfortably atop layers of abstraction that hides processor and physical layer complexities, embedded systems engineers must intimately understand their hardware's limitations. These systems directly interface with reality, converting analog signals with ADCs, controlling motors, managing RF communications, and countless other physical interactions, all while juggling tight memory budgets and CPU constraints. Moreover, these codebases often need to comply with rigorous communications protocols, safety certifications, and industry standards that leave little room for error.

Embedded systems developers navigate these constraints by employing a superset of the standard debug practices all developers use. While they hit breakpoints, monitor build artifacts, and watch terminal output, those working with real hardware inevitably also use specialized instruments like logic analyzers, spectrum analyzers, and oscilloscopes. The development cycle becomes a virtuous cycle of code development:

Not shown: the frustration, despair, and coffee breaks between ‘Observe’ and ‘code’

Current AI assistance can support most of these steps, but significant gaps remain in the 'observe' step when instrumentation is required.

The ‘closer to metal‘ developers get, the more important instrumentation becomes. Consider someone coding a wireless radio stack that must comply with a standard like Bluetooth. In these specialized domains, the relationship isn't "human in the loop" but rather "machine in the loop," where developers strategically incorporate AI code assistance. Attempting to code physical and link layer functionality for Bluetooth compliance without a spectrum analyzer will burn mountains of tokens without the code ever coming close to standards compliance.

So how might AI code assist gain access to bench instruments and complete that virtuous cycle? I think the most likely path this evolution takes is that LLMs interface with instruments through Model Context Protocol (MCP).

Maybe MCP could have helped with these tools too. credit: Gary Larson

MCP as LLM’s eyes and ears

Anthropic’s release of the MCP spec in November 2024 (https://www.anthropic.com/news/model-context-protocol) proved to be a watershed moment in AI for code development. It’s been less than a year since Anthropic released this standard to the world, and yet MCP already feels like an integral component of AI-based coding.

Sidenote: everything with AI happens at a breakneck pace, even standardization. I’m used to standardization taking years, with brilliant people in meetings for months arguing about the placement of commas in standards documents. Not so with AI.

MCP follows a client/server architecture where the server provides primitives that clients can utilize. These primitives include tools, resources, and prompts as defined by the standard. Through standardized initialization and capabilities expression, MCP servers enable LLM-powered agents to clearly understand the available services and how to interact with them.

While most MCP servers I've seen focus on providing code, documentation, and datasets as resources, the 'tools' primitive type in the MCP standard offers a pathway to integrate bench instruments into embedded code development. Using this standard, an agent should be able to:

Learn the measurement capabilities of a tool and its configuration options
Request measurements on demand
Configure the tool to monitor for asynchronous events and subscribe to updates

Physical limitations still exist, of course. Logic analyzers need physical connections to port pins, spectrum analyzers must be wired into RF paths, and so on. But before exploring these constraints further, let me share what I built.

Mocked-up MCP instrument build

I wanted a simple proof of concept MCP server that functions as a bench instrument. Emphasis on ‘simple’: I built this mainly to understand how MCP servers operate, and how LLM-powered clients can interface with those servers.

The code is available at: https://github.com/parkerdorris/ArduinoButtonMonitorMCP/blob/master/button_monitor.ino

I’ve included my initial PRD for the project for reference, though the project evolved as Cursor built and tested each component.

Pin monitor

I decided to use an Arduino board as a single GPIO pin monitor, with an MCP server wrapped around it. I chose Arduino because I wanted to ensure Cursor could generate embedded code in one attempt. For constrained devices, Arduino provides a clean, simplified hardware abstraction that makes it extremely easy for Cursor to generate functional embedded code.

The Arduino code exposes a few commands over a non-MCP compliant serial interface. Commands include:

STATE, which returns the 0/1 digital state of the pin
SUBSCRIBE/UNSUBSCRIBE, which enables/disables pin monitoring events indicating "RISING" and "FALLING" edges of the pin

Note that there's nothing MCP-specific about any of this. This is just a dead-simple Arduino sketch.

MCP Server

The MCP server is a Python script that manages the interface with the Arduino board. The MCP server python script contains a class that handles the Arduino connection and establishes a WebSocket interface compliant with MCP standards.

The MCP server implements these functions:

Connect/disconnect from the tool
A method to retrieve button state
A method to subscribe to button edge events

After a client subscribes to edge event detection, the server continuously streams edge events to that client.

MCP Client

The client script provides a system prompt to ChatGPT that instructs it to connect to the MCP server and narrate events. The terminal output is divided into two parts: the raw I/O communication between client and server, and ChatGPT's narration of those events.

Upon startup, the client establishes a WebSocket connection with the MCP server and queries the available tool capabilities.

Snippet:

main - INFO - Available tools: {'tools': [{'name': 'get_button_state', 'description': 'Get the current state of the button (0 or 1)', 'inputSchema': {'type': 'object', 'properties': {}, 'required': []}}, {'name': 'subscribe_button_edges', 'description': 'Subscribe to button edge events (RISING/FALLING)', 'inputSchema': {'type': 'object', 'properties': {}, 'required': []}}, {'name': 'connect_arduino', 'description': …

The code then checks the button state and subscribes to events.

When an edge detection event is sent from the server, the client echoes the websocket information. The LLM then provides brief 'narration' that assesses the button state within the runtime context.

📡 [MCP] Received button event: RISING at 13:03:51.522

State: 1 | Description: Button rising

2025-08-16 13:03:52,175 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"

🤖 At 13:03:51, the button was pressed, changing its state to active.

Does this PoC actually 'P' the 'C'?

To answer my own question: not really. While wrapping an MCP server around an instrument works, and I can set an LLM to studiously monitor and narrate the life of a port pin, I'm still a long way from understanding exactly how that virtuous development cycle can be completed with MCP-enabled bench tools.

For my next step, I plan to create a more realistic developer workspace. I'll connect this pin monitor to a second Arduino functioning as a device under test (DUT). I want to explore whether Cursor/VS Code can interface with the pin monitor as an MCP client (this seems feasible, though I need to conduct more research).

With this mock environment in place, I'll develop an intentionally buggy Arduino sketch for the DUT and leverage both Cursor and the MCP server to debug and fix the issues. Following that, I'll explore creating more practical MCP servers for actual instruments, possibly developing a proof-of-concept that wraps the PyVISA library. The experiment should be illuminating—potentially exciting, potentially frustrating, likely both!

Given that embedded systems development is inherently resource-constrained and tethered to real-world physical interactions, I believe this domain will remain human-in-the-loop longer than most software development fields. Through these experiments, I'm attempting to envision what human-AI collaborative embedded development might look like in 5-10 years.

Human/AI collaboration in the virtuous loop will undoubtedly remain essential for embedded systems, even as an increasing portion of the human contribution may shift toward more menial tasks like AI requests to "connect your scope to GPIO 3" or "don't forget to ground your probes."

Discussion about this post

Ready for more?