MCP Node Knowledge Server - Tech Design

Overview

This document outlines the technical design for an MCP (Model Context Protocol) server that provides workflow agent access to comprehensive node specifications knowledge. The server enables LLMs to query detailed information about available workflow nodes, including their parameters, input/output schemas, examples, and usage patterns.

Problem Statement

The workflow agent needs access to detailed knowledge about available node types to generate accurate workflows from natural language descriptions. Currently, this information is stored in the @apps/backend/shared/node_specs/ directory but is not easily accessible to the LLM during workflow generation.

Architecture Overview

High-Level Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│  Workflow Agent │───▶│ API Gateway     │───▶│ Node Knowledge  │
│   (LangGraph)   │    │  MCP Server     │    │   Repository    │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                              │                        │
                              ▼                        ▼
                       ┌─────────────┐         ┌─────────────┐
                       │ MCP Tools   │         │ Node Specs  │
                       │ - get_nodes │         │ Registry    │
                       │ - get_node  │         │             │
                       │ - search    │         └─────────────┘
                       └─────────────┘

Components

MCP Server - Extends existing API Gateway MCP implementation
Node Knowledge Service - Business logic for node spec queries
Node Registry Integration - Interface with existing node spec registry
MCP Tools - Specific tools for node knowledge access

Detailed Design

1. MCP Tools Specification

Tool: `get_node_types`

Purpose: Get all available node types and their subtypes

{
  "name": "get_node_types",
  "description": "Get all available workflow node types and their subtypes",
  "parameters": {
    "type": "object",
    "properties": {
      "type_filter": {
        "type": "string",
        "enum": ["ACTION_NODE", "TRIGGER_NODE", "AI_AGENT_NODE", "FLOW_NODE", "TOOL_NODE", "MEMORY_NODE", "HUMAN_LOOP_NODE", "EXTERNAL_ACTION_NODE"],
        "description": "Filter by node type (optional)"
      }
    }
  }
}

Tool: `get_node_details`

Purpose: Get comprehensive details for specific nodes

{
  "name": "get_node_details",
  "description": "Get detailed specifications for one or more workflow nodes",
  "parameters": {
    "type": "object",
    "properties": {
      "nodes": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "node_type": {"type": "string"},
            "subtype": {"type": "string"}
          },
          "required": ["node_type", "subtype"]
        },
        "description": "List of nodes to get details for"
      },
      "include_examples": {
        "type": "boolean",
        "default": true,
        "description": "Include usage examples in response"
      },
      "include_schemas": {
        "type": "boolean",
        "default": true,
        "description": "Include input/output schemas"
      }
    },
    "required": ["nodes"]
  }
}

Tool: `search_nodes`

Purpose: Search nodes by capabilities or description

{
  "name": "search_nodes",
  "description": "Search workflow nodes by functionality, description, or capabilities",
  "parameters": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "Search query (e.g., 'HTTP request', 'send email', 'process data')"
      },
      "max_results": {
        "type": "integer",
        "default": 10,
        "description": "Maximum number of results to return"
      },
      "include_details": {
        "type": "boolean",
        "default": false,
        "description": "Include full node specifications in results"
      }
    },
    "required": ["query"]
  }
}

2. Service Implementation

Node Knowledge Service

# apps/backend/api-gateway/app/services/node_knowledge_service.py

from typing import List, Dict, Any, Optional
from apps.backend.shared.node_specs.registry import node_spec_registry
from apps.backend.shared.node_specs.base import NodeSpec

class NodeKnowledgeService:
    def __init__(self):
        self.registry = node_spec_registry

    def get_node_types(self, type_filter: Optional[str] = None) -> Dict[str, List[str]]:
        """Get all node types and subtypes, optionally filtered by type."""
        all_types = self.registry.get_node_types()

        if type_filter:
            return {k: v for k, v in all_types.items() if k == type_filter}

        return all_types

    def get_node_details(self, nodes: List[Dict[str, str]],
                        include_examples: bool = True,
                        include_schemas: bool = True) -> List[Dict[str, Any]]:
        """Get detailed specifications for requested nodes."""
        results = []

        for node_req in nodes:
            spec = self.registry.get_spec(node_req["node_type"], node_req["subtype"])
            if spec:
                result = self._serialize_node_spec(spec, include_examples, include_schemas)
                results.append(result)
            else:
                results.append({
                    "node_type": node_req["node_type"],
                    "subtype": node_req["subtype"],
                    "error": "Node specification not found"
                })

        return results

    def search_nodes(self, query: str, max_results: int = 10,
                    include_details: bool = False) -> List[Dict[str, Any]]:
        """Search nodes by description and capabilities."""
        all_specs = self.registry.list_all_specs()
        query_lower = query.lower()

        # Simple text search in description and parameter names
        matches = []
        for spec in all_specs:
            score = 0

            # Search in description
            if query_lower in spec.description.lower():
                score += 10

            # Search in parameter names and descriptions
            for param in spec.parameters:
                if query_lower in param.name.lower():
                    score += 5
                if query_lower in param.description.lower():
                    score += 3

            # Search in port names and descriptions
            for port in spec.input_ports + spec.output_ports:
                if query_lower in port.name.lower():
                    score += 3
                if query_lower in port.description.lower():
                    score += 2

            if score > 0:
                matches.append((spec, score))

        # Sort by relevance score and limit results
        matches.sort(key=lambda x: x[1], reverse=True)
        matches = matches[:max_results]

        # Format results
        results = []
        for spec, score in matches:
            if include_details:
                result = self._serialize_node_spec(spec, True, True)
                result["relevance_score"] = score
            else:
                result = {
                    "node_type": spec.node_type,
                    "subtype": spec.subtype,
                    "description": spec.description,
                    "relevance_score": score
                }
            results.append(result)

        return results

    def _serialize_node_spec(self, spec: NodeSpec, include_examples: bool,
                           include_schemas: bool) -> Dict[str, Any]:
        """Convert NodeSpec to serializable dictionary."""
        result = {
            "node_type": spec.node_type,
            "subtype": spec.subtype,
            "version": spec.version,
            "description": spec.description,
            "parameters": [
                {
                    "name": p.name,
                    "type": p.type.value,
                    "required": p.required,
                    "default_value": p.default_value,
                    "description": p.description,
                    "enum_values": p.enum_values,
                    "validation_pattern": p.validation_pattern
                }
                for p in spec.parameters
            ],
            "input_ports": [
                {
                    "name": port.name,
                    "type": port.type,
                    "required": port.required,
                    "description": port.description,
                    "max_connections": port.max_connections,
                    "data_format": {
                        "mime_type": port.data_format.mime_type,
                        "schema": port.data_format.schema,
                        "examples": port.data_format.examples
                    } if port.data_format and include_schemas else None,
                    "validation_schema": port.validation_schema if include_schemas else None
                }
                for port in spec.input_ports
            ],
            "output_ports": [
                {
                    "name": port.name,
                    "type": port.type,
                    "description": port.description,
                    "max_connections": port.max_connections,
                    "data_format": {
                        "mime_type": port.data_format.mime_type,
                        "schema": port.data_format.schema,
                        "examples": port.data_format.examples
                    } if port.data_format and include_schemas else None,
                    "validation_schema": port.validation_schema if include_schemas else None
                }
                for port in spec.output_ports
            ]
        }

        if include_examples and spec.examples:
            result["examples"] = spec.examples

        return result

3. MCP Integration

Enhanced MCP Service

# apps/backend/api-gateway/app/api/mcp/tools.py (enhancement)

from app.services.node_knowledge_service import NodeKnowledgeService

class EnhancedMCPService:
    def __init__(self):
        self.node_knowledge = NodeKnowledgeService()

    def get_available_tools(self) -> MCPToolsResponse:
        """Enhanced to include node knowledge tools."""
        tools = [
            # ... existing tools ...
            MCPTool(
                name="get_node_types",
                description="Get all available workflow node types and their subtypes",
                parameters={
                    "type": "object",
                    "properties": {
                        "type_filter": {
                            "type": "string",
                            "enum": ["ACTION_NODE", "TRIGGER_NODE", "AI_AGENT_NODE",
                                   "FLOW_NODE", "TOOL_NODE", "MEMORY_NODE",
                                   "HUMAN_LOOP_NODE", "EXTERNAL_ACTION_NODE"],
                            "description": "Filter by node type (optional)"
                        }
                    }
                },
                category="workflow",
                tags=["nodes", "workflow", "specifications"]
            ),
            MCPTool(
                name="get_node_details",
                description="Get detailed specifications for workflow nodes including parameters, ports, and examples",
                parameters={
                    "type": "object",
                    "properties": {
                        "nodes": {
                            "type": "array",
                            "items": {
                                "type": "object",
                                "properties": {
                                    "node_type": {"type": "string"},
                                    "subtype": {"type": "string"}
                                },
                                "required": ["node_type", "subtype"]
                            },
                            "description": "List of nodes to get details for"
                        },
                        "include_examples": {
                            "type": "boolean",
                            "default": True,
                            "description": "Include usage examples"
                        },
                        "include_schemas": {
                            "type": "boolean",
                            "default": True,
                            "description": "Include input/output schemas"
                        }
                    },
                    "required": ["nodes"]
                },
                category="workflow",
                tags=["nodes", "specifications", "details"]
            ),
            MCPTool(
                name="search_nodes",
                description="Search workflow nodes by functionality, description, or capabilities",
                parameters={
                    "type": "object",
                    "properties": {
                        "query": {
                            "type": "string",
                            "description": "Search query describing desired functionality"
                        },
                        "max_results": {
                            "type": "integer",
                            "default": 10,
                            "description": "Maximum number of results"
                        },
                        "include_details": {
                            "type": "boolean",
                            "default": False,
                            "description": "Include full specifications"
                        }
                    },
                    "required": ["query"]
                },
                category="workflow",
                tags=["search", "nodes", "discovery"]
            )
        ]
        # ... rest of method

    async def invoke_tool(self, tool_name: str, params: Dict[str, Any]) -> MCPInvokeResponse:
        """Enhanced to handle node knowledge tools."""
        start_time = time.time()

        if tool_name == "get_node_types":
            type_filter = params.get("type_filter")
            result = self.node_knowledge.get_node_types(type_filter)

        elif tool_name == "get_node_details":
            nodes = params.get("nodes", [])
            include_examples = params.get("include_examples", True)
            include_schemas = params.get("include_schemas", True)
            result = self.node_knowledge.get_node_details(
                nodes, include_examples, include_schemas
            )

        elif tool_name == "search_nodes":
            query = params.get("query", "")
            max_results = params.get("max_results", 10)
            include_details = params.get("include_details", False)
            result = self.node_knowledge.search_nodes(
                query, max_results, include_details
            )

        else:
            # ... existing tool handling ...

        return MCPInvokeResponse(
            success=True,
            tool_name=tool_name,
            result=result,
            execution_time_ms=round((time.time() - start_time) * 1000, 2),
            timestamp=datetime.now(timezone.utc)
        )

Implementation Plan

Phase 1: Core Infrastructure

Create NodeKnowledgeService in apps/backend/api-gateway/app/services/
Update MCP tools in apps/backend/api-gateway/app/api/mcp/tools.py
Add dependency injection for node registry access
Create unit tests for the service

Phase 2: Tool Implementation

Implement get_node_types tool with category filtering
Implement get_node_details tool with comprehensive spec serialization
Implement search_nodes tool with relevance scoring
Add integration tests for MCP endpoints

Phase 3: Optimization & Enhancement

Add caching for frequently accessed node specifications
Implement semantic search using embeddings for better node discovery
Add node compatibility checking tools
Performance optimization for large node registries

Usage Examples

Example 1: Get all available node types

# MCP Tool call
{
  "tool_name": "get_node_types",
  "params": {}
}

# Response
{
  "ACTION_NODE": ["RUN_CODE", "HTTP_REQUEST", "PARSE_IMAGE", "WEB_SEARCH", "DATABASE_OPERATION", "FILE_OPERATION", "DATA_TRANSFORMATION"],
  "TRIGGER_NODE": ["TRIGGER_MANUAL", "TRIGGER_CRON", "TRIGGER_WEBHOOK", "TRIGGER_CHAT", "TRIGGER_EMAIL", "TRIGGER_FORM", "TRIGGER_CALENDAR"],
  "AI_AGENT_NODE": ["GEMINI_NODE", "OPENAI_NODE", "CLAUDE_NODE"],
  "FLOW_NODE": ["IF", "FILTER", "LOOP", "MERGE", "SWITCH", "WAIT"],
  "TOOL_NODE": ["TOOL_GOOGLE_CALENDAR_MCP", "CALENDAR", "EMAIL", "HTTP"],
  "MEMORY_NODE": ["MEMORY_VECTOR_STORE", "MEMORY_SIMPLE", "MEMORY_DOCUMENT", "MEMORY_BUFFER", "MEMORY_KNOWLEDGE", "MEMORY_EMBEDDING"],
  "HUMAN_LOOP_NODE": ["HUMAN_GMAIL", "HUMAN_SLACK", "HUMAN_DISCORD", "HUMAN_TELEGRAM", "HUMAN_APP"],
  "EXTERNAL_ACTION_NODE": ["EXTERNAL_GITHUB", "EXTERNAL_GOOGLE_CALENDAR", "EXTERNAL_TRELLO", "EXTERNAL_EMAIL", "EXTERNAL_SLACK", "EXTERNAL_API_CALL", "EXTERNAL_WEBHOOK", "EXTERNAL_NOTIFICATION"]
}

Example 2: Get details for specific nodes

# MCP Tool call
{
  "tool_name": "get_node_details",
  "params": {
    "nodes": [
      {"node_type": "ACTION_NODE", "subtype": "HTTP_REQUEST"},
      {"node_type": "AI_AGENT_NODE", "subtype": "OPENAI_NODE"}
    ],
    "include_examples": true
  }
}

# Response includes full specifications with parameters, ports, examples

Example 3: Search for nodes by functionality

# MCP Tool call
{
  "tool_name": "search_nodes",
  "params": {
    "query": "send email notification",
    "max_results": 5
  }
}

# Response returns relevant nodes ranked by relevance score

Security & Performance Considerations

Security

Scope-based access control using existing MCP authentication
No sensitive data exposure in node specifications
Rate limiting on search operations to prevent abuse

Performance

Lazy loading of node specifications
Response size optimization with optional detail inclusion
Caching strategy for frequently accessed specifications
Search index for efficient node discovery

Testing Strategy

Unit Tests

NodeKnowledgeService methods with mock node registry
Serialization accuracy for complex node specifications
Search relevance scoring validation
Error handling for invalid requests

Integration Tests

MCP tool invocation end-to-end workflows
Large registry performance with all node types loaded
Authentication and authorization with API keys
Error responses for malformed requests

Monitoring & Observability

Metrics

Tool invocation frequency by tool type
Response times for different query types
Cache hit rates for node specifications
Search query patterns and effectiveness

Logging

Tool usage tracking with anonymized query patterns
Performance monitoring for slow queries
Error tracking with detailed context
Cache performance metrics

Future Enhancements

Semantic Search

Embedding-based search for better node discovery
Natural language queries for complex node requirements
Similarity recommendations for related nodes

Advanced Features

Node compatibility validation tools
Workflow template generation based on available nodes
Usage analytics and recommendation engine
Custom node specification support

Conclusion

This MCP Node Knowledge Server design provides a comprehensive solution for giving workflow agents access to detailed node specifications. The implementation leverages existing infrastructure while adding powerful discovery and query capabilities through well-defined MCP tools.

The phased implementation approach ensures incremental value delivery while maintaining system stability. The design supports future enhancements like semantic search and advanced analytics while providing immediate value through basic node discovery and specification access.

Overview​

Problem Statement​

Architecture Overview​

High-Level Architecture​

Components​

Detailed Design​

1. MCP Tools Specification​

Tool: get_node_types​

Tool: get_node_details​

Tool: search_nodes​

2. Service Implementation​

Node Knowledge Service​

3. MCP Integration​

Enhanced MCP Service​

Implementation Plan​

Phase 1: Core Infrastructure​

Phase 2: Tool Implementation​

Phase 3: Optimization & Enhancement​

Usage Examples​

Example 1: Get all available node types​

Example 2: Get details for specific nodes​

Example 3: Search for nodes by functionality​

Security & Performance Considerations​

Security​

Performance​

Testing Strategy​

Unit Tests​

Integration Tests​

Monitoring & Observability​

Metrics​

Logging​

Future Enhancements​

Semantic Search​

Advanced Features​

Conclusion​