LangChain / LangSmith MCP Server

LangSmith LLM observability and tracing MCP server

Tools

Last Updated

Apr 14, 2026

What is the LangChain / LangSmith MCP Server?

The LangChain / LangSmith MCP server gives AI agents structured, permission-aware access to LangChain / LangSmith through the Model Context Protocol. With 15 pre-built actions, agents can read, create, and update LangChain / LangSmith data on behalf of authorized users.

Willow ships the LangChain / LangSmith MCP server as part of an enterprise control plane. Every call runs behind SSO (Okta, Azure AD), enforces RBAC and least-privilege at runtime, writes to a full audit trail, and integrates with Splunk and Loki for SIEM visibility. Connect from Claude Desktop, Claude Code, Cursor, ChatGPT, VS Code, n8n, or any custom agent. Install once, distribute org-wide, and see exactly how LangChain / LangSmith is being used by every AI agent in your stack.

Tools

list_prompts

Fetch prompts from LangSmith with optional filtering. Args: is_public (str): Filter by prompt visibility - "true" for public prompts, "false" for private prompts (default: "false") limit (int): Maximum number of prompts to return (default: 20) Returns: Dict[str, Any]: Dictionary containing the prompts and metadata

get_prompt_by_name

Get a specific prompt by its exact name. Args: prompt_name (str): The exact name of the prompt to retrieve ctx: FastMCP context (automatically provided) Returns: Dict[str, Any]: Dictionary containing the prompt details and template, or an error message if the prompt cannot be found

push_prompt

Call this tool when you need to understand how to create and push prompts to LangSmith.

get_thread_history

Retrieve one page of message history for a specific conversation thread. Uses char-based pagination: pages are built by character budget (max_chars_per_page). Long strings are truncated to preview_chars. Supply page_number (1-based) on every call; use the returned total_pages to request further pages. Args: thread_id (str): The unique ID of the thread to fetch history for project_name (str): The name of the project containing the thread (format: "owner/project" or just "project") page_number (int): 1-based page index (required) max_chars_per_page (int): Max character count per page, capped at 30000 (default: 25000) preview_chars (int): Truncate long strings to this length with "… (+N chars)" (default: 150) Returns: Dict with result (list of messages), page_number, total_pages, max_chars_per_page, preview_chars; or an error message if the thread cannot be found

fetch_runs

Fetch LangSmith runs from one or more projects with flexible filters and automatic pagination. All results are paginated by character budget to keep responses manageable. Use page_number and total_pages from the response to iterate through multiple pages. --- 📤 RETURNS (always paginated) ------------------------------ Dict with: runs, page_number, total_pages, max_chars_per_page, preview_chars. May include _truncated, _truncated_message, _truncated_preview if content exceeded budget. --- ⚙️ PARAMETERS (required) ------------------------ project_name : str The project name to fetch runs from. For multiple projects, use JSON array string: '["project1", "project2"]' limit : int (required) Maximum number of runs to fetch from LangSmith API (capped at 100). These runs are then paginated by character budget into pages. page_number : int, default 1 1-based page index. Use with total_pages from response to iterate through pages. trace_id : str, optional Return only runs that belong to this trace UUID. Example: "123e4567-e89b-12d3-a456-426614174000" run_type : str, optional Filter runs by type: "llm", "chain", "tool", "retriever". error : str, optional "true" for errored runs, "false" for successful runs. is_root : str, optional "true" for only top-level traces, "false" to exclude roots. filter : str, optional Filter Query Language (FQL) expression. Common fields: id, name, run_type, start_time, end_time, latency, total_tokens, error, tags, feedback_key, feedback_score, metadata_key, metadata_value, execution_order. Operators: eq, neq, gt, gte, lt, lte, has, search, and, or, not. Examples: 'gt(latency, "5s")' # runs > 5 seconds 'neq(error, null)' # errored runs 'has(tags, "beta")' # tagged "beta" 'and(eq(name,"ChatOpenAI"), eq(run_type,"llm"))' # name AND type 'search("image classification")' # full-text search trace_filter : str, optional Filter on the root run of each trace tree. Example: 'and(eq(feedback_key,"user_score"), eq(feedback_score,1))' tree_filter : str, optional Filter on any run in the trace tree (siblings/children included). Example: 'eq(name,"ExpandQuery")' order_by : str, default "-start_time" Sort field; prefix "-" for descending. Examples: "-start_time", "latency" reference_example_id : str, optional Filter runs by dataset example ID. max_chars_per_page : int, default 25000 Max JSON character count per page (capped at 30000). Pagination splits by this budget. preview_chars : int, default 150 Truncate long strings to this length with "… (+N chars)". Keeps responses readable. --- 🧪 EXAMPLES ----------- 1️⃣ Get latest 10 root runs: fetch_runs("my-project", limit=10, page_number=1, is_root="true") 2️⃣ Get errored tool runs: fetch_runs("my-project", limit=50, page_number=1, run_type="tool", error="true") 3️⃣ Runs > 5s with "experimental" tag: fetch_runs("my-project", limit=50, page_number=1, filter='and(gt(latency,"5s"), has(tags,"experimental"))') 4️⃣ All runs for a trace (paginate through pages): r = fetch_runs("my-project", limit=50, page_number=1, trace_id="abc-123") # Check r["total_pages"] and fetch subsequent pages if needed

list_projects

List LangSmith projects with optional filtering and detail level control. Fetches projects from LangSmith, optionally filtering by name and controlling the level of detail returned. Can return either simplified project information or full project details. In case a dataset id or name is provided, you don't need to provide a project name. --- 🧩 PURPOSE ---------- This function provides a convenient way to list and explore LangSmith projects. It supports: - Filtering projects by name (partial match) - Limiting the number of results - Choosing between simplified or full project information - Automatically extracting deployment IDs from nested project data --- ⚙️ PARAMETERS ------------- limit : int, default 5 Maximum number of projects to return (as string, e.g., "5"). This can be adjusted by agents or users based on their needs. project_name : str, optional Filter projects by name using partial matching. If provided, only projects whose names contain this string will be returned. Example: `project_name="Chat"` will match "Chat-LangChain", "ChatBot", etc. more_info : str, default "false" Controls the level of detail returned: - `"false"` (default): Returns simplified project information with only essential fields: `name`, `project_id`, and `agent_deployment_id` (if available) - `"true"`: Returns full project details as returned by the LangSmith API reference_dataset_id : str, optional The ID of the reference dataset to filter projects by. Either this OR `reference_dataset_name` must be provided (but not both). reference_dataset_name : str, optional The name of the reference dataset to filter projects by. Either this OR `reference_dataset_id` must be provided (but not both). --- 📤 RETURNS ---------- List[dict] A list of project dictionaries. The structure depends on `more_info`: **When `more_info=False` (simplified):** ```python [ { "name": "Chat-LangChain", "project_id": "787d5165-f110-43ff-a3fb-66ea1a70c971", "agent_deployment_id": "deployment-123" # Only if available }, ... ] ``` **When `more_info=True` (full details):** Returns complete project objects with all fields from the LangSmith API, including metadata, settings, statistics, and nested structures. --- 🧪 EXAMPLES ------------ 1️⃣ **List first 5 projects (simplified)** ```python projects = list_projects(limit="5") ``` 2️⃣ **Search for projects with "Chat" in the name** ```python projects = list_projects(project_name="Chat", limit="10") ``` 3️⃣ **Get full project details** ```python projects = list_projects(limit="3", more_info="true") ``` 4️⃣ **Find a specific project with full details** ```python projects = list_projects(project_name="MyProject", more_info="true", limit="1") ``` --- 🧠 NOTES FOR AGENTS -------------------- - Use `more_info="false"` for quick project discovery and listing - Use `more_info="true"` when you need detailed project information - The `agent_deployment_id` field is automatically extracted from nested project data when available, making it easy to identify agent deployments - Projects are filtered to exclude reference projects by default - The function uses `name_contains` for filtering, so partial matches work

get_billing_usage

Fetch organization billing usage (trace counts) with workspace names inline. Returns metrics from GET /api/v1/orgs/current/billing/usage. Each metric's `groups` is augmented to { workspace_uuid: { "workspace_name": "<name>", "value": <number> } }. If workspace is provided (UUID or display name), only that workspace's entries are included in each metric's groups. Args: starting_on: Start of date range (ISO 8601), e.g. "2025-09-01T00:00:00Z" ending_before: End of date range (ISO 8601), e.g. "2025-10-01T00:00:00Z" workspace: Optional single workspace UUID or display name to filter to on_current_plan: "true" to include only usage on current plan (default) Returns: List of billing metric objects with augmented groups, or dict with "error" key

list_experiments

List LangSmith experiment projects (reference projects) with mandatory dataset filtering. Fetches experiment projects from LangSmith that are associated with a specific dataset. These are projects used for model evaluation and comparison. Requires either a dataset ID or dataset name to filter experiments. --- 🧩 PURPOSE ---------- This function provides a convenient way to list and explore LangSmith experiment projects. It supports: - Filtering experiments by reference dataset (mandatory) - Filtering projects by name (partial match) - Limiting the number of results - Automatically extracting deployment IDs from nested project data - Returns simplified project information with key metrics (latency, cost, feedback stats) --- ⚙️ PARAMETERS ------------- reference_dataset_id : str, optional The ID of the reference dataset to filter experiments by. Either this OR `reference_dataset_name` must be provided (but not both). reference_dataset_name : str, optional The name of the reference dataset to filter experiments by. Either this OR `reference_dataset_id` must be provided (but not both). limit : int, default 5 Maximum number of experiments to return. This can be adjusted by agents or users based on their needs. project_name : str, optional Filter projects by name using partial matching. If provided, only projects whose names contain this string will be returned. Example: `project_name="Chat"` will match "Chat-LangChain", "ChatBot", etc. --- 📤 RETURNS ---------- Dict[str, Any] A dictionary containing an "experiments" key with a list of simplified experiment project dictionaries: ```python { "experiments": [ { "name": "Experiment-Chat-LangChain", "experiment_id": "787d5165-f110-43ff-a3fb-66ea1a70c971", "feedback_stats": {...}, # Feedback statistics if available "latency_p50_seconds": 1.626, # 50th percentile latency in seconds "latency_p99_seconds": 2.390, # 99th percentile latency in seconds "total_cost": 0.00013005, # Total cost in dollars "prompt_cost": 0.00002085, # Prompt cost in dollars "completion_cost": 0.0001092, # Completion cost in dollars "agent_deployment_id": "deployment-123" # Only if available }, ... ] } ``` --- 🧪 EXAMPLES ------------ 1️⃣ **List experiments for a dataset by ID** ```python experiments = list_experiments(reference_dataset_id="f5ca13c6-96ad-48ba-a432-ebb6bf94528f") ``` 2️⃣ **List experiments for a dataset by name** ```python experiments = list_experiments(reference_dataset_name="my-dataset", limit=10) ``` 3️⃣ **Find experiments with specific name pattern** ```python experiments = list_experiments( reference_dataset_id="f5ca13c6-96ad-48ba-a432-ebb6bf94528f", project_name="Chat", limit=1 ) ``` --- 🧠 NOTES FOR AGENTS -------------------- - Returns simplified experiment information with key metrics (latency, cost, feedback stats) - The `agent_deployment_id` field is automatically extracted from nested project data when available, making it easy to identify agent deployments - Experiments are filtered to include only reference projects (associated with datasets) - The function uses `name_contains` for filtering, so partial matches work - You must provide either `reference_dataset_id` OR `reference_dataset_name`, but not both - Experiment projects are used for model evaluation and comparison across different runs

list_datasets

Fetch LangSmith datasets. Note: If no arguments are provided, all datasets will be returned. Args: dataset_ids (Optional[str]): Dataset IDs to filter by as JSON array string (e.g., '["id1", "id2"]') or single ID data_type (Optional[str]): Filter by dataset data type (e.g., 'chat', 'kv') dataset_name (Optional[str]): Filter by exact dataset name dataset_name_contains (Optional[str]): Filter by substring in dataset name metadata (Optional[str]): Filter by metadata as JSON object string (e.g., '{"key": "value"}') limit (int): Max number of datasets to return (default: 20) ctx: FastMCP context (automatically provided) Returns: Dict[str, Any]: Dictionary containing the datasets and metadata, or an error message if the datasets cannot be retrieved

list_examples

Fetch examples from a LangSmith dataset with advanced filtering options. Note: Either dataset_id, dataset_name, or example_ids must be provided. If multiple are provided, they are used in order of precedence: example_ids, dataset_id, dataset_name. Args: dataset_id (Optional[str]): Dataset ID to retrieve examples from dataset_name (Optional[str]): Dataset name to retrieve examples from example_ids (Optional[str]): Specific example IDs as JSON array string (e.g., '["id1", "id2"]') or single ID limit (int): Maximum number of examples to return (default: 10) offset (int): Number of examples to skip (default: 0) filter (Optional[str]): Filter string using LangSmith query syntax (e.g., 'has(metadata, {"key": "value"})') metadata (Optional[str]): Metadata to filter by as JSON object string (e.g., '{"key": "value"}') splits (Optional[str]): Dataset splits as JSON array string (e.g., '["train", "test"]') or single split inline_s3_urls (Optional[str]): Whether to inline S3 URLs: "true" or "false" (default: SDK default if not specified) include_attachments (Optional[str]): Whether to include attachments: "true" or "false" (default: SDK default if not specified) as_of (Optional[str]): Dataset version tag OR ISO timestamp to retrieve examples as of that version/time ctx: FastMCP context (automatically provided) Returns: Dict[str, Any]: Dictionary containing the examples and metadata, or an error message if the examples cannot be retrieved

1 2

1–10 of 15 tools

Customize Tools

Edit descriptions, modify arguments, select tools, or add new ones

Set Up Your LangChain / LangSmith MCP Server in Minutes

Add the following configuration to your MCP client. Authentication is handled via OAuth. Compatible with Claude Desktop, Claude Code, Cursor, ChatGPT, VS Code, n8n, and any MCP-compatible agent.

Claude Desktop

claude_desktop_config.json

{
  "mcpServers": {
    "willow-langchain-langsmith": {
      "type": "http",
      "url": "https://<org>.mcp-s.com/mcp/mcp/langchain-langsmith"
    }
  }
}

Cursor

.cursor/mcp.json

{
  "mcpServers": {
    "willow-langchain-langsmith": {
      "type": "http",
      "url": "https://<org>.mcp-s.com/mcp/mcp/langchain-langsmith"
    }
  }
}

Claude Code

CLI

claude mcp add willow-langchain-langsmith --transport http https://<org>.mcp-s.com/mcp/mcp/langchain-langsmith

n8n

HTTP Request Node

{
  "url": "https://<org>.mcp-s.com/mcp/mcp/langchain-langsmith",
  "method": "POST"
}

Or click "Install with Willow" above to set up automatically with SSO and RBAC preconfigured.

Enterprise Governance for LangChain / LangSmith

Willow adds the layer LangChain / LangSmith and every other SaaS doesn't ship out of the box: every call runs behind SSO (Okta, Azure AD), enforces RBAC and least-privilege at runtime, writes to full audit logs, and detects shadow AI usage across your stack. One MCP gateway. Any agent. Every tool.

LangChain / LangSmith MCP Server FAQ

What is the LangChain / LangSmith MCP server?

The LangChain / LangSmith MCP server is a Model Context Protocol implementation that lets AI agents like Claude, Cursor, and ChatGPT read and write LangChain / LangSmith data through a standardized interface. Willow hosts and governs this server so enterprises can roll it out without a security review backlog.

How is Willow's LangChain / LangSmith MCP server different from the official one?

The official LangChain / LangSmith MCP server is scoped to a single user's account and does not include enterprise governance. Willow's version adds SSO, RBAC, audit logging, shadow AI detection, and centralized control over which actions agents can take across the entire org.

Which AI clients work with the LangChain / LangSmith MCP server?

Claude Desktop, Claude Code, Cursor, ChatGPT, VS Code with MCP support, n8n, and any custom agent built with OpenAI Agents SDK, LangChain, Vercel AI SDK, or Anthropic SDK.

Is the LangChain / LangSmith MCP server secure? How does Willow handle authentication?

Every call runs behind your existing SSO (Okta, Azure AD). Per-user OAuth scopes the agent to exactly what that user can do in LangChain / LangSmith, nothing more. No credentials reach the LLM. Every action writes to an audit trail.

Can I limit which LangChain / LangSmith actions agents can take?

Yes. Willow lets you scope agents to specific actions, specific projects, or specific environments. Toggle actions on or off in the dashboard, or enforce policy via infrastructure-as-code through GitHub.

How do I detect shadow LangChain / LangSmith MCP servers in my org?

Willow's browser extension and discovery service surface unmanaged MCP servers, skills, and AI agents across the org. If a developer installed an unapproved LangChain / LangSmith MCP locally, you'll see it.

What does the LangChain / LangSmith MCP server cost?

Pricing depends on org size and deployment model (SaaS, dedicated cloud, self-host). See withwillow.ai/pricing or contact sales for a quote.

How do I install the LangChain / LangSmith MCP server with Willow?

Install via the Willow Connect Panel in one click, or paste the JSON snippet above into your Claude Desktop, Cursor, or Claude Code config. SSO and RBAC inherit from your existing Willow setup.