How-To: Establish and Connect to Intelligence Hub’s MCP Server
Activate the Intelligence Hub Internal MCP Server and Create Tooling for AI Agents
What Does This Article cover?
HighByte Intelligence Hub includes a configurable Model Context Protocol (MCP) server as of version 4.2.0. This article covers the steps necessary to activate the Intelligence Hub MCP server, guidance for creating tools, and options to
What is Model Context Protocol?
Model Context Protocol (MCP) is a method by which AI agents – often Large Language Models (LLMs) like ChatGPT – can discover access to additional tools and sources of information. When a model is trained, its knowledge and awareness is finite. However, it can gain access to new context and new information by additional tools – often API endpoints. An MCP server collects information about these APIs and provides it back to the AI agent including:
- Lists of the available tools
- Descriptions of the available tools to inform when they should be used
- The relevant parameters that need to be provided to each tool
HighByte’s MCP server will inform a properly configured agent of the available tools configured in the associated Intelligence Hub.
How To Activate the MCP Server
The MCP server is configurable alongside the Intelligence Hub REST Data Server. Navigate to the “Manage” portion of the left-side menu and into the “Settings” module. Scroll down to the “REST Data Server.” Ensure the REST Data Server is enabled and additionally enable the MCP server. Set the MCP server port as necessary or accept the default port of 45345.
Manage > Settings > REST Data Server > Enabled > MCP Server & MCP Server Port
Save the settings at the top of the page. The MCP Server is now enabled. This will expose all Pipelines with API triggers as MCP tools. Access to these tools is subject to permissions.
The MCP server is now available at the base URL of the Intelligence Hub with the port as set and the endpoint of /mcp, using the Streamable HTTP protocol. In the example above, this is http://127.0.0.1:45345/mcp.
How To Create MCP Tools
When the MCP Server is enabled in HighByte Intelligence Hub, by default all custom API endpoints – all pipelines that have enabled API triggers – are exposed as MCP tools. Access to these tools is subject to permissions. Additional tools may be created with additional API Triggered Pipelines.
Custom API Endpoints
An example pipeline project is available below.
A Pipeline can be created to serve as an MCP tool. All that is required to enable this functionality is an enabled API Trigger. An AI agent may deliver parameters or some event to the Pipeline based on the prompt given to that agent or required by the tool (to be described). This event will be injected into the Pipeline.
The Pipeline may then be built as needed to process or fetch any data based on the triggering event, and a return stage may be used to send a response back to the AI agent.
This Pipeline receives a trigger from an AI agent and returns a simple (random) secret message.
The secret message model stage assembles the (random) secret message.
Pipeline Names and Descriptions
The Pipeline name and description will also be used by the MCP Server to provide some context to the AI agent on when it should use this tool. This field will therefore serve as a description to both human data engineers, and AI agents. Care should be taken to clearly and concisely provide both a description and instructions on using this Pipeline when intended for AI agents. Context for the use and return value should be given.
Similar syntax and methodologies should be used as when providing prompts for general Large Language Models.
This example shows the description for the Pipeline displayed above.
“Go here to get the secret message from our friend Kimberly.” It provides a concise description of the tool, an instruction that directs an AI agent when, how, and to use it. And it provides a specific and unique keyword or phrase that may help prompt the AI agent to recognize to use it – “our friend Kimberly”. This last point of keywords is not strictly required. See Agent Model Parameter Size below for more information.
Necessary Parameters
API-Triggered Pipelines may have parameters added. These parameters are to be included in the POST body, but do limit the POST body. That is to say, they are a minimum requirement – other additional parameters may also be added, but if these are not provided, the Pipeline will return an error.
MCP will also inform any AI agent configured to use this tool that these parameters are required. But it will be up to the AI agent to populate them. Therefore, the naming of these parameters should follow similar patterns and strategies as Pipeline names. Consider how the AI agent will be prompted to use this Pipeline tool:
- What syntax will the human likely use when interacting with the agent?
- How ambiguous or specific is this parameter?
- How important is this parameter?
Consider Agent Model Parameter Size below for more information.
Tool Permissions
AI agents will access the MCP server with either a login bearer token (seemingly unlikely) or an API Key. By default, all Pipelines with API Triggers are exposed as MCP tools, but they may be restricted using tags and claims if it is desirable to provide only certain tools to certain agents.
How To Verify MCP Availability
Pipelines with enabled API triggers are automatically exposed as MCP tools after the MCP server is enabled. However, for debug purposes, it may be beneficial to test the MCP server and confirm.
MCP Inspector
MCP Inspector is a tool that can connect to an MCP server and display discoverable tools to the user. It is accessible from the npx utility that is installed with Node.js.
After running, MCP Inspector can be accessed with a graphical web interface. When MCP inspector is launched, it will print a session token to its console that needs to be entered when connection to an MCP server. Once connected to an MCP server with credentials – a bearer token retrieved either from login or using an API key – MCP Inspector may then be used to list tools available. It will also display the description of the pipeline giving context to the tool and the necessary parameters for an agent or user to fill. A test dialogue is also provided.
How To Use MCP with an Agent
Similar to an MCP Inspector, HighByte does not control any system by which to interact as a human directly with an LLM or AI agent. However, there are some common tools in the industry. Common web-hosted LLMs may not provide the user an opportunity to use or create a customized agent (a specific invocation of an LLM with special instructions). Some tools are readily available to interact with LLMs and create agents.
LibreChat
LibreChat is an open-source application that can be used to interact with public-facing web-hosted LLMs, or also configured to talk to LLMs running privately or locally. Librechat also has the option to create AI agents – which is necessary in HighByte Intelligence Hub’s case – to attach an API key and provide the agent access the Intelligence Hub MCP server.
To run locally, Librechat usually is installed in a series of containers on a platform like Docker. It is not a standalone application, but rather it depends on other services like MongoDB. LibreChat provides instructions and a compose file that should make organizing all the necessary services trivial.
Additional Considerations
AI agents based on LLMs are generative and extrapolate by design. They sometimes respond beyond their capabilities and certain hesitations should be reserved and considered.
Agent Model Tool Compatibility
An LLM-based AI agent will need to use an LLM that is capable of using tools. Not all LLMs will be able to recognize and execute on the need to use an external tool. Verify that your LLM and agent are tool-enabled. This should be specified in the repository where you select your model, or available somewhere in your model's specification depending on the platform used to source it.
Agent Model Parameter Size
LLM models are measured in part based on the number of parameters they are able to store and execute on. Many models have multiple versions with different parameter capacities. Llama3.1:8b and Llama3.1:40b have 8 billion and 40 billion parameters respectively.
A model with fewer parameters will respond more quickly and will be able to run on less powerful hardware. However, a model with more parameters will be “smarter” and be able to provide more “reasoned” responses. A larger model is also generally more likely to understand that it should recognize to use a tool in any given situation and to provide it accurate arguments. Resource needs will have to be optimized to find the appropriate model.
Disclaimer
It is entirely possible for a model of any size at any given point to not recognize it should use a tool, to not provide that tool valid arguments, and to ignore the response of the tool. In these scenarios, it is entirely possible for any model to hallucinate a superficially valid response. Models, agents, tools, and valid scenarios should be thoroughly tested before being trusted in production, and should not be given autonomy to control any production practice, method, or equipment.
Running on Local Hardware
It may be desirable to run an AI agent on local hardware in the presence of concerns over data security, internet access, token costs, or other factors. HighByte does not control any locally-run models in any way, but tools like Ollama can be used to host local LLMs. Ollama also maintains a database of compatible models and specifically a list compatible with tools.
Other Resources
Example Project (Warning: Contains Function Block)
Parameters (User Guide) (Knowledge Base)