Hf Mcp
Automate and integrate Hugging Face MCP model and pipeline workflows
Hf Mcp is a community skill for building Model Context Protocol servers that integrate Hugging Face model inference, dataset access, and hub operations into AI agent workflows through a standardized tool interface.
What Is This?
Overview
Hf Mcp provides patterns for creating MCP servers that expose Hugging Face capabilities as discoverable tools for AI agents. It covers model inference endpoints, dataset loading and querying, model card retrieval, space deployment, and hub search operations. The skill enables AI assistants to access the full Hugging Face ecosystem through the Model Context Protocol without requiring direct API knowledge.
Who Should Use This
This skill serves developers building AI agents that need access to Hugging Face models and datasets, platform engineers creating unified tool servers for ML operations, and teams that want AI assistants to search, evaluate, and deploy models through natural language interaction.
Why Use It?
Problems It Solves
Accessing Hugging Face services from AI agents requires writing custom integration code for each capability. Model selection involves manual hub browsing rather than programmatic search based on task requirements. Dataset loading and preprocessing steps must be scripted individually for each use case. Without an MCP layer, switching between AI agent platforms requires reimplementing Hugging Face integrations from scratch.
Core Highlights
Model inference tools run predictions on Hugging Face hosted models through a single tool interface. Dataset tools load, filter, and preview datasets from the Hub without manual download scripts. Search tools query the Hub for models, datasets, and spaces with filtering by task type, library, and popularity. Resource endpoints expose model cards and dataset documentation as readable context for AI clients.
How to Use It?
Basic Usage
from mcp.server import Server
from mcp.types import TextContent
from huggingface_hub import InferenceClient, HfApi
import json
server = Server("huggingface-tools")
hf_api = HfApi()
inference = InferenceClient()
@server.tool()
async def search_models(
query: str, task: str = "", limit: int = 5
) -> list[TextContent]:
"""Search Hugging Face Hub for models."""
models = hf_api.list_models(
search=query, task=task or None, limit=limit,
sort="downloads", direction=-1
)
results = [{
"id": m.id, "task": m.pipeline_tag,
"downloads": m.downloads, "likes": m.likes
} for m in models]
return [TextContent(type="text", text=json.dumps(results, indent=2))]
@server.tool()
async def run_inference(
model_id: str, text: str
) -> list[TextContent]:
"""Run inference on a Hugging Face model."""
result = inference.text_generation(text, model=model_id, max_new_tokens=200)
return [TextContent(type="text", text=result)]Real-World Examples
from datasets import load_dataset
@server.tool()
async def preview_dataset(
dataset_id: str, split: str = "train", rows: int = 5
) -> list[TextContent]:
"""Load and preview rows from a Hugging Face dataset."""
ds = load_dataset(dataset_id, split=split, streaming=True)
samples = []
for i, row in enumerate(ds):
if i >= rows:
break
samples.append(row)
return [TextContent(type="text", text=json.dumps(samples, indent=2, default=str))]
@server.tool()
async def get_model_info(
model_id: str
) -> list[TextContent]:
"""Get detailed information about a model."""
info = hf_api.model_info(model_id)
details = {
"id": info.id, "pipeline_tag": info.pipeline_tag,
"library_name": info.library_name,
"downloads": info.downloads, "likes": info.likes,
"tags": info.tags[:10] if info.tags else []
}
return [TextContent(type="text", text=json.dumps(details, indent=2))]
@server.tool()
async def classify_text(
text: str, model_id: str = "distilbert-base-uncased-finetuned-sst-2-english"
) -> list[TextContent]:
"""Classify text using a Hugging Face model."""
result = inference.text_classification(text, model=model_id)
return [TextContent(type="text", text=json.dumps(result, indent=2))]Advanced Tips
Cache model search results and dataset metadata to reduce Hub API calls during iterative exploration. Use streaming dataset loading for large datasets that would not fit in memory. Implement tool-level rate limiting to stay within Hugging Face API quotas.
When to Use It?
Use Cases
Build AI assistants that search and evaluate Hugging Face models for specific tasks. Create data exploration agents that preview and analyze datasets through conversation. Develop ML workflow tools that let agents run inference and compare model outputs.
Related Topics
Model Context Protocol specification, Hugging Face Hub API, transformers library, dataset loading patterns, and AI agent tool integration.
Important Notes
Requirements
Python with the mcp, huggingface_hub, and datasets packages installed. A Hugging Face API token for accessing gated models and elevated rate limits. An MCP-compatible client for testing tool interactions.
Usage Recommendations
Do: use streaming mode for large datasets to avoid memory exhaustion. Implement timeout handling for inference calls that may take longer on large models. Return structured JSON from tools for reliable parsing by AI clients.
Don't: expose model deletion or repository write operations without explicit authorization checks. Load entire large datasets into memory when only a preview is needed. Skip error handling for model inference failures that may occur with incompatible inputs.
Limitations
Inference API rate limits restrict the frequency of model predictions. Large model inference may have significant latency on free-tier hosted endpoints. Not all models on the Hub support the Inference API, requiring local deployment for some architectures.
More Skills You Might Like
Explore similar skills to enhance your workflow
Nano Pdf
Edit PDFs with natural-language instructions using the nano-pdf CLI
Dadata Ru Automation
Automate Dadata Ru operations through Composio's Dadata Ru toolkit via
Labarchive Integration
Sync laboratory data and research notes with automated LabArchives electronic notebook integration
Instacart Automation
Automate Instacart operations through Composio's Instacart toolkit via
Threejs Materials
Automate and integrate Three.js Materials for dynamic 3D rendering workflows
Discord
Use when you need to control Discord from Clawdbot via the discord tool: send messages, react