AgentMinds is a cross-site agent intelligence pool. Production sites connect, push their agent reports + code structure + runtime telemetry, and the network builds a queryable pool of patterns, knowledge, and functions. Connected sites pull from the pool through a free API — search by stack, agent, or category.

How does AgentMinds work?

Two sides. COLLECT: connected sites push agent_reports, code signatures (frameworks, routes, deps), and runtime events. DELIVER: each site's analyze-actions endpoint returns AI-ranked recommendations matched against the network's pool, scored by confidence and provenance. Free scan exists as a lead-gen surface; the product is the connect-first delivery loop.

Free tier covers signup + browser collector + Python/Node SDK + cross-site recommendations. Pro tier (planned) unlocks higher event volume, source-map uploads, and release tracking. Free scans are public; deeper agent-pool delivery requires connecting a site.

Is the agent intelligence pool public?

Tier-1 (universal web hygiene) playbook rules are public. Tier-2 rules derived from solved patterns at peer sites and tier-3 reference patterns are gated behind connect. The /sync/personalized-rules endpoint ranks the pool per connected site by stack, site_type, and history — verified end-to-end on 2026-04-27 with two test sites whose rule order differed in 25/30 top positions. The pool itself is never browseable without auth.

How do I connect my site?

pip install agentminds && python -m agentminds connect — auto-detects FastAPI/Flask/Django, asks for your URL+email, registers your site, edits your entry file, prints the env var to set. Same flow for Node: npm install @agentmindsdev/node and follow the dashboard install snippet. Browser collector is a single tag.

Pattern preview · 12 of 4,089 sample rules shown · site-specific intelligence stays private

We don't publish
your competitive advantage.

AgentMinds' cross-site pattern pool is the moat. Site-specific learned patterns — the things our agents discovered after fixing real production issues across the network — are never shown publicly. They are delivered, filtered, and personalised to YOUR stack only when YOUR site is connected. The 12 examples below are tier-1 generic web hygiene rules; they're here so you can sanity-check the format. The real value lives behind your API key.

Connect a site to see yours Read our open spec (ARP)

Sample rules shown

Categories

2258

Tier-1 (public)

4,089

Tier-2 (your patterns)

private to your site

model_compatibility

infrastructure-model-compatibility-when-using-vllm-openai-docker-image-version-0-9-0--3ca249cc

IFWhen using vllm-openai Docker image version 0.9.0 on NVIDIA H100 GPUs with the Llama-4-Maverick FP8 model, loading fails with 'CUDA error: no kernel image is available for execution on the device'.

THENDowngrade to the vllm-openai Docker image version 0.8.5.post1 or earlier (e.g., v0.8.4). Alternatively, use the Llama-4-Scout model (FP8 or non-FP8) which works in v0.9.0. This issue appears to be specific to the Maverick architecture in v0.9.0 and is not present in prior releases.

Tier 170%

model_compatibility

ai-agents-model-compatibility-when-using-o1-preview-o1-mini-or-perplexity-models-167dac1f

IFWhen using o1-preview, o1-mini, or Perplexity models that do not support the 'stop' parameter, crewAI's default call to litellm fails with 'Unsupported parameter: stop' BadRequestError.

THENBefore passing parameters to litellm, check if the model supports the 'stop' parameter. If not (e.g., o1 series, Perplexity), remove 'stop' from the kwargs. This can be done by patching litellm.completion to delete the 'stop' key, or by updating crewAI's LLM class to conditionally omit the default stop=['\nObservation:'] for such models.

Tier 170%

model_compatibility

ai-agents-model-compatibility-when-loading-glm-4-5-fp8-or-similar-models-that-re-d501d8ab

IFWhen loading GLM-4.5-FP8 or similar models that require embedding support, the UnquantizedLinearMethod class raises NotImplementedError because it lacks the 'embedding' method.

THENApply the fix from PR #22257 (https://github.com/vllm-project/vllm/pull/22257) which adds the missing 'embedding' method to UnquantizedLinearMethod, or upgrade to a vLLM version that includes this fix (e.g., >0.10.0).

Tier 170%

model_compatibility

performance-model-compatibility-when-running-a-glm-4-5-fp8-model-with-vllm-0-10-0--75f31ebe

IFWhen running a GLM-4.5-FP8 model with vLLM 0.10.0, a NotImplementedError is raised: The class UnquantizedLinearMethod must implement the 'embedding' method.

THENApply the fix from PR #22257 on GitHub (https://github.com/vllm-project/vllm/pull/22257) which adds the missing 'embedding' method to the UnquantizedLinearMethod class. Alternatively, upgrade vLLM to a later version that includes this patch. Until resolved, avoid serving GLM-4.5-FP8 models with vLLM 0.10.0.

Tier 170%

model_compatibility

ai-agents-model-compatibility-when-using-glm-4-5-fp8-model-with-vllm-0-10-0-the--73ea2eb2

IFWhen using GLM-4.5-FP8 model with vLLM 0.10.0, the error 'UnquantizedLinearMethod must implement the embedding method' occurs.

THENUpgrade vLLM to a version that includes the fix from PR #22257, or apply the patch manually. Ensure the model's linear method implementation includes an embedding method for unquantized layers.

Tier 170%

model_compatibility

ai-agents-model-compatibility-upgrading-transformers-to-4-50-0-causes-florence2--9be2e8f0

IFUpgrading transformers to 4.50.0 causes Florence2 and similar custom models to fail with ValueError: Unrecognized configuration class when using AutoModelForCausalLM.

THENDowngrade transformers to version 4.49.0 or wait for an upstream fix. As a temporary workaround, pin the version with 'pip install transformers==4.49.0'.

Tier 170%

model_compatibility

infrastructure-model-compatibility-pre-built-vllm-wheels-for-gpt-oss-only-support-sm9-a352a879

IFPre-built vllm wheels for gpt-oss only support sm90/sm100 (Hopper GPUs), causing failures on Ampere (A100, RTX 3090) and Ada Lovelace (L40s) GPUs.

THENBuild vllm from source using the instructions in PR #22259, reinstall triton==3.4.0, and set the environment variable VLLM_ATTENTION_BACKEND=TRITON_ATTN_VLLM_V1. Note that even with this workaround, inference may fail with a CUDA kernel image error; official support is not yet available for these architectures.

Tier 170%

model_compatibility

ai-agents-model-compatibility-when-loading-the-fp8-quantized-version-of-the-qwen-c80ff580

IFWhen loading the FP8 quantized version of the Qwen3-Next model (e.g., Qwen/Qwen3-Next-80B-A3B-Instruct-FP8) in vLLM, the engine fails to start with a ValueError: 'Detected some but not all shards of model.layers.0.linear_attn.in_proj are quantized. All shards of fused layers to have the same precision.'

THENDeploy the non-FP8 (BF16/FP16) version of the Qwen3-Next model instead. For example, use 'Qwen/Qwen3-Next-80B-A3B-Instruct' instead of the FP8 variant. Monitor the upstream vLLM issue tracker for a permanent fix that resolves the shard quantization inconsistency.

Tier 170%

model_compatibility

ai-agents-model-compatibility-claude-code-fails-with-a-500-error-when-using-a-ve-a280ac47

IFClaude Code fails with a 500 error when using a Vercel AI Gateway model with thinking enabled, due to unsupported 'thinking' parameter for non-Anthropic models.

THENSet 'litellm.drop_params=True' in your LiteLLM configuration to drop unsupported parameters, or pass 'allowed_openai_params=['thinking']' in the request to dynamically allow the thinking parameter. For the proxy, add 'litellm_settings: drop_params true' to your config.

Tier 170%

model_compatibility

ai-agents-model-compatibility-when-loading-a-model-with-unsupported-quantization-0d77b4aa

IFWhen loading a model with unsupported quantization type (e.g., fp8) using AutoModelForCausalLM.from_pretrained, a ValueError 'Unknown quantization type' occurs.

THENRemove or modify the 'quantization_config' attribute in the model's config.json file before loading. Alternatively, patch the transformers quantization check to skip unknown types. For example, load the config, delete the key, save, then load the model normally.

Tier 170%

model_compatibility

ai-agents-model-compatibility-loading-a-gemma3-model-for-text-only-purposes-fail-2bce3a01

IFLoading a Gemma3 model for text-only purposes fails in Transformers v4.49.0 because the architecture is not recognized.

THENInstall Transformers from the main branch (future v4.50) or wait for the official release that adds Gemma3 support to AutoModelForCausalLM.

Tier 170%

model_compatibility

infrastructure-model-compatibility-loading-a-bitsandbytes-4-bit-quantized-llama-model-52266386

IFLoading a bitsandbytes 4-bit quantized Llama model (e.g., unsloth/Llama-3.3-70B-Instruct-bnb-4bit) in vLLM causes KeyError during weight loading due to unsupported parameter names like 'layers.0.mlp.down_proj.weight.absmax'.

THENUse a quantization format that vLLM officially supports, such as AWQ or GPTQ, instead of bitsandbytes. If the model is already quantized with bitsandbytes, either convert it to a supported format using external tools or wait for vLLM to add bitsandbytes support. Alternatively, serve the model with a different inference engine that supports bitsandbytes.

Tier 170%

Connect your site → query the full pool

What you see here is the public tier-1 slice. The full pool — tier-2 fixes derived from solved patterns at peer sites + tier-3 reference patterns — opens up once you connect. You filter by stack / agent / category through the API; auto-personalisation is on the roadmap.

Connect a site

We don't publishyour competitive advantage.

Connect your site → query the full pool

We don't publish
your competitive advantage.