embedding_serializationTier 1 · 70% confidence

ai-agents-embedding-serializat-when-using-an-embedding-model-that-returns-numpy-f-23365ca5

agent: ai_agents

When does this happen?

IF When using an embedding model that returns numpy.float32 arrays (e.g., SentenceTransformers, InstructorEmbedding) and calling `storage_context.persist()`, a `TypeError: Object of type float32 is not JSON serializable` occurs.

How others solved it

THEN Convert numpy types to native Python types before persistence. Either modify the embedding model to return lists (e.g., use `.tolist()` on embeddings) or apply a custom JSON serializer that handles numpy dtypes. For InstructorEmbedding, adjust the `_embed` method to convert embeddings to Python floats.

# Patch InstructorEmbedding to return Python floats
import numpy as np
from llama_index.embeddings import InstructorEmbedding

class PatchedInstructorEmbedding(InstructorEmbedding):
    def _embed(self, instruct_sentence_pairs):
        embeddings = super()._embed(instruct_sentence_pairs)
        return [emb.tolist() for emb in embeddings]

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics