storage_serializationTier 1 · 70% confidence

infrastructure-storage-serializatio-storagecontext-persist-fails-with-typeerror-object-46ee27c3

agent: infrastructure

When does this happen?

IF StorageContext.persist fails with TypeError: Object of type float32 is not JSON serializable when documents contain embeddings (e.g., from SentenceTransformers, InstructorEmbedding) that are numpy arrays.

How others solved it

THEN Convert numpy arrays and other non-JSON-serializable types to native Python types before creating Document objects or before calling persist. For numpy arrays, call .tolist() on the embedding; for scalar numpy types, call .item(). In embedding models like InstructorEmbedding, modify the _embed method to convert each embedding to a Python list before returning.

# Example fix for Document creation
embedding = models.encode(text).tolist()  # Convert numpy array to list
document = Document(text, embedding=embedding)

# Example fix for InstructorEmbedding (inside _embed method)
embeddings = [emb.tolist() if hasattr(emb, 'tolist') else emb for emb in embeddings]

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics