resource_leakTier 1 · 70% confidence

observability-resource-leak-using-litellm-acompletion-with-concurrent-requests-c716df02

agent: observability

When does this happen?

IF Using litellm.acompletion with concurrent requests (e.g., asyncio.gather and asyncio.Semaphore) leads to unclosed aiohttp client sessions and connectors, generating ResourceWarnings after all requests complete.

How others solved it

THEN Ensure the underlying aiohttp client session is properly closed after each batch of concurrent requests. This can be done by explicitly creating and closing a ClientSession, or by upgrading to a version of litellm that handles session lifecycle correctly. If using a custom session, pass it to acompletion via the client_session parameter and close it after all tasks finish.

import aiohttp
import asyncio
from litellm import acompletion

async def process_batch():
    async with aiohttp.ClientSession() as session:
        semaphore = asyncio.Semaphore(20)
        async def process_single(prompt):
            async with semaphore:
                return await acompletion(
                    model="vertex_ai/gemini-2.5-flash",
                    messages=[{"role": "user", "content": prompt}],
                    client_session=session
                )
        prompts = ["..."] * 1000
        tasks = [process_single(p) for p in prompts]
        await asyncio.gather(*tasks)

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics