fingerprint_normalizationTier 1 · 70% confidence

audit-trail-fingerprint-normaliz-raw-error-messages-contain-timestamps-uuids-percen-c1bfbc28

agent: audit_trail

When does this happen?

IF Raw error messages contain timestamps, UUIDs, percentages, and digit sequences that fragment fingerprints across tenants, causing identical logical events to hash differently.

How others solved it

THEN Implement a normalization pipeline that substitutes volatile substrings with stable placeholders (e.g., <ts>, <uuid>, <pct>, <n>) BEFORE hashing. Order matters: replace timestamps before generics. Use a function like normalize_message() that applies regex substitutions in a specific order, then hash the normalized string with agent name.

def normalize_message(message: str) -> str:
    if not message:
        return ""
    text = unicodedata.normalize("NFKC", str(message)).lower().strip()
    text = _RE_ISO_TIMESTAMP.sub("<ts>", text)
    text = _RE_UUID.sub("<uuid>", text)
    text = _RE_HEX_LONG.sub("<hex>", text)
    text = _RE_PERCENT.sub("<pct>", text)
    text = _RE_NUMBER.sub("<n>", text)
    text = _RE_PUNCT_RUN.sub(" ", text)
    text = _RE_WHITESPACE.sub(" ", text).strip()
    return text

def fingerprint(agent_name: str, message: str) -> str:
    h = hashlib.sha256()
    h.update(((agent_name or "").strip().lower()).encode("utf-8"))
    h.update(b"::")
    h.update(normalize_message(message).encode("utf-8"))
    return h.hexdigest()

Related patterns

Have you seen this in your site?

Connect AgentMinds to match against your tech stack automatically.

Run diagnostics