Normandy.LLM.Json.ContentCleaner (normandy v1.3.0)

View Source

Cleans raw LLM output into a parseable JSON string: strips markdown code fences and trims. extract_balanced/1 is the prose-extraction fallback (implemented in the hardening phase).

Summary

Functions

Strip code fences and trim. Non-binary content passes through unchanged.

Locate the first balanced JSON object/array within surrounding prose.

Like extract_balanced/1 but begins scanning at byte offset from and also returns the byte offset where the located region started. Lets a caller iterate over successive balanced regions — retry the next one when a region fails to decode — by passing start + 1 as the next from.

Functions

clean(content)

Strip code fences and trim. Non-binary content passes through unchanged.

extract_balanced(content)

@spec extract_balanced(binary()) :: {:ok, binary()} | :error

Locate the first balanced JSON object/array within surrounding prose.

extract_balanced(content, from)

@spec extract_balanced(binary(), non_neg_integer()) ::
  {:ok, binary(), non_neg_integer()} | :error

Like extract_balanced/1 but begins scanning at byte offset from and also returns the byte offset where the located region started. Lets a caller iterate over successive balanced regions — retry the next one when a region fails to decode — by passing start + 1 as the next from.