Data Structuring

Data structuring components transform unstructured LLM responses into organized, structured data formats.

Structured Output

This component transforms LLM responses into structured data formats.

In this example from the Financial Support Parser template, the Structured Output component transforms unstructured financial reports into structured data.

Structured output example

The connected LLM model is prompted by the Structured Output component's Format Instructions parameter to extract structured output from the unstructured text. Format Instructions is utilized as the system prompt for the Structured Output component.

In the Structured Output component, click the Open table button to view the Output Schema table. The Output Schema parameter defines the structure and data types for the model's output using a table with the following fields:

  • Name: The name of the output field.

  • Description: The purpose of the output field.

  • Type: The data type of the output field. The available types are str, int, float, bool, list, or dict. The default is text.

  • Multiple: This feature is deprecated. Currently, it is set to True by default if you expect multiple values for a single field. For example, a list of features is set to True to contain multiple values, such as ["waterproof", "durable", "lightweight"]. Default: True.

The Parse DataFrame component parses the structured output into a template for orderly presentation in chat output. The template receives the values from the output_schema table with curly braces.

For example, the template EBITDA: {EBITDA} , Net Income: {NET_INCOME} , GROSS_PROFIT: {GROSS_PROFIT} presents the extracted values in the Playground as EBITDA: 900 million , Net Income: 500 million , GROSS_PROFIT: 1.2 billion.

Inputs

Name
Display Name
Info

llm

Language Model

The language model to use to generate the structured output.

input_value

Input Message

The input message to the language model.

system_prompt

Format Instructions

Instructions to the language model for formatting the output.

schema_name

Schema Name

The name for the output data schema.

output_schema

Output Schema

Defines the structure and data types for the model's output.

multiple

Generate Multiple

[Deprecated] Always set to True.

Outputs

Name
Display Name
Info

structured_output

Structured Output

The structured output is a Data object based on the defined schema.

structured_output_dataframe

DataFrame

The structured output converted to a DataFrame format.

Usage Notes

  • Schema Definition: Define custom schemas with multiple fields and data types

  • Type Safety: Enforce specific data types for each field in the output

  • Template Integration: Works seamlessly with Parser components for presentation

  • LLM Agnostic: Compatible with any language model component

  • Multiple Formats: Outputs both Data objects and DataFrames for flexibility

Last updated