Processing
Processing components process and transform data within a flow.
Last updated
Processing components process and transform data within a flow.
Last updated
The Split Text processing component in this flow splits the incoming into chunks to be embedded into the vector store component.
The component offers control over chunk size, overlap, and separator, which affect context and granularity in vector store retrieval results.
The component iterates through the input list of data objects, merging them into a single data object. If the input list is empty, it returns an empty data object. If there's only one input data object, it returns that object unchanged. The merging process uses the addition operator to combine data objects.
data
Data
A list of data objects to be merged.
merged_data
Merged Data
This component concatenates two text sources into a single text chunk using a specified delimiter.
To use this component in a flow, connect two components that output Messages to the Combine Text component's First Text and Second Text inputs. This example uses two Text Input components.
In the Combine Text component, in the Text fields of both Text Input components, enter some text to combine.
In the Combine Text component, enter an optional Delimiter value. The delimiter character separates the combined texts. This example uses \n\n **end first text** \n\n **start second text** \n
to label the texts and create newlines between them.
Connect a Chat Output component to view the text combination.
Click Playground, and then click Run Flow. The combined text appears in the Playground.
first_text
First Text
The first text input to concatenate.
second_text
Second Text
The second text input to concatenate.
delimiter
Delimiter
A string used to separate the two text inputs. Defaults to a space.
message
Message
This example fetches JSON data from an API. The Lambda filter component extracts and flattens the results into a tabular DataFrame. The DataFrame Operations component can then work with the retrieved data.
The API Request component retrieves data with only source
and result
fields. For this example, the desired data is nested within the result
field.
Connect a Lambda Filter to the API request component, and a Language model to the Lambda Filter. This example connects a Groq model component.
In the Groq model component, add your Groq API key.
To filter the data, in the Lambda filter component, in the Instructions field, use natural language to describe how the data should be filtered. For this example, enter:
I want to explode the result column out into a Data object
Avoid punctuation in the Instructions field, as it can cause errors.
To run the flow, in the Lambda Filter component, click .
To inspect the filtered data, in the Lambda Filter component, click . The result is a structured DataFrame.
Add the DataFrame Operations component, and a Chat Output component to the flow.
In the DataFrame Operations component, in the Operation field, select Filter.
To apply a filter, in the Column Name field, enter a column to filter on. This example filters by name
.
Click Playground, and then click Run Flow. The flow extracts the values from the name
column.
name Emily Johnson Michael Williams John Smith ...
Add Column
Adds a new column with a constant value
new_column_name, new_column_value
Drop Column
Removes a specified column
column_name
Filter
Filters rows based on column value
column_name, filter_value
Head
Returns first n rows
num_rows
Rename Column
Renames an existing column
column_name, new_column_name
Replace Value
Replaces values in a column
column_name, replace_value, replacement_value
Select Columns
Selects specific columns
columns_to_select
Sort
Sorts DataFrame by column
column_name, ascending
Tail
Returns last n rows
num_rows
df
DataFrame
The input DataFrame to operate on.
operation
Operation
Select the DataFrame operation to perform. Options: Add Column, Drop Column, Filter, Head, Rename Column, Replace Value, Select Columns, Sort, Tail
column_name
Column Name
The column name to use for the operation.
filter_value
Filter Value
The value to filter rows by.
ascending
Sort Ascending
Whether to sort in ascending order.
new_column_name
New Column Name
The new column name when renaming or adding a column.
new_column_value
New Column Value
The value to populate the new column with.
columns_to_select
Columns to Select
List of column names to select.
num_rows
Number of Rows
Number of rows to return (for head/tail). Default: 5
replace_value
Value to Replace
The value to replace in the column.
replacement_value
Replacement Value
The value to replace with.
output
DataFrame
The resulting DataFrame after the operation.
To view the flow's output, connect a Chat Output component to the Data to Dataframe component.
Send a POST request to the Webhook containing your JSON data. Replace YOUR_FLOW_ID
with your flow ID. This example uses the default Langflow server address.
In the Playground, view the output of your flow. The Data to DataFrame component converts the webhook request into a DataFrame
, with text
and data
fields as columns.
Send another employee data object.
In the Playground, this request is also converted to DataFrame
.
data_list
Data or Data List
One or multiple Data objects to transform into a DataFrame.
dataframe
DataFrame
A DataFrame built from each Data object's fields plus a 'text' column.
data
Data
Data object to filter.
filter_criteria
Filter Criteria
List of keys to filter by.
filtered_data
Filtered Data
The Filter values component filters a list of data items based on a specified key, filter value, and comparison operator.
input_data
Input data
The list of data items to filter.
filter_key
Filter Key
The key to filter on, for example, 'route'.
filter_value
Filter Value
The value to filter by, for example, 'CMIP'.
operator
Comparison Operator
The operator to apply for comparing the values.
filtered_data
Filtered data
The resulting list of filtered data items.
This component uses an LLM to generate a Lambda function for filtering or transforming structured data.
This example gets JSON data from the https://jsonplaceholder.typicode.com/users
API endpoint. The Instructions field in the Lambda filter component specifies the task extract emails
. The connected LLM creates a filter based on the instructions, and successfully extracts a list of email addresses from the JSON data.
data
Data
The structured data to filter or transform using a Lambda function.
llm
Language Model
filter_instruction
Instructions
Natural language instructions for how to filter or transform the data using a Lambda function, such as Filter the data to only include items where the 'status' is 'active'.
sample_size
Sample Size
For large datasets, the number of characters to sample from the dataset head and tail.
max_size
Max Size
The number of characters for the data to be considered "large", which triggers sampling by the sample_size
value.
filtered_data
Filtered Data
dataframe
DataFrame
This component routes requests to the most appropriate LLM based on OpenRouter model specifications.
models
Language Models
List of LLMs to route between
input_value
Input
The input message to be routed
judge_llm
Judge LLM
LLM that will evaluate and select the most appropriate model
optimization
Optimization
Optimization preference (quality/speed/cost/balanced)
output
Output
The response from the selected model
selected_model
Selected Model
Name of the chosen model
message
Message
data
Data
This component formats DataFrame
or Data
objects into text using templates, with an option to convert inputs directly to strings using stringify
.
To use this component, create variables for values in the template
the same way you would in a Prompt component. For DataFrames
, use column names, for example Name: {Name}
. For Data
objects, use {text}
.
To use the Parser component with a Structured Output component, do the following:
Connect a Structured Output component's DataFrame output to the Parser component's DataFrame input.
Connect the File component to the Structured Output component's Message input.
Connect the OpenAI model component's Language Model output to the Structured Output component's Language Model input.
The flow looks like this:
In the Structured Output component, click Open Table. This opens a pane for structuring your table. The table contains the rows Name, Description, Type, and Multiple.
Create a table that maps to the data you're loading from the File loader. For example, to create a table for employees, you might have the rows id
, name
, and email
, all of type string
.
In the Template field of the Parser component, enter a template for parsing the Structured Output component's DataFrame output into structured text. Create variables for values in the template
the same way you would in a Prompt component. For example, to present a table of employees in Markdown:
To run the flow, in the Parser component, click .
To view your parsed text, in the Parser component, click .
Optionally, connect a Chat Output component, and open the Playground to see the output.
For an additional example of using the Parser component to format a DataFrame from a Structured Output component, see the Market Research template flow.
mode
Mode
Tab selection between "Parser" and "Stringify" modes. "Stringify" converts input to a string instead of using a template.
pattern
Template
Template for formatting using variables in curly brackets. For DataFrames, use column names, such as Name: {Name}
. For Data objects, use {text}
.
input_data
Data or DataFrame
The input to parse - accepts either a DataFrame or Data object.
sep
Separator
String used to separate rows/items. Default: newline.
clean_data
Clean Data
When stringify is enabled, cleans data by removing empty rows and lines.
parsed_text
Parsed Text
This component splits text into chunks based on specified criteria. It's ideal for chunking data to be tokenized and embedded into vector databases.
The Split Text component outputs Chunks or DataFrame. The Chunks output returns a list of individual text chunks. The DataFrame output returns a structured data format, with additional text
and metadata
columns applied.
In the Split Text component, define your data splitting parameters.
This example splits incoming JSON data at the separator },
, so each chunk contains one JSON object.
The order of precedence is Separator, then Chunk Size, and then Chunk Overlap. If any segment after separator splitting is longer than chunk_size
, it is split again to fit within chunk_size
.
After chunk_size
, Chunk Overlap is applied between chunks to maintain context.
Connect a Chat Output component to the Split Text component's DataFrame output to view its output.
Click Playground, and then click Run Flow. The output contains a table of JSON objects split at },
.
Clear the Separator field, and then run the flow again. Instead of JSON objects, the output contains 50-character lines of text with 10 characters of overlap.
First chunk: "title": "Introduction to Artificial Intelligence"" Second chunk: "elligence", "body": "Learn the basics of Artif" Third chunk: "s of Artificial Intelligence and its applications"
data_inputs
Input Documents
chunk_overlap
Chunk Overlap
The number of characters to overlap between chunks. Default: 200
.
chunk_size
Chunk Size
The maximum number of characters in each chunk. Default: 1000
.
separator
Separator
The character to split on. Default: newline
.
text_key
Text Key
The key to use for the text column (advanced). Default: text
.
chunks
Chunks
dataframe
DataFrame
This component dynamically updates or appends data with specified fields.
old_data
Data
The records to update
number_of_fields
Number of Fields
Number of fields to add (max 15)
text_key
Text Key
Key for text content
text_key_validator
Text Key Validator
Validates text key presence
data
Data
Legacy components are available to use but no longer supported.
input_value
Input
Objects to which Metadata should be added
text_in
User Text
metadata
Metadata
Metadata to add to each object
remove_fields
Fields to Remove
Metadata fields to remove
data
Data
List of Input objects, each with added metadata
number_of_fields
Number of Fields
The number of fields to be added to the record.
text_key
Text Key
Key that identifies the field to be used as the text content.
text_key_validator
Text Key Validator
If enabled, checks if the given Text Key
is present in the given Data
.
data
Data
The ParseData component converts data objects into plain text using a specified template. This component transforms structured data into human-readable text formats, allowing for customizable output through the use of templates.
data
Data
The data to convert to text.
template
Template
The template to use for formatting the data. It can contain the keys {text}
, {data}
, or any other key in the data.
sep
Separator
The separator to use between multiple data items.
text
Text
The resulting formatted text string as a Message object.
This component converts DataFrames into plain text using templates.
df
DataFrame
The DataFrame to convert to text rows.
template
Template
Template for formatting (use {column_name}
placeholders.
sep
Separator
String to join rows in output.
text
Text
All rows combined into single text.
This component converts and extracts JSON fields using JQ queries.
input_value
Input
query
JQ Query
JQ Query to filter the data
filtered_data
Filtered Data
data_list
Data List
List of data to select from
data_index
Data Index
Index of the data to select
selected_data
Selected Data
This component combines multiple data sources into a single unified object.
A single object containing the combined information from all input data objects.
A object containing the combined text.
This component performs operations on rows and columns.
To use this component in a flow, connect a component that outputs to the DataFrame Operations component.
This component can perform the following operations on Pandas .
This component converts one or multiple objects into a . Each Data object corresponds to one row in the resulting DataFrame. Fields from the .data
attribute become columns, and the .text
field (if present) is placed in a 'text' column.
To use this component in a flow, connect a component that outputs to the Data to Dataframe component's input. This example connects a Webhook component to convert text
and data
into a DataFrame.
This component filters a object based on a list of keys.
A new object containing only the key-value pairs that match the filter criteria.
To use the Lambda filter component, you must connect it to a component, which the component uses to generate a function based on the natural language instructions in the Instructions field.
The connection port for a component.
The filtered or transformed .
The filtered data as a .
This component converts Message objects to objects.
The Message object to convert to a object.
The converted object.
The resulting formatted text as a object.
To use this component in a flow, connect a component that outputs to the Split Text component's Data port. This example uses the URL component, which is fetching JSON placeholder data.
The data to split.The component accepts or objects.
List of split text chunks as objects.
List of split text chunks as objects.
Updated objects.
This component modifies metadata of input objects. It can add new metadata, update existing metadata, and remove specified metadata fields. The component works with both Message and objects, and can also create a new Data object from user-provided text.
Inputs
Text input; the value will be in the 'text' attribute of the object. Empty text entries are ignored.
Outputs
This component dynamically creates a object with a specified number of fields.
Inputs
Outputs
A object created with the specified fields and text key.
Inputs
Outputs
Inputs
Outputs
Inputs
Data object to filter (Message or ).
Outputs
Filtered data as list of objects.
This component selects a single item from a list.
Inputs
Outputs
The selected object.