File Data

File data components load content from local files and directories with various format support.

File

This component loads and parses files of various supported formats and converts the content into a Data object. It supports multiple file types and provides options for parallel processing and error handling.

To load a document, follow these steps:

  1. Click the Select files button.

  2. Select a local file or a file loaded with File management, and then click Select file.

The loaded file name appears in the component.

Inputs

Name
Display Name
Info

path

Files

Path to file(s) to load. Supports individual files or bundled archives.

file_path

Server File Path

Data object with a file_path property pointing to the server file or a Message object with a path to the file. Supersedes 'Path' but supports the same file types.

separator

Separator

Specify the separator to use between multiple outputs in Message format.

silent_errors

Silent Errors

If true, errors do not raise an exception.

delete_server_file_after_processing

Delete Server File After Processing

If true, the Server File Path is deleted after processing.

ignore_unsupported_extensions

Ignore Unsupported Extensions

If true, files with unsupported extensions are not processed.

ignore_unspecified_files

Ignore Unspecified Files

If true, Data with no file_path property is ignored.

use_multithreading

[Deprecated] Use Multithreading

Set 'Processing Concurrency' greater than 1 to enable multithreading. This option is deprecated.

concurrency_multithreading

Processing Concurrency

When multiple files are being processed, the number of files to process concurrently. Default is 1. Values greater than 1 enable parallel processing for 2 or more files.

Outputs

Name
Display Name
Info

data

Data

Parsed content of the file as a Data object.

dataframe

DataFrame

File content as a DataFrame object.

message

Message

File content as a Message object.

Supported File Types

Text files:

  • .txt - Text files

  • .md, .mdx - Markdown files

  • .csv - CSV files

  • .json - JSON files

  • .yaml, .yml - YAML files

  • .xml - XML files

  • .html, .htm - HTML files

  • .pdf - PDF files

  • .docx - Word documents

  • .py - Python files

  • .sh - Shell scripts

  • .sql - SQL files

  • .js - JavaScript files

  • .ts, .tsx - TypeScript files

Archive formats (for bundling multiple files):

  • .zip - ZIP archives

  • .tar - TAR archives

  • .tgz - Gzipped TAR archives

  • .bz2 - Bzip2 compressed files

  • .gz - Gzip compressed files

Directory

This component recursively loads files from a directory, with options for file types, depth, and concurrency.

Inputs

Input
Type
Description

path

MessageTextInput

Path to the directory to load files from

types

MessageTextInput

File types to load (leave empty to load all types)

depth

IntInput

Depth to search for files

max_concurrency

IntInput

Maximum concurrency for loading files

load_hidden

BoolInput

If true, hidden files are loaded

recursive

BoolInput

If true, the search is recursive

silent_errors

BoolInput

If true, errors do not raise an exception

use_multithreading

BoolInput

If true, multithreading is used

Outputs

Output
Type
Description

data

List[Data]

Loaded file data from the directory

Usage Notes

  • Batch Processing: Both components support processing multiple files efficiently

  • Format Detection: Automatic file format detection based on file extensions

  • Error Handling: Configurable error handling for robust file processing

  • Performance: Multithreading options for faster processing of large file sets

Last updated