Files: Process Many Files
This article provides a starter solution related to processing many files that were written to a directory.
What Does This Article Cover?
In some scenarios, it is necessary to process multiple files that have recently been added to a directory. This article outlines a data pipeline solution designed to handle batches of newly written files, including cases where the original files have a non-standard format and require conversion to CSV.
Intelligence Hub design considerations for processing many files that were written to a directory
The solution described here is intentionally designed to be straightforward and is intentionally provided independent of a destination system.
- An Intelligence Hub solution should be created that exchanges data directly with that system.
- For example, if the desired destination is Snowflake data warehouse, Intelligence Hub can be used to create a solution that parses a file and writes directly to a Snowflake table.
- The Intelligence Hub File Connection Directory Input Type can be used to obtain an array of files from a file directory.
- The Include Metadata option can be enabled to include the file name and path, file creation time, file update time, and file size in the payload.
Additional Resources