Introducing FileGroups When Using FileLists

When implementing a custom module, you will notice that there are quite a few modules which accept a list of files as an input parameter usually with the name FileSource and might provide output of processed files as FileList or other parameter names. We refer to all of these type of parameters as FileLists as a general name. When used as an input parameter, the list of files can be supplied as a simple list of files including their fully qualified paths as shown in this example.

C:\MyApplicationData\Incoming\Orders_2021-01-22_09-12-32.json¶ C:\MyApplicationData\Incoming\Orders_2021-01-22_09-46-01.json¶ C:\MyApplicationData\Incoming\Orders_2021-01-22_10-33-18.json¶ C:\MyApplicationData\Incoming\Orders_2021-01-22_11-01-46.json¶

This would be accepted as a valid list of files that the module can process. However, if you look at the output from many of the included modules which show a list of files, and most significantly the FileList output parameter from the [Files] Find module, you will notice the data looks a little different from this simple format. Internally all the included modules support a more structured definition of this data which allows certain modules to have additional functionality. This type of structured data is known as a FileGroup. This structured data is in a chunked data format in which the data describes itself. It does this by including identifying tags directly in the data stream. For example, if the above list of files had come as a result of output from a [Files] Find module, the data would instead look like this example.

[FILEGROUP] [D]C:\MyApplicationData\Incoming\¶ [F]C:\MyApplicationData\Incoming\Orders_2021-01-22_09-12-32.json¶ [F]C:\MyApplicationData\Incoming\Orders_2021-01-22_09-46-01.json¶ [F]C:\MyApplicationData\Incoming\Orders_2021-01-22_10-33-18.json¶ [F]C:\MyApplicationData\Incoming\Orders_2021-01-22_11-01-46.json¶

In this example, you can see that the data starts with an identifying tag [FILEGROUP] which tells us what kind of information is in this stream of data. Then we see this is followed by a series of records, each with a tag preceeding very similar looking data. Here we see right in the first record, there is some additional information that wasn’t in the previous example. The first record uses the [D] tag which denotes that this is the folder all the following records were located within. Thus, in this example, the data is telling us the find module started looking for files in this folder. This is usually not relevant for many modules that just process individual files, but it has benefits for modules that have options for working with the hierarchy of folders and the set of files contained within it. You can find out more detail about how this can be used in a knowledgebase article linked below.

Then after that you should notice that the rest of the data looks nearly identical to the original example except that each record starts with the [F] tag to denote the record is a file. The number of files is not limited and can continue for as long as needed. We recommend using the same FileGroup structure for supporting lists of files and of course you will need to if you want to use the output of the included modules in your own custom modules. More details about the FileGroup structure and how it is beneficial to various use cases can be found at the following article.

https://kb.jobserver.net/Q100038