The FileReader component reads files from the file system and sends their contents to the output port. The source file can either be in the form of text or binary.

  • Text filefile may be read from a specified file in an unstructured fashion and the content is sent in a single message.
  • Binary file
    Binary file contents are read as bytes of data from the source file and are sent in chunks or bundles to the output port of the FileReader component based on the configuration properties of this component.

Points to note

  • The component runs on the peer server and therefore the file paths and directories mentioned in the CPS should be valid on the machine where the peer server is running. If the component fails over to another peer server, ensure that the machine on which the secondary peer server is running must have the same path available.
  • The unstructured plain text can be transformed into its corresponding XML using the Text2XML component.
  • Number of outgoing messages for an input binary file = ceil (Size of File/ Chunk Size).

Configuration

Interaction Configurations

Business logic configuration details are configured in the Interaction Configurations panel. Figure 1 below illustrates the panel with the Expert Properties view enabled.


Figure 1: Interaction Configurations

Attributes

Use File Reader Configuration from Input

If enabled, File and Directory details would be taken from the input.

Pre Processing XSL Configuration

Pre Processing XSL configuration can be used to transform request message before processing it. Click the ellipses button against the property to configure the properties.

Refer to the Pre/Post Processing XSL Configuration section under the Common Configurations page for details regarding Pre Processing XSL configuration and Post Processing XSL configuration (below).

Post Processing XSL Configuration 

Post Processing XSL configuration can be used to transform response message before sending it to the output port.

Process Message based on a Property

In certain occasions where all the batch messages in a flow are required to be processed in a sequential order, enable this property to configure messages to be processed based on a specific property.

Icon

Refer the Processing Message based on a Property section to understand the working of this property.

Is file Binary?

The property is used to specify if the input file which is being read is binary or a flat file.

  • If enabled, the contents of the input binary file are read as binary data and are sent to the output port in chunks whose size, in bytes, is specified in the property Chunk size.

    Icon

    The Chunk Size property (see the section below) is visible only when Is file Binary? property is enabled.

  • If disabled, the contents of the target file are read in an unstructured fashion and the content is sent to the output port as a single message.

    Icon

    Two properties - Line Count and Header Count (explained in the following sections) will be visible when the Is file Binary? property is disabled.

Chunk size (bytes)

When the input file being read is binary, you can choose to receive the contents read from the file in chunks of binary data at the output ports. The size of these chunks (in bytes) can be specified in this property. When the chunk size is specified as 0 bytes, the whole file is read in a single run. 

The number of output messages received = ceil (Size of File / Chunk size). The last output message received can be identified by the value of the property COMPLETE in its message headers. Refer Table 1 for information on message headers.

Icon

The Chunk Size property is visible only when Is file Binary? property is enabled.

FileReader Configuration

Click the Ellipsis button to configure the directories in which different files have to be saved.


Figure 2: FileReader Configuration Attributes

Is configured for different machine?

Specifies whether the Peer Server on which the component is to be launched and Fiorano Studio are running on the same machine or on different machines. Enable this if the Peer Server on which the component is to be launched and Fiorano Studio are running on different machines and disable if Peer Server and Fiorano Studio are running on the same machine.

This helps the component to determine the type of dialog to be shown while providing the paths of Source Directory, Working Directory and Error Directory (these directories are described in the next sections below). When both the Peer Server and the Studio are running on the same machine, the paths to the above-specified directories can be chosen from a file dialog with the directory structure of the current machine. Otherwise, the absolute path of Source/Working/Error directories must be specified in the text field.

Icon

If enabled, the absolute path of the file must be given in the text field and if disabled, a file dialog is shown with directory structure to set the file path as shown in Figures 2 and 3 respectively.


Figure3: Choosing directory path using File Dialog

Compute Paths relative to Directory

The path of the directory relative to which the paths of Source Directory, Working Directory, Error Directory, and Postprocessing Directory are calculated. By default, this points to the FIORANO_HOME directory. If the paths specified for Source/Working/Error/Postprocessing directories are not absolute, their paths are calculated relative to the directory specified here.

Icon

If the path specified for Source Directory/Working Directory/Error Directory/Postprocessing directory is absolute, the path specified for Compute Paths relative to the directory will not be used in the computation of the path for that particular directory.

Example :
If the Source/Working/Error/PostProcessing directories are in /home/fiorano location, then their absolute paths can be specified as in the figure.


Figure 4: With absolute paths

If source/work/error/postproccessing directories branch from the same location "/home/fiorano", "/home/fiorano" which is the relative path for those directories is specified in Compute paths relative to Directory.


Figure 5: With Relative paths

File name

The name of the file to be read. A pattern of file names can also be provided using wild character *. Multiple patterns are not allowed. All the files in the Source Directory are checked against this pattern and are suitably processed.

Example:

  • *.txt includes all the files with a .txt extension.
  • S*.* would include Sample.txt, Service.doc, but not SampleFile
Icon
  • Only a single pattern of names can be specified. Multiple formats are not supported while specifying the File name
  • When the component is not in scheduling mode, the file name can be specified in the input message to the component and the name specified in the input message overrides the file name (if any) provided during the configuration.
  • When a pattern of file names is specified, there is no guarantee that the matching files will be processed in any specific order.
Source Directory

The directory which holds the file(s) to be read has to be specified in Source Directory. All the files in this directory whose names match the pattern specified for the File name property will be processed. The files present in the sub-directories are not considered.

An absolute path or a path relative to the directory specified in the Compute Paths relative to Directory can be provided.
If Is configured on different machine? is disabled, clicking the ellipsis button opens a file dialog as shown in Figure 3, where the directory can be chosen from the file system.

Otherwise, the absolute path of the directory must be specified in the text field as shown in Figure 2.

The path provided here should point to an existing directory.

Icon
  • The directory specified in Compute Paths relative to Directory property will be used in computing the path only if the path specified here is not absolute.
  • FileReader throws an exception if the specified directory does not exist.
  • If the component is not configured in scheduling mode, the Source Directory can be specified in the input message to the component and the directory specified during configuration, if any, is overridden by the one provided in the input message.
Use Working Directory

Specify if the working directory is to used.

  • If enabled, the files will be moved from the Source Directory to Working directory while FileReader processes the file. The component needs write/modify permissions on the Source Directory to be able to use a working directory. If such permissions are not available, disable this property.
  • If disabled, files won't be moved to the Working directory.
Icon

The property Working Directory becomes visible only when Use Working Directory is enabled.

Working Directory

The path of the directory which is to be used for intermediate processing of files. If preprocessing actions are specified, the working directory will be used while processing them. If Is configured on different machine? is disabled, clicking the ellipsis button will open a file dialog as shown in Figure 3, where the directory can be chosen from the file system. Otherwise, the path of the directory must be specified in the text field as shown in Figure 2.

Icon
  • This property is visible only when Use Working Directory is enabled.
  • Either an absolute path or a path relative to the directory specified in the Compute Paths relative to Directory can be provided.
  • If this directory doesn't exist, FileReader creates it while processing the input message.
  • FileReader requires write permissions on Working and Error Directories
Process Pending files in Working Directory

Enable if the pending files present in the Working directory are to be processed. When this property is enabled, for every input request to read a file from a specific directory, the file is searched in Working directory in addition to the specified directory. If the component is in scheduling mode, enabling this property processes the files in the Working directory as well.

  • If a file with the same name exists in both the Source directory and Working directory, the file in the Source directory is processed.
  • Enabling this property will not have any effect if Use Working Directory is disabled
Error Directory

Path of the directory which should hold the files whose processing has not been successful.
If Is configured on different machine? is disabled, clicking the ellipsis button will open a file dialog as shown in Figure 3 where the directory can be chosen from the file system. Otherwise, the path of the directory must be specified in the text field as shown in Figure 2.

Icon
  • Either an absolute path or a path relative to the directory specified in the Compute Paths relative to Directory can be provided.
  • If this directory does not exist, FileReader creates it while processing the input message.
  • FileReader requires write permissions on Working and Error Directories.

Line Count

The input file will be read in blocks of lines, specified by the "line count" property and each block will be sent as output.

Icon

Line Count property will be visible only when Is file Binary? property is disabled.

Header Count

The count of lines at the beginning of the file that must be skipped while reading the file. Used when the starting lines of the files are headers.

Icon

Header Count property will be visible only when Is file Binary? property is disabled.

File Encoding


Figure 6: Different types of File Encoding

The encoding to be used while reading the file. Above figure shows all the encodings that can be used.

  • ASCII
    A coding standard used to represent plain text. It is based on English Alphabetical order.
  • Cp1252
    This is a character encoding of the Latin alphabet
  • UTF8
    A variable-length character encoding for Unicode
  • UTF-16
    This too is a variable-length character encoding for Unicode. The encoding form maps each character to a sequence of 16-bit words
  • ISO8859_1
    ISO 8859-1, more formally cited as ISO/IEC 8859-1 is part 1 of ISO/IEC 8859, a standard character encoding of the Latin alphabet
  • EUC_KR
  • EUC_JP
  • EUC_CN
  • EUC_TW
    EUC_KR, EUC_JP, EUC_CN, EUC_TW are multi-byte character encoding systems used for Korean, Japanese, Simplified Chinese, and Traditional Chinese languages respectively. 

    Icon

    Reading UTF files with a byte order mark (BOM) attached to the beginning of the file may not give the desired result.

PreProcessing Command

Script or Command that is to be executed before the processing on file starts. A Command can be entered in the text area provided against this property in the CPS. To provide a script file, the file dialog which is shown by clicking the ellipsis button can be used.

By default, the component appends the absolute path of the file that is currently taken up for processing to this script/command, that is, the absolute path of the file would be the first argument to this script/command. More arguments for this command could be specified using the property PreProcessing Arguments.

The final command formed by the FileReader would be:

<PreProcessing Command> + <Absolute path of the file taken up for processing> + <PreProcessing Arguments>

PreProcessing Arguments

Arguments that are passed to preprocessing script or command. As mentioned in the PreProcessing Command section, the component, by default, appends the absolute path of the file that is currently taken up for processing to the PreProcessing Command. Any other arguments that need to be passed to the PreProcessing Command can be provided here.
The use of PreProcessing Commands and Arguments is explained in this Sample Scenario

Sample scenario:
Copying all the files present in the Error directory to a backup location before the processing on a file starts.

Solution:
A batch file copyerrors.bat with content copy C:\FileReader\ErrorDir %2 is written and is placed in C:\. The path of this batch file is specified for PreProcessing Command. The backup location (C:\ProcessingFailures) is specified as the value for PreProcessing Arguments.

Let, C:\test.txt be the file picked up for processing. With this configuration, the command formed by FileReader would be C:\copyerrors.bat C:\test.txt C:\ProcessingFailures. The copy command executed finally would be copying C:\FileReader\ErrorDir C:\ProcessingFailures which will move all the files present in C:\FileReader\ErrorDir to the backup location C:\ProcessingFailures.

Postprocessing Action

Action to be taken on the file after it is read successfully. Below figure shows the Postprocessing Actions that are allowed. 


Figure 7: Postprocessing Actions

  • DELETE
    Delete the file after reading it successfully.
  • MOVE
    Move the file to a different location (specified by the property Postprocessing Directory which appears when this MOVE option is selected).
    When 'MOVE' is selected as the Postprocessing Action, two other properties become visible—"Postprocessing Directory" and "Append timestamp ?".


    Figure 8: MOVE action
  • NO_ACTION
    Take no action on the file.

Postprocessing Directory

The directory to which files are to be moved after they are read successfully when MOVE is selected as the Postprocessing Action.
If Is configured on different machine? is disabled, clicking the ellipsis  button will open a file dialog, as shown in Figure 3, where the directory can be chosen from the file system. Otherwise, the path of the directory must be specified in the text field as shown in Figure 2.

Icon

This option is visible only when MOVE is selected as the Postprocessing Action.

Append timestamp?

Specifies if a time stamp has to be appended to the file names after they have been moved to the Postprocessing Directory.

  • If enabled, FileReader adds a time stamp whose format is provided through the Timestamp format property and a counter (if Append counter? is enabled).
  • If disabled, no timestamp is added to the files that are moved to the Postprocessing directory.
Icon

This option is visible only when MOVE is selected as the Postprocessing Action.

When Append timestamp option is enabled, two more supporting options appear: "Timestamp format" and "Append Counter ?".

Timestamp format

The format of the time stamp to be appended to the file name can be specified here. The descriptions of the symbols that can be used in the timestamp formats are shown below.


Figure 9: Symbols used in Timestamp format

Example: ddMMyyyy_HHmm.

Icon
  • This property is visible only when the Append timestamp is enabled.
  • Avoid using slashes ('/' or '\') in the timestamp format as they can be misinterpreted as File Separators and can lead to confusion.
  • Special characters that are not allowed in file names should not be included in the timestamp format (This can be platform specific)

Append counter?

  • If selected, a counter is appended to the filename of each processed file in addition to the time stamp. Appending counter to file names ensures that no two files in the Postprocessing directory will have the same name. The name of the file would look like <filename><time stamp><counter>.
  • If disabled, no counter is added to the files that have been moved to the Postprocessing directory.

Validate Input

If enabled, the input request sent to the FileReader is validated against the input port XSD of the component. This is an Expert property.

Icon

For the Expert Properties Cleanup resources, Target Namespace and Monitoring configuration, please refer the respective sections in the Common Configurations page.

Output XSD

This property is used to set the schema of the output message. If the file content is expected to be an XML, setting its schema on the output port using the Output XSD can be useful for applying transformations on the output message. The XSD can be provided using the Schema Editor as shown below.


Figure 10: Schema editor to provide Output XSD

Header Properties

Table 1 shows the descriptions of header properties set by the component on the output message when Flat/Binary files are processed.

Type of the file processed

Header property

Description

Flat/Binary

 

 

 

 

 

FileName

Name of the file being read.

FilePath

Path of the directory which holds the source file.

Size

The size of the file being read.

START_EVENT

An output message with this property set to true determines that the message is the first record in the set of responses generated for an input message.

Icon

This property appears only on the first record in the set of responses.

CLOSE_EVENT

An output message with this property set to true determines that the message is the last record in the set of responses generated for an input message.

Icon

This property appears only on the last record in the set of responses.

RECORD_INDEX

A value n for this property indicates that this is the nth response generated for an input message.

Flat

 

 

 

FullName

Absolute path of the processed file.

ReadAccess

Determines if the processed file is readable.

WriteAccess

Determines if the processed file is writable.

Type

File / Directory.

Binary

 

 

 

NEW

An output message with this property set to true determines that this is the first chunk of the binary file being read.

COMPLETE

An output message with this property set to true determines that this is the last chunk of the binary file being read.

START_INDEX

Determines the offset of the first byte of the current chunk read.

END_INDEX

Determines the offset of the last byte of the current chunk read.

Table 1: Header Properties

Cleanup resources (excluding connection) after each document

This closes all the resources except for the connection used by the microservice after every request. If the less processing time is more important than the less memory usage, then it is recommended to disable this property and vice versa.

For more details, refer to the respective section under in the Common Configurations page.

File Processing Delay

The time interval (in milliseconds) between the postprocessing of a matching file and processing the next matching file.

Icon
  • This property is helpful when multiple files match the configured file name.
  • Value not greater than zero implies no delay.

Target Namespace

Target Namespace for the microservice and response XML messages.

For more details, refer to the respective section under in the Common Configurations page.

Monitoring Configuration

Please refer to the Monitoring Configuration section in Common Configurations page.

Process Files in Order

If enabled, files will be processed in the order based on the properties that follow.


Figure 11: Property to choose the order of files in which they will be processed

Process Files in Order Based On the last modified time

If the Process Files in Order Based On property is chosen as "LastModifiedTime", files would be processed in the order of last modified time.

Ascending Order

If enabled, files will be processed in the ascending order of the file name (explained below) or the last modified time of the file. If disabled, descending order will be chosen.

Process Files in order based on the File Name

If the Process Files in Order Based On property is chosen as "FileName", files would be processed in order, on the basis of Filenames.


Figure 12: Properties for the property "Process Files in order" as 'File Name'

With the FileName option, pattern and pattern value type of the files to be processed can be configured to be sorted apart from choosing the ascending order alone.

Pattern

Specify if some part of the file name should be used to find its order while processing.

Example

Icon
  • [0-9]+ means one or more digits
  • [a-zA-Z]+ means one or more alphabetic letters which include small as well as capital letters.
Icon
  • FileReader attempts to find the first occurrence of the pattern specified in the CPS with the filename to consider for ordering.
  • Pattern and Pattern Value Type (explained below) together decide if sorting needs to be done based on Strings, Dates or Numbers in the filenames.
Pattern Value Type

Specify values based on which file names need to be processed in order.


Figure 13: Pattern Value Type options

Values are of the following types:

  • IntTo order filenames based on the numbers specified in the filenames, then specify either Int or Long pattern value type depending on the range of numbers which will be matched against using the pattern specified in the CPS. See Figure 12.
  • Long: Similarly if the numbers in the filenames to be ordered fall under the range of Long data type, specify Long pattern value type instead of Int.
  • String: Processes in the alphabetical order of the file name.
  • Date: Processes based on the unique date format with which files are named. Format can be defined in the Pattern Value Date Format property (explained below).

Icon

Refer Scenario 2 for illustration of processing files in order with the pattern and pattern value type.

Icon

Ensure that the Pattern property is set based on the Pattern Value Type property chosen.

Pattern Value Date Format

The unique format of the dates with which the file names are defined.


Figure 14: Pattern Value Type options

Example

Icon

ddMMMyyyy

Icon
  • This property gets enabled only if the Pattern Value Type property is "Date".
  • Ensure that the Pattern property is set to compliment the Date format.

    Icon

    [0-9]+[a-zA-Z]+[0-9]+

Elements to Encrypt

Select elements to encrypt in the Output message.

Refer to the Encrypt Decrypt Configuration section in the Common Configurations page for details.

Input and Output

Input

When FileReader is not in scheduling mode, messages can be sent onto the input port of the component specifying the file to be read and the location of the file. The schema of the input XML message is shown in Figure 10.


Figure 15: Schema of the input message

  • FileName is the name of the file which is to be read.
  • Directory is the location of the file.
Icon

If the values for FileName and Directory are not specified in the input message, the values configured for the CPS properties File Name and Source Directory are used.

Output

The output schema depends on the configuration of the property Output XSD. Schema provided for this property is directly set as schema on output port. For more information, please refer to Output XSD in Interaction Configurations section.

Icon

An input/output operation may get hung for an indefinite time waiting for a request to return due to various reasons like network fluctuations of the mounted directory. To prevent this indefinite waiting time, it is recommended to configure Connection Timeout in the runtime arguments. Refer to the Configuring Connection Timeout section in the Common Configurations section.

Testing the Interaction Configurations

Interaction configurations can be tested from the CPS by clicking the Test button. Below figures show the sample input and the corresponding output respectively.


Figure 16: Sample input


Figure 17: Output produced for sample input shown in Figure 10

Functional Demonstration

Scenario 1

Reading simple text files and displaying the contents.
Configure the FileReader as described in Configuration and Testing section and use feeder and display component to send sample input and check the response respectively.


Figure 18: Demonstrating scenario 1 with sample input and output

Input Message

Output Message

Contents of the input file appear here.

Scenario 2

Processing files in ascending order based on file name with various pattern value types and corresponding patterns.

Connect the Feeder and Display microservices (same as in Figure 18) to send sample input and check the response respectively

Icon

Ensure that the path, directory and file names of the files are specified in the FileReader Configuration. Files of ABC format (*.abc implies all files of the same format) are saved in a folder named filereader_dir in the D drive.


Figure 19: File Reader configuration

The text inside the file has been provided as below for the purpose of identifying with the file name in the Display window.

FilenameText
Tom24Nov2017

This is Tom24Nov2017 text for testing.

Sam26Oct2018This is Sam26Oct2018 text for testing.
Dan15Dec2018This is Dan15Dec2018 text for testing.


Figure 20: Files supposed to be processed

Pattern Value Type as Int

Input

Configure the microservice with values set as in Figure 12 and invoke FileReader by clicking Send in the Feeder microservice.

Pattern is specified as below to match the Int pattern value type:
Output

The input texts of the files get displayed in the ascending order of the first occurrence of numbers in the file name - Dan15Dec2018, Sam26Oct2018, Tom24Nov2017.


Figure 21: Messages displayed in the Display window based on the Int pattern value type

Icon

Similarly, if the numbers in the filenames to be ordered fall under the range of Long data type, Long pattern value typecan be used.

 

Pattern Value Type as String

Input

Choose Pattern Value Type property as "String" and provide the pattern to match the type.

Pattern is specified as below to match the String pattern value type:


Figure 22: Pattern specified for the String pattern value type

Output

The input texts of the files will be displayed in the ascending order of the starting letter in the file name - Dan15Dec2018, Sam26Oct2018, Tom24Nov2017.


Figure 23: Messages displayed in Display window based on the String pattern value type

Pattern Value Type as Date

Input

Configure the microservice with values set as in Figure 14 and invoke FileReader by clicking Send in the Feeder microservice.

Pattern is specified as below to match the Date format ddMMMyyyy:
Output

The input texts of the files will be displayed in the ascending order of the date format in the file name - Tom24Nov2017Sam26Oct2018Dan15Dec2018.


Figure 24: Messages displayed in the Display window based on the Date pattern value type

Use Case Scenario

In a revenue control packet scenario transaction files are read and then transformed.


Figure 25: Revenue Control Packet Scenario

The event process demonstrating this scenario is bundled with the installer. Documentation of the scenario and instructions to run the flow can be found in the Help tab of the in eStudio.

Useful Tips

Best Practices to read a file from a network drive using the FileReader component:

  • To access a file present in a network drive, FileReader component needs full permissions on that directory. Please enable the option "Allow Network Users to change my files" while sharing a directory.
  • If you do not have permissions to change the files, then in the File Reader Custom Property Sheet you need to disable Use Working Directory property.
  • In case running peer server as Windows/Linux service, it is possible that the network drive is not mounted by the time the peer server has started the file components. In such a case, making the peer service dependent on the service that is mounting the Network Drive will help.
Icon

To understand the service better, refer to the REST Attachments example which demonstrates FileReader service features.

Adaptavist ThemeBuilder EngineAtlassian Confluence