The FileReader component reads files from the file system and sends their contents to the output port. The source file can either be in the form of text or binary.
- Text file
Text file may be read from a specified file in an unstructured fashion and the content is sent in a single message.
- Binary file
Binary file contents are read as bytes of data from the source file and are sent in chunks or bundles to the output port of the FileReader component based on the configuration properties of this component.
Points to note
- The component runs on the peer server and therefore the file paths and directories mentioned in the CPS should be valid on the machine where the peer server is running. If the component fails over to another peer server, ensure that the machine on which the secondary peer server is running must have the same path available.
- The unstructured plain text can be transformed into its corresponding XML using the Text2XML component.
- Number of outgoing messages for an input binary file = ceil (Size of File/ Chunk Size).
Configuration
Interaction Configurations
Business logic configuration details are configured in the Interaction Configurations panel. Figure 1 below illustrates the panel with Expert Properties view enabled.
Figure 1: Interaction Configurations
Attributes
Use File Reader Configuration from Input
If enabled, File and Directory details would be taken from the input.
Is file Binary?
The property is used to specify if the input file which is being read is binary or a flat file.
- If enabled, the contents of the input binary file are read as binary data and are sent to the output port in chunks whose size, in bytes, is specified through the property Chunk size.
- If disabled, the contents of the target file are read in an unstructured fashion and the content is sent to the output port as a single message.
Chunk size (bytes)
When the input file being read is binary, you can choose to receive the contents read from the file in chunks of binary data at the output ports. The size of these chunks (in bytes) can be specified in this property. When the chunk size is specified as 0 bytes, the whole file is read in a single run.
The number of output messages received = ceil (Size of File / Chunk size). The last output message received can be identified by the value of the property COMPLETE in its message headers. Refer Table 1 for information on message headers.
FileReader Configuration
Click the Ellipsis button to configure the directories in which different files have to be saved.
Figure 2: FileReader Configuration Attributes
Is configured for different machine?
Specifies whether the Peer Server on which the component is to be launched and Fiorano Studio are running on the same machine or on different machines. Enable this if the Peer Server on which the component is to be launched and Fiorano Studio are running on different machines and disable if Peer Server and Fiorano Studio are running on the same machine.
This helps the component to determine the type of dialog to be shown while providing the paths of Source Directory, Working Directory and Error Directory (these directories are described in the next sections below). When both the Peer Server and the Studio are running on the same machine, the paths to the above specified directories can be chosen from a file dialog with the directory structure of the current machine. Otherwise, the absolute path of Source/Working/Error directories must be specified in the text field.
Figure3:Choosing directory path using File Dialog
Compute Paths relative to Directory
The path of the directory relative to which the paths of Source Directory, Working Directory, Error Directory and Postprocessing Directory are calculated. By default, this points to the FIORANO_HOME directory. If the paths specified for Source/Working/Error/Postprocessing directories are not absolute, their paths are calculated relative to the directory specified here.
Example :
If the Source/Working/Error/PostProcessing directories are in /home/fiorano location, then their absolute paths can be specified as in the figure.
Figure 4: With absolute paths
If source/work/error/postproccessing directories branch from the same location "/home/fiorano", "/home/fiorano" which is the relative path for those directories is specifed in Compute paths relative to Directory.
Figure 5: With Relative paths
File name
The name of the file to be read. A pattern of file names can also be provided using wild character *. Multiple patterns are not allowed. All the files in the Source Directory are checked against this pattern and are suitably processed.
Example:
- *.txt includes all the files with a .txt extension.
- S*.* would include Sample.txt, Service.doc, but not SampleFile
Source Directory
The directory which holds the file(s) to be read has to be specified in Source Directory. All the files in this directory whose names match the pattern specified for the File name property will be processed. The files present in the sub-directories are not considered.
An absolute path or a path relative to the directory specified in the Compute Paths relative to Directory can be provided.
If Is configured on different machine? is disabled, clicking the ellipsis button opens a file dialog as shown in Figure 3, where the directory can be chosen from the file system.
Otherwise the absolute path of the directory must be specified in the text field as shown in Figure 2.
The path provided here should point to an existing directory.
Use Working Directory
Specify if the working directory is to used.
- If enabled, the files will be moved from the Source Directory to Working directory while FileReader processes the file. The component needs write/modify permissions on the Source Directory to be able to use a working directory. If such permissions are not available, disable this property.
- If disabled, files won't be moved to the Working directory.
Working Directory
The path of the directory which is to be used for intermediate processing of files. If preprocessing actions are specified, the working directory will be used while processing them. If Is configured on different machine? is disabled, clicking the ellipsis button will open a file dialog as shown in Figure 3, where the directory can be chosen from the file system. Otherwise the path of the directory must be specified in the text field as shown in Figure 2.
Process Pending files in Working Directory
Enable if the pending files present in the Working directory are to be processed. When this property is enabled, for every input request to read a file from a specific directory, the file is searched in Working directory in addition to the specified directory. If the component is in scheduling mode, enabling this property processes the files in the Working directory as well.
- If a file with same name exists in both the Source directory and Working directory, the file in the Source directory is processed.
- Enabling this property will not have any effect if Use Working Directory is disabled
Error Directory
Path of the directory which should hold the files whose processing has not been successful.
If Is configured on different machine? is disabled, clicking the ellipsis button will open a file dialog as shown in Figure 3 where the directory can be chosen from the file system. Otherwise the path of the directory must be specified in the text field as shown in Figure 2.
Line Count
With this property the input file will be read in blocks of lines, specified by the "line count" property and each block will be sent as output
Header Count
The count of lines at the beginning of the file that must be skipped while reading the file . Used when the starting lines of the files are headers.
File Encoding
Figure 6: Different types of File Encoding
The encoding to be used while reading the file. Above figure shows all the encodings that can be used.
- ASCII
A coding standard used to represent plain text. It is based on English Alphabetical order.
- Cp1252
This is a character encoding of the Latin alphabet
- UTF8
A variable-length character encoding for Unicode
- UTF-16
This too is a variable-length character encoding for Unicode. The encoding form maps each character to a sequence of 16-bit words
- ISO8859_1
ISO 8859-1, more formally cited as ISO/IEC 8859-1 is part 1 of ISO/IEC 8859, a standard character encoding of the Latin alphabet
- EUC_KR
- EUC_JP
- EUC_CN
EUC_TW
EUC_KR, EUC_JP, EUC_CN, EUC_TW are multi-byte character encoding systems used for Korean, Japanese, Simplified Chinese, and Traditional Chinese languages respectively.
PreProcessing Command
Script or Command that is to be executed before the processing on file starts. A Command can be entered in the text area provided against this property in the CPS. To provide a script file, the file dialog which is shown by clicking the ellipsis button can be used.
By default, the component appends the absolute path of the file that is currently taken up for processing to this script / command, that is, the absolute path of the file would be the first argument to this script / command. More arguments for this command could be specified using the property PreProcessing Arguments.
The final command formed by the FileReader would be:
<PreProcessing Command> + <Absolute path of the file taken up for processing> + <PreProcessing Arguments> |
PreProcessing Arguments
Arguments that are passed to preprocessing script or command. As mentioned in the PreProcessing Command section, the component, by default, appends the absolute path of the file that is currently taken up for processing to the PreProcessing Command. Any other arguments that need to be passed to the PreProcessing Command can be provided here.
The use of PreProcessing Commands and Arguments is explained in this Sample Scenario
Sample scenario:
Copying all the files present in Error directory to a backup location before the processing on a file starts.
Solution:
A batch file copyerrors.bat with content copy C:\FileReader\ErrorDir %2 is written and is placed in C:\. The path of this batch file is specified for PreProcessing Command. The backup location (C:\ProcessingFailures) is specified as the value for PreProcessing Arguments.
Let, C:\test.txt be the file picked up for processing. With this configuration, the command formed by FileReader would be C:\copyerrors.bat C:\test.txt C:\ProcessingFailures. The copy command executed finally would be copy C:\FileReader\ErrorDir C:\ProcessingFailures which will move all the files present in C:\FileReader\ErrorDir to the backup location C:\ProcessingFailures.
Postprocessing Action
Action to be taken on the file after it is read successfully. Below figure shows the Postprocessing Actions that are allowed.
Figure 7: Postprocessing Actions
- DELETE
Delete the file after reading it successfully.
- MOVE
Move the file to a different location (specified by the property Postprocessing Directory which appears when this MOVE option is selected).
When 'MOVE' is selected as the Postprocessing Action, two other properties become visible—"Postprocessing Directory" and "Append timestamp ?".
Figure 8: MOVE action
- NO_ACTION
Take no action on the file.
Postprocessing Directory
The directory to which files are to be moved when they are read successfully, when MOVE is selected as the Postprocessing Action.
If Is configured on different machine? is disabled, clicking the ellipsis button will open a file dialog, as shown in Figure 3, where the directory can be chosen from the file system. Otherwise the path of the directory must be specified in the text field as shown in Figure 2.
Append timestamp?
Specifies if a time stamp has to be appended to the file names after they have been moved to the Postprocessing Directory.
- If enabled, FileReader adds a time stamp whose format is provided through the Timestamp format property and a counter (if Append counter? is enabled).
- If disabled, no timestamp is added to the files that are moved to the Postprocessing directory.
When Append timestamp option is enabled, two more supporting options appear: "Timestamp format" and "Append Counter ?".
Timestamp format
The format of the time stamp to be appended to the file name can be specified here. The descriptions of the symbols that can be used in the time stamp formats are shown below.
Figure 9: Symbols used in Timestamp format
Example: ddMMyyyy_HHmm.
Append counter
- If selected, a counter is appended to the file name of each processed file in addition to the time stamp. Appending counter to file names ensures that no two files in the Postprocessing directory will have same name. The name of the file would look like <filename><time stamp><counter>.
- If disabled, no counter is added to the files that have been moved to the Postprocessing directory.
Validate Input
If enabled, the input request sent to the FileReader is validated against the input port XSD of the component. This is an Expert property.
Output XSD
This property is used to set the schema of the output message. If the file content is expected to be an XML, setting its schema on the output port using the Output XSD can be useful for applying transformations on the output message. The XSD can be provided using the Schema Editor as shown below.
Figure 10: Schema editor to provide Output XSD
Header Properties
Table 1 shows the descriptions of header properties set by the component on the output message when Flat/Binary files are processed.
Type of the file processed | Header property | Description |
---|---|---|
Flat/Binary
| FileName | Name of the file being read. |
FilePath | Path of the directory which holds the source file. | |
Size | The size of the file being read. | |
START_EVENT | An output message with this property set to true determines that the message is the first record in the set of responses generated for an input message. | |
CLOSE_EVENT | An output message with this property set to true determines that the message is the last record in the set of responses generated for an input message. | |
RECORD_INDEX | A value n for this property indicates that this is the nth response generated for an input message. | |
Flat
| FullName | Absolute path of the processed file. |
ReadAccess | Determines if the processed file is readable. | |
WriteAccess | Determines if the processed file is writable. | |
Type | File / Directory. | |
Binary
| NEW | An output message with this property set to true determines that this is the first chunk of the binary file being read. |
COMPLETE | An output message with this property set to true determines that this is the last chunk of the binary file being read. | |
START_INDEX | Determines the offset of first byte of the current chunk read. | |
END_INDEX | Determines the offset of last byte of the current chunk read. |
Table 1: Header Properties
Process Files in Order
If enabled, files will be processed in the order based on the properties that follow.
Figure 11: Property to choose order of files in which they will processed
Process Files in Order Based On
- FileName
Files would be processed in order on the basis of File names - Last Modified Time
Files would be processed in order of last modified.
Ascending Order
If enabled, files will be processed in ascending order.
Pattern
Files will be processed in order according to the pattern specified.
Pattern Value Type
Values returned after applying pattern on file names. Values can be of the following types:
- Int
- Long
- String
- Date
Input and Output
Input
When FileReader is not in scheduling mode, messages can be sent onto the input port of the component specifying the file to be read and the location of the file. The schema of the input XML message is shown in Figure 10.
Figure 12: Schema of the input message
- FileName is the name of the file which is to be read.
- Directory is the location of the file.
Output
The output schema depends on the configuration of property Output XSD. Schema provided for this property is directly set as schema on output port. For more information, please refer to Output XSD in Interaction Configurations section.
Testing the Interaction Configurations
Interaction configurations can be tested from the CPS by clicking the Test button. Below figures show the sample input and the corresponding output respectively.
Figure 13: Sample input
Figure 14: Output produced for sample input shown in Figure 10
Functional Demonstration
Scenario 1
Reading simple text files and displaying the contents.
Configure the FileReader as described in Configuration and Testing section and use feeder and display component to send sample input and check the response respectively.
Figure 15: Demonstrating scenario 1 with sample input and output
Input Message
Output Message
Contents of the input file appears here.
Use Case Scenario
In a revenue control packet scenario transaction files are read and then transformed.
Figure 16: Revenue Control Packet Scenario
The event process demonstrating this scenario is bundled with the installer.
Documentation of the scenario and instructions to run the flow can be found in the Help tab of flow when open in Studio.
Useful Tips
Best Practices to read a file from a network drive using the FileReader component:
- To access a file present in a network drive, FileReader component needs full permissions on that directory. Please enable the option "Allow Network Users to change my files" while sharing a directory.
- If you do not have permissions to change the files, then in the File Reader Custom Property Sheet you need to disable Use Working Directory property.
- In case running peer server as Windows/Linux service, it is possible that the network drive is not mounted by the time the peer server has started the file components. In such a case, making the peer service dependent on the service that is mounting the Network Drive will help.