Performance Best Practices

The following page describes import performance best practices.

Import Job Settings

For improved performance when working with fileshare data on ADLS, we highly recommend using extracted text or other long text files encoded in UTF-16. By doing so, you can avoid the need for conversion to the correct encoding, leading to significant time savings in your document and image workflows.

For the document workflow, set FieldMapping.Encoding to UTF-16. Similarly, for the image workflow, configure ImageSettings.ExtractedTextEncoding as UTF-16. With these settings in place, the conversion overhead is eliminated, and your files will be copied directly in the unicode encoding, resulting in faster processing times.

C# Builders

Copy
ImportDocumentSettings importDocuments = ImportDocumentSettingsBuilder.Create()
            .WithAppendMode()
            .WithNatives(x => x
                .WithFilePathDefinedInColumn(filePathColumnIndex)
                .WithFileNameDefinedInColumn(fileNameColumnIndex))
            .WithoutImages()
            .WithFieldsMapped(x => x
                .WithField(controlNumberColumnIndex, "Control Number")
                .WithExtractedTextField(extractedTextPathColumnIndex, e => e
                    .WithExtractedTextInSeparateFiles(f => f
                        .WithEncoding("UTF-16")
                        .WithFileSizeDefinedInColumn(fileSizeColumnIndex))))
            .WithoutFolders();


        ImportDocumentSettings importImages = ImportDocumentSettingsBuilder.Create()
            .WithAppendMode()
            .WithoutNatives()
            .WithImages(i => i
                .WithAutoNumberImages()
                .WithoutProduction()
                .WithExtractedText(e => e.WithEncoding("UTF-16"))
                .WithFileTypeAutoDetection())
            .WithoutFieldsMapped()
            .WithoutFolders();

FileSizeColumnIndex

Another valuable setting that can enhance performance is the FieldMapping.FileSizeColumnIndex. By configuring this setting, the need for additional file size calculations can be eliminated. The file sizes will be automatically extracted from the load file, streamlining the process and saving valuable processing time.

The FileSizeColumnIndex setting will only take effect if FieldMapping.ContainsFilePath is set to true, and the FieldMapping.Encoding is set to UTF-16. This property applies only to long text fields stored in Data Grid, including Extracted Text.