The following page describes import performance best practices.
Import Job Settings
For improved performance when working with fileshare data on ADLS, we highly recommend using extracted text or other long text files encoded in UTF-16. By doing so, you can avoid the need for conversion to the correct encoding, leading to significant time savings in your document and image workflows.
For the document workflow, set FieldMapping.Encoding to UTF-16. Similarly, for the image workflow, configure ImageSettings.ExtractedTextEncoding as UTF-16. With these settings in place, the conversion overhead is eliminated, and your files will be copied directly in the unicode encoding, resulting in faster processing times.
C# Builders
Copy
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
ImportDocumentSettings importDocuments = ImportDocumentSettingsBuilder.Create()
.WithAppendMode()
.WithNatives(x => x
.WithFilePathDefinedInColumn(filePathColumnIndex)
.WithFileNameDefinedInColumn(fileNameColumnIndex))
.WithoutImages()
.WithFieldsMapped(x => x
.WithField(controlNumberColumnIndex, "Control Number")
.WithExtractedTextField(extractedTextPathColumnIndex, e => e
.WithExtractedTextInSeparateFiles(f => f
.WithEncoding("UTF-16")
.WithFileSizeDefinedInColumn(fileSizeColumnIndex))))
.WithoutFolders();
ImportDocumentSettings importImages = ImportDocumentSettingsBuilder.Create()
.WithAppendMode()
.WithoutNatives()
.WithImages(i => i
.WithAutoNumberImages()
.WithoutProduction()
.WithExtractedText(e => e.WithEncoding("UTF-16"))
.WithFileTypeAutoDetection())
.WithoutFieldsMapped()
.WithoutFolders();
FileSizeColumnIndex
Another valuable setting that can enhance performance is the FieldMapping.FileSizeColumnIndex. By configuring this setting, the need for additional file size calculations can be eliminated. The file sizes will be automatically extracted from the load file, streamlining the process and saving valuable processing time.
The FileSizeColumnIndex setting will only take effect if FieldMapping.ContainsFilePath is set to true, and the FieldMapping.Encoding is set to UTF-16. This property applies only to long text fields stored in Data Grid, including Extracted Text.