Structured Analytics (.NET)

The Structured Analytics API supports the automation of various structured analytics workflows. It provides functionality for running an analysis of a structured analytics set, checking the status of the analysis, retrieving document and set errors, and performing other tasks. It also supports the use of progress indicators and cancellation tokens by provided overloaded methods with these options.

For example, you may want to use the Structured Analytics API to implement a custom workflow for running structured analytics sets. With this API, you can automate the process for the analysis and monitoring of these sets. It provides an alternative to manually performing these tasks through the Relativity UI.

You can also use the Structured Analytics Manager service through REST. However, this service doesn't support cancellation tokens or progress indicators through REST. For more information, see Structured Analytics in REST.

This page contains the following information:

See this related page:

Fundamentals for the Structured Analytics API

The Analytics application includes functionality that you can use to run structured analytics operations. These operations identify differences and similarities between documents added to a structured analytics set. You create a structured analytics set by selecting a saved search with the documents for analysis, the operations that you want to execute, the Analytics server used for this process, and other options. You can select the following structured analytics operations through the Relativity UI:

  • Email threading - groups related email messages and performs other tasks.
  • Textual near duplicate identification - identifies the level of similarity between documents.
  • Language identification - identifies primary and secondary languages in the documents.
  • Repeated content identification - identifies content reused in documents, such as footers.
  • Name normalization - identifies aliases within email headers, such as a proper name, email address, and other items. It then assigns these aliases to entities that are imported into Relativity and linked to document fields. This process simplifies the identification of senders and recipients of email messages, because multiple aliases are frequently used for them.

For more information, see Structured Analytics on the Relativity Documentation site.

Structured Analytics API

The Structured Analytics API includes the following namespaces that contain the methods, classes, and enumerations needed to automate the analysis of structured analytics sets:

For additional information, see the class libraries for the Structured Analytics API at Class library reference.

Guidelines for the Structured Analytics Manager service

Use the following guidelines when working with the Structured Analytics Manager service:

  • Execute only valid operations for the state of a Structured Analytics Set - Only certain operations are valid for the current state of your Structured Analytics Set. See the following examples:
    • A call to the CancelAsync() method is only valid when an analysis is currently running.
    • A call to the RetryErrorsAsync() method is only valid when a Structured Analytics Set is in an errored state.

    To get a list of valid operations, make a call with the GetValidTasksAsync() method. The ValidTaskResult object returned from this call contains a valid list of operations. If you attempt to execute an invalid operation, you receive 400 status response.

  • Monitor the analysis progress - After making a successful call to the RunAsync() method, make calls to GetStatusAsync() method to monitor the progress of your analysis.
  • Use Structured Analytics Set RDOs - The Structured Analytic Sets API doesn't expose standard CRUD operations for Structured Analytics Set RDO objects. Use the standard Relativity REST API to create Structured Analytics Set RDOs. For REST API information, see Relativity Dynamic Object (RDO), and for the Relativity Services API, see RDO.

Prerequisites for the Structured Analytics API

Completed the following prerequisites to begin development with the Structured Analytics API:

Code samples for the Structured Analytics API

Review the code samples in the following sections to learn about using the methods available in the Structured Analytics API. These code samples illustrate the following best practices to use when calling methods on the Structured Analytics Manager service:

  • Use helper classes to create the proxy and select an appropriate authentication type. See Relativity API Helpers.
  • Create the Services API proxy within a using block in your code.
  • Call specific methods on the Structured Analytics Manager service within a try-catch block.
  • Use await/async design pattern. See Basic Services API concepts.
  • Use the logging framework for troubleshooting and debugging purposes. See Log from a Relativity application .

Generate result fields before running an analysis

You can optionally call the RunAnalysisPreparationAsync() method when you want to generate fields for storing results before you run an analysis on a new structured analytics set. For example, you might use the following workflow in this case:

  • Create a new structured analytics set.
  • Call the RunAnalysisPreparationAsync() method to generate fields used to store results from an analysis.
  • Use the fields generated by the RunAnalysisPreparationAsync() method to create views and saved searches. You can also use this process to add a structured analytics set to a template workspace.
  • Call the RunAsync() method to complete the analysis of the structured analytics set, and store results in the fields that you generated.

To generate these results fields, call the RunAnalysisPreparationAsync() method by passing the the Artifact ID for the structured analytics set, and the workspace that contains it. You can also optionally pass ProgressReport or CancellationToken objects to this method.

public void RunSample()
{
    int workspaceId = Constants.WORKSPACE_ID;
    int sasArtifactId = Constants.SAS_ARTIFACT_ID;

    IStructuredAnalyticsManager sasManager = ServiceProxyHelper.CreateProxy<IStructuredAnalyticsManager>();

    OperationResult results = sasManager.RunAnalysisPreparationAsync(workspaceId, sasArtifactId).GetAwaiter().GetResult();

    if (results.IsSuccess)
    {
        System.Diagnostics.Debug.WriteLine($"Call to run Analysis Preparation on SAS {sasArtifactId} on workspace {workspaceId} succeeded");
    }
    else
    {
        throw new Exception($"Call to run Analysis Preparation on SAS {sasArtifactId} on workspace {workspaceId} failed.  Error message: {results.Message}");
    }
}

Retrieve valid operations for a structured analytics set

You can use the GetValidTasksAsync() method to retrieve all valid operations that you can run for a specific structured analytics set, such as retrying errors, running an analysis, and others. For a complete list of tasks, see the StructuredAnalyticsTask enumeration in the Structured Analytics API on the Relativity API reference page.

public void RunSample()
{
    int workspaceId = Constants.WORKSPACE_ID;
    int sasArtifactId = Constants.SAS_ARTIFACT_ID;

    IStructuredAnalyticsManager sasManager = ServiceProxyHelper.CreateProxy<IStructuredAnalyticsManager>();

    ValidTaskResult results = sasManager.GetValidTasksAsync(workspaceId, sasArtifactId).GetAwaiter().GetResult();

    if (results != null)
    {
        System.Diagnostics.Debug.WriteLine($"Call to get valid tasks on SAS {sasArtifactId} on workspace {workspaceId} succeeded.  Copy to legacy is legal: {results.CopyToLegacyQualified}, Valid tasks: {string.Join(",", results.ValidTasks.Select(t => t.ToString()))}");
    }
    else
    {
        throw new Exception($"Call to get valid tasks on SAS {sasArtifactId} on workspace {workspaceId} failed.");
    }
}

Run an analysis of a structured analytics set

To analyze documents in a structured analytics set, call the RunAsync() method by passing the Artifact ID for the structured analytics set, the Artifact ID for the workspace that contains it, and an AnalysisSettings object. This object has properties that you can use to specify whether all documents are updated with the analysis results and repopulated or just new ones undergo these processes. You can also optionally pass ProgressReport or CancellationToken objects to this method.

public void RunSample()
{
    int workspaceId = Constants.WORKSPACE_ID;
    int sasArtifactId = Constants.SAS_ARTIFACT_ID;

    IStructuredAnalyticsManager sasManager = ServiceProxyHelper.CreateProxy<IStructuredAnalyticsManager>();

    AnalysisSettings runAnalysisSettings = new AnalysisSettings();


    //This AnalysisSettings configuration will run the equivalent of a Pre-Smart Ingestion Full Analysis
    runAnalysisSettings.AnalyzeAll = true;
    runAnalysisSettings.PopulateAll = true;


    //This AnalysisSettings configuration will run the equivalent of a Pre-Smart Ingestion Incremental Analysis
    //runAnalysisSettings.AnalyzeAll = false;
    //runAnalysisSettings.PopulateAll = false;


    //This AnalysisSettings configuration will add new documents to staging area, remove documents from staging area
    //that no longer exist in the target saved search, and run a full analysis.
    //runAnalysisSettings.AnalyzeAll = true;
    //runAnalysisSettings.PopulateAll = false;


    //This AnalysisSettings configuration is illegal and will throw a validation error
    //runAnalysisSettings.AnalyzeAll = false;
    //runAnalysisSettings.PopulateAll = true;


    OperationResult results = sasManager.RunAsync(workspaceId, sasArtifactId, runAnalysisSettings).GetAwaiter().GetResult();

    if (results.IsSuccess)
    {
        System.Diagnostics.Debug.WriteLine($"Call to run analysis on SAS {sasArtifactId} on workspace {workspaceId} succeeded");
    }
    else
    {
        throw new Exception($"Call to run analysis on SAS {sasArtifactId} on workspace {workspaceId} failed.  Error message: {results.Message}");
    }
}

Cancel an analysis

The Structured Analytics Manager service includes the CancelAsync() and CancelAndWaitAsync() that you can use to stop an analysis of a structured analytics set. These overloaded methods take the same parameters, but return control to caller at different times in their execution. They also both support progress monitoring and cancellation tokens.

CancelAsync() method

The CancelAsync() method makes a request to cancel the analysis of a structured analytics set that is currently running. It then returns control immediately to the caller. Use this method if you want to cancel an analysis on a structured analytics set, but aren't interested in waiting for the cancel operation to complete. This approach is useful when you don't intend to perform another operation after the cancel operation finishes.

public async Task<bool> CancelAsync(Client.SamplesLibrary.Helper.IHelper helper, int workspaceId, int sasArtifactId)
{
    OperationResult results = null;
 
    using (IStructuredAnalyticsManager sasManager = helper.GetServicesManager().CreateProxy<IStructuredAnalyticsManager>(ExecutionIdentity.User))
    {
        try
        {
            results = await sasManager.CancelAsync(workspaceId, sasArtifactId);
        }
        catch (ServiceException exception)
        {
              ISampleLogger _logger = Client.SamplesLibrary.Logging.Log.Logger.ForContext<IStructuredAnalyticsManager>();
              _logger.LogError(exception,
                    "StructuredAnalyticsManager CancelAsync failed for Workspace ID {0}, SAS ID {1}", workspaceId, sasArtifactId);
        }
    }

  return results != null && results.IsSuccess;
}

CancelAndWaitAsync() method

The CancelAndWaitAsync() method makes a request to cancel the analysis of a structured analytics set. It then waits until the operation finishes before returning control to the caller. Use this method if you want to implement a workflow that cancels an analysis job, and on the return of the call, immediately uses the RunAsync() method to restart an analysis operation.

{
public async Task<bool> CancelAndWaitAsync(Client.SamplesLibrary.Helper.IHelper helper, int workspaceId, int sasArtifactId)
{
    OperationResult results = null;
 
    using (IStructuredAnalyticsManager sasManager = helper.GetServicesManager().CreateProxy<IStructuredAnalyticsManager>(ExecutionIdentity.User))
    {
        try
        {
            results = await sasManager.CancelAndWaitAsync(workspaceId, sasArtifactId);
        }
        catch (ServiceException exception)
        {
              ISampleLogger _logger = Client.SamplesLibrary.Logging.Log.Logger.ForContext<IStructuredAnalyticsManager>();
              _logger.LogError(exception,
                    "StructuredAnalyticsManager CancelAndWaitAsync failed for Workspace ID {0}, SAS ID {1}", workspaceId, sasArtifactId);
        }
    }

  return results != null && results.IsSuccess;
}

Check the status of an analysis

Use the GetStatusAsync() method to return a Status object. This object contains information about the state of job and current operations. For more information, see Fundamentals for the Structured Analytics API

public async Task<Status> GetStatusAsync(Client.SamplesLibrary.Helper.IHelper helper, int workspaceId, int sasArtifactId)
{
    Status statusResults = null;
 
    using (IStructuredAnalyticsManager sasManager = helper.GetServicesManager().CreateProxy<IStructuredAnalyticsManager>(ExecutionIdentity.User))
    {
        try
        {

            statusResults = await sasManager.GetStatusAsync(workspaceId, sasArtifactId);

        }
        catch (ServiceException exception)
        {
              ISampleLogger _logger = Client.SamplesLibrary.Logging.Log.Logger.ForContext<IStructuredAnalyticsManager>();
              _logger.LogError(exception,
                    "StructuredAnalyticsManager GetStatusAsync failed for Workspace ID {0}, SAS ID {1}", workspaceId, sasArtifactId);
        }
    }

    return statusResults;
}

Retry errors in an analysis

Use the RetryErrorsAsync() method to resolve transient errors that occurred during an analysis.

public async Task<bool> RetryErrorsAsync(Client.SamplesLibrary.Helper.IHelper helper, int workspaceId, int sasArtifactId)
{
    OperationResult results = null;
 
    using (IStructuredAnalyticsManager sasManager = helper.GetServicesManager().CreateProxy<IStructuredAnalyticsManager>(ExecutionIdentity.User))
    {
        try
        {
            results = await sasManager.RetryErrorsAsync(workspaceId, sasArtifactId);
        }
        catch (ServiceException exception)
        {
              ISampleLogger _logger = Client.SamplesLibrary.Logging.Log.Logger.ForContext<IStructuredAnalyticsManager>();
              _logger.LogError(exception,
                    "StructuredAnalyticsManager RetryErrorsAsync failed for Workspace ID {0}, SAS ID {1}", workspaceId, sasArtifactId);
        }
    }

  return results != null && results.IsSuccess;
}

Retrieve analysis errors for a structure analytics set

The GetErrorsAsync() method retrieves set errors. These are errors aren't document specific and they cause an analysis to stop. For example, a set error may be a validation error for a structured analytics set, or it may occur when the Analytics server is disabled or goes down. This method returns StructuredAnalyticsSetError object. For more information, see Fundamentals for the Structured Analytics API .

public async Task<List<StructuredAnalyticsSetError>> GetSetErrorsAsync(Client.SamplesLibrary.Helper.IHelper helper, int workspaceId, int sasArtifactId, int indexOfFirstError, int maxNumberOfErrorsToReturn)
{
    List<StructuredAnalyticsSetError> results = null;
 
    using (IStructuredAnalyticsManager sasManager = helper.GetServicesManager().CreateProxy<IStructuredAnalyticsManager>(ExecutionIdentity.User))
    {
        try
        {
            results = await sasManager.GetErrorsAsync(workspaceId, sasArtifactId, indexOfFirstError, maxNumberOfErrorsToReturn);
        }
        catch (ServiceException exception)
        {
              ISampleLogger _logger = Client.SamplesLibrary.Logging.Log.Logger.ForContext<IStructuredAnalyticsManager>();
              _logger.LogError(exception,
                    "StructuredAnalyticsManager GetErrorsAsync failed for Workspace ID {0}, SAS ID {1}", workspaceId, sasArtifactId);
        }
    }

  return results;
}

Retrieve document errors for an analysis

The GetDocumentErrorsAsync() method retrieves errors that are associated with the processing of a specific document. For example, this type of error occurs when a document is too large to be extracted or ingested in the Analytics engine, or when Analytics engine fails to process a document due to corruption. This method returns a list of DocumentError objects.

public async Task<List<DocumentError>> GetDocumentErrorsAsync(Client.SamplesLibrary.Helper.IHelper helper, int workspaceId, int sasArtifactId, int indexOfFirstError, int maxNumberOfErrorsToReturn)
{
    List<DocumentError> results = null;
 
    using (IStructuredAnalyticsManager sasManager = helper.GetServicesManager().CreateProxy<IStructuredAnalyticsManager>(ExecutionIdentity.User))
    {
        try
        {
            results = await sasManager.GetDocumentErrorsAsync(workspaceId, sasArtifactId, indexOfFirstError, maxNumberOfErrorsToReturn);
        }
        catch (ServiceException exception)
        {
              ISampleLogger _logger = Client.SamplesLibrary.Logging.Log.Logger.ForContext<IStructuredAnalyticsManager>();
              _logger.LogError(exception,
                    "StructuredAnalyticsManager GetDocumentErrorsAsync failed for Workspace ID {0}, SAS ID {1}", workspaceId, sasArtifactId);
        }
    }

  return results;
}

Work with legacy document fields

As of Relativity 9.5.196.102, Structured Analytics stopped writing results for email threading and textual near duplicate identification to Document fields. Instead, it writes results to fields on the Structured Analytics Results object. However, you can copy these results to the legacy document result fields by calling the RunCopyToLegacyAsync() method. It copies the results from the newly created fields to the legacy document fields from pre 9.5.196 versions of Relativity. For more information, see Copy to Legacy Document Fields on the Relativity Documentation site.

Copy to legacy document fields

To copy analysis results to legacy document fields, call the RunCopyToLegacyAsync() method by passing the Artifact ID for the structured analytics set and the workspace that contains it. You can also optionally pass ProgressReport or CancellationToken objects to this method.

public void RunSample()
{
    int workspaceId = Constants.WORKSPACE_ID;
    int sasArtifactId = Constants.SAS_ARTIFACT_ID;

    IStructuredAnalyticsManager sasManager = ServiceProxyHelper.CreateProxy<IStructuredAnalyticsManager>();

    OperationResult results = sasManager.RunCopyToLegacyAsync(workspaceId, sasArtifactId).GetAwaiter().GetResult();

    if (results.IsSuccess)
    {
        System.Diagnostics.Debug.WriteLine($"Call to run CopyToLegacy on SAS {sasArtifactId} on workspace {workspaceId} succeeded");
    }
    else
    {
        throw new Exception($"Call to run CopyToLegacy on SAS {sasArtifactId} on workspace {workspaceId} failed.  Error message: {results.Message}");
    }
}

Cancel copy of content to legacy fields

To cancel a copy operation, call the CancelCopyToLegacyAsync() method by passing the Artifact ID for the structured analytics set and the workspace that contains it. You can also optionally pass ProgressReport or CancellationToken objects to this method.

public void RunSample()
{
    int workspaceId = Constants.WORKSPACE_ID;
    int sasArtifactId = Constants.SAS_ARTIFACT_ID;

    IStructuredAnalyticsManager sasManager = ServiceProxyHelper.CreateProxy<IStructuredAnalyticsManager>();

    OperationResult results = sasManager.CancelCopyToLegacyAsync(workspaceId, sasArtifactId).GetAwaiter().GetResult();

    if (results.IsSuccess)
    {
        System.Diagnostics.Debug.WriteLine($"Call to cancel CopyToLegacy on SAS {sasArtifactId} on workspace {workspaceId} succeeded");
    }
    else
    {
        throw new Exception($"Call to cancel CopyToLegacy on SAS {sasArtifactId} on workspace {workspaceId} failed.  Error message: {results.Message}");
    }
}