Scaling guidelines for migrated agents

After migrating agents in RelativityOne, you can use the following scaling guidelines to configure them. In RelativityOne, agents run on a scalable and extensible cloud platform, which provides a standard set of resources that they can utilize with each run. Agents with widely varying workload sizes may benefit from dynamic scaling. With scaling, agents can complete their overall workload faster.

Optionally, you can choose to dynamically scale your agents after completing the migration process. For more information, see Agent migration checklist.

This page contains the following information:

Scaling fundamentals

You can dynamically set the number of concurrently running agent processes based on workload of an agent at any given time. The default number of processes is one. Each agent process is given 1 vCPU and 1 GB of RAM. This is not customizable. You can horizontally scale by defining a Workload Discovery endpoint that provides visibility into the current workload size of an agent.

Scaling use cases

If your agents have with widely varying workloads, consider dynamic scaling to ensure that the necessary resources are available for them at any time. For example, the following agents in Relativity have varying workloads:

  • Redaction Agent – This agent may handle a single redaction on a page. At other times, it may apply rules across millions of documents.
  • Branding Agent – This agent may brand a single page, but it might also brand productions for thousands of documents.

These agents have differing needs based on a specific job and benefit from scaling. Other agents that are computationally lightweight or have consistent workloads don't benefit from scaling and shouldn't scale beyond one process.

Get started with scaling

The key component for dynamically scaling an agent is its Workload Discovery endpoint. This endpoint communicates with the Relativity platform, providing information about the amount of work required by an agent for a specific tenant at any given time. The platform checks the Workload Discovery endpoint at the run interval frequency defined by the agent. This process ensures that the appropriate number of resources are dedicated to the agent at all times. If the endpoint returns with no work to complete, no new resources are allocated to it, but existing jobs are allowed to complete.

The Workload Discovery endpoint returns workload sizes that decrease over time as work is completed and the pool of remaining tasks is reduced. To ensure consistent work throughput and user experience, Relativity never downscales an already running set of agents if work remains. If more resources are required and a larger workload size is requested, Relativity scales up the agent.

  • Usage of the Workload Discovery endpoint also allows you to reduce the agent interval below 60 minutes. For more information about run intervals, see Execution patterns for migrated agents.
  • This endpoint can return the string One in the JSON payload when agent scaling isn't needed. See Creating the Workload Discovery endpoint.
  • Remember to include both the Interfaces and Services dlls for your Workload Discovery Kepler Service in your RAP solution for the agent to function properly.

Creating the Workload Discovery endpoint

The Workload Discovery endpoint can be any HTTP endpoint that is accessible from Relativity. It must return the following entities:

  • A JSON payload with a Size property set to one of the following values: None, One, S, M, L, XL, or XXL.
  • A value accurately indicating the amount of work that the agent currently needs to do.

We recommend creating a Kepler API that is included in the Relativity application (RAP) file with your agent. The following code samples illustrate how to create this Kepler API. To download the required SDK, see Relativity.Platform.Agent.WorkloadDiscovery.SDK on the NuGet site.

Associate the Workload Discovery endpoint with an agent

After creating an API, you must associate it with an agent. The following codes samples illustrate how to do this through .NET.

Workload discovery best practices

To provide optimum dynamic scaling, the Workload Discovery endpoint should support the following behavior:

  • Return within 2 seconds - Use these endpoints only for a quick check of the amount of work an agent has to do. The agent itself should perform any advanced calculations or other processing.
  • Accurately reflect the workload size - The returned size should always accurately reflect the amount of work the agent has to complete at a given time. Any large variations result in suboptimal scaling behavior.