Build a SLO
This tutorial guides you through creating a Service Level Objective (SLO) in slaOS
SLOs set specific, measurable targets for your service's performance based on SLIs. Learn how to configure SLOs to establish clear reliability goals for your service.
Event vs Interval based SLOs
SLOs can be categorized into two main types: Event-based
and Interval-based
. Understanding the difference is crucial for effective service reliability management.
Event-based SLOs
Definition: Measure the success rate of discrete, countable occurrences.
Example: "99.9% of API requests should will be served successfully in a rolling 7d window"
Characteristics:
Based on individual events (such as requests, transactions etc)
Often used for request/response systems
Typically measured as a ratio of successful events to total events
Interval-based SLOs
Definition: Measure the proportion of time a service meets a specific criterion.
Example: "99% of all 10-minute intervals in a month will have latency less than 100ms "
Characteristics:
Based on continuous monitoring over time periods
Often used for availability or uptime metrics
Measured as a percentage of time the service meets the defined criteria
Select the SLO type that best fits your service and reliability goals. The choice between event-based and interval-based SLOs is both a technical and business decision. We recommend tech and business stakeholders collaboratively evaluate both options, test them on the slaOS UI before finalizing.
To learn more about event and interval based SLOs, we highly recommend a read of Google Cloud's Observability Handbook!
Best practice for raw data points (not pre-aggregated):
These are individual data points like status codes, latencies, or timestamps that haven't been aggregated before ingestion into slaOS.
For Value SLIs (e.g., latency):
Recommendation: Use event-based SLOs
Rationale: Each latency measurement is a discrete event. Event-based SLOs allow you to set objectives like "99% of requests should have a latency < 100ms."
For Percentage SLIs (e.g., status codes for uptime):
Recommendation: Use interval-based SLOs
Rationale: Uptime is typically measured over time periods. Interval-based SLOs allow objectives like "The service should be up 99.9% of the time over a month."
Best practice for pre-aggregated metrics
These are metrics that have been aggregated before ingestion, such as average latency or hourly uptime.
For Value SLIs (e.g., average latency):
Both event-based and interval-based can work, depending on your business goals
Event-based example: "99% of hourly average latencies should be < 100ms"
Interval-based example: "In 99% of 10-minute intervals, the average latency should be < 100ms"
Choose based on whether you care more about overall performance (event-based) or consistent performance over time (interval-based)
For Percentage SLIs (e.g., hourly uptime):
Again, both approaches can work
Event-based example: "The average of hourly uptime measurements over a calendar month should be ≥ 99.5%"
Interval-based example: "All daily (24hr intervals) average uptime measurements over a calendar month should be > 99.5%"
Choose based on whether you want to allow some fluctuation (event-based) or ensure consistent performance every day (interval-based)
Need help defining your SLOs? Contact us at hello@rated.co for consultation.
Create an Event based SLO
Follow these steps to create a Event Service Level Objective (SLO):
Find and click on "Objectives" in the side navigation bar
Click the "+ New SLO" button to open the Create SLO modal
Click on the dropdown to select from your list of active SLIs
Choose "Event" as the type of SLO you'll be building
Depending on the type of SLI you’ve chosen:
If your SLI is a value SLI, you will need to set a benchmark against which each “event” in the SLI will be compared against. To set a benchmark, you will select an operator and the benchmark value
Allowed operators:
>
,<
,>=
,<=
,=
Allowed benchmark types:
numeric
,boolean
If your SLI is a percentage SLI, you will need to set an aggregator for your SLI which will be applied to all events within a period.
Allowed aggregators:
AVG
,MIN
,MAX
,COUNT
,SUM
,PERCENTILE
Select the compliance period and target. Click “Continue”.
A name and description will be auto filled for your SLO based on your configuration
Click the "Save" button to create your new event SLO
Examples
For the example implementation, we’ll create an Event SLO with a Latency SLI. This SLO specifies that, over a calendar month, 99.9% of successful requests must have a latency under 1 second.
Find and click on "Objectives" in the side navigation bar
Click the "+ New SLO" button to open the Create SLO modal
Select Latency as the SLI and choose Event SLO
Set the operator as
≤
and benchmark as1 second
Set the time window type as
Calendar
, period asMonthly
and Service target as≥ 99.9%
The SLO will get the following name:
Latency above 99.9%
and descriptionEvent based calculation over a calendar period of 1 month
via the autofill featureClick "Save"
Create an Interval based SLO
Follow these steps to create an Interval Service Level Objective (SLO):
Find and click on "Objectives" in the side navigation bar
Click the "+ New SLO" button to open the Create SLO modal
Click on the dropdown to select from your list of active SLIs
Choose "Interval" as the type of SLO you'll be building
Configure your interval behavior. This includes:
The size of each intervals. You can choose a minimum size of 1min and a maximum size of 24h.
How we should treat intervals where no events were received. This can happen when your service is undergoing planned maintenance or during off peak hours. You can either choose to treat those intervals as
GOOD
meaning service was good,BAD
meaning service was down/misbehaving orEXCLUDE
meaning the interval won't be considered as there was nothing to consider.The aggregation function that you'll want to apply on the interval
Allowed aggregators:
AVG
,MIN
,MAX
,COUNT
,SUM
,PERCENTILE
The benchmark each intervals' aggregated result will be compared against
Allowed operators:
>
,<
,>=
,<=
,=
Allowed benchmark types:
numeric
,boolean
Select the compliance period and target. Click “Continue”.
A name and description will be auto filled for your SLO based on your configuration
Click the "Save" button to create your new interval SLO
Examples
For the example implementation, we’ll create an Interval SLO with a Latency SLI. This SLO specifies that, over a rolling 28d window, 99.5% of all 10 min intervals will have p99 latency less than or equal to 1 second.
Find and click on "Objectives" in the side navigation bar
Click the "+ New SLO" button to open the Create SLO modal
Select
Latency
as the SLI and choose Interval SLOSet the interval size as
10min
and empty interval behavior asExclude
Choose
p99
as the aggregation functionSet the operator as
≤
and benchmark as1 second
Set the time window type as
Rolling
, period as28days
and Service target as≥ 99.5%
The SLO will get the following name:
Latency above 99.5%
and descriptionCalculation every 10 min over a rolling period of 28 days
via the autofill featureClick "Save"
Last updated