LogoLogo
HomeApplicationBlog
  • WELCOME
    • Overview
    • Quickstart
    • Key Features
    • Demo
  • ONBOARDING YOUR DATA
    • Getting started
    • Prerequisites
    • Integrations
      • Prometheus
        • Handling organization_id
        • Privacy and Security
      • CloudWatch
      • Datadog
      • Coming soon
      • Request integration
    • Self-hosting
      • Configuration yaml
      • Running locally with Docker
      • Running with Kubernetes (k8s)
    • Data API
      • Example implementation
    • Filters
      • Open source log parsing library
      • Data hashing & transformation
    • Custom adapters
  • API (BETA)
    • Authentication
    • Pagination
    • API Reference
  • How to's
    • Tutorials
      • Build a SLI
      • Build a SLO
      • Create an Organization
      • Build a SLA
      • Configure a SLA Portal
    • Guides
    • Glossary
  • MISC
    • Changelog
    • Join the Closed Beta
  • Legal
    • Terms of Service
    • Contributor License Agreement
    • Privacy Notice
Powered by GitBook
On this page
  • Configuration File Structure
  • Section 1: Input
  • Section 2: Output
  • Section 3: Secrets
  • Conclusion
  1. ONBOARDING YOUR DATA
  2. Self-hosting

Configuration yaml

PreviousSelf-hostingNextRunning locally with Docker

Last updated 6 months ago

This guide provides a detailed explanation of the configuration YAML file used for the slaOS self-hosted indexer solution. Understanding this configuration is crucial for setting up and customizing your self-hosted indexer.

Configuration File Structure

The configuration is organized into three main sections:

  1. inputs: A list of input configurations

  2. output: Configuration for the output destination

  3. secrets: Configuration for secrets management

Let's explore each section in detail.

Section 1: Input

The inputs section defines the data sources for your indexer. It is a list of integration objects, you can run more than one input/integration concurrently fully managed.

It specifies which integration to use and the necessary configuration for that integration.

inputs:
  - integration: <integration_type>
    slaos_key: <unique_identifier>
    type: <logs_or_metrics>
    <integration_specific_config>
    filters: <optional_filter_config>
    offset: <offset_config>
Example configuration for Cloudwatch logs
 inputs:
  - integration: cloudwatch
    integration_prefix: "cloudwatch_logs_test"
    type: logs
    cloudwatch:
      region: us-east-1
      aws_access_key_id: AKIAXXXXX
      aws_secret_access_key: X/XX+XXXX
      logs_config:
        log_group_name: "/aws/apprunner/prod-rated-api/32cf02da3ba8495f87ad79806b0521e5/application"
        filter_pattern: '{ $.event = "request_finished" }'
    filters:
      version: 1
      log_format: json_dict
      log_example: { }
      fields:
        - key: "status_code"
          value: "22"
          field_type: "integer"
          path: "status_code"
        - key: "organization_id"
          value: "e6bd1f68367b4eee993f247e7301107a"
          field_type: "string"
          path: "user.id"
        - key: "path"
          value: "operators"
          field_type: "string"
          path: "request_route_name"
    offset:
      type: redis
      override_start_from: true
      start_from: 1724803200000
      start_from_type: bigint
      redis:
        host: redis
        port: 6379
        db: 0

Key components

  • integration: Specifies the data source (e.g., cloudwatch, datadog). This determines which integration-specific configuration is required.

  • slaos_key: A unique identifier for the input. This is used to differentiate data submitted to slaOS when multiple integrations are running.

    • Example: If slaos_key is set to "prod_api_cloudwatch", a data point with key "status_code" will be submitted to slaOS as "prod_api_cloudwatch_status_code".

    • Validation: Each slaos_key must be unique across all inputs to avoid conflicts.

    • Context: The slaos_key is mandatory when using more than one integration. It prevents conflicts in data submitted to slaOS by prefixing all data points from this input with the specified prefix.

  • type: Specifies "logs" or "metrics". This determines how the input data is processed and which additional configurations (like filters) are required.

  • filters: Configuration for data filtering. This is only applicable and required for log-type inputs. It defines how log data should be parsed and transformed.

  • offset: Configuration for tracking the last processed position in the data stream. This ensures idempotent operation and allows for efficient data processing, especially after interruptions or for backfills.

Filters section

The filters section defines how the indexer processes and transforms input data. This is where you specify the log format and define the fields you want to extract. It is only applicable for log-type inputs and is not needed for metrics.

Structure

filters:
  version: <version_number>
  log_format: <format_type>
  log_example: <example_log_entry>
  fields:
    - key: <field_name>
      path: <json_path>
      field_type: <data_type>

Example

filters:
  version: 1
  log_format: json_dict
  log_example: { "timestamp": "2023-01-01T00:00:00Z", "level": "INFO", "message": "Example log" }
  fields:
    - key: "timestamp"
      path: "timestamp"
      field_type: "timestamp"
    - key: "level"
      path: "level"
      field_type: "string"
      hash: true
    - key: "message"
      path: "message"
      field_type: "string"

For a more detailed explanation of how filters work, please refer to:

Offset section

The offset section is responsible for tracking the last processed position in the input data stream. This ensures idempotent operation and allows for efficient data processing.

Structure

offset:
  type: <storage_type>
  override_start_from: <boolean>
  start_from: <start_position>
  start_from_type: <data_type>
  <storage_specific_config>
  • The override_start_from option is particularly useful for backfills, allowing you to specify a starting point for data processing.

Examples

offset:
  type: redis
  override_start_from: true
  start_from: 1724803200000
  start_from_type: bigint
  redis:
    host: redis
    port: 6379
    db: 0
offset:
  type: postgres
  override_start_from: true
  start_from: 123456789
  start_from_type: bigint
  postgres:
    table_name: offset_tracking
    host: localhost
    port: 5432
    database: postgres
    user: postgres
    password: postgres
offset:
  type: slaos
  override_start_from: true
  start_from: 123456789
  start_from_type: bigint
  ingestion_id: ingestion-id
    ingestion_key: ingestion-key
    ingestion_url: https://api.rated.co/v1/ingest
    datastream_filter:
      key: datastream_key
      organization_id: customer_one

The datastream_filter is used to identify the offset related to this specific instance of the indexer. The key is a required field and corresponds to the slaos_key associated with this instance.

We also provide an optional filter parameter on organization_id. This should only be used if you have multiple instances of the indexer using the same key. A typical example of when this might happen is when indexing metrics for a resource used for particular customer or organization. For values that have been hashed in the filters config, prefix with hash: (e.g., hash:value).

Section 2: Output

The output section defines where the processed data should be sent. slaOS supports two output types: rated (for sending data to direct ingestion API) and console (for debugging purposes).

output:
  type: rated
  rated:
    ingestion_id: your_ingestion_id
    ingestion_key: your_ingestion_key
    ingestion_url: https://rated.live/v1/ingest

To obtain the ingestion_id and ingestion_key, you need to create an account on the slaOS platform. Once logged in, navigate to the API management section where you can generate and manage your ingestion credentials.

To use console output for debugging, you can configure it like this:

output:
  type: console
  console:
    verbose: true

Section 3: Secrets

The secrets section allows you to use a secrets manager for sensitive configuration values.

secrets:
  use_secrets_manager: true
  

If use_secrets_manager is set to true, any value in the YAML that starts with "secret:" will be resolved using the specified secrets manager. For example:

secrets:
  use_secrets_manager: true
  provider: aws
  aws:
    region: us-west-2
    aws_access_key_id: fake_access_key
    aws_secret_access_key: fake_secret_key

Let's break down each part of this configuration:

  • use_secrets_manager: true: This enables the use of the secrets manager.

  • provider: aws: This specifies that we're using AWS as our secrets provider.

  • aws: This section contains the configuration specific to AWS:

    • region: us-west-2: The AWS region where your secrets are stored.

    • aws_access_key_id: Your AWS access key ID for accessing the secrets manager.

    • aws_secret_access_key: Your AWS secret access key for accessing the secrets manager.

IAM Policy - using AWS Secrets manager

If you are self-hosting slaOS and using AWS Secrets Manager to store sensitive information like access keys, you need to configure additional IAM permissions.

  1. Create a Secrets Manager Policy: Use the following JSON document to create a policy.

    {
      "Version": "2012-10-17",
      "Statement": [
        {
          "Sid": "SecretsManagerAccess",
          "Effect": "Allow",
          "Action": [
            "secretsmanager:GetSecretValue",
            "secretsmanager:DescribeSecret"
          ],
          "Resource": "arn:aws:secretsmanager:*:*:secret:*"
        }
      ]
    }
  2. Attach the Policy: Name the policy RatedSecretsManagerAccessPolicy and attach it to the IAM user created for slaOS.

Conclusion

Understanding and properly configuring your self-hosted slaOS indexer is key to effectively processing your data. Always refer to the most up-to-date documentation on our GitHub repository for detailed setup instructions and best practices.

Our maintains an extensive collection of up-to-date and thoroughly tested configuration templates. These templates cover all sections of the indexer configuration and include the latest supported integrations.

Extract from GitHub Repository .

Tested input examples can be found in on our GitHub repository.

If you encounter any issues or have questions about your configuration, don't hesitate to reach out to our support team at or consult the community forums.

GitHub repository
input template examples
inputs template directory
Filters
hello@rated.network
https://github.com/rated-network/rated-log-indexergithub.com
https://github.com/rated-network/rated-log-indexer/tree/main/templates/offsetgithub.com