Configuration yaml

This guide provides a detailed explanation of the configuration YAML file used for the slaOS self-hosted indexer solution. Understanding this configuration is crucial for setting up and customizing your self-hosted indexer.

GitHub - rated-network/rated-log-indexer: Rated slaOS: from raw logs to actionable SLAs in minutesGitHub

Configuration File Structure

The configuration is organized into three main sections:

inputs: A list of input configurations
output: Configuration for the output destination
secrets: Configuration for secrets management

Our GitHub repository maintains an extensive collection of up-to-date and thoroughly tested configuration templates. These templates cover all sections of the indexer configuration and include the latest supported integrations.

Let's explore each section in detail.

Section 1: Input

The inputs section defines the data sources for your indexer. It is a list of integration objects, you can run more than one input/integration concurrently fully managed.

It specifies which integration to use and the necessary configuration for that integration.

inputs:
  - integration: <integration_type>
    slaos_key: <unique_identifier>
    type: <logs_or_metrics>
    <integration_specific_config>
    filters: <optional_filter_config>
    offset: <offset_config>

Example configuration for Cloudwatch logs

 inputs:
  - integration: cloudwatch
    integration_prefix: "cloudwatch_logs_test"
    type: logs
    cloudwatch:
      region: us-east-1
      aws_access_key_id: AKIAXXXXX
      aws_secret_access_key: X/XX+XXXX
      logs_config:
        log_group_name: "/aws/apprunner/prod-rated-api/32cf02da3ba8495f87ad79806b0521e5/application"
        filter_pattern: '{ $.event = "request_finished" }'
    filters:
      version: 1
      log_format: json_dict
      log_example: { }
      fields:
        - key: "status_code"
          value: "22"
          field_type: "integer"
          path: "status_code"
        - key: "organization_id"
          value: "e6bd1f68367b4eee993f247e7301107a"
          field_type: "string"
          path: "user.id"
        - key: "path"
          value: "operators"
          field_type: "string"
          path: "request_route_name"
    offset:
      type: redis
      override_start_from: true
      start_from: 1724803200000
      start_from_type: bigint
      redis:
        host: redis
        port: 6379
        db: 0

Extract from GitHub Repository input template examples.

Key components

integration: Specifies the data source (e.g., cloudwatch, datadog). This determines which integration-specific configuration is required.
slaos_key: A unique identifier for the input. This is used to differentiate data submitted to slaOS when multiple integrations are running.
- Example: If slaos_key is set to "prod_api_cloudwatch", a data point with key "status_code" will be submitted to slaOS as "prod_api_cloudwatch_status_code".
- Validation: Each slaos_key must be unique across all inputs to avoid conflicts.
- Context: The slaos_key is mandatory when using more than one integration. It prevents conflicts in data submitted to slaOS by prefixing all data points from this input with the specified prefix.
type: Specifies "logs" or "metrics". This determines how the input data is processed and which additional configurations (like filters) are required.
filters: Configuration for data filtering. This is only applicable and required for log-type inputs. It defines how log data should be parsed and transformed.
offset: Configuration for tracking the last processed position in the data stream. This ensures idempotent operation and allows for efficient data processing, especially after interruptions or for backfills.

Tested input examples can be found in inputs template directory on our GitHub repository.

Filters section

The filters section defines how the indexer processes and transforms input data. This is where you specify the log format and define the fields you want to extract. It is only applicable for log-type inputs and is not needed for metrics.

Structure

filters:
  version: <version_number>
  log_format: <format_type>
  log_example: <example_log_entry>
  fields:
    - key: <field_name>
      path: <json_path>
      field_type: <data_type>

Example

filters:
  version: 1
  log_format: json_dict
  log_example: { "timestamp": "2023-01-01T00:00:00Z", "level": "INFO", "message": "Example log" }
  fields:
    - key: "timestamp"
      path: "timestamp"
      field_type: "timestamp"
    - key: "level"
      path: "level"
      field_type: "string"
      hash: true
    - key: "message"
      path: "message"
      field_type: "string"

For a more detailed explanation of how filters work, please refer to:

Filters

Offset section

The offset section is responsible for tracking the last processed position in the input data stream. This ensures idempotent operation and allows for efficient data processing.

Structure

offset:
  type: <storage_type>
  override_start_from: <boolean>
  start_from: <start_position>
  start_from_type: <data_type>
  <storage_specific_config>

The override_start_from option is particularly useful for backfills, allowing you to specify a starting point for data processing.

Examples

offset:
  type: redis
  override_start_from: true
  start_from: 1724803200000
  start_from_type: bigint
  redis:
    host: redis
    port: 6379
    db: 0

offset:
  type: postgres
  override_start_from: true
  start_from: 123456789
  start_from_type: bigint
  postgres:
    table_name: offset_tracking
    host: localhost
    port: 5432
    database: postgres
    user: postgres
    password: postgres

offset:
  type: slaos
  override_start_from: true
  start_from: 123456789
  start_from_type: bigint
  ingestion_id: ingestion-id
    ingestion_key: ingestion-key
    ingestion_url: https://api.rated.co/v1/ingest
    datastream_filter:
      key: datastream_key
      organization_id: customer_one

The datastream_filter is used to identify the offset related to this specific instance of the indexer. The key is a required field and corresponds to the slaos_key associated with this instance.

We also provide an optional filter parameter on organization_id. This should only be used if you have multiple instances of the indexer using the same key. A typical example of when this might happen is when indexing metrics for a resource used for particular customer or organization. For values that have been hashed in the filters config, prefix with hash: (e.g., hash:value).

https://github.com/rated-network/rated-log-indexer/tree/main/templates/offsetgithub.com

Section 2: Output

The output section defines where the processed data should be sent. slaOS supports two output types: rated (for sending data to direct ingestion API) and console (for debugging purposes).

output:
  type: rated
  rated:
    ingestion_id: your_ingestion_id
    ingestion_key: your_ingestion_key
    ingestion_url: https://rated.live/v1/ingest

To obtain the ingestion_id and ingestion_key, you need to create an account on the slaOS platform. Once logged in, navigate to the API management section where you can generate and manage your ingestion credentials.

To use console output for debugging, you can configure it like this:

output:
  type: console
  console:
    verbose: true

Section 3: Secrets

The secrets section allows you to use a secrets manager for sensitive configuration values.

secrets:
  use_secrets_manager: true

If use_secrets_manager is set to true, any value in the YAML that starts with "secret:" will be resolved using the specified secrets manager. For example:

secrets:
  use_secrets_manager: true
  provider: aws
  aws:
    region: us-west-2
    aws_access_key_id: fake_access_key
    aws_secret_access_key: fake_secret_key

Let's break down each part of this configuration:

use_secrets_manager: true: This enables the use of the secrets manager.
provider: aws: This specifies that we're using AWS as our secrets provider.
aws: This section contains the configuration specific to AWS:
- region: us-west-2: The AWS region where your secrets are stored.
- aws_access_key_id: Your AWS access key ID for accessing the secrets manager.
- aws_secret_access_key: Your AWS secret access key for accessing the secrets manager.

IAM Policy - using AWS Secrets manager

If you are self-hosting slaOS and using AWS Secrets Manager to store sensitive information like access keys, you need to configure additional IAM permissions.

Create a Secrets Manager Policy: Use the following JSON document to create a policy.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "SecretsManagerAccess",
      "Effect": "Allow",
      "Action": [
        "secretsmanager:GetSecretValue",
        "secretsmanager:DescribeSecret"
      ],
      "Resource": "arn:aws:secretsmanager:*:*:secret:*"
    }
  ]
}

Attach the Policy: Name the policy RatedSecretsManagerAccessPolicy and attach it to the IAM user created for slaOS.

Conclusion

Understanding and properly configuring your self-hosted slaOS indexer is key to effectively processing your data. Always refer to the most up-to-date documentation on our GitHub repository for detailed setup instructions and best practices.

If you encounter any issues or have questions about your configuration, don't hesitate to reach out to our support team at [email protected] or consult the community forums.

PreviousSelf-hosting NextRunning locally with Docker

Last updated 11 months ago