Monitoring via Micrometer

Few notes on monitoring as a frontend engineer.

2 min read

Feb 13, 2023

Prometheus and micrometer are couple of tools that are being used for monitoring the backend application on a large scale. Specially if you are using the ktor server application these tools come in handy when monitoring.

As an example you can listen to api calls and send request/response related data to prometheus via micrometer. Bellow are some use cases.

Monitoring the number of HTTP requests served (Counter)
Tracking the current memory usage (Gauge)
Measuring the response time of a web service (Timer)
Observing the distribution of database query durations (Distribution)

Following example shows how you can listen to outgoing client request and capture data.

private fun createHttpClient(
  readTimeout: Duration = Duration.ofMillis(DEFAULT_TIMEOUT_MILLIS),
  connectTimeout: Duration = Duration.ofMillis(DEFAULT_TIMEOUT_MILLIS),
  meterRegistry: MeterRegistry,
  applicationConfig: ApplicationConfig,
): OkHttpClient = OkHttpClient.Builder().apply {
   eventListener(
       OkHttpMetricsEventListener.builder(meterRegistry, "http.outgoing")
           .tag(Tag.of("application", applicationConfig.applicationName))
           .build(),
   )
}
   .readTimeout(readTimeout)
   .connectTimeout(connectTimeout)
   .build()

Other than that there are few functions/methods that you could leverage.

Counters: You can use counters to track the occurrence of different HTTP status codes. Create a counter for each status code you want to track, and then increment the appropriate counter whenever that status code is encountered. Basically measures monotonically increasing values like request served, tasks completed, or errors. It never goes down.

val statusCodeCounter = meterRegistry.counter("http.status", "status", "code")

fun recordHttpStatusCode(statusCode: Int) {
    statusCodeCounter.increment(statusCode.toDouble())
}

// Usage
recordHttpStatusCode(200) // Increment counter for HTTP 200 status code
recordHttpStatusCode(404) // Increment counter for HTTP 404 status code

Gauges: This is useful if you want to monitor the current count of a specific status code over time. You could use this to measures instantaneous values like temperature, memory usage, or concurrent connections that can fluctuate.

private fun addGauge(metricName: String, responseCode: Int) {
    meterRegistry.gauge(
        metricName,
        responseCode,
    )
}

// Usage
addGauge("http_error_code", response.code())

Timers: If you want to track the response time for different status codes, you can use timers. Timers provide metrics such as response time percentiles for each status code. This measures the duration of events like request processing time or task execution time. It provides metrics like total time, count, and percentile distribution.

val responseTimeTimer = meterRegistry.timer("http.response.time")

fun recordHttpResponseTime(statusCode: Int, responseTime: Long) {
    responseTimeTimer.record(responseTime, Tags.of("status", statusCode.toString())) // Tags can be used to filter data in prometheus
}

// Usage
recordHttpResponseTime(200, 50) // Record response time for HTTP 200 status code

Distribution: Similar to a timer, measures the distribution of event durations but offers finer-grained control over bucket sizes.

Better choose the approach that best fits your monitoring and reporting needs. Consider factors such as the granularity of data required, ease of use, and compatibility with your existing monitoring infrastructure.