Lessons Learned from Learning OpenTelemetry

 1 Posted on by

I'm knee-deep in production for Learning OpenTelemetry, releasing in just over a month. This is my second book, so I figured it was a good time to sit down and write up a couple of things I learned while writing this one, if only so when the writing bug gets me again in a year or so I can look back at this post and ask myself if it was really worth it.

Mostly joking, but writing is hard! There's a real balance you need to strike, especially when doing technical-but-not-documentation content.

Continue reading "Lessons Learned from Learning OpenTelemetry"

What Do We Mean When We Talk About OpenTelemetry?

 0 Posted on by

I'm motivated to write this post as a result of several discussions I've had over the past week or so prompted in part by the announcement of Elastic wanting to donate their profiling agent to the OpenTelemetry project. One of the bigger challenges around OpenTelemetry is that you can think of it as a vector. It not only has a shape, it has a direction, and the way you think about the project and what it is has a lot to do with how well you understand that direction. There's the OpenTelemetry of yesterday, the OpenTelemetry of today, and the OpenTelemetry of tomorrow. Let's talk about each of these in turn, so that we can try and build a model of what OpenTelemetry is in a holistic sense.

Continue reading "What Do We Mean When We Talk About OpenTelemetry?"

OTel TIL – What The Heck Is Instrumentation, Anyway?

 1 Posted on by

Ever asked ChatGPT about OpenTelemetry? There's a pretty good chance that what it spits out at you started out as something I wrote, years ago. When the project started, I picked up where I left off maintaining the docs and website for OpenTracing and built the first few versions of opentelemetry.io (seen here in late 2019), including most of its initial documentation, concept pages, and so forth. Little did I realize then that the project would become as large as it did, or that everything I wrote would get repeated across the internet on dozens of other documentation sites, marketing pages, and blogs... and I really did not see those words getting fed into massive language models, thus ossifying a lot of the concepts that I wrote about into point-in-time snapshots of what a lot of words mean. One of these words, and the one I want to dive into, is instrumentation.

Continue reading "OTel TIL – What The Heck Is Instrumentation, Anyway?"

Selling The Vision

10  4 Posted on by

OpenTelemetry can be a difficult project to describe to people, because the gap between what it is today and what it will be tomorrow is very large. It's easy to stare at it from a distance, squint your eyes, and wonder what the hell we're doing over here. The further away you are from the core contributors, maintainers, and weird little observability guys at the center of it all, the harder it is for things to come into focus. There's a few reasons for that, one of which is that I truly think that it isn't a completely shared vision (and that's ok, for reasons I'll get into) -- but the biggest is that the vision really is just that. A vision, one that is going to take years to realize. That vision is what should excite people, but because we're not great at selling it or even describing it, it winds up turning people away.

Continue reading "Selling The Vision"

Parsing Apache2 access logs with the OpenTelemetry Collector

 0 Posted on by

I couldn't find a ton of resources on this, but FYI -- the OpenTelemetry Collector's filelog receiver has a pretty robust regex parser built into it. Want to get your access.log files from Apache? Here's the config.

  filelog/access:
    include: [ /var/log/apache2/access.log ]
    operators:
      - type: regex_parser
        regex: '(?P<ip>\d{1,3}(?:\.\d{1,3}){3}) - - \[(?P<datetime>[^\]]+)] "(?P<method>\S+) (?P<path>\S+) (?P<protocol>\S+)" (?P<status>\d{3}) (?P<size>\d+) "(?P<referrer>[^"]*)" "(?P<user_agent>[^"]*)'
        timestamp:
          parse_from: attributes["datetime"]
          layout: '%d/%b/%Y:%H:%M:%S %z'
        severity:
          parse_from: attributes["status"]

The documentation for a lot of this stuff is stuck inside the GitHub repositories for the receiver modules, so be sure to check that out if you're looking for a quick reference.

What if we want to go further and turn our attributes into their appropriate semantic conventions? While there's no explicit log conventions for HTTP servers, the Span ones should work for our purposes.

  transform:
    error_mode: ignore
    log_statements:
      - context: log
        statements:
          - replace_all_patterns(attributes, "key", "method",  "http.request.method")
          - replace_all_patterns(attributes, "key", "status",  "http.response.status_code")
          - replace_all_patterns(attributes, "key", "user_agent", "user_agent.original")
          - replace_all_patterns(attributes, "key", "ip", "client.address")
          - replace_all_patterns(attributes, "key", "path", "url.path")
          - delete_key(attributes, "datetime")
          - delete_key(attributes, "size")

This should be enough to get started, at least, although there's more you might want to do:

  • Add resource attributes for the logical service name (apache, reverse-proxy, etc.)
  • Change up your Apache log format to get more information like the scheme, or time spent serving the request.