Parsing Apache2 access logs with the OpenTelemetry Collector

I couldn't find a ton of resources on this, but FYI -- the OpenTelemetry Collector's filelog receiver has a pretty robust regex parser built into it. Want to get your access.log files from Apache? Here's the config.

  filelog/access:
    include: [ /var/log/apache2/access.log ]
    operators:
      - type: regex_parser
        regex: '(?P<ip>\d{1,3}(?:\.\d{1,3}){3}) - - \[(?P<datetime>[^\]]+)] "(?P<method>\S+) (?P<path>\S+) (?P<protocol>\S+)" (?P<status>\d{3}) (?P<size>\d+) "(?P<referrer>[^"]*)" "(?P<user_agent>[^"]*)'
        timestamp:
          parse_from: attributes["datetime"]
          layout: '%d/%b/%Y:%H:%M:%S %z'
        severity:
          parse_from: attributes["status"]

The documentation for a lot of this stuff is stuck inside the GitHub repositories for the receiver modules, so be sure to check that out if you're looking for a quick reference.

What if we want to go further and turn our attributes into their appropriate semantic conventions? While there's no explicit log conventions for HTTP servers, the Span ones should work for our purposes.

  transform:
    error_mode: ignore
    log_statements:
      - context: log
        statements:
          - replace_all_patterns(attributes, "key", "method",  "http.request.method")
          - replace_all_patterns(attributes, "key", "status",  "http.response.status_code")
          - replace_all_patterns(attributes, "key", "user_agent", "user_agent.original")
          - replace_all_patterns(attributes, "key", "ip", "client.address")
          - replace_all_patterns(attributes, "key", "path", "url.path")
          - delete_key(attributes, "datetime")
          - delete_key(attributes, "size")

This should be enough to get started, at least, although there's more you might want to do:

Add resource attributes for the logical service name (apache, reverse-proxy, etc.)
Change up your Apache log format to get more information like the scheme, or time spent serving the request.

Data Bores

0 0 Posted on 11/20/23 by Austin Parker

Sampling is a method to reduce the volume of data processed and stored by observability tools. There’s a variety of methods and algorithms that can be employed to do this, and most observability practices will wind up using a blend of them, but this blog isn’t necessarily about how to implement any individual technique. No, what I’m interested in discussing is the why of sampling, the outcomes that we’re looking for when we implement it, and some of the novel work that I’m seeing around the subject.

Continue reading "Data Bores"

Deploying on Friday the 13th

0 0 Posted on 10/30/23 by Austin Parker

I looked at the significantly more senior engineer sitting across from me in the white-and-startup-blue offices of a former job. Scarcely three years out of college, but with a decade of IT experience under my belt, I dug deep, searching for the endless well of patience that I previously administered to passionate but confused administrative assistants panicking about the location of a Powerpoint file. “Come again?” was the best I could muster.

This was the big leagues, right? I was a Software Engineer now, and I did DevOps, and I was leading a Cloud Transformation – this is what we’re supposed to be doing! Here I was, being yanked back down to earth by a man with over twenty years of professional development experience, balking at learning how YAML worked because… “I wasn’t trained to do that.” In the moment, I demurred, gently guiding him back to the repository of Powershell scripts my team had built to aid in the new workflows we were pushing.

The statement haunted me, though, and it does to this day. I had labored under the impression that developers and engineers were a cut above; The new philosophers of our information age, capable of making these hunks of silicon and glass sing using their minds. The notion that one of them would balk from something like… well, a different configuration file format, in this case, was almost unthinkable. It stuck in the recesses of my mind, like a stray popcorn kernel.

While I can’t admit to knowing exactly what was going on in his mind that day, over time I believe that I’ve identified a ground truth about most people in software, and most teams; It is that, deep down, we are afraid.

Continue reading "Deploying on Friday the 13th"

Observability Cannot Fail, It Can Only Be Failed

0 0 Posted on 08/28/23 by Austin Parker

Being between jobs is a great time to step back, do some self-critique, and engage in light home improvement for fun and or profit. It’s this last pursuit that’s convinced me that if this whole computer thing doesn’t work out, I’m screwed — I don’t have the spirit of a tradesperson in my body. This revelation was prompted by my journey to install laminate flooring in my office, which has until now simply had a bare concrete floor. Originally, I had my heart set on some ‘Luxury Vinyl Planks’ (or LVP), which was not only recommended to me by industrious flooring salespeople, but was available in a variety of delightful colors and patterns.

Sadly, LVP commands a significant price premium, which was unattractive for what’s meant to be, ultimately, a temporary job. We’re going to get the basement finished eventually, with consistent flooring throughout, so why waste the money? Thus, I chose what seemed to be the ‘best’ laminate I could find, purchased all of the accessories and tools that I could find to aid in the installation, and spent hours reading and watching tutorials about it. Thus armed, I cleared out the office, cleaned the floor, and started to place the flooring.

Reader, it may surprise you to learn that this plan went to shit.

Continue reading "Observability Cannot Fail, It Can Only Be Failed"

Observability and the Decentralized Web

0 0 Posted on 08/22/23 by Austin Parker

Continue reading "Observability and the Decentralized Web"