5 min read

Linux I-Node Monitoring in C++20

The Linux kernel equips a really useful i-node monitoring API. “I-nodes” are information nodes, also sometimes called index nodes, that store information within a Linux filesystem—meaning files, directories and mount points.

What would it take to set up a super-simple C++ wrapper for Linux’s “inotify” service, one that converts all errors to exceptions and masks the underlying signalling and buffering requirements?

Solution

Introducing the sys::inotify C++20 wrapper:

#include <bit>
#include <bitset>
#include <cerrno>
#include <map>
#include <system_error>
#include <unordered_map>
#include <vector>

extern "C" {
#include <poll.h>
#include <sys/inotify.h>
#include <unistd.h>
}

namespace sys {
class inotify {
  const int fd_;
  std::unordered_map<int, std::string> wds_;

public:
  inotify() : fd_{inotify_init1(IN_NONBLOCK)} {
    if (fd_ < 0)
      throw std::system_error(errno, std::system_category());
  }

  int add_watch(const std::string &pathname, uint32_t mask = IN_ALL_EVENTS) {
    const int wd = inotify_add_watch(fd_, pathname.c_str(), mask);
    if (wd < 0)
      throw std::system_error(errno, std::system_category());
    wds_.insert(std::make_pair(wd, pathname));
    return wd;
  }

  virtual ~inotify() { close(fd_); }

  int poll(short events = POLLIN, int timeout = 0) {
    pollfd fds = {fd_, events, 0};
    int rc = ::poll(&fds, 1, timeout);
    if (rc < 0)
      throw std::system_error(errno, std::system_category());
    // Polling returns a non-zero successful number of pollings. Answer the
    // returned events.
    return rc == 1 ? fds.revents : 0;
  }

  struct event {
    std::string wd;
    uint32_t mask;
    std::string name;

    auto mask_to_strings() const {
      std::vector<std::string> strings;
      static const std::map<int, std::string> in{
          {std::countr_zero(uint32_t(IN_ACCESS)), "ACCESS"},
          {std::countr_zero(uint32_t(IN_MODIFY)), "MODIFY"},
          {std::countr_zero(uint32_t(IN_ATTRIB)), "ATTRIB"},
          {std::countr_zero(uint32_t(IN_CLOSE_WRITE)), "CLOSE_WRITE"},
          {std::countr_zero(uint32_t(IN_CLOSE_NOWRITE)), "CLOSE_NOWRITE"},
          {std::countr_zero(uint32_t(IN_OPEN)), "OPEN"},
          {std::countr_zero(uint32_t(IN_MOVED_FROM)), "MOVED_FROM"},
          {std::countr_zero(uint32_t(IN_MOVED_TO)), "MOVED_TO"},
          {std::countr_zero(uint32_t(IN_CREATE)), "CREATE"},
          {std::countr_zero(uint32_t(IN_DELETE)), "DELETE"},
          {std::countr_zero(uint32_t(IN_DELETE_SELF)), "DELETE_SELF"},
          {std::countr_zero(uint32_t(IN_MOVE_SELF)), "MOVE_SELF"},
          {std::countr_zero(uint32_t(IN_UNMOUNT)), "UNMOUNT"},
          {std::countr_zero(uint32_t(IN_Q_OVERFLOW)), "Q_OVERFLOW"},
          {std::countr_zero(uint32_t(IN_IGNORED)), "IGNORED"},
          {std::countr_zero(uint32_t(IN_ONLYDIR)), "ONLYDIR"},
          {std::countr_zero(uint32_t(IN_DONT_FOLLOW)), "DONT_FOLLOW"},
          {std::countr_zero(uint32_t(IN_EXCL_UNLINK)), "EXCL_UNLINK"},
          {std::countr_zero(uint32_t(IN_MASK_CREATE)), "MASK_CREATE"},
          {std::countr_zero(uint32_t(IN_MASK_ADD)), "MASK_ADD"},
          {std::countr_zero(uint32_t(IN_ISDIR)), "ISDIR"},
          {std::countr_zero(uint32_t(IN_ONESHOT)), "ONESHOT"}};
      std::bitset<32> bitset{mask};
      while (bitset.any()) {
        auto count = std::countr_zero(bitset.to_ulong());
        auto it = in.find(count);
        strings.push_back(it == in.end()
                              ? std::string("bit") + std::to_string(count)
                              : it->second);
        bitset.reset(count);
      }
      return strings;
    }
  };

  std::vector<event> read() {
    char buf[BUFSIZ];
    auto len = ::read(fd_, buf, sizeof(buf));
    if (len < 0)
      throw std::system_error(errno, std::system_category());
    std::vector<event> events;
    const inotify_event *event;
    for (auto ptr = buf; ptr < buf + len;
         ptr += sizeof(inotify_event) + event->len) {
      event = reinterpret_cast<typeof(event)>(ptr);
      // Assume that the event name carries a null terminator. The event's total
      // length steps over the name and its terminator.
      events.push_back(
          inotify::event{.wd = wds_[event->wd],
                         .mask = event->mask,
                         .name = std::string(event->name)});
    }
    return events;
  }
};
} // namespace sys

Also available as Gist. A super-simple use case writes events to standard output, as follows. The program’s command-line arguments specify the watched directories. It polls the kernel notifications once a second and writes the events one per line to standard output.

#include "inotifyxx.hh"
#include <iostream>

int main(int argc, char **argv) {
  sys::inotify inotify;
  for (int optind = 1; optind < argc; optind++)
    inotify.add_watch(argv[optind]);
  for (;;)
    if (inotify.poll(POLLIN, 1000) & POLLIN)
      for (auto event : inotify.read()) {
        std::cout << event.wd << "/" << event.name;
        for (auto in : event.mask_to_strings())
          std::cout << " " << in;
        std::cout << std::endl;
      }
  return EXIT_SUCCESS;
}

Invoked using inotifyxx . from a shell, when creating a new file using touch hello the monitor program prints:

./hello CREATE
./hello OPEN
./hello ATTRIB
./hello CLOSE_WRITE

Explanation

Polling deserves some explanation. Reading the notification descriptor fails with an error if the kernel queue contains no events, an empty queue. The read does not read “nothing.” In this implementation, the read method throws a system error. Hence polling first is required. Read the notifications only if the poll method returns POLLIN events.

Reading an event requires a buffer of sufficient size. The call to Linux read() will fail with errno of \(22\) or EINVAL unless the buffer can accommodate at least one event. Also, note that the inotify_event structure only defines the start of an event. Its len field defines the length of the name. Walking through the kernel-supplied event buffer therefore spans the initial fixed-length structure plus the len characters.

Converting the event mask to their string representations utilises efficient bit counting and bit sets.

      std::bitset<32> bitset{mask};
      while (bitset.any()) {
        auto count = std::countr_zero(bitset.to_ulong());
        // do something
        bitset.reset(count);
      }

This is one way to iterate the set bits. It efficiently skips the zeroes by using the hardware to scan for the least-significant bit, counting the zero bits from the right, then using a C++ std::bitset to reset and repeat until none left. The bit set mutates to \(0\).

Conclusion

Turning errors into exceptions helps in simple use cases.

Should the notify methods have const qualifiers? Tempting perhaps. The C++ object does not mutate when the notifications poll and read. However, the sys::inotify interface wraps kernel-space objects that do mutate. Semantically, therefore, polling and reading operations are mutable.

The implementation is not perfect by any means. For one thing, it exposes the underlying Unix headers. Any C++ translation unit that includes the header also includes the low-level headers, a kind of namespace pollution that does not feel ideal in C++ where namespaces carefully segregate the code. The interface needs the manifest constants defined by the headers within the interface in this version. Future work could split the interface and implementation, hiding the bit-mask constants and the low-level header dependencies behind an interface abstraction.

Limitations exist, of course. The notification’s watch descriptors do not operate recursively. If you create a sub-folder, the monitor will not notify changes within the sub-directory without first adding it to the set of watches. Work for another day.

C++20 dependencies also exist. The <bit> header does not appear in earlier versions. This makes the standard countl_zero function unavailable. If using the GCC compiler toolchain, the built-in __builtin_ctzl can easily replace the C++20 standard functions; the GCC built-in counts trailing zeroes (CTZ).