Process and socket auditing with osquery

Enabling these auditing features requires additional configuration of osquery. osquery can leverage either BPF or the audit subsystems to record process executions and network connections in near real-time on Linux and macOS systems. Although these auditing features are extremely powerful for recording the activity from a host, they may introduce additional computational overhead and greatly increase the number of log events generated by osquery.

To read more about how event-based tables are created and designed, check out the osquery Table Pubsub Framework. On all supported platforms, process events are abstracted into the process_events table. Similarly, socket events are abstracted into the socket_events table.

To collect process events add a query like:

SELECT * FROM process_events;

to your query schedule, or to a query pack. If BPF is being used, change the table name to bpf_process_events.

Enabling these auditing features requires additional configuration to osquery, and may have performance impact. See the OS specific sections for guidance.

General Troubleshooting

Though some testing of underlying operating system configuration can be performed via osqueryi; osqueryi and osqueryd operate independently and do not communicate.

The --verbose flag can be really useful when trying to debug a problem.

Examine configuration flags

To verify that osquery's flags are set correct, you can query the osquery_flags table. For example, on a macOS machine, this shows osquery will process OpenBSM events.

osquery> select * from osquery_flags where name in ("disable_events", "disable_audit");
+----------------+------+---------------------------------------------------+---------------+-------+------------+
| name           | type | description                                       | default_value | value | shell_only |
+----------------+------+---------------------------------------------------+---------------+-------+------------+
| disable_audit  | bool | Disable receiving events from the audit subsystem | true          | false | 0          |
| disable_events | bool | Disable osquery publish/subscribe system          | false         | false | 0          |
+----------------+------+---------------------------------------------------+---------------+-------+------------+

Examine event table

osquery keeps state about the events subsystem in the osquery_events table. The events column is of note here.

This example is from a macOS machine with events enabled, but no events. You should try triggering an event, and then confirming that the event count is non-0. If it remains at zero, the problem is likely in how the OS auditing side is configured. See the platform specific instructions.

osquery> select * from osquery_events;
+-------------------------+-----------------+------------+---------------+--------+-----------+--------+
| name                    | publisher       | type       | subscriptions | events | refreshes | active |
+-------------------------+-----------------+------------+---------------+--------+-----------+--------+
| diskarbitration         | diskarbitration | publisher  | 1             | 0      | 0         | 1      |
| event_tapping           | event_tapping   | publisher  | 1             | 0      | 0         | 0      |
| fsevents                | fsevents        | publisher  | 0             | 0      | 24        | 1      |
| iokit                   | iokit           | publisher  | 1             | 0      | 0         | 1      |
| openbsm                 | openbsm         | publisher  | 9             | 0      | 0         | 0      |
| scnetwork               | scnetwork       | publisher  | 0             | 0      | 0         | 0      |
| disk_events             | diskarbitration | subscriber | 1             | 0      | 0         | 1      |
| file_events             | fsevents        | subscriber | 0             | 0      | 0         | 1      |
| hardware_events         | iokit           | subscriber | 1             | 0      | 0         | 1      |
| process_events          | openbsm         | subscriber | 8             | 0      | 0         | 1      |
| user_events             | openbsm         | subscriber | 1             | 0      | 0         | 1      |
| user_interaction_events | event_tapping   | subscriber | 1             | 0      | 0         | 1      |
| yara_events             | fsevents        | subscriber | 0             | 0      | 0         | 1      |
+-------------------------+-----------------+------------+---------------+--------+-----------+--------+

Linux process auditing using Audit

On Linux, osquery can the Audit system to collect and process events. It accomplishes this by monitoring syscalls such as execve() and execveat(). auditd should not be running when using osquery's process auditing, as it will conflict with osqueryd over access to the audit netlink socket. You should also ensure auditd is not configured to start at boot.

The only prerequisite for using osquery's auditing functionality on Linux is that you must use a kernel version that contains the Audit functionality. Most kernels over version 2.6 have this capability.

There is no requirement to install auditd or libaudit. Osquery only uses the audit features that exist in the kernel.

A sample log entry from process_events may look something like this:

{
  "action": "added",
  "columns": {
    "uid": "0",
    "time": "1527895541",
    "pid": "30219",
    "path": "/usr/bin/curl",
    "auid": "1000",
    "cmdline": "curl google.com",
    "ctime": "1503452096",
    "cwd": "",
    "egid": "0",
    "euid": "0",
    "gid": "0",
    "parent": ""
  },
  "unixTime": 1527895550,
  "hostIdentifier": "vagrant",
  "name": "process_events",
  "numerics": false
}

To better understand how this works, let's walk through 4 configuration options. These flags can be set at the command line or placed into the osquery.flags file.

  1. --disable_audit=false by default this is set to true and prevents osquery from opening the kernel audit's netlink socket. By setting it to false, we are telling osquery that we want to enable auditing functionality.
  2. --audit_allow_config=true by default this is set to false and prevents osquery from making changes to the audit configuration settings. These changes include adding/removing rules, setting the global enable flags, and adjusting performance and rate parameters. Unless you plan to set all of those things manually, you should leave this as true. If you are configuring audit, using a control binary, or /etc/audit.conf, your osquery may override your settings.
  3. --audit_persist=true but default this is true and instructs osquery to 'regain' the audit netlink socket if another process also accesses it. However, you should do your best to ensure there will be no other program running which is attempting to access the audit netlink socket.
  4. --audit_allow_process_events=true this flag indicates that you would like to record process events

Linux socket auditing using Audit

Osquery can also be used to record network connections by enabling socket_events. This table uses the syscalls bind() and connect() to gather information about network connections. This table is not automatically enabled when process_events are enabled because it can introduce considerable load on the system.

To enable socket events, use the --audit_allow_sockets flag.

A sample socket_event log entry looks like this:

{
  "action": "added",
  "columns": {
    "time": "1527895541",
    "success": "1",
    "remote_port": "80",
    "action": "connect",
    "auid": "1000",
    "family": "2",
    "local_address": "",
    "local_port": "0",
    "path": "/usr/bin/curl",
    "pid": "30220",
    "remote_address": "172.217.164.110"
  },
  "unixTime": 1527895545,
  "hostIdentifier": "vagrant",
  "name": "socket_events",
  "numerics": false
}

If you would like to log UNIX domain sockets use the hidden flag: --audit_allow_unix. This will put considerable strain on the system as many default actions use domain sockets. You will also need to explicitly select the socket column from the socket_events table.

Troubleshooting Audit-based process and socket auditing on Linux

There are a few different methods to ensure you have configured auditing correctly.

  1. Ensure you are supplied all of the necessary flags mentioned above in either a command-line argument or in your flagfile.
  2. Verify auditd is not running, if it is installed on the system.
  3. Run auditctl -s if the binary is present on your system and verify that enable is not set to zero and the pid corresponds to a process for osquery
  4. Verify that your osquery configuration has a query to SELECT from the process_events and/or socket_events tables
  5. You may also run auditing using osqueryi as root:
osqueryi --audit_allow_config=true --audit_allow_sockets=true --audit_persist=true --disable_audit=false --events_expiry=1 --events_max=50000 --logger_plugin=filesystem  --disable_events=false

If you would like to debug the raw audit events as osqueryd sees them, use the hidden flag --audit_debug. This will print all of the RAW audit lines to osquery's stdout.

NOTICE: Linux systems running journald will collect logging data originating from the kernel audit subsystem (something that osquery enables) from several sources, including audit records. To avoid performance problems on busy boxes (specially when osquery event tables are enabled), it is recommended to mask audit logs from entering the journal with the following command systemctl mask --now systemd-journald-audit.socket.

User event auditing with Audit

On Linux, a companion table called user_events is included that provides several authentication-based events. If you are enabling process auditing it should be trivial to also include this table.

Linux process and socket auditing using BPF

When osquery is running on a recent kernel (>= 4.18), the BPF eventing framework can be used. This event publisher needs to monitor for more system calls to reach feature parity with the Audit-based tables. For this reason, enabling BPF will also enable both the bpf_process_events and bpf_socket_events tables.

In order to start the publisher and enable the subscribers, the following flags must be passed: --disable_events=false --enable_bpf_events=true. The --verbose flag can also be extremely useful when setting up the configuration for the first time, since it emit more debug information when something fails.

The BPF framework will make use of a perf event array and several per-cpu maps in order to receive events and correctly capture strings and buffers. These structures can be configured using the following command line flags:

  • bpf_perf_event_array_exp: size of the perf event array, as a power of two
  • bpf_buffer_storage_size: how many slots of 4096 bytes should be available in each memory pool

Memory usage depends on both:

  1. How many processors are currently online
  2. How many processors can be added by hotswapping

The BPF event publisher uses 6 memory pools, grouping system calls in order to evenly distribute memory usage. Not counting the internal maps used to merge sys_enter/sys_exit events (the size for these maps is rather small), memory usage can be easily estimated with the following formula:

buffer_storage_bytes = memory_pool_count * (bpf_buffer_storage_size * 4096) * possible_cpu_count
perf_bytes = (2 ^ bpf_perf_event_array_exp) * online_cpu_count

The cpu count numbers can be read from the /sys folder:

possible_cpu_count: /sys/devices/system/cpu/possible
online_cpu_count: /sys/devices/system/cpu/online

VMware Fusion (and possibly other systems as well) supports CPU hotswapping, raising the possible_cpu_count to 128. This causes a huge increase in memory usage, and it is for this reason that the default settings are rather low.

This problem can be easily fixed by disabling hotswapping. This setting is unfortunately not available through the user interface, so it needs to be changed directly in the .vmx file (vcpu.hotadd=FALSE).

macOS process & socket auditing

osquery supports OpenBSM audit on macOS platforms. To enable it in osquery, you need to set the following command line flags --disable_audit=false --disable_events=false --audit_allow_config.

On macOS, osquery reads from the OpenBSM audit subsystem. This feature is already enabled on all macOS installations, but the default settings do not audit process execution or the root user. The osquery command line flag --audit_allow_config will make run-time configuration changes to your system audit to enable these features. This is all you need to get up and running.

Alternatively, instead of using the --audit_allow_config flag, you may edit the audit_control file in /etc/security/ for more granular/nuanced needs. This is optional and considered an "advanced configuration". An example configuration is provided below, but the important flags are: ex, pc, argv, and arge. The ex flag will log exec events while pc logs exec, fork, and exit. If you don't need fork and exit you may leave that flag out however in the future, getting parent pid may require fork. If you care about getting the arguments and environment variables you also need argv and arge. More about these flags can be found here. Note that it might require a reboot of the system for these new flags to take effect. audit -s should restart the system but your mileage may vary.

#
# $P4: //depot/projects/trustedbsd/openbsm/etc/audit_control#8 $
#
dir:/var/audit
flags:ex,pc,ap,aa,lo,nt
minfree:5
naflags:no
policy:cnt,argv,arge
filesz:2M
expire-after:10M
superuser-set-sflags-mask:has_authenticated,has_console_access
superuser-clear-sflags-mask:has_authenticated,has_console_access
member-set-sflags-mask:
member-clear-sflags-mask:has_authenticated

osquery events optimization

This section provides a brief overview of common and recommended optimizations for event-based tables. These optimizations also apply to the FIM events.

  1. --events_optimize=true apply optimizations when SELECTing from events-based tables, enabled by default.
  2. --events_expiry the lifetime of buffered events in seconds with a default value of 86000.
  3. --events_max the maximum number of events to store in the buffer before expiring them with a default value of 1000.

The goal of optimizations are to protect the running process and system from impacting performance. By default these are all enabled, which is good for configuration and performance, but may introduce inconsistencies on highly-stressed systems using process auditing.

Optimizations work best when SELECTing often from event-based tables. Otherwise the events are in a buffered state. When an event-based table is selected within the daemon, the backing storage maintaining event data is cleared according to the --event_expiry lifetime. Setting this value to 1 will auto-clear events whenever a SELECT is performed against the table, reducing all impact of the buffer.