YARA-based scanning with osquery
YARA is a tool that allows you to find textual or binary patterns inside of files.
There are two YARA-related tables in osquery, which serve very different purposes. The first table, called
yara_events
, uses osquery's Events framework to monitor for filesystem changes
and will execute YARA when a file change event fires. The second table, just called yara
, is a table for performing an
on-demand YARA scan.
In this document, "signature file" is intended to be synonymous with "YARA rule file" (plain-text files commonly
distributed with a .yar
or .yara
filename extension, although any extension is allowed).
For more information about YARA, check out the documentation.
YARA Configuration
The configuration for osquery is simple. Here is an example config, grouping some YARA rule files from the local filesystem:
{
// Description of the YARA feature.
"yara": {
"signatures": {
// Each key is an arbitrary group name to give the signatures listed
"sig_group_1": [ "/Users/wxs/sigs/foo.yar", "/Users/wxs/sigs/bar.yar" ],
"sig_group_2": [ "/Users/wxs/sigs/baz.yar" ]
},
"file_paths": {
// Each key is a key from file_paths
// The value is a list of signature groups to run when an event fires
// These will be watched for and scanned when the event framework
// fire off an event to yara_events table
"system_binaries": [ "sig_group_1" ],
"tmp": [ "sig_group_1", "sig_group_2" ]
}
},
// Paths to watch for filesystem events
"file_paths": {
"system_binaries": [ "/usr/bin/%", "/usr/sbin/%" ],
"tmp": [ "/Users/%/tmp/%%", "/tmp/%" ]
}
}
The first thing to notice is the file_paths
section, which is used to describe which paths to monitor for changes.
Each key is an arbitrary category name and the value is a list of paths. The syntax used is documented on the osquery
wildcard rules described on the FIM page. The paths, when expanded out
by osquery, are monitored for changes and processed by the
file_events
table.
The second thing to notice is the yara
section, which contains the configuration to use for YARA within osquery. The
yara
section contains two keys: signatures
and file_paths
. The signatures
key contains a set of arbitrary key
names, called "signature groups." The value for each of these groups are the paths to the signature files that will be
compiled and stored within osquery. The paths to the signature files must be absolute paths (not relative paths). The
file_paths
key maps the category name for an event described in the global file_paths
section to a signature
grouping to use when scanning.
For example, when a file in /usr/bin/
and /usr/sbin/
is changed it will be scanned with sig_group_1
, which
consists of foo.yar
and bar.yar
. When a file in /Users/%/tmp/
(recursively) is changed it will be scanned with
sig_group_1
and sig_group_2
, which consists of all three signature files.
Retrieving YARA Rules at Runtime
The default behavior of the yara
table is to use YARA rules specified in a file on the osquery host. However, it
might be more convenient to manage your YARA rules in one location, and have the yara
table fetch those rules
at runtime, rather than have to update (and version-manage) a YARA rules file on every individual osquery host. Your
organization may also treat YARA rules as security-sensitive data, and you may not wish to store that data on the
filesystem of every osquery host.
To configure osquery to allow the fetching of YARA rules at runtime, you have to set up your yara
configuration file
with the signature_urls
section. This will be an array that can be a mix of full URLs pointing to single Yara rule,
or a partial URLs, where the path part can be a regex which will be used to match multiple URLs and rules.
Each entry exists to later allow single or multiple URLs, provided via the sigurl
constraint in the query.
Since the path part of a URL string (the part after the domain) is always parsed as regex, we need to escape
the regex special characters like .
, if we want to use them to specify a full URL.
Below a configuration example:
"yara": {
"signature_urls": [
"https://raw.githubusercontent.com/Yara-Rules/rules/master/cve_rules/CVE-2010-0805\\.yar",
"https://raw.githubusercontent.com/Yara-Rules/rules/master/crypto/crypto_signatures\\.yar",
"https://raw.githubusercontent.com/Yara-Rules/rules/master/malware/APT_APT3102\\.yar",
"https://raw.githubusercontent.com/Yara-Rules/rules/devel/CVE_Rules/CVE-.*"
]
}
and a couple of queries examples:
# This is valid
SELECT * FROM yara WHERE path="/usr/bin/ls" AND sigurl='https://raw.githubusercontent.com/Yara-Rules/rules/master/cve_rules/CVE-2010-0805.yar';
# This too
SELECT * FROM yara WHERE path="/usr/bin/ls" AND sigurl='https://raw.githubusercontent.com/Yara-Rules/rules/devel/CVE_Rules/CVE-2010-0805.yar';
# This is not allowed
SELECT * FROM yara WHERE path="/usr/bin/ls" AND sigurl='https://raw.githubusercontent.com/Yara-Rules/rules/devel/malware/APT_APT3102.yar';
YARA signature url https://raw.githubusercontent.com/Yara-Rules/rules/devel/malware/APT_APT3102.yar not allowed
Failed to get YARA rule url: https://raw.githubusercontent.com/Yara-Rules/rules/devel/malware/APT_APT3102.yar
Query must specify sig_group, sigfile, or sigrule for scan
YARA rule strings are omitted from output by default, to prevent disclosure in osquery's results and logs. To include
the YARA rules in the sigrule
column, set the enable_yara_string
flag to true
.
Notes
- Retrieved YARA rules are retrieved only once and then cached; the cached copy is used until it is stale as specified
by the HTTP
Last-Modified
header in the server's response. - The osquery agent always validates the HTTPS server certificate of the server providing the YARA signatures, but currently has no support for client authentication. YARA rule files must be accessible without authentication.
Continuous monitoring using the yara_events table
Using the configuration above you can see it in action. While osquery is running, we execute touch /Users/wxs/tmp/foo
in another terminal. Here are the relevant queries to show what was detected:
osquery> SELECT * FROM file_events;
+--------------------+----------+------------+---------+----------------+----------------------------------+------------------------------------------+------------------------------------------------------------------+
| target_path | category | time | action | transaction_id | md5 | sha1 | sha256 |
+--------------------+----------+------------+---------+----------------+----------------------------------+------------------------------------------+------------------------------------------------------------------+
| /Users/wxs/tmp/foo | tmp | 1430078285 | CREATED | 33859499 | d41d8cd98f00b204e9800998ecf8427e | da39a3ee5e6b4b0d3255bfef95601890afd80709 | e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 |
+--------------------+----------+------------+---------+----------------+----------------------------------+------------------------------------------+------------------------------------------------------------------+
osquery> SELECT * FROM yara_events;
+--------------------+----------+------------+---------+----------------+-------------+-------+
| target_path | category | time | action | transaction_id | matches | count |
+--------------------+----------+------------+---------+----------------+-------------+-------+
| /Users/wxs/tmp/foo | tmp | 1430078285 | CREATED | 33859499 | always_true | 1 |
+--------------------+----------+------------+---------+----------------+-------------+-------+
osquery>
The file_events
table recorded that a file named
/Users/wxs/tmp/foo
was created with the corresponding hashes and a timestamp.
The yara_events
table recorded that 1 matching rule (always_true
)
was found when the file was created. In this example every file will always have at least one match because we are
using a rule which always evaluates to true. In the next example we'll issue the same command to create a file in a
monitored directory but have removed the always_true
rule from our signature files.
osquery> SELECT * FROM yara_events;
+--------------------+----------+------------+---------+----------------+-------------+-------+
| target_path | category | time | action | transaction_id | matches | count |
+--------------------+----------+------------+---------+----------------+-------------+-------+
| /Users/wxs/tmp/foo | tmp | 1430078285 | CREATED | 33859499 | always_true | 1 |
| /Users/wxs/tmp/foo | tmp | 1430078524 | CREATED | 33860795 | | 0 |
+--------------------+----------+------------+---------+----------------+-------------+-------+
As you can see, even though no matches were found, a row is still created and stored.
On-demand YARA scanning
The yara
table is used for on-demand scanning. With this table
you can arbitrarily YARA scan any available file on the filesystem with any available signature files or
signature group from the configuration. In order to scan, the table must be given a constraint which says
where to scan and what to scan with.
In order to determine where to scan, the path
constraint must be a full path to a single file, or a
path LIKE
with a wildcard pattern. There is no expansion or recursion with this constraint. Note that
you must use LIKE
if you want to use a wildcard pattern.
Once the where
is out of the way, you must specify the "what" part. This is done through either the
sigfile
or sig_group
constraints. The sigfile
constraint must be an absolute path to a signature
file on the filesystem, not a relative path. The signature file will be compiled only for the execution
of this one query and removed afterwards. The sig_group
constraint must consist of a named signature
grouping from your configuration file.
Here are some examples of the yara
table in action:
osquery> SELECT * FROM yara WHERE path="/bin/ls" AND sig_group="sig_group_1";
+---------+-------------+-------+-------------+---------+---------+---------+
| path | matches | count | sig_group | sigfile | strings | tags |
+---------+-------------+-------+-------------+---------+---------+---------+
| /bin/ls | always_true | 1 | sig_group_1 | | | |
+---------+-------------+-------+-------------+---------+---------+---------+
osquery> SELECT * FROM yara WHERE path="/bin/ls" AND sig_group="sig_group_2";
+---------+---------+-------+-------------+---------+---------+---------+
| path | matches | count | sig_group | sigfile | strings | tags |
+---------+---------+-------+-------------+---------+---------+---------+
| /bin/ls | | 0 | sig_group_2 | | | |
+---------+---------+-------+-------------+---------+---------+---------+
As you can see in these examples, we scan the same file with two different signature groups and get different results.
osquery> SELECT * FROM yara WHERE path LIKE "/bin/%sh" AND sig_group="sig_group_1";
+-----------+-------------+-------+-------------+---------+----------+----------+
| path | matches | count | sig_group | sigfile | strings | tags |
+-----------+-------------+-------+-------------+---------+----------+----------+
| /bin/bash | always_true | 1 | sig_group_1 | | | |
| /bin/csh | always_true | 1 | sig_group_1 | | | |
| /bin/ksh | always_true | 1 | sig_group_1 | | | |
| /bin/sh | always_true | 1 | sig_group_1 | | | |
| /bin/tcsh | always_true | 1 | sig_group_1 | | | |
| /bin/zsh | always_true | 1 | sig_group_1 | | | |
+-----------+-------------+-------+-------------+---------+----------+----------+
The above illustrates using the path LIKE
constraint to scan /bin/%sh
with a signature group.
osquery> select * from yara where path LIKE 'C:\tmp\%' and sigfile = "C:\tmp\test.yar.txt";
+------------------------------+-------------+-------+-----------+---------------------+-----------------+------+
| path | matches | count | sig_group | sigfile | strings | tags |
+------------------------------+-------------+-------+-----------+---------------------+-----------------+------+
| C:\tmp\New Text Document.txt | TextExample | 1 | | C:\tmp\test.yar.txt | $text_string:0 | |
| C:\tmp\test.yar.txt | TextExample | 1 | | C:\tmp\test.yar.txt | $text_string:35 | |
+------------------------------+-------------+-------+-----------+---------------------+-----------------+------+
The above is an example of using an absolute path for sigfile
combined with path LIKE
. Because the sigfile
contains the string its rule is searching for, it has also returned itself as a result.
Tip: you can specify AND count > 0
in your query to return only positive YARA results.
Inline YARA rules with sigrule
Above, we documented how to query the yara
table using YARA signatures specified in a local file or retrieved from a
remote host. YARA rules can also be provided inline with the query, using the hidden column sigrule
as a constraint.
YARA rules take the form of 'rule rulename { condition: [whatever] }'
and follow the
standard YARA rule syntax.
For example:
osquery> select * from yara where path = '/etc/passwd' and sigrule = 'rule always_true { condition: true }';
YARA rules don't have a line-terminating character. To enter a multi-line YARA rule, use newlines. This
even works in osqueryi
:
osquery> select * from yara where path LIKE 'C:\tmp\%' and sigrule = 'rule hello_world {
...> strings:
...> $a = "Hello world"
...> condition: $a
...> }';
+------------------------------+-------------+-------+-----------+---------+---------+------+
| path | matches | count | sig_group | sigfile | strings | tags |
+------------------------------+-------------+-------+-----------+---------+---------+------+
| C:\tmp\New Text Document.txt | hello_world | 1 | | | | |
+------------------------------+-------------+-------+-----------+---------+---------+------+
Note: when entering a sigrule
inline, be careful to avoid double-quoting the rule and then also a string
variable within the rule, as the second "
will terminate the rule and cause a syntax error
. In the example
above, the sigrule
string has been single-quoted so the enclosed variable "Hello world"
can be double-quoted.
Because allowing arbitrary YARA rules would also make it possible to retrieve arbitrary file data in the strings
column, as a protection, the strings
column will default to returning empty unless you also set the hidden flag
enable_yara_string
to true
(its default is false
).
Troubleshooting
YARA compile error
Before a YARA scan is performed, the YARA engine compiles the rule(s). An error here indicates there is probably an
issue with the YARA rule(s), but, the first thing to check is whether the same rule can be run with the YARA
command-line utility: yara64.exe myYaraRule.yar fileToScan.foo
. You will be able to get more helpful messages
about the compile error. If, however, this actually works as intended, then perhaps you've found a bug! Please let
the osquery team know, on Slack or by opening an issue on GitHub.
Error loading YARA rules: 8
At this time, osquery only supports loading plaintext YARA rules/signatures, which it compiles itself at runtime. If
these rules have already been compiled into their binary form (e.g. with the yarac
CLI tool), osquery will
generate an error trying to load the rules.