Auditd

What is Audit Daemon (auditd)?

auditd is the userspace component to the Linux Auditing System. It's responsible for writing audit records to the disk. Viewing the logs is done with the ausearch or aureport utilities. Configuring the audit system or loading rules is done with the auditctl utility. During startup, the rules in /etc/audit/audit.rules are read by auditctl and loaded into the kernel. Alternately, there is also an augenrules program that reads rules located in /etc/audit/rules.d/ and compiles them into an audit.rules file. The audit daemon itself has some configuration options that the admin may wish to customize. They are found in the auditd.conf file.

EC2 instances with audit daemon running will stop automatically if auditd is unable to write the log files

Why would the audit daemon stop my instance if it can not write logs?

This is mainly a security response. If the system is unable to log actions or movements on the system, then if a compromise happens there would be no way to account for the actions of nefarious actors. Simply put, if auditd can't log anything to disk, no one should be on the system.

To facilitate these actions, there are configurable parameters. In the /etc/auditd.conf there are a few options that can manipulate the actions of the system which could cause a shutdown. The parameters are:

space_left

space_left_action

admin_space_left

admin_space_left_action

disk_full_action

disk_error_action

Below are the definitions for each of the above items according to man 5 auditd.conf:

space_left

   This is a numeric value in megabytes that tells the audit daemon when to perform a configurable action because the system is starting to run low on disk space.

space_left_action

   This parameter tells the system what action to take when the system has detected that it is starting to get low on disk space. Valid values are ignore, syslog, rotate, email, exec, suspend, single, and halt. If set to ignore, the audit daemon does nothing. syslog means that it will issue a warning to syslog. rotate will rotate logs, losing the oldest to free up space. Email means that it will send a warning to the email account specified in action_mail_acct as well as sending the message to syslog. exec /path-to-script will execute the script. You cannot pass parameters to the script. The script is also responsible for telling the auditd daemon to resume logging once its completed its action. This can be done by adding service auditd resume to the script. suspend will cause the audit daemon to stop writing records to the disk. The daemon will still be alive. The single option will cause the audit daemon to put the computer system in single user mode. The halt option will cause the audit daemon to shutdown the computer system.

admin_space_left

   This is a numeric value in megabytes that tells the audit daemon when to perform a configurable action because the system is running low on disk space. This should be considered the last chance to do something before running out of disk space. The numeric value for this parameter should be lower than the number for space_left.

admin_space_left_action

   This parameter tells the system what action to take when the system has detected that it is low on disk space. Valid values are ignore, syslog, rotate, email, exec, suspend, single, and halt. If set to ignore, the audit daemon does nothing. Syslog means that it will issue a warning to syslog. rotate will rotate logs, losing the oldest to free up space. Email means that it will send a warning to the email account specified in action_mail_acct as well as sending the message to syslog. exec /path-to-script will execute the script. You cannot pass parameters to the script. The script is also responsible for telling the auditd daemon to resume logging once its completed its action. This can be done by adding service auditd resume to the script. Suspend will cause the audit daemon to stop writing records to the disk. The daemon will still be alive. The single option will cause the audit daemon to put the computer system in single user mode. The halt option will cause the audit daemon to shutdown the computer system.

disk_full_action

   This parameter tells the system what action to take when the system has detected that the partition to which log files are written has become full. Valid values are ignore, syslog, rotate, exec, suspend, single, and halt. If set to ignore, the audit daemon will issue a syslog message but no other action is taken. Syslog means that it will issue a warning to syslog. rotate will rotate logs, losing the oldest to free up space. exec /path-to-script will execute the script. You cannot pass parameters to the script. The script is also responsible for telling the auditd daemon to resume logging g once its completed its action. This can be done by adding service auditd resume to the script. Suspend will cause the audit daemon to stop writing records to the disk. The daemon will still be alive. The single option will cause the audit daemon to put the computer system in single user mode. halt option will cause the audit daemon to shutdown the computer system.

disk_error_action

   This parameter tells the system what action to take whenever there is an error detected when writing audit events to disk or rotating logs. Valid values are ignore, syslog, exec, suspend, single, and halt. If set to ignore, the audit daemon will not take any action. Syslog means that it will issue no more than 5 consecutive warnings to syslog. exec /path-to-script will execute the script. You cannot pass parameters to the script. Suspend will cause the audit daemon to stop writing records to the disk. The daemon will still be alive. The single option will cause the audit daemon to put the computer system in single user mode. halt option will cause the audit daemon to shutdown the computer system.

   By default, on Amazon Linux if the disk has an error or is full, the system is SUSPENDED.  Below are unmodified parameters from an ALAMI 2017.09 instance:

      disk_error_action = SUSPEND
      disk_full_action = SUSPEND
      admin_space_left_action = SUSPEND
      admin_space_left = 50
      space_left_action = SYSLOG
      space_left = 75

As you can see, auditd is configured to warn via syslog. You can use "email" as the value, however this value is dependent on "action_mail_acct" which is detailed below:

action_mail_acct

   This option should contain a valid email address or alias. The default address is root. If the email address is not local to the machine, you must make sure you have email properly configured on your machine and network. Also, this option requires that /usr/lib/sendmail exists on the machine.

   Additional Information regarding disk actions:

Allthough the man page mentions that if auditd.conf's disk_full_action and disk_error_action are set to SUSPEND it will still keep the daemon alive and just stop writing to disk, from all indicators, the suspend action does more than that and does include putting the computer into a sleep state. As seen with this message:

[ 16.872478] ACPI: Preparing to enter system sleep state S5

   Further messages may also be visible in /var/log/messages regarding the action auditd has taken:

grep auditd /var/log/messages | grep -i "space"

While you can change this behavior in /etc/audit/audit.conf and ignore the disk full or disk error its not the best practice. Best practice would be to set up log rotation and log file size limits to help manage the space in /var/log/audit. In RHEL machines that use LVM, /var/log/audit is usually only given 5GB. If you are using SELinux, this can fill up rather quickly due to the constant AVC denial messages if SELinux is not properly configured/used.

Possible Resolutions:

There are a few things you can do: Rotate logs and limit log size:

auditd can rotate its own logs, but not compress them. RedHat does offer the following information regarding the rotation and compression of such log files. The same may be applied to CentOS and ALAMI.

   By default, auditd in all versions of Red Hat Enterprise Linux rotates its own log files automatically when they reach a certain size, as determined by the max_log_file setting in auditd.conf (which defaults to 6 megabytes)

   Replacing auto-rotation based on size with auto-rotation based on time
   1. Disable rotation in /etc/audit/auditd.conf so that: max_log_file_action = ignore

   2. Tell auditd to reconfigure itself (applying your changes) by doing one of the following: kill -HUP $(pidof auditd)   (Any version) systemctl reload auditd   (RHEL7) service auditd reload   (RHEL6 and earlier)
   3. To manually trigger auditd to rotate, it needs to receive a USR1 signal Simple solution for daily rotation: copy auditd.cron to cron.daily

            ~]# cp /usr/share/doc/audit-*/auditd.cron /etc/cron.daily

            ~]# chmod +x /etc/cron.daily/auditd.cron

            ~]# cat /etc/cron.daily/auditd.cron

       #!/bin/sh
        ##########
        # This script can be installed to get a daily log rotation
        # based on a cron job.
        ##########
       /sbin/service auditd rotate
       EXITVALUE=$?
       if [ $EXITVALUE != 0 ]; then
         /usr/bin/logger -t auditd "ALERT exited abnormally with [$EXITVALUE]"
       fi
       exit 0

   Implementing log compression
   auditd does not support log compression; however, it's trivial to update the above script to rename old audit.log.n files and compresses them. A working example is provided for demonstration purposes.

   1. Follow the steps above to disable auto-rotation based on size
   2. Replace the previously-created script with the following code:

   #!/bin/bash
       export PATH=/sbin:/bin:/usr/sbin:/usr/bin
       FORMAT="%F_%T"  # Customize timestamp format as desired, per `man date`
       # %F_%T will lead to files like: audit.log.2015-02-26_15:43:46
       COMPRESS=gzip   # Change to bzip2 or xz as desired
       KEEP=5          # Number of compressed log files to keep
       rename_and_compress_old_logs() {
           for file in $(find /var/log/audit/ -name 'audit.log.[0-9]'); do      
           timestamp=$(ls -l --time-style="+${FORMAT}" ${file} | awk '{print $6}')
           newfile=${file%.[0-9]}.${timestamp}
               # Optional:
           remove "-v" verbose flag from next 2 lines to hide output
               mv -v ${file} ${newfile}
               ${COMPRESS} -v ${newfile}
               done
       }
       delete_old_compressed_logs() {
           # Optional: remove "-v" verbose flag to hide output
           rm -v $(find /var/log/audit/ -regextype posix-extended -regex '.*audit\.log\..*(xz|gz|bz2)$' | sort -n | head -n -${KEEP})
       }
       rename_and_compress_old_logs
       service auditd rotate
          rename_and_compress_old_logs

       delete_old_compressed_logs

   3. Modify the declarations of FORMAT, COMPRESS, and KEEP as desired
   4. Ensure the script is marked executable and set it to be called by cron at desired times (either via a normal cron job or by putting it in cron.daily as demonstrated above)

Set up Cloud Watch Logs

You can use Amazon CloudWatch Logs to monitor, store, and access your log files from Amazon Elastic Compute Cloud (Amazon EC2) instances, AWS CloudTrail, Route 53, and other sources. You can then retrieve the associated log data from CloudWatch Logs.

Expand the EBS volume size

If your current-generation Amazon EBS volume is attached to a current-generation EC2 instance type, you can increase its size, change its volume type, or (for an io1 volume) adjust its IOPS performance, all without detaching it. You can apply these changes to detached volumes as well.

Additional information:

There has been known issued that could cause the kernel can panic due to audit option "f" in /etc/audit/audit.rules cat /etc/audit/audit.rule This would not cause the system to stop automatically and may be a different issue:

Example:

1. This file is automatically generated from /etc/audit/rules.d

-D

-b 8192

-f 1

The f flag sets the action that is performed when a critical error is detected, 0 -- Silent 1 -- Means that error will be handled by kernel log subsystem (printk, print a failure message) 2 -- Kernel panic in case of critical error Example conditions where this flag is consulted includes: transmission errors to user-space audit daemon, backlog limit exceeded, and rate limit exceeded.

Just wanted to add I have faced the error in the past where kernel got into panic due to audit option "f" in /etc/audit/audit.rules

cat /etc/audit/audit.rules

1. This file is automatically generated from /etc/audit/rules.d

-D -b 8192 -f 1 f - Sets the action that is performed when a critical error is detected,

  0 -- Silent
  1 -- Means that error will be handled by kernel log subsystem (printk, print a failure message)
  2 -- Kernel panic in case of critical error

Example conditions where this flag is consulted includes: transmission errors to user-space audit daemon, backlog limit exceeded, and rate limit exceeded

Notes: Per my customer's issue, we fixed it by: Removing the auditd.service from /usr/lib/systemd/system so you wont be able to start the service upon bootup.

To be able to start the service again you have to run: "systemctl daemon-reload” in addition to having the <>.service file in that directory

$ sudo systemctl start firewalld.service 
Failed to start firewalld.service: Unit not found.

$ sudo systemctl daemon-reload
$ sudo systemctl start firewalld.service

References: https://superuser.com/questions/513159/how-to-remove-systemd-services