Install and Enable ENA drivers for Nitro

From Dikapedia
Jump to: navigation, search

How do I install and enable the latest ENA driver for Enhanced Network Support on an Amazon EC2 instance running Red Hat 6/7?
https://aws.amazon.com/premiumsupport/knowledge-center/install-ena-driver-rhel-ec2/

How to launch RHEL 6 on Nitro-instance


Yes, it is possible to run RHEL 6 on Nitro-isntances (M5, C5, T3). I have tested this out using AMI: ami-0351faf7328fdb373 (RHEL 6.10 - HVM - Red Hat Provided, ENA: no).


1) First make sure NVMe driver is there and check the driver version using the following command. If the instance has the NVMe driver, the command returns information about the driver.

$ modinfo nvme


2) Install and Enable ENA: https://aws.amazon.com/premiumsupport/knowledge-center/install-ena-driver-rhel-ec2/

A. Update the kernel and reboot the system so that the latest kernel takes effect:

sudo yum upgrade kernel -y && sudo reboot

B. Install the development package for building kernel modules to match the kernel:

sudo yum install kernel-devel-$(uname -r) gcc git patch rpm-build wget -y
cd /usr/src/
sudo wget https://github.com/amzn/amzn-drivers/archive/master.zip
sudo unzip master.zip
cd amzn-drivers-master/kernel/linux/ena
sudo make

C. Copy the module to the modules directory:

sudo cp ena.ko /lib/modules/$(uname -r)/

D. Regenerate the kernel module dependency map files:

sudo depmod

E. Use the modinfo command to confirm that the ENA module is present:

modinfo ena

The modinfo command output shows the ENA driver information.

Note: The ENA driver version might be newer than 2.2.11g while you compile and install it on your system.

filename:       /lib/modules/2.6.32-754.33.1.el6.x86_64/ena.ko
version:        2.2.11g
license:        GPL
description:    Elastic Network Adapter (ENA)
author:         Amazon.com, Inc. or its affiliates
retpoline:      Y
srcversion:     17C7CD1CEAD3F0ADB3A5E5E
alias:          pci:v00001D0Fd0000EC21sv*sd*bc*sc*i*
alias:          pci:v00001D0Fd0000EC20sv*sd*bc*sc*i*
alias:          pci:v00001D0Fd00001EC2sv*sd*bc*sc*i*
alias:          pci:v00001D0Fd00000EC2sv*sd*bc*sc*i*
alias:          pci:v00001D0Fd00000051sv*sd*bc*sc*i*
depends:        
vermagic:       2.6.32-754.33.1.el6.x86_64 SMP mod_unload modversions 
parm:           debug:Debug level (0=none,...,16=all) (int)
parm:           rx_queue_size:Rx queue size. The size should be a power of 2. Max value is 8K
(int)
parm:           force_large_llq_header:Increases maximum supported header size in LLQ mode to 224  bytes, while reducing the maximum TX queue size by half.
(int)
parm:           num_io_queues:Sets number of RX/TX queues to allocate to device. The maximum value depends on the device and number of online CPUs.
(int)

F. Append net.ifnames=0 to /boot/grub/grub.conf to disable network interface naming:

sudo sed -i '/^kernel/s/$/ net.ifnames=0/' /boot/grub/grub.conf

G. Stop the instance.

H. Enable enhanced network support at the instance level. The following example modifies the instance's attribute from the AWS Command Line Interface (AWS CLI).

aws ec2 modify-instance-attribute --instance-id i-xxxxxxxxxxxxxxxxx --ena-support --region xx-xxxxx-x

I. Change the instance type to one of the ENA supported instance types.

J. Start the instance, connect to the instance using SSH, and then run the ethtool command:

ethtool -i eth0
driver: ena
version: 2.4.1g
firmware-version: 
bus-info: 0000:00:05.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no


STEPS TO INSTALL AND ENABLE ENA ON CENTOS7


1) Prerequisite: Please set up the proxy environment variables:

       # export http_proxy=http://EG5017APROXY.egain.cloud:800/
       # export https_proxy=https://EG5017APROXY.egain.cloud:800/
   Confirm the environment variables:
       # echo $http_proxy
       # echo $https_proxy


2) Install required packages and update:

       # sudo yum --enablerepo=extras install epel-release
       # sudo yum -y install patch dkms kernel-devel perl
       # sudo yum update
       # sudo reboot


3) After reboot, confirm the running kernel and change the default kernel index to kernel version 3.10.0-1160.31.1.el7.x86_64. You will notice after the reboot, the system did not boot into the latest kernel version, so the following steps would help fix that.

   Confirm current running kernel:
       # uname -r
   
   PLEASE NOTE: If the running kernel version is 3.10.0-1160.31.1.el7.x86_64, then you can skip the rest of STEP 3 and proceed to STEP 4. (During our attempt, we did not have to do this STEP 3 as the running kernel version was 3.10.0-1160.31.1.el7.x86_64)
   In the list of kernels, identify the kernel version 3.10.0-1160.31.1.el7.x86_64 and make note of it's index: 
       # sudo awk -F\' '$1=="menuentry " {print i++ " : " $2}' /boot/grub2/grub.cfg
       0 : CentOS Linux (3.10.0-1160.31.1.el7.x86_64) 7 (Core) <------ We want to use this one (index = 1)
       1 : CentOS Linux (3.10.0-327.36.3.el7.x86_64) 7 (Core)
       2 : CentOS Linux (3.10.0-327.28.3.el7.x86_64) 7 (Core)
       3 : CentOS Linux (3.10.0-327.28.2.el7.x86_64) 7 (Core)
       4 : CentOS Linux (3.10.0-327.10.1.el7.x86_64) 7 (Core)
       5 : CentOS Linux (0-rescue-f32e0af35637b5dfcbedcb0a1de8dca1) 7 (Core)
   Set the default kernel to kernel version 3.10.0-1160.25.1.el7.x86_64:
       # grubby --set-default /boot/vmlinuz-3.10.0-1160.31.1.el7.x86_64
       # grubby --default-index
       0
       # sudo reboot
   Confirm the system is now running the latest kernel:
       # uname -r 
       3.10.0-1160.31.1.el7.x86_64


4) Install and compile the ENA driver:

       # sudo su
       # cd /tmp
       # curl -o ena_linux_2.2.9.tar.gz https:// c o d e l o a d . g i t h u b . c o m /amzn/amzn-drivers/tar.gz/ena_linux_2.2.9
       # tar zxvf ena_linux_2.2.9.tar.gz
       # mv amzn-drivers-ena_linux_2.2.9 /usr/src/ena-2.2.9
   Create the following file and add the following lines:
       # vi /usr/src/ena-2.2.9/dkms.conf
       PACKAGE_NAME="ena"
       PACKAGE_VERSION="2.2.9"
       AUTOINSTALL="yes"
       REMAKE_INITRD="yes"
       BUILT_MODULE_LOCATION[0]="kernel/linux/ena"
       BUILT_MODULE_NAME[0]="ena"
       DEST_MODULE_LOCATION[0]="/updates"
       DEST_MODULE_NAME[0]="ena"
       CLEAN="cd kernel/linux/ena; make clean"
       MAKE="cd kernel/linux/ena; make BUILD_KERNEL=${kernelver}"
       # dkms add -m ena -v 2.2.9
       # dkms build -m ena -v 2.2.9
       # dkms install -m ena -v 2.2.9
       # dracut -f --add-drivers ena

5) Confirm that the ENA and NVME modules were installed:

       # modinfo ena
       filename:       /lib/modules/3.10.0-1160.25.1.el7.x86_64/extra/ena.ko.xz
       version:        2.2.9g
       license:        GPL
       description:    Elastic Network Adapter (ENA)
       author:         Amazon.com, Inc. or its affiliates
       retpoline:      Y
       rhelversion:    7.9
       srcversion:     27F5567B9755BE00C8A08B5
       alias:          pci:v00001D0Fd0000EC21sv*sd*bc*sc*i*
       alias:          pci:v00001D0Fd0000EC20sv*sd*bc*sc*i*
       alias:          pci:v00001D0Fd00001EC2sv*sd*bc*sc*i*
       alias:          pci:v00001D0Fd00000EC2sv*sd*bc*sc*i*
       alias:          pci:v00001D0Fd00000051sv*sd*bc*sc*i*
       depends:        
       vermagic:       3.10.0-1160.25.1.el7.x86_64 SMP mod_unload modversions 
       parm:           debug:Debug level (0=none,...,16=all) (int)
       parm:           rx_queue_size:Rx queue size. The size should be a power of 2. Max value is 8K
       (int)
       parm:           force_large_llq_header:Increases maximum supported header size in LLQ mode to 224 bytes, while reducing the maximum TX queue size by half.
       (int)
       parm:           num_io_queues:Sets number of RX/TX queues to allocate to device. The maximum value depends on the device and number of online CPUs.
       (int)
       # modinfo nvme
       filename:       /lib/modules/3.10.0-1160.25.1.el7.x86_64/kernel/drivers/nvme/host/nvme.ko.xz
       version:        1.0
       license:        GPL
       author:         Matthew Wilcox <willy@linux.intel.com>
       retpoline:      Y
       rhelversion:    7.9
       srcversion:     E7B6047FC28A75C582AC5D0
       alias:          pci:v*d*sv*sd*bc01sc08i02*
       alias:          pci:v00001E0Fd00000007sv00001028sd*bc*sc*i*
       alias:          pci:v0000144Dd0000A824sv00001028sd*bc*sc*i*
       alias:          pci:v0000144Dd0000A822sv*sd*bc*sc*i*
       alias:          pci:v0000144Dd0000A821sv*sd*bc*sc*i*
       alias:          pci:v00001C5Fd00000540sv*sd*bc*sc*i*
       alias:          pci:v00001C58d00000023sv*sd*bc*sc*i*
       alias:          pci:v00001C58d00000003sv*sd*bc*sc*i*
       alias:          pci:v00001BB1d00000100sv*sd*bc*sc*i*


6) Systems that use systemd or udev versions 197 or greater can rename Ethernet devices and they do not guarantee that a single network interface will be named eth0. This behavior can cause problems connecting to your instance. Run the following commands to address this:

       # rpm -qa | grep -e '^systemd-[0-9]\+\|^udev-[0-9]\+'
       systemd-219-78.el7_9.3.x86_64
   Disable predictable network interface names by adding the net.ifnames=0 option to the GRUB_CMDLINE_LINUX line in /etc/default/grub like so:
       # vi /etc/default/grub
       GRUB_CMDLINE_LINUX="net.ifnames=0"
       # grub2-mkconfig -o /boot/grub2/grub.cfg


7) Now stop the instance. In another instance, run the CLI command to enable enaSupport attribute on the instance:

       $ aws ec2 modify-instance-attribute --instance-id i-287ca2bd --ena-support --region us-west-2
   And then confirm enaSupport is now set to true:
       $ aws ec2 describe-instances --instance-ids i-287ca2bd --query "Reservations[].Instances[].EnaSupport" --region us-west-2
       [
         true
       ]


8) Next, changed the instance type to C5. Then start it up. During my test, my instance booted successfully as C5 instance type and did not run into an issue of booting into emergency mode. Instance was passing 2/2 status checks and confirmed ENA module was loaded:

       $ ethtool -i eth0
       driver: ena
       version: 2.2.9g
       firmware-version: 
       expansion-rom-version: 
       bus-info: 0000:00:05.0
       supports-statistics: yes
       supports-test: no
       supports-eeprom-access: no
       supports-register-dump: no
       supports-priv-flags: no


Notes


  • Don't trust the M5_C5 checker script!! Good for updating fstab, that's about it.


Dracut Errors

If you see something like this:

[  185.203047] dracut-initqueue[267]: Warning: dracut-initqueue timeout - starting timeout scripts
[  185.203357] dracut-initqueue[267]: Warning: Could not boot.

[  185.426315] dracut-initqueue[267]: Warning: /dev/mapper/rhel-root does not exist
[  185.428699] dracut-initqueue[267]: Warning: /dev/rhel/root does not exist
[  185.431023] dracut-initqueue[267]: Warning: /dev/rhel/swap does not exist
         Starting Dracut Emergency Shell...
Warning: /dev/mapper/rhel-root does not exist
Warning: /dev/rhel/root does not exist
Warning: /dev/rhel/swap does not exist

Generating "/run/initramfs/rdsosreport.txt"


Entering emergency mode. Exit the shell to continue.
Type "journalctl" to view system logs.
You might want to save "/run/initramfs/rdsosreport.txt" to a USB stick or /boot after mounting them and attach it to a bug report.

Another Solution:

  • Run the below command:
$ lsinitrd | grep nvme
  • If nothing returns, that means the initramfs didn't have the NVME driver so the EBS volumes couldn't be mounted when started as Nitro.
  • add “nvme” “add_drivers+=“ in /etc/dracut.conf (or /etc/dracut.conf.d/ena.conf)
  • This step is also in our documents
$ echo 'add_drivers+=" ena "' >> /etc/dracut.conf.d/ena.conf
  • Then run:
$ dracut -f -v

You should be good to go


How to check if Ena Support is enabled on an instance


Enhanced networking attribute NOT enabled:

$ aws ec2 describe-instances --instance-ids i-0f0ece5f4b6f0957e --query "Reservations[].Instances[].EnaSupport"
[]

Enhanced networking attribute IS enabled:

$ aws ec2 describe-instances --instance-ids i-0f0ece5f4b6f0957e --query "Reservations[].Instances[].EnaSupport"
[
    true
]