How to Optimize Your Cisco Day-0 with POAP and Ansible

In this article, we will see how to optimize your time during your Day-0 with Cisco Nexus N9K using POAP and Ansible.

Topology Overview

In this infrastructure, we have :

  • one Cisco Nexus N9Kv
  • one Cisco router to split the mgmt Out-Of-Band (OOB) network used by Nexus (interface mgmt 0)
  • one linux server running on Ubuntu.

The Ubuntu server will run the following services :

  • DHCP server to assign IP addresses and return the path for the tftp server
  • TFTP server to host and deliver the poap python script
  • SCP and / or HTTP server to host and deliver the Nexus images and configuration file

We will use two different networks where the router will be the default gateway for each one:

  • 10.0.1.0/24 – GW .254, used for the OOB network
  • 10.0.2.0/24 – GW .254, used to host administration tools

The server will have the IP address 10.0.2.1/24 and we will use 10.0.1.100/24 for the Nexus 9Kv.

The network services

DHCP server

For the DHCP server, we will use isc-dhcp-server.

You can easily install with the following command this package:

apt-get install isc-dhcp-server

Then you need to configure your server. In this topology the router will be a DHCP Relay, that means request from N9Kv will arrive to this server in unicast with the IP (10.0.1.254) on the router.

Router configuration:

interface Ethernet0/1
ip address 10.0.1.254 255.255.255.0
ip helper-address 10.0.2.1
end 

For the server configuration, we need to add at least a subnet pool for your own network : 10.0.2.0/24. If you don’t have a network which listen on your network, the server will not start.

You need also to create a range for your N9Kv : 10.0.1.0/24. For this range, we will provide :

  • The default gateway (option routers)
  • The tftp server (option tftp-server-name)
  • The file, which should be used on the tftp server (option bootfile-name)

In this file we will also create one entry for our N9Kv. We will reserve an IP address based on the Serial Number. The S/N should be prefixed with : \000.

Configuration file:

root@ubuntu:/srv/tftp/poap# cat /etc/dhcp/dhcpd.conf
ddns-update-style none;

option domain-name "lab";
default-lease-time 600;
max-lease-time 7200;

authoritative;

log-facility local7;

subnet 10.0.1.0 netmask 255.255.255.0 {
  range 10.0.1.1 10.0.1.180;
  option routers 10.0.1.254;
  option tftp-server-name "10.0.2.1";
  option bootfile-name "/poap/poap.py";
  ping-check = 1;
}

subnet 10.0.2.0 netmask 255.255.255.0 {
}

host N9K-POAP {
  option dhcp-client-identifier "\00090IFLAUVL3T";
  fixed-address 10.0.1.100;
  option tftp-server-name "10.0.2.1";
  option bootfile-name "/poap/poap.py";
}

Finally, you can start your server :

root@ubuntu:/srv/tftp/poap# service isc-dhcp-server start
root@ubuntu:/srv/tftp/poap# service isc-dhcp-server status
* isc-dhcp-server.service - ISC DHCP IPv4 server
   Loaded: loaded (/lib/systemd/system/isc-dhcp-server.service; enabled; vendor preset: enabled)
   Active: active (running) since Sun 2020-08-23 21:12:52 EEST; 16h ago
     Docs: man:dhcpd(8)
 Main PID: 14027 (dhcpd)
    Tasks: 1
   Memory: 8.5M
      CPU: 563ms
   CGroup: /system.slice/isc-dhcp-server.service
           `-14027 dhcpd -user dhcpd -group dhcpd -f -4 -pf /run/dhcp-server/dhcpd.pid -cf /etc/dhcp/dhcpd.conf

Aug 24 13:22:29 ubuntu dhcpd[14027]: DHCPREQUEST for 10.0.2.1 from 00:50:00:00:03:00 via ens3: unknown lease 10.0.2.1.
Aug 24 13:22:30 ubuntu dhcpd[14027]: DHCPREQUEST for 10.0.2.1 from 00:50:00:00:03:00 via ens3: unknown lease 10.0.2.1.
Aug 24 13:22:39 ubuntu dhcpd[14027]: DHCPREQUEST for 10.0.2.1 from 00:50:00:00:03:00 via ens3: unknown lease 10.0.2.1.
Aug 24 13:22:40 ubuntu dhcpd[14027]: DHCPREQUEST for 10.0.2.1 from 00:50:00:00:03:00 via ens3: unknown lease 10.0.2.1.
Aug 24 13:22:42 ubuntu dhcpd[14027]: DHCPREQUEST for 10.0.2.1 from 00:50:00:00:03:00 via ens3: unknown lease 10.0.2.1.
Aug 24 13:22:58 ubuntu dhcpd[14027]: DHCPREQUEST for 10.0.2.1 from 00:50:00:00:03:00 via ens3: unknown lease 10.0.2.1.
Aug 24 13:22:59 ubuntu dhcpd[14027]: DHCPREQUEST for 10.0.2.1 from 00:50:00:00:03:00 via ens3: unknown lease 10.0.2.1.
Aug 24 13:23:00 ubuntu dhcpd[14027]: DHCPREQUEST for 10.0.2.1 from 00:50:00:00:03:00 via ens3: unknown lease 10.0.2.1.
Aug 24 13:23:07 ubuntu systemd[1]: Started ISC DHCP IPv4 server.
Aug 24 13:23:07 ubuntu dhcpd[14027]: DHCPREQUEST for 10.0.2.1 from 00:50:00:00:03:00 via ens3: unknown lease 10.0.2.1.

TFTP server

For the TFTP server, we will use atftp. To install it enter the following command :

root@ubuntu:/srv/tftp/poap# apt install atftpd

By default the configuration file is located on /etc/default/atftpd. You can setup the path where is located your files.

USE_INETD=false
OPTIONS="--tftpd-timeout 300 --logfile /var/log/atftpd.log --retry-timeout 5 --mcast-port 1758 --mcast-addr 239.239.239.0-255 --mcast-ttl 1 --maxthread 100 --verbose=7 /srv/tftp"

In your case, we will create directories /srv/tftp and some other one:

root@ubuntu:/srv/tftp/poap# tree /srv
/srv
|-- ftp
|   `-- welcome.msg
`-- tftp
    `-- poap
        |-- conf
        |   |-- conf.90IFLAUVL3T
        |   |-- conf.90IFLAUVL3T.md5
        |-- nxos.9.3.2.bin
        |-- nxos.9.3.2.bin.md5
        |-- poap.py
        `-- poap.py.md5

You can download from git the poap.py file.

https://github.com/datacenter/nexus9000/blob/master/nx-os/poap/poap.py

You have different version. We will describe one after.

SCP server

You need also a server to deliver the Nexus images and the configuration file. Prefer to use a secure server like an scp or https server.

Here we will simply use opeenssh-server as scp server and create one service account: poap.

We can see one poap user, where the homedirectory is located on /srv/tftp/

poap:x:1001:1001::/srv/tftp/:

poap.py

Now we have our services, we need to prepare the poap.py script.

In this script, we will provide some information:

  • the target NXOS version
  • the path where are located the images
  • the path where are located the configurations
  • the credential for scp
  • the mode used to obtain the config file in our case: serial_number

Extract of the poap.py file:

# **** Here are all variables that parametrize this script ****
# These parameters should be updated with the real values used
# in your automation environment

# system and kickstart images, configuration: location on server (src) and target (dst)
n9k_image_version       = "9.3.2" # this must match your code version
image_dir_src           = "/srv/tftp/poap/"  # Sample - /Users/bob/poap
ftp_image_dir_src_root  = image_dir_src
tftp_image_dir_src_root = image_dir_src
n9k_system_image_src    = "nxos.%s.bin" % n9k_image_version
config_file_src         = "/srv/tftp/poap/conf/conf" # Sample - /Users/bob/poap/conf
image_dir_dst           = "bootflash:" # directory where n9k image will be stored
system_image_dst        = n9k_system_image_src
config_file_dst         = "volatile:poap.cfg"
md5sum_ext_src          = "md5"
# Required space on /bootflash (for config and system images)
required_space          = 350000

# copy protocol to download images and config
# options are: scp/http/tftp/ftp/sftp
protocol                = "scp" # protocol to use to download images/config

# Host name and user credentials
username                = "poap" # server account
ftp_username            = "anonymous" # server account
password                = "cisco1234" # password
hostname                = "10.0.2.1" # ip address of ftp/scp/http/sftp server

# vrf info
vrf = "management"
if os.environ.has_key('POAP_VRF'):
    vrf=os.environ['POAP_VRF']

# Timeout info (from biggest to smallest image, should be f(image-size, protocol))
system_timeout          = 2100
config_timeout          = 120
md5sum_timeout          = 120

# POAP can use 3 modes to obtain the config file.
# - 'static' - filename is static
# - 'serial_number' - switch serial number is part of the filename
# - 'location' - CDP neighbor of interface on which DHCPDISCOVER arrived
#                is part of filename
# if serial-number is abc, then filename is $config_file_src.abc
# if cdp neighbor's device_id=abc and port_id=111, then filename is config_file_src.abc.111
# Note: the next line can be overwritten by command-line arg processing later
config_file_type        = "serial_number"

After you changed something on this file, you need to generate the md5 with the command within the script :

f=poap.py ; cat $f | sed '/^#md5sum/d' > $f.md5 ; sed -i "s/^#md5sum=./#md5sum=\"$(md5sum $f.md5 | sed 's/ .//')\"/" $f

You need also to prepare the configuration file. Basically you need to provide the minimum like :

  • the admin credential
  • the IP address for the management interface
  • the default gateway for the management vrf.

One other recommendation is also to add one other account, which can be used to push the Post-Configuration. In my case, I added one account for ansible with an ssh key.

If you want to do the same, you need to create one new user dedicated for ansible, generated one RSA key and put the public key on your configuration.

Example:

#adduser ansible
#su - ansible
#ssh-keygen

ansible@ubuntu:/srv/tftp/poap# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/ansible/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /ansible/.ssh/id_rsa.
Your public key has been saved in /ansible/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:9YHPgARV1M1o0OhO/XLPiRqFBnqP+aIY3AXdWxwZqKQ root@ubuntu
The key's randomart image is:
+---[RSA 2048]----+
| .ooo+=.o | | ..ooo.+ |
| .++=+.o |
| Eoo+==. |
| .S= ++o |
| . . o * o o |
| o . o o o.o.|
| o .. .. .o|
| . .. .o. |
+----[SHA256]-----+

Then get the public key on the following file

ansible@ubuntu:~/ansible$ cat ~/.ssh/id_rsa.pub
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPcNuuldpLiBuII9obYQXxRJKyqDoIsJAEWDo8w7Wk0bF8Y1A/Yba4Ld2a61k9NYUy8BbwF7ra3sRM1sd/lzW4KEsFx0lMq5SFXYBQCeYVSWSstRnuRuspfyQzcGYnPziyolBcDKTpMRekZk3cGD7lWSq32uIKEIW4k5UCywxqXP0RsjlGedtRg2in5tDPn4+qaTGpPRqYN/Cicoivm4SaX4iFtPhTyGdingss9aMMahdSKK4G1EixQnAfTcotY0A409013a1xuiMetBq+wXgCC19mepwwvovWm825q5CH8xTu9JxzvfolXHKNKIeUxFoQo55MNRgte7RNTC0EtYx9 ansible@ubuntu

and add it on your Nexus configuration file. You will be able to connect in ssh without password on th Nexus 9K.

[..]
username ansible password 5 ! role network-operator
username ansible role network-admin
username ansible sshkey ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDPcNuuldpLiBuII9obYQXxRJKyqDoIsJAEWDo8w7Wk0bF8Y1A/Yba4Ld2a61k9NYUy8BbwF7ra3sRM1sd/lzW4KEsFx0lMq5SFXYBQCeYVSWSstRnuRuspfyQzcGYnPziyolBcDKTpMRekZk3cGD7lWSq32uIKEIW4k5UCywxqXP0RsjlGedtRg2in5tDPn4+qaTGpPRqYN/Cicoivm4SaX4iFtPhTyGdingss9aMMahdSKK4G1EixQnAfTcotY0A409013a1xuiMetBq+wXgCC19mepwwvovWm825q5CH8xTu9JxzvfolXHKNKIeUxFoQo55MNRgte7RNTC0EtYx9 ansible@ubuntu
username ansible passphrase lifetime 99999 warntime 14 gracetime 3
[..]

POAP process

At this time your Cisco Nexus device is probably up and loop in the poap.

..2020 Aug 24 08:04:39 %$ VDC-1 %$ %CARDCLIENT-2-FPGA_BOOT_GOLDEN: IOFPGA booted from Golden
2020 Aug 24 08:04:39 %$ VDC-1 %$ %CARDCLIENT-2-FPGA_BOOT_STATUS: Unable to retrieve MIFPGA boot status
..System is coming up … Please wait …
…Starting Auto Provisioning …
2020 Aug 24 08:05:13 %$ VDC-1 %$ %VDC_MGR-2-VDC_ONLINE: vdc 1 has come online
Done
Abort Power On Auto Provisioning yes/skip/no yes - continue with normal setup, skip - bypass password and basic configuration, no - continue with Power On Auto Provisioning [no]:
Abort Power On Auto Provisioning yes - continue with normal setup, skip - bypass password and basic configuration, no - continue with Power On Auto Provisioning[no]: 2020 Aug 24 08:05:23 switch %$ VDC-1 %$ %POAP-2-POAP_INITED: [90IFLAUVL3T-50:00:00:01:00:07] - POAP process initialized
2020 Aug 24 08:05:39 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - USB Initializing Success
2020 Aug 24 08:05:39 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - USB disk not detected

Now your switch will start the process. First it will ask an IP address, the server will answer with some other parameter, basically a tftp-server address and a filename.

On the DHCP Discover phase you can see the Serial Number inside one option (61). This option will use to reserve one IP Address.

The server will offer the IP address, etc.

On the server DHCP side we can see the source is 10.0.1.254, which is the default GW for the mgmt subnet, where we have configure the DHCP Relay.

Aug 24 12:22:54 ubuntu dhcpd[14027]: from the dynamic address pool for 10.0.1.0/24
Aug 24 12:22:54 ubuntu dhcpd[14027]: uid lease 10.0.1.1 for client 50:00:00:01:00:00 is duplicate on 10.0.1.0/24
Aug 24 12:22:54 ubuntu dhcpd[14027]: DHCPREQUEST for 10.0.1.100 (10.0.2.1) from 50:00:00:01:00:00 via 10.0.1.254
Aug 24 12:22:54 ubuntu dhcpd[14027]: DHCPACK on 10.0.1.100 to 50:00:00:01:00:00 via 10.0.1.254

One the Nexus console we can observe the following logs :

2020 Aug 24 08:05:40 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: Recieved DHCP offer from server ip - 10.0.2.1
2020 Aug 24 08:05:48 switch %$ VDC-1 %$ last message repeated 1 time
2020 Aug 24 08:05:48 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - Using DHCP, valid information received over mgmt0 from 10.0.2.1
2020 Aug 24 08:05:48 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - Assigned IP address: 10.0.1.100
2020 Aug 24 08:05:48 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - Netmask: 255.255.255.0
2020 Aug 24 08:05:48 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - DNS Server: 10.0.100.1
2020 Aug 24 08:05:48 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - Default Gateway: 10.0.1.254
2020 Aug 24 08:05:48 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - Script Server: 10.0.2.1
2020 Aug 24 08:05:48 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - Script Name: /poap/poap.py
2020 Aug 24 08:06:00 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - The POAP Script download has started
2020 Aug 24 08:06:00 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - The POAP Script is being downloaded from [copy tftp://10.0.2.1//poap/poap.py bootflash:scripts/script.sh vrf management ]
2020 Aug 24 08:06:05 switch %$ VDC-1 %$ %USER-1-SYSTEM_MSG: SWINIT failed. devid:241 inst:0 - t2usd
2020 Aug 24 08:06:10 switch %$ VDC-1 %$ %POAP-2-POAP_SCRIPT_DOWNLOADED: [90IFLAUVL3T-50:00:00:01:00:07] - Successfully downloaded POAP script file
2020 Aug 24 08:06:10 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - Script file size 20390, MD5 checksum 89f5b64624dcd4c2350dbece6aaf3bab
2020 Aug 24 08:06:10 switch %$ VDC-1 %$ %POAP-2-POAP_INFO: [90IFLAUVL3T-50:00:00:01:00:07] - MD5 checksum received from the script file is 89f5b64624dcd4c2350dbece6aaf3bab
2020 Aug 24 08:06:10 switch %$ VDC-1 %$ %POAP-2-POAP_SCRIPT_STARTED_MD5_VALIDATED: [90IFLAUVL3T-50:00:00:01:00:07] - POAP script execution started(MD5 validated)
2020 Aug 24 08:07:25 switch %$ VDC-1 %$ %ASCII-CFG-2-CONF_CONTROL: System ready

If everything is good, you have an IP address and download in TFTP the poap.py script. Next the script will verify if you run the target image and download the configuration in SCP.

Finally, you will reload the device.

2020 Aug 24 09:23:20 switch %$ VDC-1 %$ %USER-1-SYSTEM_MSG: SWINIT failed. devid:241 inst:0  - t2usd
2020 Aug 24 09:24:53 switch %$ VDC-1 %$ %ASCII-CFG-2-CONF_CONTROL: System ready
2020 Aug 24 09:25:22 switch %$ VDC-1 %$ %VMAN-2-ACTIVATION_STATE: Successfully activated virtual service 'guestshell+'
2020 Aug 24 09:25:22 switch %$ VDC-1 %$ %VMAN-2-GUESTSHELL_ENABLED: The guest shell has been enabled. The command 'guestshell' may be used to access it, 'guestshell destroy' to remove it.
2020 Aug 24 09:27:35 switch %$ VDC-1 %$ %POAP-2-POAP_SCRIPT_EXEC_SUCCESS: [90IFLAUVL3T-50:00:00:01:00:07] - POAP script execution success
2020 Aug 24 09:27:41 switch %$ VDC-1 %$ %POAP-2-POAP_RELOAD_DEVICE: [90IFLAUVL3T-50:00:00:01:00:07] - Reload device
2020 Aug 24 09:27:51 switch %$ VDC-1 %$ %VMAN-2-ACTIVATION_STATE: Successfully deactivated virtual service 'guestshell+'
2020 Aug 24 09:27:53 switch %$ VDC-1 %$ %PLATFORM-2-PFM_SYSTEM_RESET: Manual system restart from Command Line Interface
[  731.127762] sysrq: SysRq : Resetting
Sysconf checksum failed. Using default values
WARNING:  No BIOS Info found
Sysconf checksum failed. Using default values
Sysconf checksum failed. Using default values
Sysconf checksum failed. Using default values
ATE0Q1&D2&C1S0=1
Standalone chassis
check_bootmode: grub2pxe: grub failed, launch ipxe
Trying to load ipxe
Loading Application:
/Vendor(429bdb26-48a6-47bd-664c-801204061400)/UnknownMedia(6)/EndEntire
cannot load imageFailed to launch ipxe
Came back to grub, now load efi shell
Trying to load efishell
Loading Application:
/Vendor(429bdb26-48a6-47bd-664c-801204061400)/UnknownMedia(6)/EndEntire
cannot load imageFailed to launch shell
Trying to read config file /boot/grub/menu.lst.local from (hd0,4)
 Filesystem type is ext2fs, partition type 0x83

Booting bootflash:/nxos.9.3.2.bin ...
Booting bootflash:/nxos.9.3.2.bin
Trying diskboot
 Filesystem type is ext2fs, partition type 0x83
Image valid
[..]
Installing local RPMS
Patch Repository Setup completed successfully
Bootstrapping via POAP overriding existing startup-config
Creating /dev/mcelog
Starting mcelog daemon
INIT: Entering runlevel: 3
Running S93thirdparty-script...

Populating conf files for hybrid sysmgr ...
Starting hybrid sysmgr ...
done
Netbroker support IS present in the kernel.
done
Executing Prune clis.
Aug 24 09:31:14 %FW_APP-2-FIRMWARE_IMAGE_LOAD_SUCCESS No Firmware needed for Non SR card.
2020 Aug 24 09:31:28  %$ VDC-1 %$  %USER-2-SYSTEM_MSG: <<%USBHSD-2-MOUNT>> logflash: online  - usbhsd
2020 Aug 24 09:31:35  %$ VDC-1 %$  %DAEMON-2-SYSTEM_MSG: <<%ASCII-CFG-2-CONF_CONTROL>> Poap replay /bootflash/poap_replay01.cfg - ascii-cfg[31425]
2020 Aug 24 09:31:53  %$ VDC-1 %$ netstack: Registration with cli server complete
System is coming up ... Please wait ...
....System is coming up ... Please wait ...
2020 Aug 24 09:32:46  %$ VDC-1 %$ %USER-2-SYSTEM_MSG: ssnmgr_app_init called on ssnmgr up - aclmgr
....2020 Aug 24 09:33:06  %$ VDC-1 %$ %USER-0-SYSTEM_MSG: end of default policer - copp
2020 Aug 24 09:33:06  %$ VDC-1 %$ %COPP-2-COPP_NO_POLICY: Control-plane is unprotected.
System is coming up ... Please wait ...
2020 Aug 24 09:33:15  %$ VDC-1 %$ %CARDCLIENT-2-FPGA_BOOT_GOLDEN: IOFPGA booted from Golden
2020 Aug 24 09:33:15  %$ VDC-1 %$ %CARDCLIENT-2-FPGA_BOOT_STATUS: Unable to retrieve MIFPGA boot status
....System is coming up ... Please wait ...
.2020 Aug 24 09:33:47  %$ VDC-1 %$ %ASCII-CFG-2-CONFIG_REPLAY_STATUS: Bootstrap Replay Started.
.2020 Aug 24 09:33:51  %$ VDC-1 %$ %VDC_MGR-2-VDC_ONLINE: vdc 1 has come online
Waiting for box online to replay poap config
2020 Aug 24 09:34:09 switch %$ VDC-1 %$ %ASCII-CFG-2-CONFIG_REPLAY_STATUS: Bootstrap Replay Done.
2020 Aug 24 09:34:31 switch %$ VDC-1 %$ %USER-1-SYSTEM_MSG: SWINIT failed. devid:241 inst:0  - t2usd
2020 Aug 24 09:35:46 switch %$ VDC-1 %$ %ASCII-CFG-2-CONFIG_REPLAY_STATUS: Ascii Replay Started.
2020 Aug 24 09:36:19 switch %$ VDC-1 %$ %ASCII-CFG-2-CONFIG_REPLAY_STATUS: Ascii Replay Done.
2020 Aug 24 09:36:21 switch %$ VDC-1 %$ %ASCII-CFG-2-CONF_CONTROL: System ready
[########################################] 100%
2020 Aug 24 09:36:52 switch %$ VDC-1 %$ %VMAN-2-ACTIVATION_STATE: Successfully activated virtual service 'guestshell+'
2020 Aug 24 09:36:52 switch %$ VDC-1 %$ %VMAN-2-GUESTSHELL_ENABLED: The guest shell has been enabled. The command 'guestshell' may be used to access it, 'guestshell destroy' to remove it.
Copy complete, now saving to disk (please wait)...
Copy complete.
Auto provisioning complete



User Access Verification
switch login:

ANSIBLE

Your switch is UP with the target image and your configuration. Now you are able to continue your setup with ansible.

For ansible we have install the latest version with python-pip :

pip install ansible
root@ubuntu:/srv/tftp/poap# pip list
[..]

ansible 2.9.12
cffi 1.14.2
cryptography 3.0
ecdsa 0.13
enum34 1.1.10
httplib2 0.9.1
ipaddress 1.0.23
Jinja2 2.8
MarkupSafe 0.23
netaddr 0.7.18
paramiko 1.16.0
pip 20.2.2
pycparser 2.20
pycrypto 2.6.1
PyYAML 3.11
setuptools 20.7.0
six 1.10.0
wheel 0.29.0

For this lab, we have two files :

  • inventory, which contains the variables for your switch
  • play1.yml, which is your simple playbook
ansible@ubuntu:~/ansible$ tree
.
|-- inventory
`-- play1.yml
0 directories, 2 files
ansible@ubuntu:~/ansible$ cat inventory
[N9K]
N9K1 ansible_host=10.0.1.100  ansible_port=22

[N9K:vars]
ansible_user=ansible
ansible_connection=network_cli
ansible_network_os=nxos
ansible_python_interpreter="/usr/bin/env python"
ansible@ubuntu:~/ansible$ cat play1.yml
---
- name: Setup Nexus Devices

  hosts: all
  connection: local
  gather_facts: False

  tasks:
    - name: configure hostname
      nxos_config:
        lines: hostname {{ inventory_hostname }}
        save_when: modified

This playbook will setup the hostname and replace the variable {{ inventory_hostname }} by the value inside the inventory : N9K1.

We play the playbook and you can see one changed:

The hostname has been changed.

switch login:
User Access Verification
switch login:
User Access Verification
N9K1 login: 2020 Aug 24 09:48:06 N9K1 %$ VDC-1 %$ %COPP-2-COPP_NO_POLICY: Control-plane is unprotected.

Troubleshooting

If you have issue during the poap, the best option is probably to skip the process and the check the log created in bootflash.

Example here where the file name for the configuration is not good:

root@ubuntu:/srv/tftp/poap# cat conf/poap.log.7_26_15
INFO: Selected config filename (serial_number) : /srv/tftp/poap/conf/conf..90IFLAUVL3T
INFO: free space is 2523696 kB
CLI : terminal dont-ask ; terminal password cisco1234 ; copy scp://poap@10.0.2.1/srv/tftp/poap//nxos.9.3.2.bin.md5 volatile:nxos.9.3.2.bin.md5.poap_md5 vrf management
CLI : show file volatile:nxos.9.3.2.bin.md5.poap_md5 | grep -v '^#' | head lines 1 | sed 's/ .*$//'
INFO: md5sum 76b01ff1d7243ce035c25becd2634d27 (.md5 file)
CLI : show file bootflash:/nxos.9.3.2.bin md5sum
INFO: md5sum 76b01ff1d7243ce035c25becd2634d27 (recalculated)
INFO: Same source and destination images
INFO: Verification passed. (system : 11/4/2019)
INFO: Verification passed.  (system : 11/4/2019)
CLI : terminal dont-ask ; terminal password cisco1234 ; copy scp://poap@10.0.2.1/srv/tftp/poap/conf/conf..90IFLAUVL3T volatile:poap.cfg vrf management
WARN: Copy Failed: "\r\n\nError: no such file
[..]
ERR : aborting
INFO: cleaning up

The following shows when it works properly :

root@ubuntu:/srv/tftp/poap# cat poap.log.8_6_17
INFO: Selected config filename (serial_number) : /srv/tftp/poap/conf/conf.90IFLAUVL3T
INFO: free space is 2523664 kB
CLI : terminal dont-ask ; terminal password cisco1234 ; copy scp://poap@10.0.2.1/srv/tftp/poap//nxos.9.3.2.bin.md5 volatile:nxos.9.3.2.bin.md5.poap_md5 vrf management
CLI : show file volatile:nxos.9.3.2.bin.md5.poap_md5 | grep -v '^#' | head lines 1 | sed 's/ .*$//'
INFO: md5sum 76b01ff1d7243ce035c25becd2634d27 (.md5 file)
CLI : show file bootflash:/nxos.9.3.2.bin md5sum
INFO: md5sum 76b01ff1d7243ce035c25becd2634d27 (recalculated)
INFO: Same source and destination images
INFO: Verification passed. (system : 11/4/2019)
INFO: Verification passed.  (system : 11/4/2019)
CLI : terminal dont-ask ; terminal password cisco1234 ; copy scp://poap@10.0.2.1/srv/tftp/poap/conf/conf.90IFLAUVL3T volatile:poap.cfg vrf management
INFO: Completed Copy of Config File
CLI : terminal dont-ask ; terminal password cisco1234 ; copy scp://poap@10.0.2.1/srv/tftp/poap/conf/conf.90IFLAUVL3T.md5 volatile:conf.90IFLAUVL3T.md5.poap_md5 vrf management
CLI : show file volatile:conf.90IFLAUVL3T.md5.poap_md5 | grep -v '^#' | head lines 1 | sed 's/ .*$//'
INFO: md5sum 97a6fd0ffad10c1986a1c89b0e433ae8 (.md5 file)
CLI : show file volatile:poap.cfg md5sum
INFO: md5sum 97a6fd0ffad10c1986a1c89b0e433ae8 (recalculated)
CLI : show system internal platform internal info | grep box_online | sed 's/[^0-9]*//g'
INFO: Setting the boot variables
CLI : config terminal ; boot nxos bootflash:/nxos.9.3.2.bin
CLI : copy running-config startup-config
CLI : copy volatile:poap.cfg scheduled-config
INFO: Configuration successful

How to Automate Cisco NXOS infrastructure with Ansible

You manage a lot of network devices, but you are alone or you don’t have time. Ansible can help you to manage your change on your whole network very quickly based on your own template. In this article we will use Cisco Nexus 9K.

You have a new DNS server, syslog server etc and you need to modify hundred switches. No worries, with ansible it can be very simple.

First you should create at least two files. The first one will be your inventory and contains your switches. The second will be your playbook.

The first thing is to create a service account for ansible in your switches. This account could be centralize or local. In the following I’ll provide my password in cleartext. Of course, it’s not recommended and you should prefer ssh-key.

On my virtual nexus 9k, I only configured my account and my management IP address.

My topology contains :

  • Nexus-1 : IP 10.0.100.99, name: AGG1
  • Nexus-2 : IP 10.0.100.100, name: ACC1
  • Nexus-3 : IP 10.0.100.101, name: ACC2
switch(config-if)# sh run 

!Command: show running-config
!Running configuration last done at: Sat Mar 21 18:28:03 2020
!Time: Sat Mar 21 18:29:45 2020

version 9.3(2) Bios:version  
[..]
username ansible password 5 $5$.FhD0kmO$4PJV/HKJN5ul9aK7160ii.1WQ3s9pjh2QCRL7x7l
EU/  role network-admin
username ansible passphrase  lifetime 99999 warntime 14 gracetime 3
ip domain-lookup

[..]
interface mgmt0
  vrf member management
  ip address 10.0.100.100/24
line console
line vty

The inventory file will be the following. We can use two formats: YAML or INI. This one will use the INI format. This file contains a group named N9K with three switches.

[N9K]
AGR1 ansible_host=10.0.100.99  ansible_port=22
ACC1  ansible_host=10.0.100.100 ansible_port=22
ACC2  ansible_host=10.0.100.101 ansible_port=22

[N9K:vars]
ansible_user=ansible
ansible_password=@ns1b!E.
ansible_connection=network_cli
ansible_network_os=nxos
ansible_python_interpreter="/usr/bin/env python"

The following file uses the YAML format. This first playbook is very simple and contains one task to configure the switch hostname.

---
- name: Setup Nexus Devices

  hosts: all
  connection: local
  gather_facts: False


  tasks:

    - name: configure hostname
      nxos_config:
        lines: hostname {{ inventory_hostname }}
        save_when: modified

Now I’ll verify my playbook, before apply the changes. This command uses the option -i to specify which file should be use as inventory and –check to simulate the changes.

root@09cf326cc275:/ansible/NXOS# ansible-playbook -i inventory-home playbook-home.yaml --check

PLAY [Setup Nexus Devices] ***********************************************************************************************************************

TASK [configure hostname] ************************************************************************************************************************
[WARNING]: Skipping command `copy running-config startup-config` due to check_mode.  Configuration not copied to non-volatile storage
terpreter_discovery.html for more information.
changed: [ACC1]
changed: [AGR1]
changed: [ACC2]

PLAY RECAP ***************************************************************************************************************************************
ACC1                       : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
ACC2                       : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
AGR1                       : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0 

Now I’ll do the same without the option –check and my Nexus device should be configured. You can see the message copy running is not there.

root@09cf326cc275:/ansible/NXOS# ansible-playbook -i inventory-home playbook-home.yaml        

PLAY [Setup Nexus Devices] ***********************************************************************************************************************

TASK [configure hostname] ************************************************************************************************************************

changed: [ACC1]
changed: [AGR1]
changed: [ACC2]

PLAY RECAP ***************************************************************************************************************************************
ACC1                       : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
ACC2                       : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
AGR1                       : ok=1    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Fantastic, my nexus have been configured !! With the command show accounting log, you can verify the command injected by ansible. In my playbook, I added the line save_when: modified to save the configuration after the changes.

AGR1# show accounting log | last 10
Sat Mar 21 18:45:42 2020:type=stop:id=10.0.100.150@pts/2:user=ansible:cmd=shell terminated because the ssh session closed
Sat Mar 21 18:49:11 2020:type=start:id=10.0.100.150@pts/2:user=ansible:cmd=
Sat Mar 21 18:49:12 2020:type=update:id=10.0.100.150@pts/2:user=ansible:cmd=terminal length 0 (SUCCESS)
Sat Mar 21 18:49:12 2020:type=update:id=10.0.100.150@pts/2:user=ansible:cmd=terminal width 511 (SUCCESS)
Sat Mar 21 18:49:20 2020:type=update:id=10.0.100.150@pts/2:user=ansible:cmd=configure terminal ; hostname AGR1 (SUCCESS)
Sat Mar 21 18:49:26 2020:type=update:id=10.0.100.150@pts/2:user=ansible:cmd=Performing configuration copy.
Sat Mar 21 18:49:36 2020:type=start:id=vsh.bin.13650:user=admin:cmd=
Sat Mar 21 18:49:52 2020:type=update:id=10.0.100.150@pts/2:user=ansible:cmd=copy running-config startup-config (SUCCESS)
Sat Mar 21 18:49:53 2020:type=stop:id=10.0.100.150@pts/2:user=ansible:cmd=shell terminated because the ssh session closed
Sat Mar 21 18:52:35 2020:type=update:id=console0:user=admin:cmd=terminal width 511 (SUCCESS)

Now you can imagine the next step. You can add your syslog server for example.

    - name: configure syslog server
      nxos_config:
        lines:
          - logging server 10.0.100.42 4 use-vrf management facility local7
          - logging timestamp milliseconds
        save_when: modified

Before the change:

AGR1(config)# logging timestamp milliseconds ^C
AGR1(config)# sh logging 

Logging console:                enabled (Severity: critical)
Logging monitor:                enabled (Severity: notifications)
Logging linecard:               enabled (Severity: notifications)
Logging timestamp:              Seconds
Logging source-interface :      disabled
Logging rate-limit:             enabled
Logging server:                 disabled
Logging origin_id :             disabled
Logging RFC :                   disabled
Logging logflash:               enabled (Severity: notifications)
Logging logfile:                enabled
        Name - messages: Severity - notifications Size - 4194304

[..]

After the change:

AGR1(config)# 2020 Mar 21 18:58:48 AGR1 %$ VDC-1 %$  %SYSLOG-2-SYSTEM_MSG: Attempt to configure logging server with: hostname/IP 10.0.100.42,severity 4,port 514,facility local7 - syslogd
AGR1(config)# sh logging 

Logging console:                enabled (Severity: critical)
Logging monitor:                enabled (Severity: notifications)
Logging linecard:               enabled (Severity: notifications)
Logging timestamp:              Milliseconds
Logging source-interface :      disabled
Logging rate-limit:             enabled
Logging server:                 enabled
{10.0.100.42}
        This server is temporarily unreachable
        server severity:        warnings
        server facility:        local7
        server VRF:             management
        server port:            514
Logging origin_id :             disabled
Logging RFC :                   disabled
Logging logflash:               enabled (Severity: notifications)
Logging logfile:                enabled
        Name - messages: Severity - notifications Size - 4194304
[..]
root@09cf326cc275:/ansible/NXOS# ansible-playbook -i inventory-home playbook-home.yaml

PLAY [Setup Nexus Devices] ***********************************************************************************************************************

TASK [configure hostname] ************************************************************************************************************************

ok: [ACC1]
ok: [AGR1]
ok: [ACC2]

TASK [configure syslog server] *******************************************************************************************************************
changed: [ACC1]
changed: [ACC2]
changed: [AGR1]

PLAY RECAP ***************************************************************************************************************************************
ACC1                       : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
ACC2                       : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
AGR1                       : ok=2    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

I can be useful to manage your access-list. Imagine you install a new server for the monitoring and you need to update one entry. This time we will use another module named nxos_acl.

    - name: configure SNMP-ACCESS-LIST
      nxos_acl:
        name: ACL_SNMP-ReadOnly
        seq: "10"
        action: permit
        proto: udp
        src: 10.0.100.42/32
        dest: any
        state: present

Now we have the ACL configured on all switches. When the module exists, prefer to use the specific module.

root@09cf326cc275:/ansible/NXOS# ansible-playbook -i inventory-home playbook-home.yaml

PLAY [Setup Nexus Devices] ***********************************************************************************************************************

TASK [configure hostname] ************************************************************************************************************************

ok: [ACC1]
ok: [AGR1]
ok: [ACC2]

TASK [configure syslog server] *******************************************************************************************************************
changed: [ACC1]
changed: [ACC2]
changed: [AGR1]

TASK [configure SNMP-ACCESS-LIST] ****************************************************************************************************************
changed: [ACC1]
changed: [ACC2]
changed: [AGR1]

PLAY RECAP ***************************************************************************************************************************************
ACC1                       : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
ACC2                       : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
AGR1                       : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0 
AGR1(config)# sh ip access-lists ACL_SNMP-ReadOnly

IP access list ACL_SNMP-ReadOnly
        10 permit udp 10.0.100.42/32 any
--
ACC1(config)# sh ip access-lists ACL_SNMP-ReadOnly

IP access list ACL_SNMP-ReadOnly
	10 permit udp 10.0.100.42/32 any
--
ACC2# sh ip access-lists ACL_SNMP-ReadOnly

IP access list ACL_SNMP-ReadOnly
        10 permit udp 10.0.100.42/32 any 

This module is idempotent. Now we will update the ACL with a second entry. The documentation is here.

    - name: configure SNMP-ACCESS-LIST
      nxos_acl:
        name: ACL_SNMP-ReadOnly
        seq: "10"
        action: permit
        proto: udp
        src: 10.0.100.42/32
        dest: any
        state: present

    - name: configure SNMP-ACCESS-LIST
      nxos_acl:
        name: ACL_SNMP-ReadOnly
        seq: "20"
        action: permit
        proto: udp
        src: 10.0.100.43/32
        dest: any
        state: present
root@09cf326cc275:/ansible/NXOS# ansible-playbook -i inventory-home playbook-home.yaml

PLAY [Setup Nexus Devices] ***********************************************************************************************************************

TASK [configure hostname] ************************************************************************************************************************
changed: [ACC1]
changed: [AGR1]
changed: [ACC2]

TASK [configure syslog server] *******************************************************************************************************************
changed: [ACC1]
changed: [ACC2]
changed: [AGR1]

TASK [configure SNMP-ACCESS-LIST] ****************************************************************************************************************
ok: [ACC1]
ok: [AGR1]
ok: [ACC2]

TASK [configure SNMP-ACCESS-LIST] ****************************************************************************************************************
changed: [ACC1]
changed: [AGR1]
changed: [ACC2]

PLAY RECAP ***************************************************************************************************************************************
ACC1                       : ok=4    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
ACC2                       : ok=4    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
AGR1                       : ok=4    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
AGR1(config)# sh ip access-lists ACL_SNMP-ReadOnly

IP access list ACL_SNMP-ReadOnly
        10 permit udp 10.0.100.42/32 any 
        20 permit udp 10.0.100.43/32 any

and they update the last entry with a new IP address.

    - name: configure SNMP-ACCESS-LIST
      nxos_acl:
        name: ACL_SNMP-ReadOnly
        seq: "10"
        action: permit
        proto: udp
        src: 10.0.100.42/32
        dest: any
        state: present

    - name: configure SNMP-ACCESS-LIST
      nxos_acl:
        name: ACL_SNMP-ReadOnly
        seq: "20"
        action: permit
        proto: udp
        src: 10.0.100.44/32
        dest: any
        state: present
root@09cf326cc275:/ansible/NXOS# ansible-playbook -i inventory-home playbook-home.yaml

PLAY [Setup Nexus Devices] ***********************************************************************************************************************

TASK [configure hostname] ************************************************************************************************************************
changed: [ACC1]
changed: [ACC2]
changed: [AGR1]

TASK [configure syslog server] *******************************************************************************************************************
changed: [ACC1]
changed: [AGR1]
changed: [ACC2]

TASK [configure SNMP-ACCESS-LIST] ****************************************************************************************************************
ok: [ACC1]
ok: [AGR1]
ok: [ACC2]

TASK [configure SNMP-ACCESS-LIST] ****************************************************************************************************************
changed: [ACC1]
changed: [AGR1]
changed: [ACC2]

PLAY RECAP ***************************************************************************************************************************************
ACC1                       : ok=4    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
ACC2                       : ok=4    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
AGR1                       : ok=4    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0 
AGR1(config)# sh ip access-lists ACL_SNMP-ReadOnly

IP access list ACL_SNMP-ReadOnly
        10 permit udp 10.0.100.42/32 any 
        20 permit udp 10.0.100.44/32 any

You can image a lot of scenario now, and apply your change very quickly.

Cisco Nexus 9000 – POAP

In this article, we will discuss about POAP to provision multiple switches.

We need a DHCP, TFTP and SCP server. We can also use an HTTP server to deliver the software and the configuration.

POAP Infrastructure:

http://www.cisco.com/c/dam/en/us/td/i/300001-400000/330001-340000/331001-332000/331649.eps/_jcr_content/renditions/331649.jpg

POAP Process:

http://www.cisco.com/c/dam/en/us/td/i/300001-400000/330001-340000/332001-333000/332315.eps/_jcr_content/renditions/332315.jpg

Software used:

  • ISC-DHCP-SERVER – version 4.3.1
  • ATFTPD – version 0.7
  • OPENSSH-server 6.7p1

DHCP configuration example :

Subnet used : 192.168.255.0/24

In the following block, I reserve a baud for the Client XXXXXXXX. XXXX is the serial number of the switch.

In the option dhcp-client-identifier you need to add “\000” before the serial number.

We have to assign the following parameter:

  • IP address
  • Default Gateway
  • IP address TFTP Server
  • Filename
  • DNS server

In the file: /etc/dhcp/dhcpd.conf

option domain-name-servers 192.168.255.254;

subnet 192.168.255.0 netmask 255.255.255.0 {

host switch1 {
 option dhcp-client-identifier "\000XXXXXXXXXX";
 fixed-address 192.168.255.1;
 option routers 192.168.255.254;
 option bootfile-name "/nxos/poap.py";
 option tftp-server-name "192.168.255.200";
 }

}

TFPT server:

I kept the default configuration, in the file /etc/default/atftpd

USE_INETD=true
 OPTIONS="--tftpd-timeout 300 --retry-timeout 5 --mcast-port 1758 --mcast-addr 239.239.239.0-255 --mcast-ttl 1 --maxthread 100 --verbose=5 /srv/tftp"

In the directory /srv/tftp, I downloaded the poap.py file on github. (https://github.com/datacenter/nexus9000/blob/master/nx-os/poap/poap.py)

This script is provided by Cisco. In this file, you need to customize one part. In the following part you enter the information for:

  • The target image
  • Directory for the image and configuration
  • Method to download the image and configuration here scp
  • The credential of the SCP Server
  • The name of the configuration file (here based on the serial number)
# system and kickstart images, configuration: location on server (src) and target (dst)
 n9k_image_version       = "7.0.3.I5.2" # this must match your code version
 image_dir_src           = "/srv/tftp/nxos"  # Sample - /Users/bob/poap
 ftp_image_dir_src_root  = image_dir_src
 tftp_image_dir_src_root = image_dir_src
 n9k_system_image_src    = "nxos.%s.bin" % n9k_image_version
 config_file_src         = "/srv/tftp/nxos/conf" # Sample - /Users/bob/poap/conf
 image_dir_dst           = "bootflash:" # directory where n9k image will be stored
 system_image_dst        = n9k_system_image_src
 config_file_dst         = "volatile:poap.cfg"
 md5sum_ext_src          = "md5"
 # Required space on /bootflash (for config and system images)
 required_space          = 800000

# copy protocol to download images and config
 # options are: scp/http/tftp/ftp/sftp
 protocol                = "scp" # protocol to use to download images/config

# Host name and user credentials
 username                = "root" # server account
 ftp_username            = "anonymous" # server account
 password                = "password" # password
 hostname                = "192.168.255.200" # ip address of ftp/scp/http/sftp server
 config_file_type        = "serial_number"

After you need to generate a md5 of the poap.py script. The following line will replace the second line with the MD5. If the MD5 is not valided the POAP process will fail and restart.

f=poap.py ; cat $f | sed '/^#md5sum/d' > $f.md5 ; sed -i "s/^#md5sum=.*/#md5sum=\"$(md5sum $f.md5 | sed 's/ .*//')\"/" $f
#!/bin/env python
#md5sum="3b614973cbde2742388b5997228678cd"
# Still needs to be implemented.
# Return Values:

You also need to generate an md5 for the image:

md5sum nxos.7.0.3.I5.2.bin > nxos.7.0.3.I5.2.bin.md5

Don’t forget your configuration file name “conf.XXXXXXX” where XXXX is the serial number and to configure the credential in this file.