To ensure that the necessary components for Mender are properly integrated, you should use this checklist to verify each of them in turn. You can run this checklist after you have successfully built all components and correctly booted the device.
This checklist will verify some key functionality aspects of the Mender integration. It will verify that:
Please note During these steps we will be switching between the two partitions without downloading new content to them. If you don't have a bootable OS on the second partition, nothing will boot once we do the switch. To overcome this, you can perform regular rootfs update after the initial flashing before proceeding with the verification or flash the second partition of storage with the same rootfs the verification.
If full rootfs updates are not required, this is the only validation needed.
The Mender Client consists of a number of components and configuration files, with the mender-auth
userspace application being the responsible for authentication against the Mender Server, and mender-update
responsible for executing the updates.
By default, both run as systemd services.
To verify the correct installation of the service, run the following commands and confirm the output:
systemctl is-active mender-authd
# Output:
# active
systemctl is-enabled mender-authd
# Output:
# enabled
systemctl is-active mender-updated
# Output:
# active
systemctl is-enabled mender-updated
# Output:
# enabled
In the remaining verification, we will manually be executing similar steps to what the rootfs-image update module usually does in regular operations to communicate with the bootloader.
To avoid the Mender Client interfering with the manual verification it is recommended to disconnect the device from the internet.
Verify which of the two commands to manipulate the bootloader environment are executable and available in the path.
These are used by the rootfs-image
update module to set the bootloader environment.
The executables which the rootfs-image update module expects are bootloader specific. By default, GRUB
and uboot
implementations are supported.
Please run the commands below:
grub-mender-grubenv-print
grub-mender-grubenv-set
fw_printenv
fw_setenv
Errors upon calling the set
command without arguments are expected and can be ignored, this step only checks the executable form.
We will test the setting of variables in the upcoming steps.
For the remaining steps, the GRUB CLI tools will be used, but the verification steps are equivalent when using the uboot tools.
In Yocto releases prior to 4.0 kirkstone, the names of the GRUB tools were the same as the U-Boot tools. Make sure to take this into account in the remaining examples on this page.
Redundant (A/B) partitioning is a requirement for full rootfs updates.
These steps will identify the partitions and check if they align with what is in the Mender Client configuration (/var/lib/mender/mender.conf
).
By default the Mender Client looks for configuration in two locations. One of those is /var/lib/mender/mender.conf
which is - in the default case - a link to the persistent partition /data/mender/mender.conf
and doesn't get overwritten during the rootfs update. We recommend keeping the backup of the RootfsPartA/B
settings in /var/lib/mender/mender.conf
as it is very rare that you need to change partition names as a result of an update.
Please note that the output can vary depending on the actual device names or if you're using PARTUUIDs.
Partitions as device files
The device name can vary according to the storage type and kernel naming convention.
cat /var/lib/mender/mender.conf | grep RootfsPart
# Output:
# "RootfsPartA": "/dev/nvme0n1p2"
# "RootfsPartB": "/dev/nvme0n1p3"
Partitions as PARTUUIDs
The PARTUUID feature hasn't been tested for u-boot
cat /var/lib/mender/mender.conf | grep RootfsPart
# Output:
# "RootfsPartA": "/dev/disk/by-partuuid/bdcae16f-400a-45e3-b5bb-c9512d3f56c1",
# "RootfsPartB": "/dev/disk/by-partuuid/bdcae16f-400a-45e3-b5bb-c9512d3f56c2"
Identify the currently active partition with the following command:
Partitions as device files
mount | grep 'on / '
# Output:
#/dev/nvme0n1p2 on / type ext4 ... ...
On some devices the rootfs is not listed as a block device but as /dev/root
or similar. You can use an alternative method for verifying it, by calling the following series of commands:
stat -c %D /
# Output:
# 10303
stat -c %t%02T /dev/nvme0n1p2
# Output:
# 10303
The output of the two commands should be identical. This verifies that the correct partition is mounted as the root device when partition A is active.
Partitions as PARTUUIDs
dev=$(mount | grep 'on / ' | awk '{print $1}') && echo "$dev $(blkid -s PARTUUID -o value $dev)"
# Output:
# /dev/sda3 bdcae16f-400a-45e3-b5bb-c9512d3f56c2
At the end of this step, we need to identify the partition numbering.
This is because the rootfs-image update module passes only partition numbers to the bootloader and not the whole path as seen from mender.conf
.
Partitions as device files
For device files, the partition numbers are whatever is the last number in the device name.
cat /var/lib/mender/mender.conf | grep RootfsPart
# Output: # Comment for clarification
# "RootfsPartA": "/dev/nvme0n1p2" -> Partition A number: 2
# "RootfsPartB": "/dev/nvme0n1p3" -> Partition B number: 3
Partitions as PARTUUIDs
For PARTUUID we need to get that mapping from the grub.cfg:
grep 'mender_rootfsa_part=\|mender_rootfsa_uuid=\|mender_rootfsb_part=\|mender_rootfsb_uuid=' /boot/efi/grub-mender-grubenv/grub.cfg
# Output: # Comment for clarification
# mender_rootfsa_part=2
# mender_rootfsb_part=3
# mender_rootfsa_uuid=bdcae16f-400a-45e3-b5bb-c9512d3f56c1 -> Partition A number: 2 (because rootfsa is 2)
# mender_rootfsb_uuid=bdcae16f-400a-45e3-b5bb-c9512d3f56c2 -> Partition B number: 3 (because rootfsb is 3)
It doesn't matter which partition happens to be active in your example, if it's reversed from the example just adjust the numbers.
We identified edge cases in certain u-boot board integrations which lead to the introduction of the mender_boot_part_hex
variable.
To make the verification steps generally applicable, we change both variables in the steps even though they aren't both used in all cases.
We will confirm the bootloader can read the environment and is behaving correctly by manually switching to the inactive partition.
In the previous step, we identified the currently running partition to be nvme0n1p2
and the inactive one nvme0n1p3
.
Set the bootloader variables manually so it boots from the currently inactive partition on the next reboot.
grub-mender-grubenv-set mender_boot_part 3
grub-mender-grubenv-set mender_boot_part_hex 3
reboot
After the device boots up verify that you are indeed running on the expected partition:
mount | grep 'on / '
# Output:
#/dev/nvme0n1p3 on / type ext4 ... ...
After completion, return to the previously active partition by adapting the previous steps:
grub-mender-grubenv-set mender_boot_part 2
grub-mender-grubenv-set mender_boot_part_hex 2
reboot
In the Mender state machine workflow the transitional state for the bootloader starts with ArtifactReboot
and ends with either ArtifactCommit
or ArtifactFailure
.
During this verification process, it is expected that the bootloader's environment variables will experience changes - either by the rootfs-image update module or the bootloader itself - and the bootloader is expected to enact conditional logic. For comparison, in the non-transition state, the bootloader's variables remain unchanged.
To notify the bootloader about the switch to the transitional state, we will set the following variables:
grub-mender-grubenv-set upgrade_available 1
grub-mender-grubenv-set bootcount 0
Setting upgrade_available
to 1
has multiple side effects:
bootcount
by 1 for every new boot attemptupgrade_available
to 0
marks the end of transition stateAs explained on the variables in the previous paragraph, test a full active partition switch including reboot:
# In the normal update process, at this point the Mender Client just
# concluded streaming the new version to the inactive partition
# We are currently running the active partition
# Identify active partition
mount | grep 'on / '
# Output:
#/dev/nvme0n1p2 on / type ext4 ... ...
# Start the transition state
grub-mender-grubenv-set upgrade_available 1
grub-mender-grubenv-set bootcount 0
# Switch to the inactive partition
grub-mender-grubenv-set mender_boot_part 3
grub-mender-grubenv-set mender_boot_part_hex 3
reboot
After the reboot the partition changed and the bootcount increased:
# Identify the active partition
mount | grep 'on / '
# Output:
#/dev/nvme0n1p3 on / type ext4 ... ...
grub-mender-grubenv-print bootcount upgrade_available
# Output:
# bootcount=1
# upgrade_available=1
This is as expected, we can conclude the transition state and confirm the variables remain unchanged:
grub-mender-grubenv-set upgrade_available 0
grub-mender-grubenv-set bootcount 0
# This is now the stable state which must remain the same after reboots
grub-mender-grubenv-print
# Output:
# bootcount=0
# mender_boot_part=3
# upgrade_available=0
# mender_boot_part_hex=3
reboot
grub-mender-grubenv-print
# Output:
# bootcount=0
# mender_boot_part=3
# upgrade_available=0
# mender_boot_part_hex=3
The process initiation is identical to the "success" form.
# In the normal update process, at this point the Mender Client just
# concluded streaming the new version to the inactive partition
# We are currently running the active partition
# Identify the active partition
mount | grep 'on / '
# Output:
#/dev/nvme0n1p3 on / type ext4 ... ...
# Start the transition state
grub-mender-grubenv-set upgrade_available 1
grub-mender-grubenv-set bootcount 0
# Switch to the inactive partition
grub-mender-grubenv-set mender_boot_part 2
grub-mender-grubenv-set mender_boot_part_hex 2
To trigger the rollback mechanism, rename the kernel on the inactive partition to break the boot process. This is equivalent to a use case where the new update contains a faulty kernel.
mkdir /mnt/inactive-partition
mount /dev/nvme0n1p2 /mnt/inactive-partition
mv /mnt/inactive-partition/boot/vmlinuz-5.13.0-35-generic /mnt/inactive-partition/boot/vmlinuz-5.13.0-35-generic.backup
umount /mnt/inactive-partition
reboot
The device will attempt to boot; however, it will fail and trigger a rollback to booting from the previously working partition. The bootloader will automatically conclude the transition state in that case:
grub-mender-grubenv-print upgrade_available
# Output:
# upgrade_available=0
And the active partition is still the one we started with.
# Identify the active partition
mount | grep 'on / '
# Output:
#/dev/nvme0n1p3 on / type ext4 ... ...
You can return your kernel back to normal again:
mkdir /mnt/inactive-partition
mount /dev/nvme0n1p2 /mnt/inactive-partition
mv /mnt/inactive-partition/boot/vmlinuz-5.13.0-35-generic /mnt/inactive-partition/boot/vmlinuz-5.13.0-35-generic.backup
umount /mnt/inactive-partition
Please note - the rollback is triggered by an unplanned second reboot that takes place during the transition period, and it is not a result of the bootloader detecting a faulty kernel in any way.
If you have successfully followed and verified all the steps, you can confirm that your device has the appropriate partitioning and bootloader integration for complete rootfs updates using Mender.
© 2024 Northern.tech AS