The new Log Insight 8.0 release brings some cool new things: The OS of the whole appliance is replaced with Photon OS 3.0 instead of SUSE (in-place update possible!), the Windows agent is being open-sourced and brings various improvements to content packs. To read more you can check out the release notes or the official blog post.
However when I’ve went ahead upgrading my Log Insight instance the upgrade process ran into a timeout. The Web UI as well as SSH were not reachable anymore. The whole node was dead network-wise, while the appliance itself ran just fine. Sherlock’s conclusion: Something went wrong, obviously.
IMPORTANT: BEFORE you proceed, please notice that this is NOT officially supported by VMware! Do that at your own risk. Make sure you have a backup. If unsure, reach out to the support first.
Also important to note: I have NOT tested this on a Log Insight setup with an active cluster of multiple nodes.
A colleague pointed out checking the network configuration on the appliance. Turns out this was actually a very good tip. As you might spot on the screenshot below, the gateway address seems to be wrong. This seems to be misbehavior from the upgrade process when the configuration files are being converted and the whole operating system replaced underneath.
Getting it alive
The fix is actually pretty easy: Edit the network configuration file of eth0, correct the gateway and reboot.
- Backup. Snapshot is pretty easy, otherwise make a copy of the 10-eth0.network file.
- Edit the eth0 network config file:
- Press I (i) to switch to the INSERT mode
- Fix the gateway address
- Hit ESC, then enter “:wq” in the bottom command line to save changes (US keyboard layout!)
- Reboot the node using:
- After a few minutes the node should come back online again. If you made a snapshot, you can remove it once you verified everything is working as expected (snapshots should NOT exist longer than 72 hours!).