Log Insight: Node unreachable after upgrade from 4.8 to 8.0

This post was published 5 years 1 month 13 days ago, so the post may be outdated.

The new Log Insight 8.0 release brings some cool new things: The OS of the whole appliance is replaced with Photon OS 3.0 instead of SUSE (in-place update possible!), the Windows agent is being open-sourced and brings various improvements to content packs. To read more you can check out the release notes or the official blog post.

However when I’ve went ahead upgrading my Log Insight instance the upgrade process ran into a timeout. The Web UI as well as SSH were not reachable anymore. The whole node was dead network-wise, while the appliance itself ran just fine. Sherlock’s conclusion: Something went wrong, obviously.

IMPORTANT: BEFORE you proceed, please notice that this is NOT officially supported by VMware! Do that at your own risk. Make sure you have a backup. If unsure, reach out to the support first.

Also important to note: I have NOT tested this on a Log Insight setup with an active cluster of multiple nodes.

The issue

A colleague pointed out checking the network configuration on the appliance. Turns out this was actually a very good tip. As you might spot on the screenshot below, the gateway address seems to be wrong. This seems to be misbehavior from the upgrade process when the configuration files are being converted and the whole operating system replaced underneath.


Getting it alive

The fix is actually pretty easy: Edit the network configuration file of eth0, correct the gateway and reboot.

  1. Backup. Snapshot is pretty easy, otherwise make a copy of the 10-eth0.network file.
  2. Edit the eth0 network config file:
    vi /etc/systemd/network/10-eth0.network
  3. Press I (i) to switch to the INSERT mode
  4. Fix the gateway address
  5. Hit ESC, then enter “:wq” in the bottom command line to save changes (US keyboard layout!)
  6. Reboot the node using:
    reboot
  7. After a few minutes the node should come back online again. If you made a snapshot, you can remove it once you verified everything is working as expected (snapshots should NOT exist longer than 72 hours!).

Update on November 19, 2019: There is also now an official KB article #76067 this particular issue.

Patrik Kernstock

May I introduce my self? I am Patrik Kernstock, 25 years old, perfectionist, born in Austria and living in Ireland, Cork. Me explained in short: Tech- and security enthusiast, series & movies junky. Interesting in Linux, Container-stuff and many software solutions by Microsoft, Veeam and VMware.

0 0 votes
Article Rating
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

3 Comments
newest
oldest most voted
Inline Feedbacks
View all comments
Otto Jackson

Thank you for this fix. So simple yet helpful!

Kingtut1010

Thank you so much ! This just helped me out.

3
0
Would love your thoughts, please comment.x
()
x