Automate certificate renewal via Let’s Encrypt on Avi/NSX ALB
This time I want to introduce the ability to request the famous Let’s Encrypt certificates on your NSX Advanced Load Balancer (also known as Avi Vantage) using comprehensive on-board functionality it offers.
I was improving, adding new features and testing the script for quite some time now, which got now reviewed and merged by Engineering on 31st of August 2021 on GitHub. One major change was to support the "Virtual Hosting" feature with Let’s Encrypt and also allow specifying multiple domains for the certificate (SANs).
Support for the script: If you have any troubles or questions regarding the script, I’d recommend raising a GitHub Issue. It’s easier to keep track of them and more people might be able to assist.
In the Controller you can manage all your SSL/TLS certificates at one central point. All those certificates can be used across your virtual services, running on different Service Engines. Obviously, before a certificate expires it should be renewed and replaced.
In the case of free, well-known public certificate authority Let’s Encrypt certificates are only valid 90 days, where the best-practise is to renew it every 30 days. Performing this step manually each month, is probably not the most exciting work.
This is where Certificate Management and the "ControlScript" in NSX ALB joins the party. This feature – by default – initiates a renewal 7 days before the certificate expiry. Or, in other words: Just right before the penultimate certificate expiry notification as configured. (For more information see Avi’s documentation for "Customizing Notification of Certificate Expiration" here)
With this feature issuing Let’s Encrypt certificates for Virtual Services can be completely automated.
Takeaway: To debug the certificate renewal it’s really handy to manually trigger the renewal via CLI. For more details, you can give this QuickTip a read.
To set up everything you need, follow below steps from your Avi Controller’s webinterface.
You need to do this once per tenant.
To note: Starting 21.1 (Release Notes), what I use, the webinterface has been rebuilt using VMware’s Clarity website framework going forward, hence previous versions do look differently. Beside the visual difference, the steps however should be pretty much the same.
Step 1: Check the requirements/disclaimer
- You need a public domain which is also publicly reachable.
- You need a
Virtual Servicewith an
Application Profileof type
HTTP(so Layer 7). Issuing certificates to virtual services of type L4 do not work, as it requires
HTTP Policy Setsto complete the validation.
Some more notes:
- You should be aware that Let’s Encrypt does do Rate-limitating based on IP address. If issuing certificates fails, you might want to enable
dryrun, investigate the cause and keep retries to a minimum to not reach the limit.
- Please read carefully through this guide. When done a few times, it’s way less complicated than it looks like.
- This is a community-supported script. So no official support is provided. If there’re issues, please consider raising a GitHub issue.
Step 2: Create ControlScript
- Go to
Templates - Scripts - ControlScriptsand hit
Create. A dialog will open.
- Open the LetsEncrypt renewal script in a separate tab: raw.githubusercontent.com/avinetworks/devops/master/cert_mgmt/letsencrypt_mgmt_profile.py. Copy the content in your clipboard.
- Pick a name like
- Paste the script, we copied right before, in the large textarea below.
- It will now look like this:
Saveto return to the previous dialog.
Step 3: Create "Certificate Management"
- Go to
Templates - Security - Certificate Managementand hit
Create. A dialog will open.
Name, pick something like
use_letsencrypt(Only alphanumeric, undersscore, period or hyphen characters are allowed)
Control Script, select the script named
request_letsencrypt_certificatewe created in the previous step.
Enable Custom Script Parameters. At a minimum, you need to add at least following two values: (Check additional parameters in the next step!)
userfor the username used for the API calls. Can be a custom user (recommended), or some admin account.
passwordfor the password, marked as
- This account needs permissions to manage and change
SSL/TLS Certificatesvia API/WebUI.
- Additionally to above parameters, there are more options you can and might need to define. I recommend to set most of these as
Dynamic, so it can be changed on a certificate-basis individually.
tenantcontains the name of the tenant to be used. If not specified,
adminwill be used.
Falseand production servers are used. If
Trueset, the staging/test servers of LetsEncrypt will be used which have different ratelimiting settings.
disable_checkdetermines if the token will be validated from the Avi Controller. See more details in Appendeix-section down below. Usually no change is needed.
False. If set to
True, more debug messages are printed.
contactcan be a e-mail address provided to Let’s Encrypt. Certificate expiry warnings will be sent to this email address.
- In the end, it will look like this:
Step 4: Request a certificate
- Go to
Templates - Security - SSL/TLS Certificates, click on
Create - Application Certificate.
- As a
Nameyou can pick something like
- As the
- As the
Certificate Management Profilewe pick
use-letsencrypt, what we created in the previous step. (Make changes to
Dynamic Parameters, if defined and required.)
Common Namepick the FQDN to what the certificate should be issued to. For example:
Key Sizeas required. We can go for
- Add any
Subject Alternate Name (SAN), if required. This are additional domain names which should be included in the certificate.
- When saved, the script will be run in the background and used to issue your certificate accordingly. This might take a few seconds. If it fails, you will see the output of the script with more details. (Note: Script output will only displayed starting 20.1.6)
If the script suceeds, you will see the recent issued certificate in the list:
Now your certificate is ready to use. Happy certificiating… or so…
I don’t see an error!
If you don’t see any additional errors (e.g. when using older versions than 20.1.6), you can see more logs in the log file
/var/lib/avi/log/portal_exception.log on your Avi Controller. To check this log file, login to your controller via SSH and check the log file using
less -i /var/lib/avi/log/portal_exception.log or
tail -f /var/lib/avi/log/portal_exception.log.
Additionally you might want to set the custom parameter
How to use RSA and ECDSA?
To issue both, a RSA and ECDSA certificate, you simply create two
SSL/TLS Certificates entries and chosing the
Algorithm down below accordingly. You can then define both certificates on your
Error: "All 5 internal token verifications failed."
(This also provides more details about the parameter
As described earlier in Step 3, point 5, the token verification can be disabled by setting the parameter
To understand this parameter further, I’ll need to briefly explain the token validation of Let’s Encrypt:
- The ACME standard, what Let’s Encrypt invented, is used to automatically issue certificates and proof ownership. Of course noone wants to have other/evil people issue valid certificates for their domains.
- At first, we’re going to tell Let’s Encrypt – or the ACME server in general – for which domain we want a certificate issued. It gives us back a token Let’s Encrypt expects to see at a certain URL to proof ownership to them.
- The script then sets a
HTTP Policyon the corresponding Virtual Service to return a specific token string at the URL
- Before the script tells Let’s Encrypt to verify the token, the Avi Controller (the script to be precise) makes a HTTP call to above URL and validate the token locally. This keeps us from getting rate-limited to quick in case validation fails.
- If the validation suceeds, we inform Let’s Encrypt that the token can be verified. On success, the certificate will be issued and handed over to the Avi Controller to process. Issuing complete.
In some setups you might use split-horizon DNS:
our-domain-on-avi.tldinside your network points to an internal server, directly on the webserver.
our-domain-on-avi.tldfrom outside your network points to the NSX ALB/Avi Load Balancer.
As the Avi Controller validates the token within the local network, it will never go through the Load Balancer and therefore never hit the
HTTP Policy set through the script. Essentially causing the local validation to fail. By setting
True we simply bypass this check.
I hope this was useful for some Avi/NSX ALB fans out there!
- 2021-12-27: Added note regarding support on GitHub repo.
Great Article Patrik. I am gonna try this out in my lab soon.
Wondering what config changes you made with ALB to get a dark theme?
I’m using “Dark Reader” (browser extension) for this. Unfortunately there’s no native dark mode in the Avi Controller WebUI.
hi buddy, great script. I am with a problem and I cannot find the solution. I have no programming skills and I am just learning about avi. I share the error I have to see if you can help me: Error from certificate management service: Could not find a VS with fqdn = abc.labs.com.ar. STDOUT - 'Running version 0.9.0 Debug enabled. dry_run is: False disable_check is: False directory_url is https://acme-v02.api.letsencrypt.org/directory Reusing account key. Parsing account key ... Parsing CSR ... Found domains: abc.labs.com.ar Getting directory ... Directory found! Registering account ... Already registered! Creating new order ... Order created! Authorization… Read more »
I have not looked in the logs to get the verbose error rapt0r has posted, however, I am getting the message “Error from certificate management service: Could not find a VS with fqdn = domainnamehere” so I think I have the similar issue. We have a glsb and two SE’s creating two vs objects for the site. dns of domainnamehere does resolve to the ip of the vs. I have also tried the disable_check True with same results.
Hi you both, apologies for the delayed response.
Would you mind please trying the suggested fix manually: https://github.com/avinetworks/devops/pull/246. Does this help?
If not, following new PR might help as this allows manually overwriting the VS to be used: https://github.com/avinetworks/devops/pull/249
For additional questions, I’d recommend raising a GitHub issue in the repo there as it’s better to keep track of and more people might see it.