decorative image for blog on nginx performance tuning
February 27, 2024

NGINX Performance Tuning: Top 5 Tips

Web Infrastructure

NGINX performance tuning is an important exercise to ensure high availability on your website. If you are using NGINX as a web server or reverse proxy, for load balancing and/or HTTP caching, read this blog to get tips on how to tune NGINX to keep it fast and performant.

Back to top

Why NGINX Performance Tuning Matters

In general, performance tuning is important to maximize the return on investment from a business asset and maintain high levels of availability of the service. For NGINX, tuning will help your site meet and exceed performance benchmarks for speed and latency. 

It is important to remember that NGINX performance tuning is not a one-time exercise; as website or application loads may change from your initial estimates or planning, it may be necessary to reevaluate and make adjustments to maximize performance. 

Back to top

Before You Tune NGINX

Before you tune any application, you must understand the current baselines for performance and application use case. 

You can start by profiling the system resources and looking at the types of content returned. Browser developer tools, system performance tools like dstat (piped to a CSV file which can be visualized), and Apache Bench, can be helpful for this, as well as reviewing NGINX logs. 

One benefit of profiling is the ability to establish key performance and risk indicators. You can see at what level of requests the application is likely to underperform and break SLAs.

It is recommended to use multiple NGINX instances to test tuning results on different hardware (more CPUs, more RAM and disk for caching) and different versions of NGINX. 

Be aware that while SELinux or AppArmor are recommended, these may cause an additional performance impact. This is something that should be checked during performance testing.

Determine if your clients will use the HTTP 1.1, HTTP/2 or HTTP/3 protocols. Depending on the type of client application, using HTTP/2 or even HTTP/3 will cut down on round trips and thus improve latency.

Back to top

Tip #1: Modify NGINX Worker Processes

By default, NGINX uses the number of hardware CPU cores for the number of worker processes. This is done to pin each worker to a CPU core and minimize context switching during low level connection management via the Linux Kernel.

However, the following configuration directive can be used to tweak the number of worker processes:

worker_processes: Defines the number of worker processes.

The ideal ratio of CPU to worker processes takes advantage of the way the connections and events are fetched from the Kernel network stack queues. That said, in some workloads, and with large amounts of modern processors that have very high throughput, the number of worker processes can exceed the number of CPU cores.

In other words, the larger numbers of worker processes can be a benefit regardless of the penalty from context switching from active process to process on the CPU. The way to determine that is with benchmarking, but likely the smaller the traffic per request, the shorter the connection time and the smaller the impact from context switching becomes.

Back to top

Tip #2: Manage NGINX Worker Connections

How many connections per worker process (and CPU core) to use should be determined based on content and workload testing.

The following configuration directive sets the number of connections per worker process; the default value is 512:

worker_connections: Sets the maximum number of simultaneous connections that can be opened by a worker process.

Keep in mind here that TLS processing will increase CPU load. Large numbers of connections with mutual TLS are more likely to load a single core. Similarly, worker process CPU usage will increase when using content compression. To make sure that connections are closed right after all the requested data is received, don’t disable the lingering_close directive.

Back to top

Tip #3: Configure NGINX as a Load Balancer

Using NGINX as a frontend load balancer allows efficient load distribution among a group of backend servers. This ensures high availability by sending requests only to backend servers found to be healthy.

The connections are distributed in a weighted round-robin fashion. For example, more connections can be routed to a server with faster hardware. Or connections may be routed to backend servers with the least number of connections.

The open source version of NGINX uses a passive health check to monitor if the backend service is up and ready to receive connections. Using the passive health check mechanism, NGINX will monitor connections as they happen, and try to resume failed connections. If the connections start failing, a backend service will be marked unavailable.

An upstream service is marked unavailable based on the following configurable settings:

  • fail_timeout: Sets the time during which a number of failed attempts must happen for the server to be marked unavailable, and also the time for which the server is marked unavailable (default is 10 seconds).
  • max_fails: Sets the number of failed attempts that must occur during the fail_timeout period for the server to be marked unavailable (default is 1 attempt).

The load balancing is performed using the ngx_http_upstream_module and is configured with its directives. 

The next issue to solve in a multi-backend setup is session persistence. This is achieved by using the sticky session affinity configuration directive.

Back to top

Tip #4: Cache and Compress Static Content

Caching can be categorized as server-side and client-side. The less round trips are needed to get data, the faster it can be loaded or reused.

Proxy_Cache: Defines the shared memory and storage to use for proxy caching. The directives in the same tree, such as proxy_cache_path and proxy_cache_key, define the location and patterns to cache.

FastCGI Caching: Micro caching is most useful for dynamic content that does not change very often, like calendar data, RSS feeds, daily statistics, status pages, etc.

Client-Side Cache: Web servers have only partial control over client-side caching. But they can provide HTTP response headers to the HTTP client with caching recommendations. Cache-control Header values:

“Cache-Control public;” – Allows the resources to be cached by the HTTP client and any intermediate proxy. This is the option I recommend.

“Cache-Control private;” – Only cache on HTTP client side and not on intermediate proxy.

“Max-age” – Value in seconds which indicates the amount of time to cache the resources.

Example: Cache-Control:public, max-age=864000

Expires – Sets a date from which a cached resource should be no longer considered as valid. For example, “expires” 6M for 6 months or 2Y for 2 years. 

If not specified, the Cache-Control: max-age will be calculated automatically from the Expires directive and returned in the HTTP headers.

Each content type can be mapped with its own expiry. Example:

map $sent_http_content_type $expires { 
default            off; 
text/html          epoch; 
text/css            max; 
application/javascript     max; 
~image/    max; 
~font/    max; 

server { 
listen 80 _; 
listen [::]:80 _; 
expires $expires;

If “expires” and “max-age” are defined, then max-age will be used.

Gzip response compression: By compressing the response, the time used up for content transmission is reduced. This improves the user experience and potentially reduces bandwidth costs on metered connections. Some content types maybe already be compressed, such as JPEGs.

Example: 

gzip_types text/plain text/xml text/css text/javascript application/json application/x-javascript application/xml application/xml+rss application/javascript;

Note: When using the SSL/TLS protocol, compressed responses may be subject to BREACH attacks.

Back to top

Tip #5: Pay Attention to NGINX Logging and Buffering

Access and error logs are the first stop for troubleshooting on the server side. These logs show configuration errors, resources being accessed, HTTP status codes, and payload size.

Use the buffer or gzip parameter to enable log buffering. After buffering is enabled, the data will be written to the log file:

  • if the next log line does not fit into the buffer; 
  • if the buffered data is older than specified by the flush parameter (1.3.10, 1.2.7); 
  • when a worker process is re-opening log files or is shutting down. 

The ngx_http_log_module includes:

The ngx_core_module includes:

  • error_log: error_log logs/error.log error;

The ngx_http_rewrite_module includes:

  • rewrite_log: rewrite_log off;

The ngx_http_session_log_module includes:

  • session_log: session_log off;

Other Logging Tips

  1. For logging and application performance testing, set up a location block with minimal logging as it would be in production, and another location block with maximum logging for debug purposes. This way some requests can be dissected in detail and others processed with minimal overhead. This will conserve server resources.
     
  2. Use remote logging to aggregate information across multiple endpoints in the architecture.
     
  3. Create observability by combining OS and hardware metrics with NGINX logs and request tracing using OpenTelemetry.
Back to top

Final Thoughts

According to the most recent State of Open Source Report, NGINX was the most used open source web server in 2023, surpassing Apache HTTP Server for the first time. Hopefully these tips will help you tune NGINX for optimal performance. If you need assistance getting started with NGINX, check out my blog on NGINX setup and configuration, or consider NGINX training for your team. 

Get Support For NGINX and All Your OSS 

OpenLogic provides end-to-end enterprise technical support and services for NGINX and more than 400 open source technologies. Our support is backed by SLAs and delivered by experts with 15+ years of open source experience.

Explore Solutions

Additional Resources

Back to top