How to Scale Node.js Application? Strategy and Technique

Last Update: March 27, 2024
node.js applications
Table of Contents
Picture of Vivasoft Team
Vivasoft Team
Tech Stack
0 +
Want to accelerate your software development your company?

It has become a prerequisite for companies to develop custom software products to stay competitive.

Performance and scalabilities are achieved by application architecture, not by any particular choice of language or framework. We can use any programming language to make it scalable, like Java, PHP, C#, or Python, but Node.JS is an outstanding choice for a few reasons.

Node.JS is an asynchronous event-driven JavaScript runtime, Node.JS is designed to build scalable network applications. NodeJS has single-threading and asynchronous capabilities enabling a non-blocking mechanism. NodeJS gives an uninterrupted flow, speed, scalability, and easy maintenance.

It’s important to consider the performance, scalability, and maintainability of the language or environment when making this decision. Generally, it’s a good practice to scale an application before it reaches its maximum capacity. This can help ensure that the application is able to handle the traffic and requests, and it can also help prevent performance issues and downtime.

Assume we have a monolithic e-commerce application built in Node.JS, which handles a certain amount of traffic daily. However, the application is growing fast now, and it has broken its threshold. It will reach its maximum capacity very soon. Everyone is panicked and asking – “What do we have to do?” We need to scale it.

Let’s see what we can do to resolve this situation.

Strategy and Techniqe to Scale the Application

Scaling an application can be a challenging task. Here are some strategies and techniques we can take into consideration:

  • Identify the bottlenecks:
Use monitoring and profiling tools like PM2, New Relic, and Datadog to identify which parts of the application are causing performance issues and where the bottlenecks are. 
  • Microservices:
Sometimes we need to make changes to the application’s architecture in order to scale a monolithic application. One of the greatest ways to scale a monolithic application is to divide it into smaller services (microservice). It is simpler to maintain and scale the application because each service can be deployed and scaled independently.
  • Use a load balancer:
Use a load balancer to distribute incoming traffic across multiple instances of the application. This can help improve the performance and scalability of the application.
  • Use a caching solution:
Use a caching solution, such as Redis, to cache frequently accessed data and reduce the load on the database.
  • Use a reverse proxy:
Use a reverse proxy, such as Nginx, to handle SSL termination and caching static assets.
  • Use a reverse proxy:
Use a CDN to deliver static assets from a location that is closer to the user, which can help reduce the load on the application.
  • Use a container orchestration system:
Use a container orchestration system, such as Kubernetes, Docker Swarm or Amazon ECS, to manage and scale the application. This can help to scale an application by providing features such as automatic scaling, load balancing, self-healing, rolling updates, service discovery, security and management and monitoring processes.

Scalability: Improve Node.Js Performance

One of the most dreaded questions is: ‘Would that scale?’. The following is a guideline on how to grow our Node.JS application as the number of users grows. Scaling an application too early is more painful than beneficial. This guide provides a way to start simple and scale as the number of users grows. Scalability in software development refers to designing solutions that continue to function efficiently with a growing number of users. For businesses, this means that regardless of how big the business gets, the software can handle increased users, customers, or requests. Scalability comes down, primarily, to: – using a “divide and conquer” approach i.e., layering and partitioning resources (i.e., computational, storage, communication) – managing resources (e.g., access control, metering, throttling etc.) – working out the required trade-off when it comes to CAP.  The scalable applications can remain stable while adapting to changes, upgrades, overhauls, and resource reduction. Let’s see the example below to understand how it works.
  • Single Server Setup:
Everything is great on a single server as long as we are using a web server that uses an event model like Nginx. NodeJS by nature uses an event-driven and non-blocking I/O model. It means that it won’t block a single request, rather it will handle all the requests and replies as data from the database or services comes available in a callback/promise. Our NodeJS app will spend most of the time waiting for the database or file system to respond. In the meantime, it can take multiple requests. We have a monolithic application right now & it’s fine for now. No need to complicate our life to give thought for millions of users to be handled by the system. In case of new changes to our application server we need to take down the application while updating the server. Single server setup is simplest, everyone shares the same resources. Here is what our single server setup will look like. Scalability with Node.js
  • Vertical Scaling:
Let’s assume now we have a minimal user base for our Node.JS application. So, there will be lots of incoming requests getting served by our single server. Requests responses are starting to be slower than before and having downtime issues because of resource exhaustion. We need a bigger box! To serve our current purpose for the users what we can do for now is “Vertical Scaling”. Vertical scaling, referred to as “scale up”, means the process of adding more power (CPU, RAM, etc.) to our server. To make our NodeJS application work smoothly we can add more resources to our server. When the traffic is low (assume below~5000) users. Vertical scaling is a good choice. The main advantages are its cost-effective, less complicated maintenance. Unfortunately, it comes with serious limitations. The higher possibility of downtime, single point of failure & upgrade limitations are major disadvantages. The additional benefit of adding multiple CPU cores to our server is that we can run two instances of our application and load balance it with nginx. Multiple instances of app mean zero downtime deployments & updates. We can deploy updates on one instance at a time. More resource clustering is also a good option. Some common cluster tools are NodeJS cluster-module, PM2 clustering, docker cluster mode etc. Let’s look at the several improvements over the previous one:
  • Load balancer takes care of the requests and accomplishes two functions: static filers server and reverse proxy. It serves all static files (CSS, JS, Images) by itself without touching the web app. The request that the app needs to resolve is redirected to it, this is called a reverse proxy.
  • Zero-downtime upgrades.
  • Load Balancer:
A load balancer evenly distributes incoming traffic among web servers that are defined in a load-balanced set. A load balancer acts as the “traffic cop” sitting in front of our servers and routing client requests across all servers capable of fulfilling those requests in a manner that maximizes speed and capacity utilization and ensures that no one server is overworked, which could degrade performance. If a single server goes down, the load balancer redirects traffic to the remaining online servers. When a new server is added to the server group, the load balancer automatically starts to send requests to it. Using load balancer with single server setup & multiple instances of application: scale node.js performance
  • Horizontal Scaling:
After eliminating our previous problems, assume our application has 3 times more users than before & evolving with new features for users. At this point, users are having more issues regarding using the application. The most probable bottleneck can be I/O. Database taking longer time to respond. What could we think of for the probable solution. Adding more server resources? That might not be effective as we have seen the long-term disadvantages of having vertical scaling.  Adding another server can be a great addition. So, this is called “Horizontal Scaling”. Horizontal scaling, referred to as “scale-out”, means adding additional nodes or machines to our infrastructure to cope with new demands. Horizontal scaling is more desirable for large scale applications due to the limitations of vertical scaling.  Few advantages are easier to scale from a hardware perspective, fewer periods of downtime, resilience, fault tolerance and increased performance. Increased complexity of maintenance and operation, and higher initial cost of server setup are some disadvantages. Main factors to choose horizontal scaling for:
  1. Unable to access our application if the web server goes down. 
  2. Many users access the web server simultaneously and it reaches the web server’s load limit.
  3. Users experience slower responses or fail to connect to the server.
Here is a pictorial representation of the server with load balancer and scale-out: node.js performance
  • Database Replication:
In this scenario let’s see what will happen when a failover happens:
  1. If server 1 goes offline, all the traffic will be routed to server 2. This prevents the application from going offline.
  2. If the application traffic grows rapidly, and two servers are not enough to handle the traffic, the load balancer can handle this problem gracefully (auto-scaling). Only need to add more servers to the web server pool, and the load balancer automatically starts to send requests to them.
Let’s see the improvements we made by horizontal scaling.
  1. No failover issues.
  2. Handle more traffic.
  3. Availability of server.
Did we notice one thing from the above figure? No problem if you have not figured it out yet. We will help you here. Database is missing out from our above representation. As we have multiple servers for our application now what will happen to our database? It’s better to have a separate database server in this scenario. So, all the DB operations get leveraged by the dedicated DB server & we don’t have to bother about which of our application servers will leverage it. Later, we will discuss separating the database from the application servers. Replication in computing involves sharing information to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault tolerance, or accessibility. The master-slave is a database architecture divided into a master database and slave databases. The slave database serves as the backup for the master database. This architecture is used to enhance our Node.JS applications reliability to a greater extent. Below figure shows the flow of database replication. node.js best practices How will we get benefited in our NodeJS application from this model:
  1. Better performance.
  2. Reliability.
  3. High availability.
We might have some questions regarding this model. No worries we got your points. So, what will happen if this master database goes down. Is read write operation happens in all the database? What if all slave databases are down? These are the simplified answers to those questions. The master database is actually the keeper of the data resources and also the place where all the writing requests are performed. The reading operations are spread across multiple slave databases relative to the master database.  If the slave database goes offline – All the read operations will redirect to the master database temporarily. After the issue is found, a new slave database will replace the old one. In case of multiple slaves are available read operations will redirect to healthy slave databases. If the master database goes offline – A slave database will be promoted to be the master database. A new slave database will replace the old one for data replication immediately. Now it’s time to see how our scalable application looks like adding a database to it. scale node.js performance Let’s see what we have improved so far by scaling our NodeJS application & database. In this above setup, we started growing horizontally rather than vertically. In other words, we separated the application from the database and scale each one with multiple instances. There are several advantages of having the database on a different server than the app:
  1. Application and database don’t fight for the same resources.
  2. We can scale each (application, DB) independently too as many as we need.
The cons are that getting this setup is more complicated. Furthermore, since the application and DB are not on the same server performance issues might arise due to network latency or bandwidth limits. It maximizes performance, it’s recommended to use private networks with low latency and high-speed links.
  • Caching:
As we have a newly scaled-out system and we have seen some cons of having database performance issues due to bandwidth or network latency. What could we do to minimize some database calls and scale our application. We can use the help of caching? Yes, we can. So, caching can help us minimize some database calls by adding caching to our Node.JS application. Cache is a temporary storage area that stores the result of expensive responses or frequently accessed data. So that subsequent requests are served more quickly. As every time a new web page loads, one or more database calls are executed to fetch data. The application performance is greatly affected by calling the database repeatedly. The cache can mitigate this problem. cach setup After receiving a request, a web server first checks if the cache has the available response. If it has, it sends data back to the client. If not, it queries the database, stores the response in cache, and sends it back to the client. This caching strategy is called a read-through cache. There are a few considerations about using a cache system, when to use it, expiration policy, consistency, eviction policy & mitigating the failures.

Read More: Start Caching With Python

  • Content delivery network (CDN):
As our application keeps growing and more users are using our application. We can take some more steps to scale it for smooth operations. Using CDN is a great addition for serving static files. We are using a multi-server scalable system. So, where to put the static files in the above scenarios. CDN can help us to mitigate that problem. A content delivery network (CDN) focuses to provide the following services: storage management of content, distribution of content among the edge servers, cache management, delivery of static dynamic & streaming content, backup & disaster recovery solution and monitoring, performance measurement and reporting.  A CDN allows for the quick transfer of assets needed for loading Internet content including HTML pages, JavaScript files, stylesheets, images, and videos. For our Node.JS application, we can use a CDN server for serving images (i.e., S3). Which will improve the performance of our application & move the workload from our server.
  • Stateless:
Some key notes to make a system scale-out. We must make sure our application is stateless. Meaning no state to be kept on the server side. We need this because our web server will scale according to the need of the user’s request. So, it is difficult to get the stateful data when we don’t know which machine will perform our operation for the user. Here is a simplified model we can follow. Web Browser (has state) <-> Web Server (stateless) <-> Database (has state/stateless) HTTP requests can be processed from any one of our multi-server models. State data is stored in a shared data store and kept out of web servers. A stateless system is simpler, more robust, and scalable. To share stateful data we can use a NoSQL database and data can be shared across the server without making the server stateless. There are also numerous techniques to make an application stateless. That will be another topic to discuss. Let’s save it for later. Let’s look after adding all the above components to our Node.JS application. node.js application We have been leveraging vertical and horizontal scaling, we have separated web applications from database instances, and deploy them to multiple servers. However, we have a single code base that handles all the work in our application for users’ needs. So, what if only some specific services needed to be scaled beyond the above representation? In our e-commerce application, we’ll have models for a product, payment feature, cart, customers, admin, and order. Based on the request log let’s assume our payment, cart & order services get lots of traffic from users. In our monolithic application, we can’t scale those specific models. Adding more features makes it more complex, hard to understand, restarts the whole application when deploying, and often lacks flexibility. We can break it down into smaller pieces and scale them as needed. Going from monolith to microservices for better management of our application. Microservices became necessary due to the shortcomings of the monolithic pattern of software development.
  • Logging, Metrics, and Automation:
When working with a small application that runs on a few servers, logging, metrics, and automation support are good practices. However, now that our NodeJS application has grown to serve a large business, investing in those tools is essential. Monitoring error logs is important because it helps to identify errors and problems in the system. Collecting different types of metrics helps us to gain business insights and understand the health status of the system. When a system gets big and complex, we need to build or leverage automation tools to improve productivity.
  • Microservices:
Microservices are a style of service-oriented architecture (SOA) where the app is structured on an assembly of interconnected services in software development. In a microservice, each software application feature is separated from the other, in most cases with their respective servers and databases. Applications built with this kind of architecture are loosely coupled, also referred to as distributed applications. Communication between services is made possible in several ways, ex: HTTP(s), grpc, and message broker. For our context here we will break our monolithic application into smaller services as we discussed the models for e-commerce applications. To scale specific services, we can do it easily without causing any trouble to others. If one service is down in our microservice model it will not hamper the overall user experience, only that specific service will be down instead of the whole application. Here is what our microservice model will look like. microservices model Let’s see how microservice will improve our application’s scalability:
  1. Easier to scale, scale only required services.
  2. Improved productivity.
  3. Improved fault isolation.
  4. Reusability of services.
  5. Simpler to deploy.
  6. Easier migration.
  7. Elasticity


Scaling is an iterative process. What we have learned here so far can help us achieve scaling a NodeJS application. For more fine-tuning, we can make new strategies to scale it further. For example, we might need to deploy it in data centers depending on the number of geographic users.

Tech Stack
0 +
Accelerate Your Software Development Potential with Us
With our innovative solutions and dedicated expertise, success is a guaranteed outcome. Let's accelerate together towards your goals and beyond.
Blogs You May Love

Don’t let understaffing hold you back. Maximize your team’s performance and reach your business goals with the best IT Staff Augmentation