Requests Per Second (RPS)

Requests Per Second (RPS) is a metric that measures the number of requests made to an application or website within a specific time period, typically per second. It is used to measure the performance and scalability of an application or website.

Requests Per Second (RPS) formula

The formula for RPS is:

RPS = Total Number of Requests / Total Time (in seconds)

For example, if a website receives 1,000 requests in a minute, the RPS would be:

RPS = 1,000 / 60 = 16.67

This means that the website received an average of 16.67 requests per second in that minute.

Another example, if an application receives 500 requests in 30 seconds, the RPS would be:

RPS = 500 / 30 = 16.67

This means that the application received an average of 16.67 requests per second in that 30 seconds.

Example of projections of metrics of Requests per second for an application assuming growth in requests at various rates for 10 minute intervals over 1 hour starting from 100 requests per second:

RPS Growth Rate	10 min	20 min	30 min	40 min	50 min	60 min
-50%	50	25	12.5	6.25	3.125	1.563
-25%	75	56.25	42.19	31.64	23.73	17.79
0	100	100	100	100	100	100
25%	125	156.25	195.31	242.05	295.91	358.98
50%	150	225	337.5	506.25	759.06	1153.91
100%	200	400	800	1600	3200	6400
200%	300	600	1200	2400	4800	9600
500%	500	1000	2000	4000	8000	16000

These projections are based on the assumption that the rate of growth or decline will remain constant throughout the year, that the number of actions per user per day is constant and that the requests per second is the same for all actions. In reality, the rate of growth or decline may change over time, the number of actions per user may change, and the requests per second may vary depending on the type of actions.

Why it matters

It's important to measure RPS over different time intervals, such as per second, per minute, or per hour, depending on the nature of the application and the level of traffic. Measuring RPS over different time intervals allows you to identify patterns and trends in traffic and performance, and it can help you make decisions on scalability and capacity planning.

RPS is a useful metric for understanding the performance of your application or website, as well as its ability to handle traffic. A high RPS indicates that the application or website can handle a high number of requests, while a low RPS may indicate

Typical Requests per Second limits

AWS EC2 instances: Depending on the type of instance and the load balancer configuration, an EC2 instance can handle hundreds or even thousands of RPS. For example, a c5.9xlarge instance can handle up to 8,000 RPS.

Azure Virtual Machines: Depending on the type of VM, an Azure VM can handle hundreds or even thousands of RPS. For example, a Dv3-series VM can handle up to 7,000 RPS.

Google Cloud Compute Engine: Depending on the type of instance and the load balancer configuration, a GCE instance can handle hundreds or even thousands of RPS. For example, a n1-standard-8 instance can handle up to 4,800 RPS.

AWS Lambda: The maximum RPS for an AWS Lambda function is dependent on the number of concurrent executions that are allowed. The default concurrency limit is 1000, which you can request to increase.

Azure Functions: The maximum RPS for an Azure Function is dependent on the number of instances and the scale of the App Service Plan.

Google Cloud Functions: The maximum RPS for a Cloud Function depends on the number of instances and the scale of the App Engine service.

Requests Per Second (RPS)