In an era of relentless technological advancement, increasingly sophisticated software systems are emerging, accompanied by ever-growing and complex data. To handle large and complex data, modern software systems are often designed using the Microservices model. The Microservices model helps break down the system into smaller services, each of which can have even smaller components.

One of the biggest challenges when building a Microservices system is how to make the services communicate with each other effectively. In this article, we will explore a communication technique between services or components in the Microservices model, known as Polling.

This is a blog post in my System Design series, if you are interested in issues related to system design. software system, follow this series to update the latest knowledge!

Some concepts and terminologies

Before delving into the details of Polling, we need to standardize some terms:

  • Communication in a software system is between multiple services or different components.
  • There are two terms when it comes to communication between any two components in the Microservices model, we often distinguish between upstream and downstream. Upstream is the component that returns data, while downstream is the component that receives data from upstream.

πŸ’‘ A quite understandable and practical example of upstream and downstream is that they are like the upper and lower parts of a river. The upstream is where the water flows out, while the downstream is where the water flows to, the downstream will have to depend on the upstream to have water flowing down. Therefore, here, upstream will be the upper part, while downstream is the lower part.

  • When applied to real architectures and systems, they can be:

    • Client-Server: The Client (Browser) is downstream, and the Server is upstream.
    • Service-Service: In a Microservice architecture consisting of multiple services, the requester is downstream, and the responder is upstream.
    • Service-Database: A Service sends requests to a Database to retrieve data, where the Service is downstream, and the Database is upstream.

We can see that the concepts of upstream and downstream are quite common and important in designing software systems. They can be applied in various scenarios and contexts, making it easier for us to understand and design systems.

What is Polling?

In simple terms, Polling is a technique where a downstream continuously sends requests to an upstream to check and retrieve the latest status or data if available. Polling technique is commonly used in the following cases:

  • Downstream needs to fetch the latest data from upstream without knowing when that data will be updated.
  • Downstream needs to check the status of upstream without knowing when that status will change.

Based on the nature of polling, there are two common Polling methods: Long Polling and Short Polling.

Short Polling

Short Polling is a Polling method where the downstream sends requests to the upstream to fetch the latest data, and the upstream immediately returns the data regardless of whether the data to be fetched has changed or not. Depending on the configuration, the downstream will send Polling requests after a certain period of time (interval time).

How it works

sequenceDiagram
    participant D as Downstream
    participant U as Upstream
	Note right of D: interval time = 5s
	loop Every 5 seconds (interval time)
		D->>U: Poll (request) for data
		U->>D: Data (response)
	end

In the diagram above, we can observe the operation between the Upstream and Downstream components as follows:

  • When the downstream needs to fetch data from the upstream, it sends a request to the upstream to retrieve the data.
  • After a certain period, in this example, 5 seconds, the downstream sends another request to the upstream to fetch the latest data, and this process continues until the downstream no longer needs to retrieve data.

Advantages

  • Simple and easy to implement: Short Polling does not require complex configuration, making it easy to deploy and use.
  • Easy to configure interval time: It is easy to configure the interval time for the downstream to send Polling requests after a short period if needed.

Disadvantages

  • Increase latency: Since the downstream sends Polling requests after a certain period, the data that the downstream receives may be old, leading to an increase in latency in updating the latest data.
  • Resource consumption: Since the downstream sends Polling requests after a certain period, if the data does not change during that period, the downstream still has to consume resources to send Polling requests. On the other hand, if the data changes too much during that period, both upstream and downstream will have to process a large number of Polling requests, leading to resource consumption for both sides.

Long Polling

Long Polling is a Polling method where the downstream sends requests to the upstream to fetch the latest data, and the upstream does not immediately return the data until the data to be fetched has changed, at which point the upstream will return the data to the downstream. And after the downstream receives the data, it sends the next Polling request to fetch the latest data.

How it works

sequenceDiagram
	participant D as Downstream
	participant U as Upstream
	Note right of U: timeout = 10s
	loop
		D->>U: Poll (request) for data
		alt Data is changed
			U->>D: Data (response)
		else Timeout
			U->>D: Timeout (response)
		end
		U->>D: Data (response)
	end

The diagram above appears more complex compared to Short Polling, let’s delve into the details:

  • When a downstream needs to fetch data from the upstream, it sends a request to the upstream to retrieve the data. The upstream does not immediately return the data to the downstream but waits until the data to be fetched has changed.
  • If the data to be fetched has changed, the upstream returns the data to the downstream. If the data has not changed but the timeout period has elapsed, the upstream notifies the downstream that the timeout has occurred without data change (this prevents the upstream from holding the connection for too long).
  • After the downstream receives data from the upstream, it sends the next Polling request to fetch the latest data, and this process continues until it no longer needs to retrieve data.

Advantages

  • Reduce latency: Long Polling helps reduce latency in updating the latest data because the downstream only receives data when the data to be fetched has changed.
  • Reduce resource consumption: Long Polling helps reduce resource consumption because the upstream does not need to process a large number of Polling requests if the data does not change during a certain period.

Disadvantages

  • Complex to implement: Long Polling is more complex to implement than Short Polling because it requires the upstream to hold the connection until the data to be fetched has changed.
  • Timeout management: Long Polling requires timeout management to prevent the upstream from holding the connection for too long, which can lead to resource exhaustion.

Comparison table

Short PollingLong Polling
Simple and easy to implementComplex to implement
Immediate data retrievalData retrieval when changed
Increase latencyReduce latency
Resource consumptionReduce resource consumption

Conclusion

In this article, we have explored the Polling technique, a communication method between services or components. We have learned about two common Polling methods: Short Polling and Long Polling, along with their advantages and disadvantages.

Hopefully, this article has helped you understand Polling technique better and apply it to your system design most effectively. In next articles, we will delve deeper into applying Polling in practical scenarios.

Goodbye, and see you in the next article πŸ–