Hello everyone, uploading data to the internet happens every day, from uploading photos to Facebook, videos to YouTube, TikTok, or files to Google Drive, etc. Files can range from small to very large sizes, but network environments have never been stable πŸ˜…, with jitter, lag, or sudden disconnections 🀯. Have you ever wondered how to deploy a system capable of resuming an upload even if the connection is lost midway?

Today, we will explore the Tus protocol, a protocol that helps us solve the above problem. Tus is an HTTP-based upload protocol that allows for fragmented, resumable, and secure data uploads. Tus enables flexible data uploads without having to start over if the connection is interrupted, saving bandwidth and upload time.

This is the first article in the Learn Something New series, a series where I will share with you new knowledge that I have learned through work experience or reading books, articles, etc. I hope you will learn many new things from this series. 🎊

1. What is the Tus Protocol?

Tus Protocol

πŸ“Œ Tus is an open HTTP-based protocol that allows for resumable uploads if issues such as network disconnections or system errors occur midway.

Tus allows for fragmented, resumable, and secure data uploads. Tus enables flexible data uploads without having to start over if the connection is interrupted, saving bandwidth and upload time.

Tus was created to bring the following advantages:

  • An upload protocol based on HTTP: HTTP is a popular protocol, Tus uses HTTP to upload data, making it easy to integrate into existing systems.
  • Multi-platform: Tus can operate on many different platforms such as Web, Mobile, IoT, …
  • High stability: Tus provides some specifications for data upload, helping the system to upload data stably, without data loss in the middle.
  • Multi-language support: Tus has support libraries for many different programming languages ​​such as Go, Node.js, Java, … making it easy to integrate into existing systems.

2. How Tus Works

Since Tus is based on HTTP and solely intended for data upload purposes, it requires a specific specification for data uploading. You can find detailed information about the Tus Protocol specification Δ‘Γ’y. Currently, the stable version of Tus is 1.0.0 according to SemVer, but protocol improvements are still being developed by the Internet Engineering Task Force (IETF).

2.1. Operation Diagram

Tus Protocol is a set of HTTP methods (OPTIONS, HEAD, POST, PATCH, DELETE, …) for uploading data. The basic flow of a Tus client and Tus server system is as follows:

	participant c as Client
	participant s as TusServer
	participant st as Storage

	c->>+s: OPTIONS (Gather information)
	s-->>-c: 204 No content (Supported methods, extensions, ...)

	c->>s: HEAD (Check if the file is already uploaded)

	alt File uploaded completely
		s-->>c: 200 OK (File size, metadata, ...)
	else File not uploaded or partially uploaded
		s-->>c: 404 Not found

		c->>+s: POST (Create a new upload)
		s->>+st: Upload new resource on storage (s3, gcs, ...)
		st-->>-s: Resource uploaded
		s-->>-c: 201 Created (Location, unique upload ID, ...)

		loop Upload chunks
			Note right of c: Read a chunk of data from file
			c->>+s: PATCH (Upload a chunk of data)
			s->>+st: Append data to the resource
			st-->>-s: Data appended
		s-->>-c: 204 No content (Offset, ...). Upload completed.

	alt Enable Termination extension
		c->>+s: DELETE (Optional, remove the upload)
		s->>+st: Remove the resource on storage
		st-->>-s: Resource removed
		s-->>-c: 204 No content (Upload removed)

In the diagram above, it shows a relatively complete set of methods that a Tus client and Tus server system need to implement to upload data.

  • OPTIONS: This method is used to retrieve information from the server such as supported methods, extensions, etc.
  • HEAD: This method is used to check if the file has been uploaded:
    • If the file has been completely uploaded -> the result is 200 OK -> end.
    • If the file has not been uploaded or has been partially uploaded, the server will return 404 Not Found and continue uploading the data as follows.
  • POST: This method is used to create a new upload, returning a unique upload ID to be used for uploading data. Simultaneously, the server will create a new resource in storage to store the uploaded data. Available only with the Creation extension.
  • PATCH: This method is used to upload a part of the data, returning the offset of the uploaded data. The server will also upload the data to the corresponding storage. The data upload will be performed multiple times until the data is completely uploaded (offset = file size).
  • DELETE: This method is used to cancel an upload, deleting all uploaded data. Available only with the Termination extension.

This is just a basic description. If you want to understand more about implementing a Tus server based on these methods, you can read the official specification documents.

2.2. How does Tus continue uploading data in case of interruptions?

Tus allows us to continue uploading data flexibly without having to start over if interrupted. This is based on Tus storing the offset of each chunk sent to the server. If an interruption occurs, the Tus client can resume uploading from that offset without needing to start over. To implement this, both the client and the server need to store information about the offset of the uploaded data. Where this information is stored depends on the platform. For example, on the web, this information can be stored in localStorage, sessionStorage, cookies, etc. For mobile, this information can be stored in SharedPreferences, CoreData, etc.

Please note that both the client and server must implement the Tus Protocol specification to meet the necessary methods. Otherwise, the data upload will not be performed correctly.

2.3. Tus Extensions

Data storage on the client depends on the platform (web, Android, iOS, etc.). Additionally, Tus supports several extensions such as:

  • Creation: Allows the client to create a new upload without having to upload data immediately.
  • Expiration: Allows the client to set an expiration time for the upload process. If the time expires, the client will not be able to continue uploading, and the server will delete the uploaded data.
  • Checksum: Allows the client to perform a checksum on the uploaded data. This ensures the data has not been altered and verifies the metadata.
  • Concatenation: Allows the client to split a large file into multiple parts and upload them non-sequentially and in parallel. This helps optimize the upload process.
  • Termination: Allows the client to terminate an upload and delete the uploaded data. This helps free up memory and storage.

Additionally, Tus specifies many other extensions that you can learn more about here.

3. When to Use Tus?

If you are building a system that requires uploading large amounts of data, with the need for flexible, resumable uploads that don’t have to start over if interrupted, Tus is a good choice for you. Tus ensures stable uploads without data loss in the middle, saving bandwidth and upload time.

So how do you integrate Tus into your system?

Tus is a protocol with its own specification as mentioned above. The simplest way to integrate Tus into your system is to use Tus libraries for the programming language you are using. Currently, Tus supports many programming languages such as Go, Node.js, Java, etc. You can learn more about Tus libraries here.

The official website also provides guides. If you are interested, you can refer to them here.

4. Conclusion

For a long time, uploading data has always been a challenge for any system, especially with large data and high fault tolerance requirements like today. Tus helps us effectively solve this problem, allowing us to upload data flexibly without having to start over if interrupted, saving bandwidth and upload time.

I hope that through this article, you have understood the Tus Protocol, how Tus works, when to use Tus, and how to integrate Tus into your system.

If you have any questions, don’t hesitate to leave a comment below, and we will discuss this interesting topic further. πŸ˜„