Note from the Sia Team: this is a guest post from Thomas Grant Bennett, a Sia community contributor. Thank you Thomas for your contribution!
Sia version 1.4.0 is the most significant release since Sia’s public launch in 2015. It is significant in terms of lines of code added, removed, and modified, the number of community contributions, API changes, and new features introduced. This article is focusing on some of the front-end and back-end changes, why they are important, and how they fit into the greater Sia vision. This is not a deep dive into any specific feature, and should not be considered a comprehensive list of all changes introduced in the Sia v1.4.0 release. Let’s get started!
New .sia file format
The new .sia file format is a monumental change in the Sia code. This new file format is the first step to eliminating many of Sia’s roadblocks to incredibly important features such as seed-based file recovery, sharing of files, partial uploads and downloads, scaling past 5TB of user data, scaling to tens of thousands of files/user, improved upload and download bandwidth, reduced RAM usage, and more. The full effect of the new .sia file format will be proven in future releases, but Sia version 1.4.0 sets the stage for these features to become a reality.
The new .sia file format uses binary blobs to store metadata about a user’s files that have been uploaded to the Sia network. Historically, the .sia files used simple JSON encoding of the metadata; while this made it very easy to manage, it had many limiting consequences. When a change is made to a JSON file the entire file has to be read, modified, and written to disk. The I/O requirements of opening, reading, and re-writing the JSON files became a major bottleneck in Sia’s performance. By using binary blobs Sia can now update portions of the .sia files without reading and writing the entire file. This new workflow dramatically reduces the I/O requirements for modifying .sia files.
By reducing the I/O requirements of reading, writing, and modifying the .sia files Sia can do things more efficiently. It can repair files faster, fetch files faster, get portions of files (important for things such as seeking in a video stream), handle more simultaneous uploads and downloads, and reduce the system load. The reduced I/O requirements will enable Sia to scale much larger than previously possible. Exactly how high Sia will scale remains to be proven, but I’m sure the “Sia Test App Community” (STAC) will be running version 1.4.0 through its test suite giving us some basic benchmarks.
The new .sia file format introduces new metadata fields. Sia can now keep track of created, modified, and accessed timestamps in addition to filesize on files uploaded to the Sia network similar to a what you might expect from a Linux stat command.
The new .sia file format enables Threefish cipher support. Threefish is an encryption cipher that is significantly less CPU intensive than Twofish, which Sia used prior to 1.4.0. This means that users who have historically been CPU bound while uploading and downloading files will see much better performance in the 1.4.0 release. Threefish also enables uploaded files to be partially modified without re-uploading the entire file. The uploaded file modification feature will be released sometime after 1.4.0 but the groundwork for that feature is available with the 1.4.0 release.
Sia now stores its .sia files in a new directory tree, which cleans up the Sia installation directory, this has no impact to most end-users but makes the entire Sia installation cleaner. Sia no longer stores any .sia files in memory. This should cut down on .sia file corruption during unexpected power loss, and reduce the memory requirements for running Sia with a large number of files.
Host blacklist and whitelist feature
Sia version 1.4.0 introduces host blacklist and whitelist features. This enables a user to pass Sia a list of hosts to use (whitelist), or avoid (blacklist). This is significant because it will enable users to restrict their data to a set of specified hosts instead of letting Sia decide which hosts to use. One possible use case for this feature is to keep your data on the Sia network within a specific geographic region. For example, if you want your data to only use hosts within North America you could whitelist all available hosts located in North America. With the 1.4.0 release, this feature is only available through the API, although several community developers are working on interfaces to make the host blacklist and whitelist features more accessible to non-technical users and help with the host selection and API commands. More information and use cases about this feature will be coming out in the coming weeks.
Better directory support
Sia version 1.4.0 introduces new directory functionality. Users can now create empty directories, query Sia for the contents of an individual directory, and delete entire directory trees in a single command. As users scale to thousands of files they will be able to manage them while maintaining performance given the new directory tree hierarchy. Additional features with directory level support will also be added in upcoming releases.
In addition to the user-facing improvements of better directory support, the Sia daemon will also manage file repairs on a directory level and report file health at the directory level. This will enable more efficient file repairs, and allow a user to more easily see which files are being repaired at any time.
Sia users can now restore their files from a backup of the renter folder. This release marks the first step towards a true seed based file recovery which has long been touted as the biggest feature missing in Sia. More information will be posted as this feature matures.
New API endpoints and siac commands have been added to facilitate easy backups of the renter folder and contract recovery from the consensus. With Sia’s new contract and renter folder backup functionality it should be trivial to back up and restore files automatically in the event of hardware failure.
One possible use case would be to keep Sia’s renter folder backed up to a flash drive, in the event your computer stopped working you would be able to restore all your files with just your seed and the flash drive backup. In a future release, Sia will automatically back up the renter folder to the Sia network enabling seed based file recovery.
Host score improvements
Many aspects of the host score have changed, I’ll mention a few of the key pieces here. The host scoring algorithm is now simplified and centralized in the Sia code. Historically a host was scored slightly differently depending on where in the code it was being accessed. Now there is a single point at which the host receives its score and that score is preserved and passed throughout the rest of the Sia modules.
The host scoring algorithm itself has changed, host collateral now weighs more heavily in the host score, and a host’s IP address now has weight in the host score. Multiple hosts with the same IP address will be penalized thereby reducing the effectiveness of “hosting farms” and dramatically reducing the chance of a Sybil attack against the Sia network. The host’s age has less weight, and the minimum storage requirements have been relaxed.
New RPC structure
The RPC (remote procedure call) structure in Sia has been revamped. Sia uses RPCs to facilitate communication between renters and hosts making RPCs a critical piece of the Sia code. Most users won’t notice a difference on this back-end functionality but I thought it was important to mention since it makes Sia more performant, secure, and is necessary for some important new features. All necessary RPC calls are now encrypted, simplified, and optimized for performance. In addition to being more secure and performant, Sia also added new RPC calls that will enable new features such as partial file downloads (more about that in the next section) and seed based file recovery (expected in version 1.4.2).
Sia has supported some rudimentary download streaming (most commonly used in video streaming) since the 1.3.5 release. With 1.4.0 Sia now supports more robust download streaming (aka partial downloads) and host upload streaming.
Partial downloads are being released in 2 steps:
- Step 1: Hosts will need to install version 1.4.0 which includes the RPCs required for hosts to support partial downloads.
- Step 2: Renters will be able to take advantage of partial downloads starting with version 1.4.1. This two-step rollout is required so that renters are not trying to perform partial downloads from hosts that do not have the required RPCs.
Upload streaming will enable developers to be more efficient about uploading data to the Sia network. One possible use case for upload streaming is to stream a security camera feed to the Sia network for safekeeping without having to split the feed into individual files before upload.
More details and demonstrations will be released with 1.4.1 since that is when most users will be able to take advantage of the streaming improvements.
Revamped User Interface
The Sia-UI has been completely revamped by the Sia team. Users of the UI will notice a much cleaner interface that adheres to the Sia branding. New animations, bug fixes, and some new functionality have been added as well.
Sia-UI is intended to be an easy to use implementation of Sia that supports most common use cases. The Sia API offers more advanced Sia functionality and greater control over how Sia handles your files.
- Audit of bootstrap nodes: All Sia bootstrap nodes are now on version 1.3.7 or higher.
- Revamped API documentation: Official API documentation now lives at https://sia.tech/docs.
- Ledger Nano S Support: Sia released an official Ledger Nano S app.
- Gateway rate limit: You can now control the rate at which the gateway module serves data to its peers.
- Seed sanity checks: If your seed fails Sia will attempt to tell you what is wrong with it.