Stability enhancements for Proton Drive transfers
Proton Drive sometimes encounters a high transfer error rate. This can happen in disconnected, interrupted, and low bandwidth environments, even for files < 30 MB.
When this happens, Proton Drive shows a long list of transfer errors. Proton Drive automatically retries the files, but because it tries to transfer too many things at once, then the transfers can continue to fail again and again.
Keeping the host awake overnight doesn't help. Pausing and resuming sync doesn't help. Quitting Proton Drive and relaunching doesn't help.
Here are some suggestions to help Proton Drive conduct file transfers of all sizes, more reliably, even in poor networking environments.
Proton Drive appears to have a basic cap of four concurrent uploads. However, if the user initiates some downloads, then the overall bandwidth may be squeezed too tightly, causing transfers to slow to a crawl and fail. Instead of capping the downloads separately from the uploads, would be helpful to provide a global max transfer count option, default four. The user may tune this option upward for performance, or down to as low as 1 for reliability. Like uploads, any downloads that would conflict with the global transfer limit, should be queued rather than immediately executed.
A smarter implementation would include (exponential) backoff and retry, in terms of the number of concurrent transfers. That would be welcome later. But for the short term, even a simple user control to set a fixed preference for the number of global concurrent transfers, would also be welcome.
Proton Drive appears to queue file uploads in a somewhat haphazard order. Unknown if the current order is by a timestamp, or perhaps by filename. Regardless, in my experience it is more reliable to queue files in descending order of file size. Many projects involve legions of small files, with fewer large files. By prioritizing small files, the user is likely to witness most of their files successfully upload. At which point, Proton Drive can then attend to uploading the riskier larger files. One of the problems that keeps happening with the current Proton Drive code, is that it tries to queue large files (GB or TB) early, fails, and then retries the exact same files over and over. By saving the largest files for last, we can get the smaller files completely through the queue, freeing up valuable bandwidth for the riskiest files.
When an upload or download fails, I wonder if Proton Drive may be naively retrying the entire file, rather than just the failed chunked portions of the file. For comparison, other sync protocols like rsync and BitTorrent are able to chunk files. We can enrich Proton Drive with smaller chunks. That would significantly decrease the number of transfer failures. Even a drive with an incredibly large number of large files, can eventually succeed over just a few retry iterations. As long as some of the chunks complete each attempt.
Not everyone has a stable, low latency, high bandwidth network connection. I like the idea of Proton Drive, and with a bit of work it may approach the reliability of competing cloud storage providers, across all kinds of user environments.