At Radial Development Group, we build a wide range of software applications. But even varied projects often have similar technical issues. One common customer request is the ability to upload lots of data to an app at once using a CSV file.
This feature request comes with several challenges. What if the headers in the user’s CSV file don’t match the database fields? What if the server times out while processing the file? What if the CSV file is ten thousand lines long? Ten million?
There are several ways to deal with these problems. The trick is to solve them all while simultaneously constructing a compelling user experience. Here’s how we built a CSV upload feature that is prepared for all of these situations.
Server-side: Asynchronous CSV Uploads
We knew going into the problem that we needed to process the CSV in a separate thread. If we had tried to process the CSV in the same thread that was rendering the next page, it would appear to the user that the screen was frozen until the upload completed.
With small uploads this might not be noticeable, but we needed the app to be able to handle large files gracefully. Furthermore, a large enough upload would fail anyway because the server would time out if we tried to process the file in the same thread as the rest of the app.
Since many of our projects are developed using Ruby on Rails, we decided to use the Sidekiq gem to manage the app’s asynchronous tasks. This meant that we could create a worker that steps through the upload process. Sidekiq then manages the process in a separate thread from the rest of the app.
def perform csv_upload_id listings_update = CSVUpload.find csv_upload_id saved_path = Rails.root.join('tmp', listings_update.id.to_s) listings_update.file.copy_to_local_file(:original, saved_path) row_number = 0 CSV.foreach(saved_path, headers: true) do |row| row_number += 1 create_listing(row, csv_upload_id, row_number) end listings_update.update total_records: row_number end
Once the process is kicked off by the user clicking “Start Upload,” the worker walks through several steps. First, it saves the CSV file to the database, which is how Sidekiq prevents the upload process from eating up a ton of system memory on large uploads. Next, it iterates through each row, returning any errors on a row-by-row basis. And last, it uploads all of the remaining error-free data to the database.
Client-Side: Constant Visual Feedback
All of that was great for the back end of the project, and it meant that the user would no longer see a blank screen while their enormous CSV upload processed.
But it left us with a completely different problem: since the upload was happening asynchronously, the next page would load before the file had fully processed, even if the file was only one line long.
This meant that unless the user refreshed the page, they would be under the impression that the CSV file didn’t upload at all!
We solved this by using a jQuery function to refresh certain elements of the page until the CSV upload completes. This gives the user constant feedback on the status of their upload, so a ten-line CSV file will quickly tell the user the upload is complete, while a much larger file will keep the user informed on the status of their upload.
(function() { var reCompletionText = /<%= completion_string %>/; $(document).ready(reloadUntilComplete); function reloadUntilComplete() { var progressText = $("#progress").text(); if (!reCompletionText.test(progressText)) { $("#live-reload").load("#! #live-reload", function(response, status, xhr) { if (status !== "error" ) setTimeout(reloadUntilComplete, 2000); }); } } })();
It also generates a list of errors beneath the upload status as it processes the file, so the user can receive instant feedback on their upload and fix any mistakes they may have made.
Useful across multiple projects
The brilliant thing about all of this is that our work did not just apply to one project; it applied to two we have now, and will also remain accessible to us for future projects.
This is something we love about our process at Radial: since we work on a wide range of projects, we can really take the time to think through a complex problem once. Once we’ve solve the problem, we can tailor the solution to each project that needs it, rather than building every feature from scratch every time.
It saves us time. It saves our clients money. And it means we can focus on building quality products for each and every customer.