How to test CSV files content in end-to-end testing

Posted on November 6, 2017

Hey,

Often in web applications we have a possibility of exporting data to csv files. This is a very useful feature because we can look at these data later, or analyze some of them. However, from a developer point of view, there is a problem. Problem testing their (files) content. Of course, we should have unit tests to check that the data written is correct. But what about end-to-end testing for the whole functionality?

We can use Selenium or Canopy libraries, which allow us to simulate user interaction with the application, by performing all possible user interactions with website.
So we can automatically recreate the entire user action required to generate the file. However, those libraries do not allow us to check content of files. What in this case?

I encountered a similar problem during my daily work. In order to solve this problem, I decided to create a small F# library that would be responsible for opening the file, parsing it and providing functions that would return the data included in that file.
At first I decided to use the CsvProvider library. The library code I created is as follows:

Because the files that are generated by the application can have different sizes, different number of columns, different number of rows, and additionally data can be of different types, I decided to create a schema file containing all possible headers with a single line, which will allow to specify the types for the column data without specifying them using the schema parameter in CsvProvider.

Within the library itself, I decided to make available the methods that were responsible for:

Return all the headers in a file

Return values ​​in a specific columns

Connecting the library to the project responsible for automated testing (implemented in C# and using a Selenium and NUnit libraries) looks like this.

An important information worth pointing out is the use of the key word use instead of let when opening the file, we have to keep in mind that when file would be deleted in a Teardown test phase, it is possible that the file will not be closed before by the CsvProvider, which can lead to the red test result.
Since the code is ready to test the application, I have encountered a problem here. Now, with files of varying sizes, the number of columns of data returned is correct, but the parsing of the content of the file (rows), doesn’t works because the file generated in the automated test was not exactly the same as declared as a “standard”. How to fix this problem? I tried to define a schema

And do not pay attention to the errors in the file

But none of these solutions helped.
So I decided that in my case it would be better to use CsvFile instead of CsvProvider. Because of that, I had to modify the library code a little bit. After modification it looks as follows.

As you can see I had to create a helper function that provide me an index of column based on an index of a particular header in a headers list (row).

Using CsvFile instead of CsvProvider, I was able to handle csv files that had different number of columns or different types of separators, which would have ended with the definition of n different types of Csv files if using a CsvProvider.

Finally, the entire library code looks like this:

You may notice that I have defined a couple of methods that allow you to test the file more extensively. Include information about the separator, quote, column numbers, the number of rows, range of dates etc.

In conclusion, you can see that using the libraries/classes available in Fsharp.Data I was able to easily write a library that allows me to test the contents of csv files. In addition, you can see that the library code is very concise and straightforward, giving developers and testers a lot of opportunities when writing tests, because they provide all possible functions. In the case of a project where I work, we have not limited ourselves to the testing of csv files with CsvFile, but we also decided to test rss feeds with XmlProvider. So it seems that the classes provided by Fsharp.Data give us a lot of possibilities in terms of processing content of files, rss and web pages (using HtmlProvider).