Export to UTF-8 CSV using Data Preparation Tool Free

Export to UTF-8 CSV using Data Preparation Tool Free

Hello

We are using the Data Preparation Free tool to manage excel file that will then be imported into a Marketing Automation tool called Marketo.

Unfortunately, when we export the data from the data preparation tool usin CSV UTF-8 encoding, any Latin characters that have accents or double dots on top for example are all coming out incorrectly. When we do the same thing with MS Excel all come out fine.

What it is that we are doing wrong when exporting data from the data preparation free tool?

Re: Export to UTF-8 CSV using Data Preparation Tool Free

Hi Axel,

That is what I assumed ... it's actually an Excel issue: Excel doesn't automatically detect encoding of CSV files and always assumes it is Windows-1252 ... hence the issue you see when opening the CSV file generated by Data Prep. If you open the CSV file in Notepad++ or any other decent text editor, you will see that the file generated by Data Prep is well encoded in UTF-8.

Note that you can display correctly a file encoded in UTF-8 in Excel ... it's everything but user-friendly, though. See https://www.youtube.com/watch?v=GcYt1mJbwk4 for instance (it is not on Excel 2016 but the sequence is essentially the same). And it has been this way for ever ... so it is unlikely to be fixed by Microsoft.

Good news: in the next Data Prep release (planned for January 2018), you will be able to select the encoding when exporting to CSV. So you'll be able to export in Windows-1252 so that Excel can read it correctly natively.

Re: Export to UTF-8 CSV using Data Preparation Tool Free

Good news: in the next Data Prep release (planned for January 2018), you will be able to select the encoding when exporting to CSV. So you'll be able to export in Windows-1252 so that Excel can read it correctly natively.

Hello @gvaznunes,I have the same exact problem in Talend Open Studio, do you know if there is another way to solve the problem there?

Right now the field on the left is the one extracted directly from the table (inside a csv file) from the ORACLE DB with codification cp-1252, and the field on the right is the one extracted from the talend procedure with codification UTF-8 and also with the custom codification CP-1252....

Just a little bit of background about my Talend's Job.... Before the extraction to the csv file i use a REST call (tRESTclient) via Application Server to fill the table on the destination database. The data are stored with the correct encoding on the source db, but when i check on the destination db the data are already stored incorrectly....Is it possible that there is a different encoding on the Application Server?

Re: Export to UTF-8 CSV using Data Preparation Tool Free

Hello,I'm here again, always with the same issue....I checked in the details what happen in my job and i saw the the REST call on the Application Express works properly, the data extracted are in the correct encoding.