Import CSV .ParseNumbers behaves different

I have a .csv file with 3 lines:

#serial-number;device-position;device-identification;created;value-data-count;manufacturer;version;device-type;access-number;status;signature;fabrication-no,,inst-value,0,0,0;volume,m3,inst-value,0,0,0
0016021700;A-1001-VM1 [052000480] [1];20107866;2022-01-31 00:00:00;00;GWF;60;warm water (30°C-90°C);186;0;0;20107866;9,308
0016021700;A-1001-VM1 [052000480] [1];20107866;2022-02-01 00:00:00;00;GWF;60;warm water (30°C-90°C);210;0;0;20107866;9,370

This topic concerns the last value in line-2 and line-3 of the csv:

9,308 and 9,370

If I use v47.0.1345 and I import the csv with “.ParseNumbers=False” the output is the same as the input:

9,308 and 9,370

If I now do the same with the latest release 47.0.1398 the output is:

09/01/0308 00:00:00 and 09/01/0370 00:00:00

The last result is not correct.

Can you fix this?

Hi Ton,

Try setting the new CsvLoadOptions.ParseDates property to false as well.

Or, you can specify exactly what data formats you want to parse, for example:

var options = new CsvLoadOptions(CsvType.SemicolonDelimited);
options.ParseNumbers = false;
options.DateTimeFormats = new string[] { "yyyy-MM-dd hh:mm:ss" };

var workbook = ExcelFile.Load("input.csv", options);
workbook.Save("output.xlsx");

This way the values in the “created” column will still end up being parsed as DateTime values, but those last digits will remain as they were.

I hope this helps.

Regards,
Mario

I could set .ParseDates to false…but what about if the csv contains dates that I want to treat as dates?

And how can a value of “9,308” be interpreted as a date? Doesn’t the default “DateTimeFormats” setting following system defaults? If so, it shouldn’t treat this value as a date!?

Also, why did it work correctly in 47.0.1345 but doesn’t work correctly in 47.0.1398?

I’m afraid if I set ParseDates to False I’ll run into new problems in the future…

BR, Ton.

Please take a look at my second suggestion, the one that sets the DateTimeFormats.

Yes, the default uses the system defaults and it is parsing this value as a date.
Please try the following:

string value = "9,308";
var style = DateTimeStyles.None;
var culture = CultureInfo.InvariantCulture;

Console.WriteLine(DateTime.TryParse(value, culture, style, out DateTime result));
Console.WriteLine(result);

Because the support for reading DateTime values from CSV (and HTML) files was added in the current latest release, see the release notes:
https://www.gemboxsoftware.com/spreadsheet/downloads/bugfixes.html

What kind of problems?

This property didn’t exist before, so if you set it to false it should behave exactly how it previously behaved when it didn’t exist.

Regards,
Mario

Hi Mario,

I’ll add some options to let the client set the .ParseDates option together with the dateformat…they won’t be happy because this means they have to manually edit a lot of workflows which process csv files. In that regard this new option breaks backwards compatibility…because:

if this is a new option and the behaviour of the “old” version is comparable to “.ParseDates=False” then I think it would have been better to let .ParseDates be “False” by default…it seems the default is now “True”…if it was .False by default I would not have run into this issue and my clients wouldn’t need to update all their workflows…

BR, Ton.

Also…it seems that you might get a lot of “false positives” when strings like

9,3
8,13
7,234
etc.

…will be converted to a “DateTimeFormat” when no custom format is set; in that case the default system format will cause all these values to be interpreted as dates…which is not a good thing I think!

I think it would be safer to only let .ParseDates=True work if a valid datetimeformat has been set…and not use the system default…

Hi Ton,

I agree with you, the ParseDates property should have been false by default to keep the previous behavior.
Unfortunately, the debate we internally had has concluded it in another way.

Basically, we were hoping that keeping both parse properties at the same default would be more intuitive. Having one enabled and one disabled by default seems like it could result in confusion.

Nevertheless, I don’t understand why you need to add some options for your clients?
You said that version 47.0.1345 worked for you, so why don’t you just set that new property to false and thus have the same workflow that you previously did…

Regards,
Mario

Hi Ton,

These “false positives” will only occur when you explicitly specify ParseNumbers to false.
I’m currently not sure if that is a more common use case than having it as default.

But anyway, to me this indicates that the user is looking to read all the CSV values as strings, so setting the ParseDates to false will probably be the desired action as well.

Regards,
Mario

Nevertheless, I don’t understand why you need to add some options for your clients?
You said that version 47.0.1345 worked for you, so why don’t you just set that new property to `false` and thus have the same workflow that you previously did…

The answer to that is simple: when I upgraded to the latest release I did not make any changes to my code…which should not be necessary when just updating to a new release.

To get the desired behaviour I would haveto change my code to make sure .ParseDates=False…(ofcourse) I did not do that because I didn’t know that was necessary. The result is that I released an update of my own software which caused the problem for the client(s).

Ofcourse in a new release of my own software I will make sure .ParseDates=False…but the damage was already done in case of some clients.

I that respect I don’t understand the decision to make “.ParseDates=True” the default setting.

Hi Ton,

I apologize for the inconvenience and the bad experience you had with the update.

To be honest this was probably not an expected scenario, if it was the chosen default would probably end up being false.

For now, I believe the ParseDates property will remain as it is unless we receive more complaints like this.

Regards,
Mario

Ton,

I wanted to add something, I think we found the solution that will satisfy both parties.

We’ll change the default DateTimeFormats property to include the formats that we’ll parse by default. We will choose only “reasonable” formats, so those common number values should not end up being parsed.

I hope this will work for you.

Regards,
Mario

Hi Ton,

Please try again with this latest bugfix version:
https://www.gemboxsoftware.com/spreadsheet/nightlybuilds/GBS47v1401.zip

Or this latest NuGet package:
Install-Package GemBox.Spreadsheet -Version 47.0.1401-hotfix

Note, the CsvLoadOptions.DateTimeFormats is now of IList<string> type and by default it includes some ~30 commonly used date formats.

I hope this helps.

Regards,
Mario