Hacker News new | past | comments | ask | show | jobs | submit login

One of my coworkers found a data integrity causing bug that has been around for 11 years.

He and his team lead had no idea how such a bug missed 11 years of peer-QA through all edits on that module / activity.

It was as simple as converting to string and back and hitting an edge case on internationalization with commas vs periods.




My favourite decimal separater i18n bug was in a product where we were reading a time dilation factor from a config file. In the default shipping file (which basically nobody ever changed) it was specified as "1.0". We would then read this using (this is in .Net) Double.parse() which honors the users locale if none is specified, so for all users in Europe, and probably elsewhere, this was interpreted as 10 ('.' is a grouping construct and more or less ignored). So the software was running 10x slower than expected.


. would be a weird here as well (and we had a share of issues with early online banking systems: Think "I want to pay 10 $currency, let's put 10.00 - and the bank uses the . as a delimiter for thousands, i.e. 1.000,00 / you meant 10,00).

That said, my favorite localization issues stem from CH/Switzerland. Numbers are formatted as 1'000.00 there (someone correct me), which is especially odd:

1) That's a weird decimal separator. DE uses a , and as far as I know FR uses a , as well. I _assume_ IT uses a , - so why is this a . in CH?

2) The ' really kills a lot of naive parsers and causes quite a bit of frustration in other fields (think OCR - some overly fat 1'000 might turn into a 11000 with bad luck/crappy image quality or (marginally better) lead to 1000 where means 'I think here's a character, but no clue what that might be')


1) This is actually not that clear cut. Both , and . are used and it depends on context which is used where (usually within the same business sector the guidelines are consistent).

2) Actually, I find it much more confusing that people are constantly using . and , for two completely different functions and not get confused about it. Mistaking a . for a , and vice-versa is quite easy.

And that not even considers the likely confusion for people coming from countries with different rules.

OTOH an ' is highly distinct visually from both a , and . so it's always clear that it can't be the decimal separator.


Different currency has different conventions?


I'm confused.

Yes, different conventions exist (in this case the thread drifted to bugs based on locales, so obviously we're talking about different conventions here). I .. don't understand what you're saying.

(and my original post is missing an asterisk in two places, as in 1?000 as OCR result for example - messed up the formatting in the middle of the night)


Sorry - my point was counter to your about DE, FR and IT having the same conventions, so it's odd CH doesn't too. They all use the same currency, whereas CH doesn't, hence why there'd be a different one (?). E.g. we use £1,000.00 in the UK.


DE, FR and IT have the same currency only the last 15 years or so. One would expect that they would be using the same conventions because of the history and relationships between CH and all of these countries.


I shipped a program with the exact save bug! And I've worked on at least one website with the same bug. ASP.NET uses HTTP headers to determine locale and will parse/format on it. Which means if someone used the currency specifier for some reason, all of a sudden they're displaying the wrong prices!

Or, if the website reads some string from the back end, like say a config value "DefaultCreditLimit", the user night be able to modify the interpretation of it.

I've come to the conclusion that this automatic behavior that tries to guess at the format required is simply wrong. If a program wants user-specific formatting, it should opt-in.


Satellite control software: you enter tiny time intervals using scientific notation to command very short thruster bursts for realignment. User A, the usual guy, always uses lowercase 'e', no problem. One day he goes on holiday, user B takes over, and on his first day enters a value using uppercase 'E'. The s/w is not expecting this, and so interprets 7.6E-3 seconds as 7.6 seconds (e.g., exact numbers unknown).

Fortunately, after the predicable mass panic, they were eventually able to relocate the satellite.


> The s/w is not expecting this, and so interprets 7.6E-3 seconds as 7.6 seconds

This sounds extremely sloppy, especially for satellite control software. Anything not fitting exactly the expected format should lead to an error so the operator can immediately correct it, not be silently ignored.


Are you running FxCop? Specifically CA1305: Specify IFormatProvider (http://msdn.microsoft.com/en-us/library/ms182190.aspx) should help with this class of bugs - or at least make the bug easier for reviewers to catch (CultureInfo.CurrentCulture stands out in config parsing code).


Those are the ones which scare me the most.. when you find them, they're potentially devastating, and you haven't hit them.

Another I've seen is the potential landmine that can go of at ANY time and then one day, 5 years in the future, it hits you at 2AM and you can't figure out WTF is happening.


Right now I'm working on our company's new commissions payment system. Thanks to you I'm going to have nightmares tonight.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: