I am looking for suggestions on how to handle a csv file that is being created, then uploaded by our customers, and that may have a comma in a value, like a company name.
Some of the ideas we are looking at are: quoted Identifiers (value “,” values “,”etc) or using a | instead of a comma. The biggest problem is that we have to make it easy, or the customer won’t do it.
There’s actually a spec for CSV format, RFC 4180 and how to handle commas:
Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes.
http://tools.ietf.org/html/rfc4180
So, to have values foo
and bar,baz
, you do this:
foo,"bar,baz"
Another important requirement to consider (also from the spec):
If double-quotes are used to enclose fields, then a double-quote
appearing inside a field must be escaped by preceding it with
another double quote. For example:
"aaa","b""bb","ccc"