Mailing List Archive

OT: how does excel find commas within fields of a csv file?
To clean up csv files I use excel's find/replace to swap the commas
occurring within fields for something benign. How does this magic work?
Different character sets within the same file?

Is it possible to do this with shell scripting?
Re: OT: how does excel find commas within fields of a csv file? [ In reply to ]
On 2/27/24 20:54, Adam Carter wrote:
> To clean up csv files I use excel's find/replace to swap the commas
> occurring within fields for something benign. How does this magic
> work? Different character sets within the same file?
>
> Is it possible to do this with shell scripting?
Once Excel (or LibreOffice) reads in a csv file, the commas are no
longer present, and it just searches within the cells.  It might be
possible for a shell script to do it, but you need to parse the file to
distinguish any commas separating the fields from commas within the
fields.  I'm sure there are plenty of utilities to do this, but it's
certainly not trivial.
Re: OT: how does excel find commas within fields of a csv file? [ In reply to ]
On 28/02/2024 02:17, Jack wrote:
> On 2/27/24 20:54, Adam Carter wrote:
>> To clean up csv files I use excel's find/replace to swap the commas
>> occurring within fields for something benign. How does this magic
>> work? Different character sets within the same file?
>>
>> Is it possible to do this with shell scripting?
> Once Excel (or LibreOffice) reads in a csv file, the commas are no
> longer present, and it just searches within the cells.  It might be
> possible for a shell script to do it, but you need to parse the file to
> distinguish any commas separating the fields from commas within the
> fields.  I'm sure there are plenty of utilities to do this, but it's
> certainly not trivial.
>
The other thing is, look up the definition (such as there is) of CSVs.
Special characters (such as commas) can be quoted. Standard practice as
far as I can tell, is that any cell containing a comma will be
double-quoted, and the quotes are stripped on import.

The other trick I learnt is that to prevent Excel mangling text, you
precede it with a single quote - for example I want eg "+7" in a cell,
so I have to enter '+7.

Cheers,
Wol
Re: OT: how does excel find commas within fields of a csv file? [ In reply to ]
> The other thing is, look up the definition (such as there is) of CSVs.
> Special characters (such as commas) can be quoted. Standard practice as
> far as I can tell, is that any cell containing a comma will be
> double-quoted, and the quotes are stripped on import.
>

Thanks - looks like quoting is the answer.
Re: OT: how does excel find commas within fields of a csv file? [ In reply to ]
On Wed, Feb 28, 2024 at 6:25?PM Adam Carter <adamcarter3@gmail.com> wrote:
>>
>> The other thing is, look up the definition (such as there is) of CSVs.
>> Special characters (such as commas) can be quoted. Standard practice as
>> far as I can tell, is that any cell containing a comma will be
>> double-quoted, and the quotes are stripped on import.
>
>
> Thanks - looks like quoting is the answer.

It might not be something you want to deal with but pretty much
every Python data analysis and machine learning package has
functions for reading and writing CSV files.

- Mark