Mailing List Archive

file names
HI,

since in Windows you can type anything for a file name, and even
in Unix, I wonder would it be feasible to clean up things to pure ascii?
Anyone did that?
1. convert UTF8 to ascii
2. translate space,',;,& to -,c,:,a

This is not complete list, however. And perhaps wrong, too. :)

Regards, Zdravko
Re: file names [ In reply to ]
On May 20, 2011, at 4:59 AM, Zdravko Balorda wrote:

> HI,
>
> since in Windows you can type anything for a file name, and even
> in Unix, I wonder would it be feasible to clean up things to pure ascii?
> Anyone did that?
> 1. convert UTF8 to ascii
> 2. translate space,',;,& to -,c,:,a
>
> This is not complete list, however. And perhaps wrong, too. :)

If we did this, someone would complain that we weren't preserving their characters. So I don't think it should be limited in any way (except maybe to prevent unprintable characters).

Best,

David
Re: file names [ In reply to ]
On May 20, 2011, at 4:59 AM, Zdravko Balorda wrote:

> HI,
>
> since in Windows you can type anything for a file name, and even
> in Unix, I wonder would it be feasible to clean up things to pure ascii?
> Anyone did that?
> 1. convert UTF8 to ascii
> 2. translate space,',;,& to -,c,:,a
>
> This is not complete list, however. And perhaps wrong, too. :)

If we did this, someone would complain that we weren't preserving their characters. So I don't think it should be limited in any way (except maybe to prevent unprintable characters).

Best,

David
Re: file names [ In reply to ]
It's interesting that you bring this up now; I have recently encountered
a few users whose uploaded media still have the Windows drive letter in
the file name (e.g. C:\myresume.pdf). Something about the escaping of
the backslash causes the file to fail during a publish, but really I'd
rather not have the drive letter stuff in the file name in the first
place. I haven't been able to reproduce this yet, in any case.

Is this what Zdravko is talking about?

On 5/20/2011 11:32 AM, David E. Wheeler wrote:
> On May 20, 2011, at 4:59 AM, Zdravko Balorda wrote:
>
>> HI,
>>
>> since in Windows you can type anything for a file name, and even
>> in Unix, I wonder would it be feasible to clean up things to pure ascii?
>> Anyone did that?
>> 1. convert UTF8 to ascii
>> 2. translate space,',;,& to -,c,:,a
>>
>> This is not complete list, however. And perhaps wrong, too. :)
> If we did this, someone would complain that we weren't preserving their characters. So I don't think it should be limited in any way (except maybe to prevent unprintable characters).
>
> Best,
>
> David
>
Re: file names [ In reply to ]
On May 20, 2011, at 12:19 PM, Nick Legg wrote:

> It's interesting that you bring this up now; I have recently encountered a few users whose uploaded media still have the Windows drive letter in the file name (e.g. C:\myresume.pdf). Something about the escaping of the backslash causes the file to fail during a publish, but really I'd rather not have the drive letter stuff in the file name in the first place. I haven't been able to reproduce this yet, in any case.

This should be protected against by this code:

https://github.com/bricoleurs/bricolage/blob/master/lib/Bric/App/Callback/Profile/Media.pm#L852

Might be worthwhile to put that in for every browser, frankly.

> Is this what Zdravko is talking about?

I don't believe so, no.

Best,

David
Re: file names [ In reply to ]
David E. Wheeler wrote:
> On May 20, 2011, at 12:19 PM, Nick Legg wrote:
>
>> It's interesting that you bring this up now; I have recently encountered a few users whose uploaded media still have the Windows drive letter in the file name (e.g. C:\myresume.pdf). Something about the escaping of the backslash causes the file to fail during a publish, but really I'd rather not have the drive letter stuff in the file name in the first place. I haven't been able to reproduce this yet, in any case.
>
> This should be protected against by this code:
>
> https://github.com/bricoleurs/bricolage/blob/master/lib/Bric/App/Callback/Profile/Media.pm#L852
>
> Might be worthwhile to put that in for every browser, frankly.
>
>> Is this what Zdravko is talking about?
>
> I don't believe so, no.
>
> Best,
>
> David

I was thinking on making filenames "slug compliant". Everyone should be fine with that.
Regards, Zdravko.
Re: file names [ In reply to ]
On May 24, 2011, at 5:39 AM, Zdravko Balorda wrote:

>> This should be protected against by this code:
>> https://github.com/bricoleurs/bricolage/blob/master/lib/Bric/App/Callback/Profile/Media.pm#L852
>> Might be worthwhile to put that in for every browser, frankly.
>>> Is this what Zdravko is talking about?
>> I don't believe so, no.
>
> I was thinking on making filenames "slug compliant". Everyone should be fine with that.

Not sure what that means…

Best,

David
Re: file names [ In reply to ]
David E. Wheeler wrote:
> On May 24, 2011, at 5:39 AM, Zdravko Balorda wrote:
>
>>> This should be protected against by this code:
>>> https://github.com/bricoleurs/bricolage/blob/master/lib/Bric/App/Callback/Profile/Media.pm#L852
>>> Might be worthwhile to put that in for every browser, frankly.
>>>> Is this what Zdravko is talking about?
>>> I don't believe so, no.
>> I was thinking on making filenames "slug compliant". Everyone should be fine with that.
>
> Not sure what that means…
>

To translate filename first to ascii, then to only a-z, 0-9, - or _, as it is requested for slug?

Zdravko
Re: file names [ In reply to ]
On May 24, 2011, at 10:09 PM, Zdravko Balorda wrote:

> To translate filename first to ascii, then to only a-z, 0-9, - or _, as it is requested for slug?

I think the MEDIA_UNIQUE_FILENAME bricolage.conf directive will get you that.

Best,

David