Mailing List Archive

Handling backspace chars in a string...
I'm in the posistion of having to process strings with arbitrary
numbers of backspace and newline characters in them. The backspaces
actually get put in the string, so I have to handle removing the
characters that are backspaced over. Currently I'm doing this
something like this (hastily retyped mostly from memory so forgive any
small errors and/or typos) :

i = len(str)
while ktr < i:
if string[ktr] == '\b':
bBegin = ktr
backs = 0
while string[ktr] == '\b':
backs = backs + 1
ktr = ktr + 1
if backs > (ktr - backs - 1): # backs > prior chars
string = string[bBegin+backs:]
ktr = 0
else:
string = string[:bBegin-backs] + string[ktr:]
ktr = bBegin - backs
i = len(str)
ktr = ktr + 1

This just looked rather messy to me -- I was curious if anyone know a
better way?
Handling backspace chars in a string... [ In reply to ]
[Purple]
> I'm in the poistion of having to process strings with arbitrary
> numbers of backspace and newline characters in them.

Your code doesn't appear to care about newlines one way or t'other. Do you
<wink>?

> The backspaces actually get put in the string, so I have to handle
> removing the characters that are backspaced over.
> ... [rather sprawling string + indexing code] ...
> This just looked rather messy to me -- I was curious if anyone know a
> better way?

Assuming "better" means "less messy" here, lists support appending and
deleting quite naturally and efficiently; like

def stripbs(sin):
import string
sout = []
for ch in sin:
if ch == '\b':
del sout[-1:] # a nop if len(sout) == 0
else:
sout.append(ch)
return string.join(sout, '')

This essentially treats the input string as a sequence of opcodes for a
stack machine, where "\b" means "pop" and anything else means "push me!".

don't-use-any-indices-and-you-can't-screw-'em-up<wink>-ly y'rs - tim
Handling backspace chars in a string... [ In reply to ]
bwizard@bga.com (Purple) said:
> I'm in the posistion of having to process strings with arbitrary
> numbers of backspace and newline characters in them. The backspaces
> actually get put in the string, so I have to handle removing the
> characters that are backspaced over.
>
> [one implementation given]
>
> This just looked rather messy to me -- I was curious if anyone know
> a better way?

Here's one possibility. It uses a regular expression substitution
to replace <any character> + <backspace> with the empty string.
(Note: don't use a raw string for the re; r".\b" will find a character
which is before a word break.) When done, it removes all the
leading backspaces.


import re
char_backspace = re.compile(".\b") # Don't use a raw string here
any_backspaces = re.compile("\b+") # or here

def apply_backspaces(s):
while 1:
t = char_backspace.sub("", s)
if len(s) == len(t):
# remove any backspaces which may start a line
return any_backspaces.sub("", t)
s = t


>>> apply_backspaces("\bQ\b\bAndqt\b\brew Dalkt\br\be")
'Andrew Dalke'


You mentioned something about containing newlines. By default, the
"." re pattern doesn't match a \n, so the above code acts like a
normal tty, and doesn't remove the \n if followed by a newline. This
is likely the right thing. That's also why I delete any backspace
because
"this\n\bthat"

should be the same string as
"this
that"

Andrew Dalke
dalke@acm.org
Handling backspace chars in a string... [ In reply to ]
On Mon, 26 Apr 1999 00:45:32 GMT, "Tim Peters" <tim_one@email.msn.com>
wrote:

>[Purple]
>> I'm in the poistion of having to process strings with arbitrary
>> numbers of backspace and newline characters in them.
>
>Your code doesn't appear to care about newlines one way or t'other. Do you
><wink>?

I didn't post the code for that bit as it seems easy enough to take
care of processing those with something like
map(string.strip,string.split(stringWithNewlines,"\n")

Unless there's a better way to do that too? :)

>> The backspaces actually get put in the string, so I have to handle
>> removing the characters that are backspaced over.
>> ... [rather sprawling string + indexing code] ...
>> This just looked rather messy to me -- I was curious if anyone know a
>> better way?
>
>Assuming "better" means "less messy" here, lists support appending and
>deleting quite naturally and efficiently; like
>
[code using lists snipped]

>This essentially treats the input string as a sequence of opcodes for a
>stack machine, where "\b" means "pop" and anything else means "push me!".
>
>don't-use-any-indices-and-you-can't-screw-'em-up<wink>-ly y'rs - tim

I may well go that route... Is it any slower or faster to do this
using lists rather than counting up the backspaces and slicing around
the bits that need to be snipped?
Handling backspace chars in a string... [ In reply to ]
[Purple]
> I didn't post the code for that bit [newlines] as it seems easy enough
> to take care of processing those with something like
> map(string.strip,string.split(stringWithNewlines,"\n")
>
> Unless there's a better way to do that too? :)

I don't know what you're trying to do. What that code *does* is break the
string into chunks as separated by newlines, strips leading and trailing
whitespace of all kinds from each resulting chunk, and leaves the result as
a list of strings. That doesn't sound like what you *wanted* to do, but if
it is I can't think of a better to do that <wink>.

> ...
> [code using lists snipped]
> I may well go that route... Is it any slower or faster to do this
> using lists rather than counting up the backspaces and slicing around
> the bits that need to be snipped?

from time import clock

How big are the strings? What's the expected distribution of backspaces?
Etc. The list code is worst-case linear-time (in practice although not in
theory); the string slicing worst-case quadratic (although "acts linear"
until strings reach a platform-dependent size). There's no definitive
answer to which is faster without a strong characterization of your data.
Don't bother telling me, time it <wink>.

in-the-end-is-it-faster-to-run-or-walk?-yes-ly y'rs - tim