Mailing List Archive

Re: Cutting slices
On 3/5/23 17:43, Stefan Ram wrote:
> The following behaviour of Python strikes me as being a bit
> "irregular". A user tries to chop of sections from a string,
> but does not use "split" because the separator might become
> more complicated so that a regular expression will be required
> to find it. But for now, let's use a simple "find":
>
> |>>> s = 'alpha.beta.gamma'
> |>>> s[ 0: s.find( '.', 0 )]
> |'alpha'
> |>>> s[ 6: s.find( '.', 6 )]
> |'beta'
> |>>> s[ 11: s.find( '.', 11 )]
> |'gamm'
> |>>>
>
> . The user always inserted the position of the previous find plus
> one to start the next "find", so he uses "0", "6", and "11".
> But the "a" is missing from the final "gamma"!
>
> And it seems that there is no numerical value at all that
> one can use for "n" in "string[ 0: n ]" to get the whole
> string, isn't it?
>
>

I would agree with 1st part of the comment.

Just noting that string[11:], string[11:None], as well as string[11:16]
work ... as well as string[11:324242]... lol..
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cutting slices [ In reply to ]
On 06/03/2023 11.59, aapost wrote:
> On 3/5/23 17:43, Stefan Ram wrote:
>>    The following behaviour of Python strikes me as being a bit
>>    "irregular". A user tries to chop of sections from a string,
>>    but does not use "split" because the separator might become
>>    more complicated so that a regular expression will be required
>>    to find it. But for now, let's use a simple "find":
>> |>>> s = 'alpha.beta.gamma'
>> |>>> s[ 0: s.find( '.', 0 )]
>> |'alpha'
>> |>>> s[ 6: s.find( '.', 6 )]
>> |'beta'
>> |>>> s[ 11: s.find( '.', 11 )]
>> |'gamm'
>> |>>>
>>
>>    . The user always inserted the position of the previous find plus
>>    one to start the next "find", so he uses "0", "6", and "11".
>>    But the "a" is missing from the final "gamma"!
>>    And it seems that there is no numerical value at all that
>>    one can use for "n" in "string[ 0: n ]" to get the whole
>>    string, isn't it?
>>
>>
>
> I would agree with 1st part of the comment.
>
> Just noting that string[11:], string[11:None], as well as string[11:16]
> work ... as well as string[11:324242]... lol..

To expand on the above, answering the OP's second question: the numeric
value is len( s ).

If the repetitive process is required, try a loop like:

>>> start_index = 11 #to cure the issue-raised

>>> try:
... s[ start_index:s.index( '.', start_index ) ]
... except ValueError:
... s[ start_index:len( s ) ]
...
'gamma'


However, if the objective is to split, then use the function built for
the purpose:

>>> s.split( "." )
['alpha', 'beta', 'gamma']

(yes, the OP says this won't work - but doesn't show why)


If life must be more complicated, but the next separator can be
predicted, then its close-relative is partition().
NB can use both split() and partition() on the sub-strings produced by
an earlier split() or ... ie there may be no reason to work strictly
from left to right
- can't really help with this because the information above only shows
multiple "." characters, and not how multiple separators might be
interpreted.


A straight-line approach might be to use maketrans() and translate() to
convert all the separators to a single character, eg white-space, which
can then be split using any of the previously-mentioned methods.


If the problem is sufficiently complicated and the OP is prepared to go
whole-hog, then PSL's tokenize library or various parser libraries may
be worth consideration...

--
Regards,
=dn
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cutting slices [ In reply to ]
On 05/03/2023 22:59, aapost wrote:
> On 3/5/23 17:43, Stefan Ram wrote:
>>    The following behaviour of Python strikes me as being a bit
>>    "irregular". A user tries to chop of sections from a string,
>>    but does not use "split" because the separator might become
>>    more complicated so that a regular expression will be required
>>    to find it. But for now, let's use a simple "find":
>>    |>>> s = 'alpha.beta.gamma'
>> |>>> s[ 0: s.find( '.', 0 )]
>> |'alpha'
>> |>>> s[ 6: s.find( '.', 6 )]
>> |'beta'
>> |>>> s[ 11: s.find( '.', 11 )]
>> |'gamm'
>> |>>>
>>
>>    . The user always inserted the position of the previous find plus
>>    one to start the next "find", so he uses "0", "6", and "11".
>>    But the "a" is missing from the final "gamma"!
>>       And it seems that there is no numerical value at all that
>>    one can use for "n" in "string[ 0: n ]" to get the whole
>>    string, isn't it?
>>
>>
>
The final `find` returns -1 because there is no separator after 'gamma'.
So you are asking for
    s[ 11 : -1]
which correctly returns 'gamm'.
You need to test for this condition.
Alternatively you could ensure that there is a final separator:
    s = 'alpha.beta.gamma.'
but you would still need to test when the string was exhausted.
Best wishes
Rob Cliffe
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cutting slices [ In reply to ]
On 2023-03-06 00:28, dn via Python-list wrote:
> On 06/03/2023 11.59, aapost wrote:
>> On 3/5/23 17:43, Stefan Ram wrote:
>>>    The following behaviour of Python strikes me as being a bit
>>>    "irregular". A user tries to chop of sections from a string,
>>>    but does not use "split" because the separator might become
>>>    more complicated so that a regular expression will be required
>>>    to find it. But for now, let's use a simple "find":
>>> |>>> s = 'alpha.beta.gamma'
>>> |>>> s[ 0: s.find( '.', 0 )]
>>> |'alpha'
>>> |>>> s[ 6: s.find( '.', 6 )]
>>> |'beta'
>>> |>>> s[ 11: s.find( '.', 11 )]
>>> |'gamm'
>>> |>>>
>>>
>>>    . The user always inserted the position of the previous find plus
>>>    one to start the next "find", so he uses "0", "6", and "11".
>>>    But the "a" is missing from the final "gamma"!
>>>    And it seems that there is no numerical value at all that
>>>    one can use for "n" in "string[ 0: n ]" to get the whole
>>>    string, isn't it?
>>>
>>>
>>
>> I would agree with 1st part of the comment.
>>
>> Just noting that string[11:], string[11:None], as well as string[11:16]
>> work ... as well as string[11:324242]... lol..
>
> To expand on the above, answering the OP's second question: the numeric
> value is len( s ).
>
> If the repetitive process is required, try a loop like:
>
> >>> start_index = 11 #to cure the issue-raised
>
> >>> try:
> ... s[ start_index:s.index( '.', start_index ) ]
> ... except ValueError:
> ... s[ start_index:len( s ) ]
> ...
> 'gamma'
>
Somewhat off-topic, but...

When there was a discussion about a None-coalescing operator, I thought
that it would've been nice if .find and .rfind returned None instead of -1.

There have been times when I've wanted to find the next space (or
whatever) and have it return the length of the string if absent. That
could've been accomplished with:

s.find(' ', pos) ?? len(s)

Other times I've wanted it to return -1. That could've been accomplished
with:

s.find(' ', pos) ?? -1

(There's a place in the re module where .rfind returning -1 is just the
right value.)

In this instance, slicing with None as the end is just what's wanted.

Ah, well...
--
https://mail.python.org/mailman/listinfo/python-list
Re: Cutting slices [ In reply to ]
On 6/03/23 11:43 am, Stefan Ram wrote:
> A user tries to chop of sections from a string,
> but does not use "split" because the separator might become
> more complicated so that a regular expression will be required
> to find it.

What's wrong with re.split() in that case?

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list
RE: Cutting slices [ In reply to ]
I am not commenting on the technique or why it is chosen just the part where
the last search looks for a non-existent period:

s = 'alpha.beta.gamma'
...
s[ 11: s.find( '.', 11 )]

What should "find" do if it hits the end of a string without finding the
period you claim is a divider?

Could that be why gamma got truncated?

Unless you can arrange for a terminal period, maybe you can reconsider the
approach.


-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On
Behalf Of aapost
Sent: Sunday, March 5, 2023 6:00 PM
To: python-list@python.org
Subject: Re: Cutting slices

On 3/5/23 17:43, Stefan Ram wrote:
> The following behaviour of Python strikes me as being a bit
> "irregular". A user tries to chop of sections from a string,
> but does not use "split" because the separator might become
> more complicated so that a regular expression will be required
> to find it. But for now, let's use a simple "find":
>
> |>>> s = 'alpha.beta.gamma'
> |>>> s[ 0: s.find( '.', 0 )]
> |'alpha'
> |>>> s[ 6: s.find( '.', 6 )]
> |'beta'
> |>>> s[ 11: s.find( '.', 11 )]
> |'gamm'
> |>>>
>
> . The user always inserted the position of the previous find plus
> one to start the next "find", so he uses "0", "6", and "11".
> But the "a" is missing from the final "gamma"!
>
> And it seems that there is no numerical value at all that
> one can use for "n" in "string[ 0: n ]" to get the whole
> string, isn't it?
>
>

I would agree with 1st part of the comment.

Just noting that string[11:], string[11:None], as well as string[11:16]
work ... as well as string[11:324242]... lol..
--
https://mail.python.org/mailman/listinfo/python-list

--
https://mail.python.org/mailman/listinfo/python-list
Re: Cutting slices [ In reply to ]
Am 05.03.23 um 23:43 schrieb Stefan Ram:
> The following behaviour of Python strikes me as being a bit
> "irregular". A user tries to chop of sections from a string,
> but does not use "split" because the separator might become
> more complicated so that a regular expression will be required
> to find it.

OK, so if you want to use an RE for splitting, can you not use
re.split() ? It basically works like the built-in splitting in AWK

>>> s='alphaAbetaBgamma'
>>> import re
>>> re.split(r'A|B|C', s)
['alpha', 'beta', 'gamma']
>>>


Christian
--
https://mail.python.org/mailman/listinfo/python-list