Mailing List Archive

regular expression question: last occurence
Hello all,

I want to get the _last_ occurence of some string in some other string. What
I want exactly, is having the current directory (for example,
/home/gerrit/linuxgames/links/)
and than put everything after the last "/linuxgames" in a string. But when
someone is in the directory "/home/gerrit/linuxgames/cvs/linuxgames", my current
regexp is getting /cvs/linuxgames, and that's not what I want. Now, I have:

currentdir = os.getcwd()
mo = re.search('/linuxgames', currentdir)
eind = mo.end()
subdir = currentdir[eind:] + '/'

But that doesn't solve my problem.

kind regards,
Gerrit.


--
The Dutch Linuxgames website. De Nederlandse Linuxgames pagina.
Everything about games on Linux. Alles over spelletjes onder Linux.
Site address: http://linuxgames.nl.linux.org
Personal homepage: http://nl.linux.org/~gerrit/
regular expression question: last occurence [ In reply to ]
On Sat, 12 Jun 1999, Gerrit Holl wrote:
> currentdir = os.getcwd()
> mo = re.search('/linuxgames', currentdir)
> eind = mo.end()
> subdir = currentdir[eind:] + '/'
>
> But that doesn't solve my problem.

print os.path.basename(currentdir)

> kind regards,
> Gerrit.
>
> --
> The Dutch Linuxgames website. De Nederlandse Linuxgames pagina.
> Everything about games on Linux. Alles over spelletjes onder Linux.
> Site address: http://linuxgames.nl.linux.org
> Personal homepage: http://nl.linux.org/~gerrit/

Oleg.
----
Oleg Broytmann Netskate/Inter.Net.Ru phd@emerald.netskate.ru
Programmers don't die, they just GOSUB without RETURN.
regular expression question: last occurence [ In reply to ]
[Gerrit Holl]
> I want to get the _last_ occurence of some string in some other
> string.

The best thing is to use string.rfind ("reverse find"):

>>> import string
>>> string.rfind("abcabcabc", "abc")
6
>>> string.rfind("abcabcabc", "xyz")
-1
>>>

Note that (like string.find, its forward-searching relative) it returns -1
if it can't find the string it's looking for.

This is faster and much more obvious than regexp tricks.

> What I want exactly, is having the current directory (for example,
> /home/gerrit/linuxgames/links/) and than put everything after the last
> "/linuxgames" in a string.

>>> have = "/home/gerrit/linuxgames/links/"
>>> want = "/linuxgames"
>>> i = string.rfind(have, want)
>>> have[i+len(want):]
'/links/'
>>>

> But when someone is in the directory
> "/home/gerrit/linuxgames/cvs/linuxgames",
> my current regexp is getting /cvs/linuxgames, and that's not what I want.

What you do want is an empty string? Continuing the above,

>>> have = "/home/gerrit/linuxgames/cvs/linuxgames"
>>> i = string.rfind(have, want)
>>> have[i+len(want):]
''
>>>


> Now, I have:
>
> currentdir = os.getcwd()
> mo = re.search('/linuxgames', currentdir)
> eind = mo.end()
> subdir = currentdir[eind:] + '/'
>
> But that doesn't solve my problem.

Right, each part of a regexp matches at the leftmost position possible:

>>> m = re.search("a", "aaaaaaaaaaaa")
>>> m.span()
(0, 1)
>>>

The way to *trick* it is to stick ".*" at the front:

>>> m = re.search(".*a", "aaaaaaaaaaaa")
>>> m.span()
(0, 12)
>>>

First the ".*" part matches at the leftmost position possible (which is the
start of the string!). Then the other obscure part of regexps kicks in:
each part of a regexp matches the *longest* string possible such that the
*rest* of the regexp is still able to match. So, above, ".*" matches the
first 11 "a"s, and the "a" in the regexp matches the last "a" in the string.

This is subtle! That's why I recommend string.rfind <wink>. Still, you can
trick regexps into working for this:

>>> pattern = re.compile(".*/linuxgames")
>>> for current in ("/home/gerrit/linuxgames/links/",
"/home/gerrit/linuxgames/cvs/linuxgames"):
mo = pattern.search(current)
eind = mo.end()
subdir = current[eind:] + '/'
print current, "->", subdir

/home/gerrit/linuxgames/links/ -> /links//
/home/gerrit/linuxgames/cvs/linuxgames -> /
>>>

regexps-the-international-corrupter-of-youth-ly y'rs - tim