Mailing List Archive

Some pattern annoyance
Hi folks,

I just used Structural Pattern Matching quite intensively and I'm
pretty amazed of the new possibilities.

But see this code, trying to implement Mark Pilgrim's regex
algorithm for roman literals with SPM:

With constants, I can write

match seq:
case "M", "M", "M", "M", *r:
return 4 * 1000, r

But if I want to use abbreviations by assignment, this is no longer
possible, and I have to write something weird like:

M = "M"
match seq:
case a, b, c, d, *r if M == a == b == c == d:
return 4 * 1000, r

So what is missing seems to be a notion of const-ness, which
could be dynamically deduced. Am I missing something?

--
Christian Tismer-Sperling :^) tismer@stackless.com
Software Consulting : http://www.stackless.com/
Strandstraße 37 : https://github.com/PySide
24217 Schönberg : GPG key -> 0xFB7BEE0E
phone +49 173 24 18 776 fax +49 (30) 700143-0023
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5MKBWCSVYZKR3S7OVY6KBF6FE7WYB5LC/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Some pattern annoyance [ In reply to ]
> On 2 Aug 2023, at 12:03, Christian Tismer-Sperling <tismer@stackless.com> wrote:
>
> ?Hi folks,
>
> I just used Structural Pattern Matching quite intensively and I'm
> pretty amazed of the new possibilities.
>
> But see this code, trying to implement Mark Pilgrim's regex
> algorithm for roman literals with SPM:
>
> With constants, I can write
>
> match seq:
> case "M", "M", "M", "M", *r:
> return 4 * 1000, r
>
> But if I want to use abbreviations by assignment, this is no longer
> possible, and I have to write something weird like:
>
> M = "M"
> match seq:
> case a, b, c, d, *r if M == a == b == c == d:
> return 4 * 1000, r
>
> So what is missing seems to be a notion of const-ness, which
> could be dynamically deduced. Am I missing something?

Try asking for help at https://discuss.python.org/
This list is not for help or ideas, also its basically dead.

Barry
>
> --
> Christian Tismer-Sperling :^) tismer@stackless.com
> Software Consulting : http://www.stackless.com/
> Strandstraße 37 : https://github.com/PySide
> 24217 Schönberg : GPG key -> 0xFB7BEE0E
> phone +49 173 24 18 776 fax +49 (30) 700143-0023
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5MKBWCSVYZKR3S7OVY6KBF6FE7WYB5LC/
> Code of Conduct: http://python.org/psf/codeofconduct/
Re: Some pattern annoyance [ In reply to ]
On 02.08.23 13:23, Barry wrote:
>
>
>> On 2 Aug 2023, at 12:03, Christian Tismer-Sperling
>> <tismer@stackless.com> wrote:
>>
>> ?Hi folks,
>>
>> I just used Structural Pattern Matching quite intensively and I'm
>> pretty amazed of the new possibilities.
>>
>> But see this code, trying to implement Mark Pilgrim's regex
>> algorithm for roman literals with SPM:
>>
>> With constants, I can write
>>
>>    match seq:
>>        case "M", "M", "M", "M", *r:
>>            return 4 * 1000, r
>>
>> But if I want to use abbreviations by assignment, this is no longer
>> possible, and I have to write something weird like:
>>
>>    M = "M"
>>    match seq:
>>        case a, b, c, d, *r if M == a == b == c == d:
>>            return 4 * 1000, r
>>
>> So what is missing seems to be a notion of const-ness, which
>> could be dynamically deduced. Am I missing something?
>
> Try asking for help at https://discuss.python.org/
> <https://discuss.python.org/>
> This list is not for help or ideas, also its basically dead.


Thanks, Barry.
I thought this list would always stay intact as an alternatice
to the web things. How sad!

Cheers -- Chris

--
Christian Tismer-Sperling :^) tismer@stackless.com
Software Consulting : http://www.stackless.com/
Strandstraße 37 : https://github.com/PySide
24217 Schönberg : GPG key -> 0xFB7BEE0E
phone +49 173 24 18 776 fax +49 (30) 700143-0023

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DTSEGLLMPJZLSF65BUZADFO36RCYVM6D/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Some pattern annoyance [ In reply to ]
Christian Tismer-Sperling writes:

> I thought this list would always stay intact as an alternatice
> to the web things. How sad!

The list is alive. You got an immediate answer, did you not? It's
just that almost all of the people who are engaged with discussion
every day have found alternative platforms more productive.

Partly because that's where the other discussants are (the network
externality is undeniably powerful), and partly (I believe) because
effective use of email is a skill that requires effort to acquire.
Popular mail clients are designed to be popular, not to make that
expertise easy to acquire and exercise. Clunky use of email makes
lists much less pleasant for everyone than they could be.

I guess that's sad (I am, after all, a GNU Mailman developer), but
it's reality.

Steve
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7IBBTTSI3Q3WX3MXBGMFJ2GKX6MTE67V/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Some pattern annoyance [ In reply to ]
On Wed, 2 Aug 2023 at 15:24, Stephen J. Turnbull <
turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:

> Partly because that's where the other discussants are (the network
> externality is undeniably powerful), and partly (I believe) because
> effective use of email is a skill that requires effort to acquire.
> Popular mail clients are designed to be popular, not to make that
> expertise easy to acquire and exercise. Clunky use of email makes
> lists much less pleasant for everyone than they could be.
>
> I guess that's sad (I am, after all, a GNU Mailman developer), but
> it's reality.
>

Personally, I'm sad because some people whose contributions I enjoy (you
being one of them :-)) didn't move to Discourse. But like you say, it's how
things are.

Christian - you can make named constants using class attributes (or an
enum):

class A:
M = "M"

match seq:
case A.M, A.M, A.M, A.M, *r:
return 4*1000, r

Basically, the "names are treated as variables to assign to" rule doesn't
apply to attributes.

I'm not sure how helpful that is (it's not particularly *shorter*) but I
think the idea was that most uses of named constants in a match statement
would be enums or module attributes. And compromises had to be made.

Cheers,
Paul
Re: Some pattern annoyance [ In reply to ]
On 02.08.23 18:30, Paul Moore wrote:
> On Wed, 2 Aug 2023 at 15:24, Stephen J. Turnbull
> <turnbull.stephen.fw@u.tsukuba.ac.jp
> <mailto:turnbull.stephen.fw@u.tsukuba.ac.jp>> wrote:
>
> Partly because that's where the other discussants are (the network
> externality is undeniably powerful), and partly (I believe) because
> effective use of email is a skill that requires effort to acquire.
> Popular mail clients are designed to be popular, not to make that
> expertise easy to acquire and exercise.  Clunky use of email makes
> lists much less pleasant for everyone than they could be.
>
> I guess that's sad (I am, after all, a GNU Mailman developer), but
> it's reality.
>
>
> Personally, I'm sad because some people whose contributions I enjoy (you
> being one of them :-)) didn't move to Discourse. But like you say, it's
> how things are.
>
> Christian - you can make named constants using class attributes (or an
> enum):
>
> class A:
>     M = "M"
>
> match seq:
>     case A.M, A.M, A.M, A.M, *r:
>         return 4*1000, r
>
> Basically, the "names are treated as variables to assign to" rule
> doesn't apply to attributes.
>
> I'm not sure how helpful that is (it's not particularly *shorter*) but I
> think the idea was that most uses of named constants in a match
> statement would be enums or module attributes. And compromises had to be
> made.
>
> Cheers,
> Paul

Thanks a lot, everybody!

I have tried a lot now, using classes which becomes more readable
but - funnily - slower! Using the clumsy if-guards felt slow but isn't.

Then I generated functions even, with everything as constants,
and now the SPM version in fact out-performs the regex slightly!

But at last, I found an even faster and correct algorithm
by a different approach, which ends now this story :)

Going to the Discourse tite, now.

Cheers -- Chris
--
Christian Tismer-Sperling :^) tismer@stackless.com
Software Consulting : http://www.stackless.com/
Strandstraße 37 : https://github.com/PySide
24217 Schönberg : GPG key -> 0xFB7BEE0E
phone +49 173 24 18 776 fax +49 (30) 700143-0023

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/OFLAU34KWAKREKG4H2M5GES3PGT6VBAU/
Code of Conduct: http://python.org/psf/codeofconduct/
Re: Some pattern annoyance [ In reply to ]
Hi Chris,

Nice to see you on the list.

While this is definitely off-topic, I trust I might be given license by the
list's few remaining readers to point out that the match-case construct is
for _structural_ pattern matching. As I wrote in the latest Nutshell:
"Resist the temptation to use match unless there is a need to analyse the
_structure_ of an object."

I don't believe it's accidental that match-case sequence patterns won't
match str, bytes or bytearrray objects - regexen are the tool already
optimised for that purpose, so it's quite impressive that you are
managing to approach the same level of performance!

Kind regards,
Steve


On Wed, 2 Aug 2023 at 18:26, Christian Tismer-Sperling <tismer@stackless.com>
wrote:

> On 02.08.23 18:30, Paul Moore wrote:
> > On Wed, 2 Aug 2023 at 15:24, Stephen J. Turnbull
> > <turnbull.stephen.fw@u.tsukuba.ac.jp
> > <mailto:turnbull.stephen.fw@u.tsukuba.ac.jp>> wrote:
> >
> > Partly because that's where the other discussants are (the network
> > externality is undeniably powerful), and partly (I believe) because
> > effective use of email is a skill that requires effort to acquire.
> > Popular mail clients are designed to be popular, not to make that
> > expertise easy to acquire and exercise. Clunky use of email makes
> > lists much less pleasant for everyone than they could be.
> >
> > I guess that's sad (I am, after all, a GNU Mailman developer), but
> > it's reality.
> >
> >
> > Personally, I'm sad because some people whose contributions I enjoy (you
> > being one of them :-)) didn't move to Discourse. But like you say, it's
> > how things are.
> >
> > Christian - you can make named constants using class attributes (or an
> > enum):
> >
> > class A:
> > M = "M"
> >
> > match seq:
> > case A.M, A.M, A.M, A.M, *r:
> > return 4*1000, r
> >
> > Basically, the "names are treated as variables to assign to" rule
> > doesn't apply to attributes.
> >
> > I'm not sure how helpful that is (it's not particularly *shorter*) but I
> > think the idea was that most uses of named constants in a match
> > statement would be enums or module attributes. And compromises had to be
> > made.
> >
> > Cheers,
> > Paul
>
> Thanks a lot, everybody!
>
> I have tried a lot now, using classes which becomes more readable
> but - funnily - slower! Using the clumsy if-guards felt slow but isn't.
>
> Then I generated functions even, with everything as constants,
> and now the SPM version in fact out-performs the regex slightly!
>
> But at last, I found an even faster and correct algorithm
> by a different approach, which ends now this story :)
>
> Going to the Discourse tite, now.
>
> Cheers -- Chris
> --
> Christian Tismer-Sperling :^) tismer@stackless.com
> Software Consulting : http://www.stackless.com/
> Strandstraße 37 : https://github.com/PySide
> 24217 Schönberg : GPG key -> 0xFB7BEE0E
> phone +49 173 24 18 776 fax +49 (30) 700143-0023
>
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/OFLAU34KWAKREKG4H2M5GES3PGT6VBAU/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
Re: Some pattern annoyance [ In reply to ]
Hi Steve,

Yes I am well aware that this regex example is not well suited for SPM.
This was a proof of concept. Pushing things no the extreme is my
way of understanding things deeply, so this was something I needed.

For some reason, I love and hate regex. I hate it because it is
unpythonic, char only and ugly. I love it because it is fast, and by the
use of the verbose flag also quite readable.

But getting rid of regex in favor of something even more capable was
a long-standing wish that is yet not fulfilled, because the nature of
both features is (still) pretty different.

I would love to have similar building blocks as in regex, but with a
pythonic syntax, and extending the basic string matching to general
objects. At the moment I don't see this in SPM because there are basic
flexible patterns missing. The only flexible thing in sequences is
the star operator, but in my example this is always eaten by the need
of an open end in the pattern. This is something that might improve.

As a drive-by, while looking into the Pilgrim algorithm for Roman
literals, I found by chance a faster algorithm :)
Not only that my SPM craziness is now really faster than the regex
solution, but I found something better, based on Pilgrim's `toRoman`
part of the algorithm :D

Given one of the basic algorithms in the internet which are fast
and incomplete, this here is much faster than using regex:

def from_roman_fastest(numeral):
if numeral == 'N':
return 0
num = from_roman_numeral(numeral)
cmp = roman.toRoman(num)
if numeral != cmp:
raise InvalidRomanNumeralError(f"Invalid Roman numeral:
{numeral}")
return num

This follows the old observation "Listening is much harder than talking",
so this algorithm does not try a complex solution, but uses a simple one
and checks if the input string was correctly reconstructed.

Cheers -- Chris


On 02.08.23 22:30, Steve Holden wrote:
> Hi Chris,
>
> Nice to see you on the list.
>
> While this is definitely off-topic, I trust I might be given license by
> the list's few remaining readers to point out that the match-case
> construct is for _structural_ pattern matching. As I wrote in the latest
> Nutshell: "Resist the temptation to use match unless there is a need to
> analyse the _structure_ of an object."
>
> I don't believe it's accidental that match-case sequence patterns won't
> match str, bytes or bytearrray objects - regexen are the tool already
> optimised for that purpose, so it's quite impressive that you are
> managing to approach the same level of performance!
>
> Kind regards,
> Steve
>
>
> On Wed, 2 Aug 2023 at 18:26, Christian Tismer-Sperling
> <tismer@stackless.com <mailto:tismer@stackless.com>> wrote:
>
> On 02.08.23 18:30, Paul Moore wrote:
> > On Wed, 2 Aug 2023 at 15:24, Stephen J. Turnbull
> > <turnbull.stephen.fw@u.tsukuba.ac.jp
> <mailto:turnbull.stephen.fw@u.tsukuba.ac.jp>
> > <mailto:turnbull.stephen.fw@u.tsukuba.ac.jp
> <mailto:turnbull.stephen.fw@u.tsukuba.ac.jp>>> wrote:
> >
> >     Partly because that's where the other discussants are (the
> network
> >     externality is undeniably powerful), and partly (I believe)
> because
> >     effective use of email is a skill that requires effort to
> acquire.
> >     Popular mail clients are designed to be popular, not to make that
> >     expertise easy to acquire and exercise.  Clunky use of email
> makes
> >     lists much less pleasant for everyone than they could be.
> >
> >     I guess that's sad (I am, after all, a GNU Mailman
> developer), but
> >     it's reality.
> >
> >
> > Personally, I'm sad because some people whose contributions I
> enjoy (you
> > being one of them :-)) didn't move to Discourse. But like you
> say, it's
> > how things are.
> >
> > Christian - you can make named constants using class attributes
> (or an
> > enum):
> >
> > class A:
> >      M = "M"
> >
> > match seq:
> >      case A.M, A.M, A.M, A.M, *r:
> >          return 4*1000, r
> >
> > Basically, the "names are treated as variables to assign to" rule
> > doesn't apply to attributes.
> >
> > I'm not sure how helpful that is (it's not particularly
> *shorter*) but I
> > think the idea was that most uses of named constants in a match
> > statement would be enums or module attributes. And compromises
> had to be
> > made.
> >
> > Cheers,
> > Paul
>
> Thanks a lot, everybody!
>
> I have tried a lot now, using classes which becomes more readable
> but - funnily - slower! Using the clumsy if-guards felt slow but isn't.
>
> Then I generated functions even, with everything as constants,
> and now the SPM version in fact out-performs the regex slightly!
>
> But at last, I found an even faster and correct algorithm
> by a different approach, which ends now this story :)
>
> Going to the Discourse tite, now.
>
> Cheers -- Chris
> --
> Christian Tismer-Sperling    :^) tismer@stackless.com
> <mailto:tismer@stackless.com>
> Software Consulting          : http://www.stackless.com/
> <http://www.stackless.com/>
> Strandstraße 37              : https://github.com/PySide
> <https://github.com/PySide>
> 24217 Schönberg              :     GPG key -> 0xFB7BEE0E
> phone +49 173 24 18 776  fax +49 (30) 700143-0023
>
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> <mailto:python-dev@python.org>
> To unsubscribe send an email to python-dev-leave@python.org
> <mailto:python-dev-leave@python.org>
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> <https://mail.python.org/mailman3/lists/python-dev.python.org/>
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/OFLAU34KWAKREKG4H2M5GES3PGT6VBAU/ <https://mail.python.org/archives/list/python-dev@python.org/message/OFLAU34KWAKREKG4H2M5GES3PGT6VBAU/>
> Code of Conduct: http://python.org/psf/codeofconduct/
> <http://python.org/psf/codeofconduct/>
>
>
> _______________________________________________
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-leave@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DYTVT7CUFVVGIDPXG2MKIOELWJPG3W73/
> Code of Conduct: http://python.org/psf/codeofconduct/

--
Christian Tismer-Sperling :^) tismer@stackless.com
Software Consulting : http://www.stackless.com/
Strandstraße 37 : https://github.com/PySide
24217 Schönberg : GPG key -> 0xFB7BEE0E
phone +49 173 24 18 776 fax +49 (30) 700143-0023

_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-leave@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XGTQVVTRMQRKVKXSE4O5WZYZITMN5DBE/
Code of Conduct: http://python.org/psf/codeofconduct/