Mailing List Archive: try vs. has_key()

try vs. has_key()

Apr 22, 1999, 3:48 PM

Post #1 of 20 (4009 views)

I've seen roughly half the people here doing

try:
dict[key].append(foo)
except:
dict[key]=[foo]

with the other half doing

if dict.has_key(key):
dict[key].append(foo)
else:
dict[key]=[foo]

Can people explain their preferences?
--
--- Aahz (@netcom.com)

Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het

"You often don't really understand the problem until after the first
time you implement a solution." - Eric S. Raymond

try vs. has_key() [ In reply to ]

Apr 22, 1999, 5:16 PM

Post #2 of 20 (3987 views)

My experience shows that throwing an exception is slower.

Aahz Maruch <aahz@netcom.com> wrote in message
news:aahzFAM4oJ.M7M@netcom.com...
> I've seen roughly half the people here doing
>
> try:
> dict[key].append(foo)
> except:
> dict[key]=[foo]
>
> with the other half doing
>
> if dict.has_key(key):
> dict[key].append(foo)
> else:
> dict[key]=[foo]
>
> Can people explain their preferences?
> --
> --- Aahz (@netcom.com)
>
> Hugs and backrubs -- I break Rule 6 <*>
http://www.rahul.net/aahz/
> Androgynous poly kinky vanilla queer het
>
> "You often don't really understand the problem until after the first
> time you implement a solution." - Eric S. Raymond

try vs. has_key() [ In reply to ]

dalke at bioreason

Apr 22, 1999, 6:15 PM

Post #3 of 20 (3989 views)

Aahz Maruch <aahz@netcom.com> asked:
> Can people explain their preferences [for try/except over has_key]?

As I recall, has_key is a relatively new addition to Python
so older code (and people who learned from older code) will
use try/except.

I usually use has_key because coming from a shop where most
people have C experience, it's easier to explain that than using
what it in essense the side effect of exceptions.

Also, I believe Barry Warsaw did a benchmark on the tradeoffs of
when you do one over the other. The exception case of a try/except
has some pretty high overhead, while the has_key call is slightly
worse than the non-exception case of the try/except. Barry's
results showed that if 5% of the cases or less were "exceptional"
then try/except is faster, otherwise, use has_key. Alas, I cannot
find the URL for that paper.

Andrew
dalke@acm.org

try vs. has_key() [ In reply to ]

gmcm at hypernet

Apr 22, 1999, 6:55 PM

Post #4 of 20 (3997 views)

Aahz asks:

> I've seen roughly half the people here doing
>
> try:
> dict[key].append(foo)
> except:
> dict[key]=[foo]
>
> with the other half doing
>
> if dict.has_key(key):
> dict[key].append(foo)
> else:
> dict[key]=[foo]
>
> Can people explain their preferences?

I have done both. Option 1 requires slightly less typing, but is only
better when you (in practice) have a dict with a small number of keys
and rather longish lists. (In Python, "try" is damned near free, and
"except" is a lot cheaper than, say, C++'s "catch", but still costs a
good deal more than has_key.)

Conscientious practice of option 2, of course, allows you to look St.
Peter in the eye and demand entrance without fear of contradiction...

- Gordon

try vs. has_key() [ In reply to ]

Apr 22, 1999, 7:47 PM

Post #5 of 20 (3988 views)

Personally I really like the new-ish "get" method of dictionaries, which lets you write

dict[key] = dict.get(key,[]) + [foo]

I don't know about performance implications but it's a nice terse construct.

try vs. has_key() [ In reply to ]

bwarsaw at cnri

Apr 22, 1999, 8:31 PM

Post #6 of 20 (3993 views)

>>>>> "AD" == Andrew Dalke <dalke@bioreason.com> writes:

AD> Barry's results showed that if 5% of the cases or
AD> less were "exceptional" then try/except is faster, otherwise,
AD> use has_key. Alas, I cannot find the URL for that paper.

<http://www.python.org/~bwarsaw/ecshort.pdf> (sorry, I lost the
on-line version of the longer paper)

I did the tests as part of my presentation at IPC6, against Python
1.5. While the numbers and exact trade-off might be different now, I
believe the general recommendations are still valid. If you expect to
get a hit on the key less than ~90% of the time, use has_key(). The
one pattern where you might be better off with try/except would be if
you're using a dictionary as a cache, where you'll only get a miss
once.

Note that since then, dictionaries grew a get() method, which will
usually be better than both has_key() and try/except, depending on the
application. E.g.:

dict[counterkey] = dict.get(counterkey, 0) + 1

With only one dictionary access and no exceptions, this is about as
fast as you're gonna get.

-Barry

try vs. has_key() [ In reply to ]

Apr 22, 1999, 8:31 PM

Post #7 of 20 (3990 views)

From: "Barry A. Warsaw" <bwarsaw@cnri.reston.va.us>
Subject: Re: try vs. has_key()

>>>>> "AD" == Andrew Dalke <dalke@bioreason.com> writes:

AD> Barry's results showed that if 5% of the cases or
AD> less were "exceptional" then try/except is faster, otherwise,
AD> use has_key. Alas, I cannot find the URL for that paper.

<http://www.python.org/~bwarsaw/ecshort.pdf> (sorry, I lost the
on-line version of the longer paper)

I did the tests as part of my presentation at IPC6, against Python
1.5. While the numbers and exact trade-off might be different now, I
believe the general recommendations are still valid. If you expect to
get a hit on the key less than ~90% of the time, use has_key(). The
one pattern where you might be better off with try/except would be if
you're using a dictionary as a cache, where you'll only get a miss
once.

Note that since then, dictionaries grew a get() method, which will
usually be better than both has_key() and try/except, depending on the
application. E.g.:

dict[counterkey] = dict.get(counterkey, 0) + 1

With only one dictionary access and no exceptions, this is about as
fast as you're gonna get.

-Barry

--
|Fidonet: UUCP 2:500/3.1
|Internet: UUCP@p1.f3.n500.z2.hccfido.hcc.nl
|
| Standard disclaimer: The views of this user are strictly his own.

try vs. has_key() [ In reply to ]

mzadka at geocities

Apr 23, 1999, 4:13 AM

Post #8 of 20 (3988 views)

Note:
This article is a non-commercial advertisement for the ``get'' method
of dictionary objects.
Brought to you by the object None and the method .append.

On Thu, 22 Apr 1999, Darrell wrote:

> My experience shows that throwing an exception is slower.
>
> Aahz Maruch <aahz@netcom.com> wrote in message
> news:aahzFAM4oJ.M7M@netcom.com...
> > I've seen roughly half the people here doing
> >
<snipped try/except idiom for updating a dictionary>
<snipped has_key idiom for updating a dictionary>

It depends on the expected hit/miss ratio.

If you have many hits, few misses -- use the first
Few hits, many misses -- use the second.

Best way is to use
(for example, counting)

d={}
for word in words:
d[word]=d.get(word, 0)+1

Or, for logging:
d={}
for word in words:
first_two=word[:2]
d[first_two]=d.get(first_two, []).append(word)

Unfortunately, few people seem to know about the ``get'' method, which
is really good.
From the docs:

a.get(k[, f]) the item of a with key k (4)
(4)
Never raises an exception if k is not in the map, instead it
returns f. f is optional, when not provided and k is not in the
map, None is returned.

This makes dictionary types behave in a Perl-hash-like manner, which is
sometimes a good thing.

Note that this idiom is (I think) more efficient, and shorter.
--
Moshe Zadka <mzadka@geocities.com>.
QOTD: What fun to me! I'm not signing permanent.

try vs. has_key() [ In reply to ]

tismer at appliedbiometrics

Apr 23, 1999, 6:16 AM

Post #9 of 20 (3996 views)

Aahz Maruch wrote:
>
> I've seen roughly half the people here doing
>
> try:
> dict[key].append(foo)
> except:
> dict[key]=[foo]
>
> with the other half doing
>
> if dict.has_key(key):
> dict[key].append(foo)
> else:
> dict[key]=[foo]
>
> Can people explain their preferences?

Well, besides the other good answers, there's some more.

From a design point of view, both versions have their place.
The has_key version reflects the expectation of many missing keys,
and it is not exceptional. The try..except version suggests
that you usually will find a key, while a non-existant key
is exceptional.
I think, the dict.get variant is just a more efficient
compromize which is somewhere between.
It is also a matter of taste.

But on execution speed, there are the measures of Barry's paper,
and he is still right, although I would need measures for dict.get
as well.

Note that a loop running over dict.get calls is still a bit
slower than indexing the dict, due to some internal method
access overhead. It even applies when the dict.get method
is assigned to a local variable. Indexing is the fastest
way to access a dict element when you expect no key errors.

I have no exact measures yet, but will provide some, soon.

To optimize things if you can expect a very small number of
key failures, a different approach can help to get the
last cycles squeezed out of the code:
If it is possible for your algorithm, you can put the
try..except clause outside of some loop.
This saves a statement and the exception clause setup as well.
Your code must be changed quite heavily, since it has to
restart the loop where it failed, but there are a number of
cases where I need this kind of optimization.

Here an example, which doesn't try to look well-designed,
but run as fats as possible. I recommend to provide
a second, more readable version of the function for
documentation.

Assuming you have code which runs in a loop and tries to collect
new elements as your code does, here a good, "nearly fast"
version, one with "get", and a faster, ugly one:

def count_many(dict, itemlist):
for key, value in itemlist:
if dict.has_key(key):
dict[key].append(value)
else:
dict[key] = [value]

def count_many_get(dict, itemlist):
for key, value in itemlist:
lis = dict.get(key)
if lis :
lis.append(value)
else:
dict[key] = [value]

def count_many_fast(dict, itemlist):
while 1:
try:
k = 1
for key, value in itemlist:
dict[key].append(value) ; k = k+1
break
except KeyError:
dict[key] = [value]
itemlist = itemlist[k:]

results = """
# given a list of 1000 tuples, and appending
# this 1000 times, we measure

>>> d={} ; print timing(count_many, (d, list) ,1000)[0]
17.8
>>> d={} ; print timing(count_many_get, (d, list) ,1000)[0]
16.15
>>> d={} ; print timing(count_many_fast, (d, list) ,1000)[0]
13.73
>>>
"""

ciao - chris

--
Christian Tismer :^) <mailto:tismer@appliedbiometrics.com>
Applied Biometrics GmbH : Have a break! Take a ride on Python's
Kaiserin-Augusta-Allee 101 : *Starship* http://starship.python.net
10553 Berlin : PGP key -> http://wwwkeys.pgp.net
PGP Fingerprint E182 71C7 1A9D 66E9 9D15 D3CC D4D7 93E2 1FAE F6DF
we're tired of banana software - shipped green, ripens at home

try vs. has_key() [ In reply to ]

Apr 23, 1999, 6:44 AM

Post #10 of 20 (3986 views)

Thanks for all the responses, it's a lot clearer now.

As a side note, people who cc their posts by e-mail should stick the
phrase "[posted and e-mailed]" at the top; it's disconcerting when one
doesn't know whether something is intended to be private, and e-mail
almost always arrives faster than news.
--
--- Aahz (@netcom.com)

Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het

"You often don't really understand the problem until after the first
time you implement a solution." - Eric S. Raymond

try vs. has_key() [ In reply to ]

cfelling at iae

Apr 27, 1999, 12:05 PM

Post #11 of 20 (4003 views)

Hai Moshe Zadka,

sorry to react so late, but i'm a newbie to python, so it took me some time
to realise that i really don't understand what you are doing:)

In article <Pine.SUN.3.95-heb-2.07.990423140345.21577A-100000@sunset.ma.huji.ac.il> you wrote:
> Note:
> This article is a non-commercial advertisement for the ``get'' method
> of dictionary objects.
> Brought to you by the object None and the method .append.

> d={}
> for word in words:
> d[word]=d.get(word, 0)+1

> Or, for logging:
> d={}
> for word in words:
> first_two=word[:2]
> d[first_two]=d.get(first_two, []).append(word)

this is the statement where i get lost. and when i try it with python 1.5.1
it doesn't work either. As far as i understand the append function it doesn't
return anything as it is just a shorthand for an assignment which to my
knowledge even in python 1.5.2 doesn't return values. or am i missing
something here? nonetheless, it looks great, and i sure hope it works too.

--
groetjes, carel

try vs. has_key() [ In reply to ]

Apr 27, 1999, 4:17 PM

Post #12 of 20 (3990 views)

In article <7g51q6$1pt$1@vvs.superst.iae.nl>,
Carel Fellinger <cfelling@iae.nl> wrote:
>In article <Pine.SUN.3.95-heb-2.07.990423140345.21577A-100000@sunset.ma.huji.ac.il> you wrote:
>>
>> d={}
>> for word in words:
>> first_two=word[:2]
>> d[first_two]=d.get(first_two, []).append(word)
>
>this is the statement where i get lost. and when i try it with python 1.5.1
>it doesn't work either. As far as i understand the append function it doesn't
>return anything as it is just a shorthand for an assignment which to my
>knowledge even in python 1.5.2 doesn't return values. or am i missing
>something here? nonetheless, it looks great, and i sure hope it works too.

It appears that the second parameter for get() (to specify a value
different from None when the key is not found) is new to 1.5.2. In
1.5.1 you still have to do the following:

d={}
for word in words:
first_two=word[:2]
if d.get(first_two) is None :
d[first_two]=[word]
else :
d[first_two].append(word)

(I think, without testing the code)
--
--- Aahz (@netcom.com)

Hugs and backrubs -- I break Rule 6 <*> http://www.rahul.net/aahz/
Androgynous poly kinky vanilla queer het

Hi! I'm a beta for SigVirus 2000! Copy me into your .sigfile!

try vs. has_key() [ In reply to ]

Apr 28, 1999, 4:16 AM

Post #13 of 20 (3990 views)

aahz@netcom.com (Aahz Maruch) writes:
> It appears that the second parameter for get() (to specify a value
> different from None when the key is not found) is new to 1.5.2. In
> 1.5.1 you still have to do the following:

Not quite.

mstenber@melkki ~>python
Python 1.5.1 (#1, Nov 10 1998, 13:21:44) [GCC 2.7.2.3] on linux2
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> {}.get("foo","bar")
'bar'
>>>

-Python 1.5-ly yours, Markus

--
"Schizophrenia, like the Unicorn, is described in various ways by
various people. And its elimination, like the capture of the Unicorn,
appears to require some special conditions, not all of which have yet
been specified. Finally, whether or not those conditions exist, belief
in their existence has had significant consequences."
- Schizophrenia, Behavioural Aspects (Salzinger, 1972)

try vs. has_key() [ In reply to ]

William.H.Duquette at jpl

Apr 28, 1999, 7:58 AM

Post #14 of 20 (3992 views)

On Fri, 23 Apr 1999 14:13:59 +0300, Moshe Zadka
<moshez@math.huji.ac.il> wrote:

>d={}
>for word in words:
> first_two=word[:2]
> d[first_two]=d.get(first_two, []).append(word)
>
>Unfortunately, few people seem to know about the ``get'' method, which
>is really good.

This doesn't seem to work. For example, here's a python
interpreter session:

>>> d = {}
>>> a = 'Foo'
>>> d[a] = d.get(a, []).append('Bar')
>>> d
{'Foo': None}
>>>

I'd have expected to see {'Foo': 'Bar'}, but that's not what I get.

I'm using Python 1.5.2, by the way.

Will Duquette
--------------------------------------------------------------------------
Will Duquette, JPL | William.H.Duquette@jpl.nasa.gov
But I speak only | http://eis.jpl.nasa.gov/~will (JPL Use Only)
for myself. | It's amazing what you can do with the right tools.

try vs. has_key() [ In reply to ]

akuchlin at cnri

Apr 28, 1999, 8:11 AM

Post #15 of 20 (3990 views)

William H. Duquette writes:
>>>> d = {}
>>>> a = 'Foo'
>>>> d[a] = d.get(a, []).append('Bar')
>>>> d
>{'Foo': None}
>I'd have expected to see {'Foo': 'Bar'}, but that's not what I get.

The .append() method only returns None, not the list you've
just appended to.

>>> L = []
>>> print L.append(2)
None
>>> L
[2]

You'd want something like:

dummy = d[a] = d.get(a, [])
dummy.append('Bar')

--
A.M. Kuchling http://starship.python.net/crew/amk/
When I originally designed Perl 5's OO, I thought about a lot of this stuff,
and chose the explicit object model of Python as being the least confusing. So
far I haven't seen a good reason to change my mind on that.
-- Larry Wall, 27 Feb 1997 on perl5-porters

try vs. has_key() [ In reply to ]

Apr 28, 1999, 8:13 AM

Post #16 of 20 (3993 views)

>>>>> "AM" == Aahz Maruch <aahz@netcom.com> writes:

AM> It appears that the second parameter for get() (to specify a
AM> value different from None when the key is not found) is new to
AM> 1.5.2. In 1.5.1 you still have to do the following:

That's not correct. The optional second argument for get has been
around since get was introduced, which was before the release of 1.5
final. (You check the CVS log for dictobject.c for confirmation :-).
If you use get, you can always specify a default value.

This thread seems to have included a lot of code that misuses
dictionaries and other built-in types, which is probably why you
concluded that get changed between 1.5.1 and 1.5.2.

I think the confusion started when the following bits of code were
posted:

d={}
for word in words:
d[word]=d.get(word, 0)+1

The above part is just fine -- an efficient way to count words.

d={}
for word in words:
first_two=word[:2]
d[first_two]=d.get(first_two, []).append(word)

This second bit doesn't work because the append method on list objects
returns None. As a result, the first time a new value for first_two
appears None will be assigned to that key in the dictionary. The
second time that value of first_two shows up, None will be returned by
d.get. Then the code will raise an AttributeError, because None
doesn't have an append method.

The following code would be correct:

d={}
for word in words:
first_two=word[:2]
d[first_two]= temp = d.get(first_two, [])
temp.append(word)

It would be less efficient, however, than the following:

d={}
for word in words:
first_two=word[:2]
if d.has_key(first_two):
d[first_two].append(word)
else:
d[first_two] = [word]

(And as Barry pointed out earlier in the thread, depending on how
frequently you expect has_key to return true, you may want to use
try/except instead. See http://www.python.org/~bwarsaw/ecshort.pdf.)

Jeremy

try vs. has_key() [ In reply to ]

William.H.Duquette at jpl

Apr 28, 1999, 3:23 PM

Post #17 of 20 (3991 views)

On Wed, 28 Apr 1999 14:58:13 GMT, William.H.Duquette@jpl.nasa.gov
(William H. Duquette) wrote:

>>>> d = {}
>>>> a = 'Foo'
>>>> d[a] = d.get(a, []).append('Bar')
>>>> d
>{'Foo': None}
>>>>
>
>I'd have expected to see {'Foo': 'Bar'}, but that's not what I get.

As someone pointed out, append() returns None, explaining
the result I got. However, the following code works just fine:

d = {}
a = 'foo'

# Append an entry to d[a], whether it has been initialized or not:
d[a] = d.get(a, [])
d[a].append('Bar')

Will

--------------------------------------------------------------------------
Will Duquette, JPL | William.H.Duquette@jpl.nasa.gov
But I speak only | http://eis.jpl.nasa.gov/~will (JPL Use Only)
for myself. | It's amazing what you can do with the right tools.

try vs. has_key() [ In reply to ]

Apr 29, 1999, 7:44 AM

Post #18 of 20 (3988 views)

>>>>> "WD" == William H Duquette <William.H.Duquette@jpl.nasa.gov> writes:

WD> As someone pointed out, append() returns None, explaining the
WD> result I got. However, the following code works just fine:
d = {}
a = 'foo'

# Append an entry to d[a], whether it has been initialized or not:
d[a] = d.get(a, [])
d[a].append('Bar')

However, you're doing two dictionary lookups when only one is
necessary. It would be more efficient to store the results of the get
in a temporary variable -- temp = d[a] = d.get(a, []) -- and use the
temporary variable for the append.

Jeremy

try vs. has_key() [ In reply to ]

jepler at inetnebr

Apr 29, 1999, 6:13 PM

Post #19 of 20 (3988 views)

On Wed, 28 Apr 1999 11:13:28 -0400 (EDT), Jeremy Hylton
<jeremy@cnri.reston.va.us> wrote:
> d={}
> for word in words:
> first_two=word[:2]
> d[first_two]=d.get(first_two, []).append(word)
>
>This second bit doesn't work because the append method on list objects
>returns None. As a result, the first time a new value for first_two
>appears None will be assigned to that key in the dictionary. The
>second time that value of first_two shows up, None will be returned by
>d.get. Then the code will raise an AttributeError, because None
>doesn't have an append method.
>
>The following code would be correct:
>
> d={}
> for word in words:
> first_two=word[:2]
> d[first_two]= temp = d.get(first_two, [])
> temp.append(word)

what about
d[first_two] = d.get(first_two, [])+[word]
? Or is list construction and addition going to be enough more expensive
than the function call to make this a lose as well?

Jeff

try vs. has_key() [ In reply to ]

Apr 30, 1999, 8:08 AM

Post #20 of 20 (3989 views)

>>>>> "JE" == Jeff Epler <jepler@inetnebr.com> writes:

>>The following code would be correct:
>>
>> d={}
>> for word in words:
>> first_two = word[:2]
>> d[first_two]= temp = d.get(first_two, [])
>> temp.append(word)

JE> what about d[first_two] = d.get(first_two, [])+[word] ? Or is
JE> list construction and addition going to be enough more expensive
JE> than the function call to make this a lose as well?

Yeah. Concatenating two lists results in a new list being created
every time (and you're already creating a new list containing the
value of word). Two list allocs is definitely more expensive that an
append.

Jeremy