Mailing List Archive

Using PHP's assert in MediaWiki code
Dear all,

should we allow using PHP's assert [1] in MediaWiki code?

It would allow us to formulate and automatically verify conditions
about code, while at the same time providing readable documentation of
code for free.

Possible, exemplary use cases would be:
- automatically verifyable documentation of code's intent
- guarding against logic pitfalls like forgetting to set a variable in
all branches of switches, if/else cascades
- guarding against using uninitialized variables

What do you think?


Kind regards,
Christian


P.S.: For typical MediaWiki use cases, PHP's assert is even faster
than throwing exceptions behind 'if'-guards.


[1] http://php.net/manual/en/function.assert.php
(Not to confuse with PHPUnit's functions for assertions, which solve a
different problem.)




--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a Email: christian@quelltextlich.at
4040 Linz, Austria Phone: +43 732 / 26 95 63
Fax: +43 732 / 26 95 63
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
Re: Using PHP's assert in MediaWiki code [ In reply to ]
On 18/03/12 20:37, Christian Aistleitner wrote:
> Dear all,
>
> should we allow using PHP's assert [1] in MediaWiki code?
>
> It would allow us to formulate and automatically verify conditions
> about code, while at the same time providing readable documentation of
> code for free.
>
> Possible, exemplary use cases would be:
> - automatically verifyable documentation of code's intent
> - guarding against logic pitfalls like forgetting to set a variable in
> all branches of switches, if/else cascades
> - guarding against using uninitialized variables
>
> What do you think?

We use exceptions for that.

> P.S.: For typical MediaWiki use cases, PHP's assert is even faster
> than throwing exceptions behind 'if'-guards.

That's funny, for me "if" is about 10 times faster than assert() in
the non-throwing case. Micro-optimisation in PHP usually revolves
around minimising the number of function calls, since a function call
is relatively complex and expensive compared to other opcodes.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Using PHP's assert in MediaWiki code [ In reply to ]
+1 to what Tim said. I effectively said as much about a week ago when this
was brought up on IRC.

I'd also add that the behavior of assertions vary based on configuration,
which is confusing at best. Unlike MWExceptions, which are all handled the
same.

-Chad
On Mar 18, 2012 6:10 PM, "Tim Starling" <tstarling@wikimedia.org> wrote:

> On 18/03/12 20:37, Christian Aistleitner wrote:
> > Dear all,
> >
> > should we allow using PHP's assert [1] in MediaWiki code?
> >
> > It would allow us to formulate and automatically verify conditions
> > about code, while at the same time providing readable documentation of
> > code for free.
> >
> > Possible, exemplary use cases would be:
> > - automatically verifyable documentation of code's intent
> > - guarding against logic pitfalls like forgetting to set a variable in
> > all branches of switches, if/else cascades
> > - guarding against using uninitialized variables
> >
> > What do you think?
>
> We use exceptions for that.
>
> > P.S.: For typical MediaWiki use cases, PHP's assert is even faster
> > than throwing exceptions behind 'if'-guards.
>
> That's funny, for me "if" is about 10 times faster than assert() in
> the non-throwing case. Micro-optimisation in PHP usually revolves
> around minimising the number of function calls, since a function call
> is relatively complex and expensive compared to other opcodes.
>
> -- Tim Starling
>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Using PHP's assert in MediaWiki code [ In reply to ]
Hi Tim,

on Mon, Mar 19, 2012 at 09:09:58AM +1100, Tim Starling wrote:
> On 18/03/12 20:37, Christian Aistleitner wrote:
> > Dear all,
> >
> > should we allow using PHP's assert [1] in MediaWiki code?
> >
> > It would allow us to formulate and automatically verify conditions
> > about code, while at the same time providing readable documentation of
> > code for free.
> >
> > Possible, exemplary use cases would be:
> > - automatically verifyable documentation of code's intent
> > - guarding against logic pitfalls like forgetting to set a variable in
> > all branches of switches, if/else cascades
> > - guarding against using uninitialized variables
> >
> > What do you think?
>
> We use exceptions for that.

Yes, this was the motivation for my email.

'If'-guards are fine. Just as exceptions are. They are excellent tools
for conditions that /typically/ hold at run-time--but eventually they
might fail. In such a case, we want to do classical error handling.
It's the right tool for the job.

We can of course decide to keep using if-guards/exceptions when
modelling conditions that /unconditionally and always/ hold.
However, PHP introduced asserts some 12 years back for just this and
only this use case. It's a proven tool.
assert is tailored for conditions that /unconditionally and
always/ hold. So why not allow this standard tool in our toolbox?

Due to this narrower use case, assert comes with some benefit over
if-guards/exceptions in terms of code readability and quality:
- We can turn off checking the conditions on production machines, to
lower the impact.
- assert's syntax shows the condition that holds. [1]
- asserts produce good error messages without condition duplication. [2]
- asserts clearly stand out in code. [3]
- asserts just add the bare necessities to the code and do not clutter
up code so much
- asserts are less code to write.



> > P.S.: For typical MediaWiki use cases, PHP's assert is even faster
> > than throwing exceptions behind 'if'-guards.
>
> That's funny, for me "if" is about 10 times faster than assert() in
> the non-throwing case.

Have you tried real world examples?

Consider for example

$this->isOpen() && $this->mConn

This is a typical condition one could add in many places of
DatabaseMysql.php. For this condition asserts are ~16% faster [4].

For this real-world example, the fact that assert takes the condition
as string (hence unevaluated) outperforms the penalty due to the
function call.

But speed is just in the "P.S.". assert's real benefit would be
improved readability, as pointed out above.


Kind regards,
Christian



[1] If guards show the negated condition. Hence, when reading the
code, you have to mentally negate the condition again before actually
knowing what has to hold.


[2] An

assert( 'condA && condB' );

would relate to

if ( ! condA || ! condB ) {
throw new MWException( 'condA && condB was violated' )
}

Hence, if e.g.: condA changes, asserts just changes condA and we are
done.
For if-guards/exceptions, we have to adapt both occurrences of
condA. This is somewhat error prone and it's easier for the conditions
to run apart.


[3] if-guards/exceptions look like normal code. Hence, you have to
mentally reparse it again and again and detect them. Typically IDEs
cannot help or highlight only those guards that document code.

IDEs can easily detect and understand asserts. Even REs can find them ;)


[4] Please verify the number yourself. It was obtained by the attached
assert_test.php. The output for me was:

RUNS: 10, ITERATIONS: 1000000
assert: 1.818
ifGuard: 2.148
assert: 1.798
ifGuard: 2.151
assert: 1.795
ifGuard: 2.162
assert: 1.798
ifGuard: 2.148
assert: 1.801
ifGuard: 2.154
assert: 1.800
ifGuard: 2.134
assert: 1.788
ifGuard: 2.140
assert: 1.790
ifGuard: 2.141
assert: 1.791
ifGuard: 2.146
assert: 1.797
ifGuard: 2.141
total: assert: 17.976
total: ifGuard: 21.464
assert is ~16% faster than ifGuard




--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a Email: christian@quelltextlich.at
4040 Linz, Austria Phone: +43 732 / 26 95 63
Fax: +43 732 / 26 95 63
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
Re: Using PHP's assert in MediaWiki code [ In reply to ]
Hi Chad,

on Sun, Mar 18, 2012 at 07:18:01PM -0400, Chad wrote:
> I'd also add that the behavior of assertions vary based on configuration,
> which is confusing at best.

Being able to vary based on configuration actually is a feature. An
essential one. It lowers assert's impact on performance.
But there is no need to mess with configuration. asserts work out of
the box.
You are only given the possibility to turn them off.

The same holds true for the very software MediaWiki is built on. The
software uses and relies on asserts, but gives you the possibility to
turn assertion checking off.
* MySQL uses asserts [1].
* PHP uses asserts [2].

asserts and the possibility to turn them on and off is not confusing
there.

But MySQL and PHP are not the only adopters of asserts.
Just take about any quality software.
The source code takes advantage of asserts (e.g.: Libreoffice [3])

But it's not only practical software engineering. Literature is also
strongly in favor of using asserts as well:
In books: E.g.: S. McConnell. Code Complete [4]
In papers: E.g.: G. Kudrjavets, N. Nagappan, T. Ball. Assessing the
Relationship between Software Assertions and Code Quality: An
Empirical Investigation [5]
In talks: E.g.: T. Hoare. Assert early, assert often [6]

Kind regards,
Christian




[1] E.g.: ./mysql-5.1.59/regex/engine.c:199--206 in the MySQL 5.1.59
tarball:

-----8<-----BEGIN-----8<-----
assert(dp == NULL || dp == endp);
if (dp != NULL) /* found a shorter one */
break;

/* despite initial appearances, there is no match here */
NOTE("false alarm");
start = m->coldp + 1; /* recycle starting later */
assert(start <= stop);
-----8<-----END-----8<-----

And here you clearly see what asserts buy you. With just this
snippet of code, the first assert tells you what to expect from
“dp” at this point. At development time, this contract is
automatically checked and a breach thereof is signalled.
On production systems, the asserts are deactivated and are
ignored.

[2] E.g.: main/streams/memory.c:86--97 in the PHP 5.3.9 tarball:

-----8<-----BEGIN-----8<-----
static size_t php_stream_memory_read(php_stream *stream, char *buf, size_t count TSRMLS_DC)
{
php_stream_memory_data *ms = (php_stream_memory_data*)stream->abstract;
assert(ms != NULL);

if (ms->fpos + count >= ms->fsize) {
count = ms->fsize - ms->fpos;
stream->eof = 1;
}
if (count) {
assert(ms->data!= NULL);
assert(buf!= NULL);
-----8<-----END-----8<-----

Again, the asserts tell you what to expect from “ms” etc.

[3] E.g.: sc/source/core/data/markdata.cxx:241--243 in the
libreoffice-calc 3.4.4.2 tarball:

-----8<-----BEGIN-----8<-----
if ( bMultiMarked )
{
DBG_ASSERT(pMultiSel, "bMultiMarked, aber pMultiSel == 0");
-----8<-----END-----8<-----

(At this point, you see the German StarOffice roots of
LibreOffice/OpenOffice.org. The German “aber” means “but” in
English. So the assertion message would be
bMultiMarked, but pMultiSel == 0
in English)

[4] isbn:9780735619678

[5] http://research.microsoft.com/pubs/70290/tr-2006-54.pdf

[6] http://research.microsoft.com/en-us/people/thoare/assertearlyassertoften.ppt
Be sure to read the notes within the ppt.



--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a Email: christian@quelltextlich.at
4040 Linz, Austria Phone: +43 732 / 26 95 63
Fax: +43 732 / 26 95 63
Homepage: http://quelltextlich.at/
---------------------------------------------------------------
Re: Using PHP's assert in MediaWiki code [ In reply to ]
On 19/03/12 21:43, Christian Aistleitner wrote:
> Being able to vary based on configuration actually is a feature.
> An essential one. It lowers assert's impact on performance. But
> there is no need to mess with configuration. asserts work out of
> the box. You are only given the possibility to turn them off.

The ability to turn off asserts in C is damaging to system security
and stability, and is part of C's toxic culture of trading off program
correctness for negligible performance improvements.

There are cases where it does make sense to optimise for every last
clock cycle, but such cases are very rare in modern programming.

In another post:
> Have you tried real world examples?

[...]

> function funcAssert() { assert( '$this->isOpen() && $this->mConn'
> ); $this->mConn++; }

[...]

> assert_options( ASSERT_ACTIVE, 0 );

Yeah, very clever. Look, I have a test case where assert() is faster
as well:

<?php
function foo() {
sleep(100);
return true;
}

assert_options( ASSERT_ACTIVE, 0 );
$t = microtime(true);
assert('foo()');
print (microtime(true) - $t) . "\n";
$t = microtime(true);
if (!foo()) { throw Exception('assert!'); }
print (microtime(true) - $t) . "\n";
?>

Wow, assert() is 17 million times faster that if() in this case! We
should really use assert()!

My previous test of assert() involved a case where the assert() and
the if() were doing roughly the same thing. In such cases, if() is
faster, because it is not a function call.

> The same holds true for the very software MediaWiki is built on.
> The software uses and relies on asserts, but gives you the
> possibility to turn assertion checking off. * MySQL uses asserts
> [1]. * PHP uses asserts [2].
>
> asserts and the possibility to turn them on and off is not
> confusing there.
>
> But MySQL and PHP are not the only adopters of asserts. Just take
> about any quality software. The source code takes advantage of
> asserts (e.g.: Libreoffice [3])
>
> But it's not only practical software engineering. Literature is
> also strongly in favor of using asserts as well: In books: E.g.:
> S. McConnell. Code Complete [4] In papers: E.g.: G. Kudrjavets,
> N. Nagappan, T. Ball. Assessing the Relationship between Software
> Assertions and Code Quality: An Empirical Investigation [5] In
> talks: E.g.: T. Hoare. Assert early, assert often [6]

assert() is better than nothing. It's not better than exceptions and
unit tests, especially not in PHP.

assert() in PHP shares very little in common with assert() in C. In
C, assert() is an empty macro by default. In PHP, by default it
raises a warning. PHP doesn't have macros, so to simulate the C
performance feature, you have to put the source code inside a
string, hiding it from automated source analysis and maintenance
tools, and breaking syntax highlighting. I don't think you can
defend the PHP feature with references that talk about the C feature.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Re: Using PHP's assert in MediaWiki code [ In reply to ]
Hi Tim,

On Mon, Mar 19, 2012 at 11:20:53PM +1100, Tim Starling wrote:
> > assert_options( ASSERT_ACTIVE, 0 );
>
> [ unmotivated ranting ]

I was talking about performance on production servers.
Obviously.
Where else would performance matter?

And yes. On production servers, one typically turns checking
assertions off.

assert's are used to catch situations that catch logic errors. You
want to do that during /development/. They are a tool for development
and documentation.

Asserts are not just another nice way to burn cycles on production
systems :D

I doubt that the PHP and MySQL binaries used in production were built
with assertion checking enabled, were left unstripped, ...

> My previous test of assert() involved a case where the assert() and
> the if() were doing roughly the same thing.

'assert' and 'if' are not designed to do the same thing ... So why
should we care to cripple one to simulate the other?

I'd much rather compare how the available tools get required job done.
And for conditions that /always/ hold, assert would be the right tool.

> assert() is better than nothing. It's not better than exceptions and
> unit tests, especially not in PHP.

PHP's assert and unit tests have nothing to do with each other. They
are orthogonal tools.

Either way MediaWiki's stance on the issue is clear now:
To MediaWiki, PHP's assert is evil.

Fair enough!

Kind regards,
Christian



--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Gruendbergstrasze 65a Email: christian@quelltextlich.at
4040 Linz, Austria Phone: +43 732 / 26 95 63
Fax: +43 732 / 26 95 63
Homepage: http://quelltextlich.at/
---------------------------------------------------------------