Mailing List Archive

[NASL2] Auto-conversion with + and -
My philosophical problem with the undefined value led me to look
better at the + and - operators.
The code is ugly, so I decided to rewrite it in a simpler way.
The big question is: what do we do when we have heterogeneous
arguments?
For +, the (current) result is a string as soon as one argument is a
string.
e.g. "x" + 3 = "x3" and 3 + "x" = "3x"
This also means that "123" + 7 = "1237"

For -, the behavior is identical, although I am not sure it is a good
idea, because "string subtraction" is a rather special operation.

Maybe after we should just print a warning and returns an error (null)
but I suspect that this would break a couple of scripts.

--
mailto:arboi@alussinan.org
GPG Public keys: http://michel.arboi.free.fr/pubkey.txt
http://michel.arboi.free.fr/ http://arboi.da.ru/
FAQNOPI de fr.comp.securite : http://faqnopi.da.ru/
RE: [NASL2] Auto-conversion with + and - [ In reply to ]
When a programmer chooses an operator, she will mean either addition or
string concatenation, without any ambiguity in mind. It's pretty
unfortunate if the language doesn't capture this information. Since you
have chosen '+' to mean string concatenation and addition, you need to
disambiguate which operation is meant at run time (it has to be done at run
time since the disambiguation depends on operand types, which is only known
at run time). Seems like a pretty odd problem since the programmer knew
what he meant in the first place.

When you combine ambiguity of operators with ambiguity of data types (either
through automatic type conversion or undefined values or both), you end up
with too many difficult question to answer.

I would recommend disambiguating the meaning of your operators. Use '+' to
mean addition and something else to mean string concatenation. Then a
simpler and more natural set of rules can be constructed.

Examples ('.' is the string concatenation operator):

"x" + 3 generates a runtime error
"x" . 3 produces "x3" (3 is converted to "3")
"123" + "7" produces 130 ("123" converted to 123 and "7" converted to 7)
"123" . "7" produces "1237"

Etc.

-Jim

> -----Original Message-----
> From: owner-nessus-devel@list.nessus.org
> [mailto:owner-nessus-devel@list.nessus.org]On Behalf Of Michel Arboi
> Sent: Saturday, February 08, 2003 12:02 PM
> To: nessus-devel@list.nessus.org
> Subject: [NASL2] Auto-conversion with + and -
>
>
> My philosophical problem with the undefined value led me to look
> better at the + and - operators.
> The code is ugly, so I decided to rewrite it in a simpler way.
> The big question is: what do we do when we have heterogeneous
> arguments?
> For +, the (current) result is a string as soon as one argument is a
> string.
> e.g. "x" + 3 = "x3" and 3 + "x" = "3x"
> This also means that "123" + 7 = "1237"
>
> For -, the behavior is identical, although I am not sure it is a good
> idea, because "string subtraction" is a rather special operation.
>
> Maybe after we should just print a warning and returns an error (null)
> but I suspect that this would break a couple of scripts.
>
> --
> mailto:arboi@alussinan.org
> GPG Public keys: http://michel.arboi.free.fr/pubkey.txt
> http://michel.arboi.free.fr/ http://arboi.da.ru/
> FAQNOPI de fr.comp.securite : http://faqnopi.da.ru/
Re: [NASL2] Auto-conversion with + and - [ In reply to ]
"Jim Cervantes" <jim.cervantes@verizon.net> writes:

> When a programmer chooses an operator, she will mean either addition or
> string concatenation, without any ambiguity in mind.

A way to do it could be to force the types of the arguments with int()
or string()
Unfortunately, string() does more than converting its arguments to
ASCII: it interprets escape sequences in "impure strings".

> I would recommend disambiguating the meaning of your operators. Use '+' to
> mean addition and something else to mean string concatenation.

But many scripts will be broken then.
string() could be used to concatenate strings, with the problem with
"impure" string. Maybe we just add a "strcat" function ?
(this is simpler than adding a Perl-like dot operator)

> "123" + "7" produces 130

This can be done with int("123") + int("7")

Here is my proposal, for the moment: I add a BIG warning when + or -
arguments are converted. We wait for a while, and if all the current
scripts are OK, we remove the "auto conversion" and returns an error.
Re: [NASL2] Auto-conversion with + and - [ In reply to ]
"Jim Cervantes" <jim.cervantes@verizon.net> writes:

> I would recommend disambiguating the meaning of your operators. Use '+' to
> mean addition and something else to mean string concatenation. Then a
> simpler and more natural set of rules can be constructed.

Got two of them, at least :-(

[11214](nntp_info.nasl) Horrible type conversion (int -> str) for operator + at or near line 190
[12055](ntalk_detect.nasl) Horrible type conversion (int -> str) for operator + at or near line 166
Re: [NASL2] Auto-conversion with + and - [ In reply to ]
On 8 Feb 2003, Michel Arboi wrote:

> Unfortunately, string() does more than converting its arguments to
> ASCII: it interprets escape sequences in "impure strings".

BTW: what happens when you use + to concatenate a pure string and an
impure string? Will the result be pure? Impure? Will the impure argument
be auto-purified?

I think the whole concept of pure/impure strings creates considerably more
problems than it solves. Is there any script that (really) needs to work
with impure strings (I mean doing things like x = "n"; y = "\" + x; z =
string(y))? I doubt there is any.

I suggest to get rid of impure strings completely and interpret escape
sequences in string literals. I even volunteer to examine all scripts and
double every backslash that should be interpreted as a backslash. :)

> Here is my proposal, for the moment: I add a BIG warning when + or -
> arguments are converted. We wait for a while, and if all the current
> scripts are OK, we remove the "auto conversion" and returns an error.

I agree.

--Pavel Kankovsky aka Peak [ Boycott Microsoft--http://www.vcnet.com/bms ]
"Resistance is futile. Open your source code and prepare for assimilation."
Re: [NASL2] Auto-conversion with + and - [ In reply to ]
"Pavel Kankovsky" <peak@argo.troja.mff.cuni.cz> writes:

> BTW: what happens when you use + to concatenate a pure string and an
> impure string? Will the result be pure? Impure?

Pure

> Will the impure argument be auto-purified?

No. 'a'+"\n" will give 'a\\n' (a + antislash + n)
If you want to "purify" it, use the string function.
BTW, I am adding a strcat function. Maybe we'll get rid of this +
operator (we'll have to do something with - too)

> I think the whole concept of pure/impure strings creates considerably more
> problems than it solves.

So do I. That's why I implemented "pure strings". But we cannot
get rid of "impure strings" without rewriting many plugins. That's why
I used a different separator for pure strings (single quote instead of
double quote)

> Is there any script that (really) needs to work with impure strings

It's easier to declare regex patterns with "impure strings" than with
pure strings. That's the only example I can think of.
e.g. ereg_replace(string: s, pattern: "A(.*)B", replace: "\1");

> I suggest to get rid of impure strings completely and interpret escape
> sequences in string literals.

Too hard for a little gain IMHO.
And do not forget that when Nessus 1.4.x is out, there will be a time
where we'll have to maintain two brands of scripts: some for the old
1.2 parser, and others for the new one.

I cannot see a painless way to do it. I suppose that this time will
have to be the shortest possible.
But considering the fact that Debian still ships the obsolete 1.0.x
Nessus, I am pessimistic.
Re: [NASL2] Auto-conversion with + and - [ In reply to ]
On 9 Feb 2003, Michel Arboi wrote:

> BTW, I am adding a strcat function. Maybe we'll get rid of this +
> operator

> (we'll have to do something with - too)

String - is wierd. It appears to be pretty useful at the first glance but
AFAIK 9 of 10 times it is used 1. together with strstr() to extract a
given part of the string (in a rather cumbersome way), 2. together with
ereg_replace() to split a string into parts (cumbersome as well), 3. to
remove trailing newlines.

Ad 1. can be solved with a single ereg_replace().
Ad 2. can be solved with eregmatch().
Ad 3. can be solved with chomp().

> > I think the whole concept of pure/impure strings creates considerably more
> > problems than it solves.
>
> So do I. That's why I implemented "pure strings". But we cannot
> get rid of "impure strings" without rewriting many plugins. That's why
> I used a different separator for pure strings (single quote instead of
> double quote)

Can we declare them obsolete and avoid them in (new) scripts, please? :)

Perhaps the interpreter could have something like "pragma(NASLpre2)" to
mark scripts relying on obsolete features of the language. (Or the exact
opposite marking new scripts not relying on them but I think it is better
to make it default to the new (and presumably better) style.)

> > Is there any script that (really) needs to work with impure strings
>
> It's easier to declare regex patterns with "impure strings" than with
> pure strings. That's the only example I can think of.
> e.g. ereg_replace(string: s, pattern: "A(.*)B", replace: "\1");

This means "strings where baslashes are never interpreted" rather than
impure strings with magical backslashes.

--Pavel Kankovsky aka Peak [ Boycott Microsoft--http://www.vcnet.com/bms ]
"Resistance is futile. Open your source code and prepare for assimilation."
Re: [NASL2] Auto-conversion with + and - [ In reply to ]
Pavel Kankovsky <peak@argo.troja.mff.cuni.cz> writes:

> Ad 2. can be solved with eregmatch().

While trying to rewrite webmirror.nasl, I ran into a strange bug. I
suspect that some REGEX librairies limit the size of the found
substrings.
Anyway, we probably need something like strstr() (it should return an
index) or maybe a simple "str_remove" function that will do what the
"-" operator does.

> Can we declare them obsolete and avoid them in (new) scripts, please? :)

OK.

> This means "strings where baslashes are never interpreted" rather than
> impure strings with magical backslashes.

Note that the problem with "pure"/"impure" only exists with string()
and display()
strcat does the same job as string, but does not interpret its
arguments.