Mailing List Archive

on the vsl format - VIP23 (VSL refactoring) design sketch
Hi,

I spend some time pondering the VSL format, here's what I got at the moment.

First of all, a recap of the existing format. In short:

* a record can be an end- or wrapmarker. These are not relevant in the current
context

* batch records are containers for individual records as produced by the VSLb*
functions on a vsl buffer

* records are tag, length, vxid and payload

endmarker:

| 31 ... 24 | 23 ... 0 |
| 0xfe | 0x454545 |

wrapmarker:

| 31 ... 24 | 23 ... 0 |
| 0xfe | 0x575757 |

batch:

| 31 ... 24 | 23 ... 0 |
| 0xff | zero |

| 31 ... 0 |
| len |

<record><record>...

record:

| 31 ... 24 | 23 ... 2 | 1 .. 0 |
| tag | len >> 2 | zero |

| 31 | 30 | 29... 0 |
| B | C | vxid |

| (len bytes) |
| log |

B = BACKENDMARKER
C = CLIENTMARKER
actually those are contained in the vxid passed to VSL


There is a number of VSL Tags for which this format can remain unchanged,
because they either contain exactly one string or are relevant only for
debugging/tracing purposes.

PROPOSAL:

For the binary VSL records we already agreed that fixed length fields always
come first. Because we have the length and know the size of the fixed part, we
can could put in another STRING or header as follows:

fixed:

| 31 ... 24 | 23 ... 2 | 1 .. 0 |
| tag | len >> 2 | 1 0 |

| 31 | 30 | 29... 0 |
| B | C | vxid |

<DBL><INT>...[<STRING|HDR>]

HDR: <UINT8><STRING><STRING>


This would accommodate exactly one additional STRING or, with one extra byte to
indicate the header name length, one additional header.

I would also propose to indicate this format by setting one of the lower bits of
the length.

This format would accommodate more VSL Tags where exactly one string/header is
logged.

The question then is if, for additional variable data, we should actually embed
it in one VSL record. We could also use additional VSL records, under the
condition that such "continuation records" are written within the same Batch:

variable:

like record, but
- field instead of tag
- p[0] & 3 == 3
- no xvid, data starts at p[1]

| 31 ... 24 | 23 ... 2 | 1 .. 0 |
| field | len >> 2 | 1 1 |

| (len bytes) |
| log |

The advantage would be that we get the full 24 bits length per string.

But my main motivation was actually pondering about a different function
interface, for which this format would be advantageous because the caller would
write directly to VSL. Here's a taste of this suggestion, but this is not cooked
yet. Feedback welcome anyway:

struct SLTS_XXX t[1];

VSLS_Open(t, vsl, SLT_XXX, vxid);
t->fixed->dbl = 0.5;
t->fixed->int = 42;
VSLS_printf(t, t->var->brz, "%s_%s", bar, baz);
VSLS_printf(t, t->var->foo, "%s/%s", bar, baz);
VSLS_Close(t);

Nils


--

** * * UPLEX - Nils Goroll Systemoptimierung

Scheffelstraße 32
22301 Hamburg

tel +49 40 28805731
mob +49 170 2723133
fax +49 40 42949753

xmpp://slink@jabber.int.uplex.de/

http://uplex.de/
Re: on the vsl format - VIP23 (VSL refactoring) design sketch [ In reply to ]
On Fri, Apr 12, 2019 at 5:08 PM Nils Goroll <nils.goroll@uplex.de> wrote:
>
> Hi,
>
> I spend some time pondering the VSL format, here's what I got at the moment.
>
> First of all, a recap of the existing format. In short:
>
> * a record can be an end- or wrapmarker. These are not relevant in the current
> context
>
> * batch records are containers for individual records as produced by the VSLb*
> functions on a vsl buffer
>
> * records are tag, length, vxid and payload
>
> endmarker:
>
> | 31 ... 24 | 23 ... 0 |
> | 0xfe | 0x454545 |
>
> wrapmarker:
>
> | 31 ... 24 | 23 ... 0 |
> | 0xfe | 0x575757 |
>
> batch:
>
> | 31 ... 24 | 23 ... 0 |
> | 0xff | zero |
>
> | 31 ... 0 |
> | len |
>
> <record><record>...
>
> record:
>
> | 31 ... 24 | 23 ... 2 | 1 .. 0 |
> | tag | len >> 2 | zero |
>
> | 31 | 30 | 29... 0 |
> | B | C | vxid |
>
> | (len bytes) |
> | log |
>
> B = BACKENDMARKER
> C = CLIENTMARKER
> actually those are contained in the vxid passed to VSL
>
>
> There is a number of VSL Tags for which this format can remain unchanged,
> because they either contain exactly one string or are relevant only for
> debugging/tracing purposes.
>
> PROPOSAL:
>
> For the binary VSL records we already agreed that fixed length fields always
> come first. Because we have the length and know the size of the fixed part, we
> can could put in another STRING or header as follows:

The prefix, which may be optional, and by nature is supposed to be the
first field, and coincidentally is of variable length, is a problem.

Although if we reserve one more bit for a prefix flag we can make it
the last field. Upon truncation though, not having the prefix could be
a big deal.

> fixed:
>
> | 31 ... 24 | 23 ... 2 | 1 .. 0 |
> | tag | len >> 2 | 1 0 |
>
> | 31 | 30 | 29... 0 |
> | B | C | vxid |
>
> <DBL><INT>...[<STRING|HDR>]
>
> HDR: <UINT8><STRING><STRING>
>
>
> This would accommodate exactly one additional STRING or, with one extra byte to
> indicate the header name length, one additional header.
>
> I would also propose to indicate this format by setting one of the lower bits of
> the length.
>
> This format would accommodate more VSL Tags where exactly one string/header is
> logged.
>
> The question then is if, for additional variable data, we should actually embed
> it in one VSL record. We could also use additional VSL records, under the
> condition that such "continuation records" are written within the same Batch:
>
> variable:
>
> like record, but
> - field instead of tag
> - p[0] & 3 == 3
> - no xvid, data starts at p[1]
>
> | 31 ... 24 | 23 ... 2 | 1 .. 0 |
> | field | len >> 2 | 1 1 |
>
> | (len bytes) |
> | log |
>
> The advantage would be that we get the full 24 bits length per string.
>
> But my main motivation was actually pondering about a different function
> interface, for which this format would be advantageous because the caller would
> write directly to VSL. Here's a taste of this suggestion, but this is not cooked
> yet. Feedback welcome anyway:
>
> struct SLTS_XXX t[1];
>
> VSLS_Open(t, vsl, SLT_XXX, vxid);
> t->fixed->dbl = 0.5;
> t->fixed->int = 42;
> VSLS_printf(t, t->var->brz, "%s_%s", bar, baz);
> VSLS_printf(t, t->var->foo, "%s/%s", bar, baz);
> VSLS_Close(t);
>
> Nils
>
>
> --
>
> ** * * UPLEX - Nils Goroll Systemoptimierung
>
> Scheffelstraße 32
> 22301 Hamburg
>
> tel +49 40 28805731
> mob +49 170 2723133
> fax +49 40 42949753
>
> xmpp://slink@jabber.int.uplex.de/
>
> http://uplex.de/
>
_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
Re: on the vsl format - VIP23 (VSL refactoring) design sketch [ In reply to ]
On 12/04/2019 17:57, Dridi Boukelmoune wrote:
> The prefix, which may be optional, and by nature is supposed to be the
> first field, and coincidentally is of variable length, is a problem.

I do not understand why. The VSL client code which matches the prefix can do so
against the string field as before, it is just not the start of the record payload.

--

** * * UPLEX - Nils Goroll Systemoptimierung

Scheffelstraße 32
22301 Hamburg

tel +49 40 28805731
mob +49 170 2723133
fax +49 40 42949753

xmpp://slink@jabber.int.uplex.de/

http://uplex.de/
Re: on the vsl format - VIP23 (VSL refactoring) design sketch [ In reply to ]
On Fri, Apr 12, 2019 at 6:27 PM Nils Goroll <nils.goroll@uplex.de> wrote:
>
> On 12/04/2019 17:57, Dridi Boukelmoune wrote:
> > The prefix, which may be optional, and by nature is supposed to be the
> > first field, and coincidentally is of variable length, is a problem.
>
> I do not understand why. The VSL client code which matches the prefix can do so
> against the string field as before, it is just not the start of the record payload.

The prefix is too important as an optional field to move to the last
position and jeopardize its presence because of vsl_reclen.

Dridi
_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
Re: on the vsl format - VIP23 (VSL refactoring) design sketch [ In reply to ]
--------
In message <6b8d3465-2245-ba3a-7c80-4b76200782e9@uplex.de>, Nils Goroll writes:

>For the binary VSL records we already agreed that fixed length fields always
>come first. Because we have the length and know the size of the fixed part, we
>can could put in another STRING or header as follows:

Well, agreed and agreed...

If we go with the "printf-format" model, and we wait a bit before we
rearrange the order of fields, we can do A/B testing, and i think
that is a very good trade-off for maybe not gaining quite as much
performance up front.

Then once we have switched over and the dust have settled, we can
start to optimize field order.

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev