Mailing List Archive

VSC self-documentation
I have been prototyping the extensible VSC-stuff for some days and
this is a progress report of sorts.

The major issue with extensible VSC's is that they need to document
themselves so varnishstat(1) and other VSM users can understand them.

The layout of a VSM segment follows directly from that:

{
uint64_t offset_to_doc; // N * 8
uint64_t counters[N];
char doc[];
}

The documentation should have 1+N parts, a "head" which explains what
this set of counters is about, and a "stanza" for each counter.

The 'stanza' tells the type (counter/gauge/bitmap), the level
(info/diag/debug) and maybe also unit of this counter.

It must have a "oneliner" and optionally also a multi-paragraph
"long" explanation.

And then we run into the how do we encode that in the VSM segment
thing, and after looking at all the alternatives I could imagine,
I end up with JSON ... again, again, again.

I have a small, 500 LOC, json parser in C we can use, so that is
not going to drag in a dependency.

If we're going to bite the JSON bullet, the other places it makes
sense to use it, is for the vcc_if.c::Vmod_Spec data structure
which explains the contents of a VMOD to VCC.

Given that we are going to expand what VMODs can contain, that
"minilanguage" needs expansion too, so all in all, I think we
are over the threshold were JSON makes sense.

... but I still feel dirty using JSON to pass data structures
from one C-program to another, or as it may be, from a python
program to another.

Poul-Henning

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
Re: VSC self-documentation [ In reply to ]
> I have a small, 500 LOC, json parser in C we can use, so that is
> not going to drag in a dependency.
>
> If we're going to bite the JSON bullet, the other places it makes
> sense to use it, is for the vcc_if.c::Vmod_Spec data structure
> which explains the contents of a VMOD to VCC.
>
> Given that we are going to expand what VMODs can contain, that
> "minilanguage" needs expansion too, so all in all, I think we
> are over the threshold were JSON makes sense.
>
> ... but I still feel dirty using JSON to pass data structures
> from one C-program to another, or as it may be, from a python
> program to another.

There are other solutions for this kind of descriptors, and you could
probably get a msgpack [1] subset implementation with the same-ish
amount of code and get the benefits of actual typing (int, string,
etc) and easier parsing.

Dridi

[1] http://msgpack.org/

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
Re: VSC self-documentation [ In reply to ]
On Thu, May 18, 2017 at 12:15 AM, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:
> And then we run into the how do we encode that in the VSM segment
> thing, and after looking at all the alternatives I could imagine,
> I end up with JSON ... again, again, again.
>
> I have a small, 500 LOC, json parser in C we can use, so that is
> not going to drag in a dependency.

In case JSON ends up being a thing, I recently contributed to a JSON
project, https://github.com/skeeto/pdjson. It's a streaming parser,
and is the only C JSON parser that passes / fails all the tests as
expected in https://github.com/nst/JSONTestSuite
(http://seriot.ch/parsing_json.php). It's ~700 LOC with header files,
and supports custom allocators.

Unsure whether strict correctness is required, but it certainly can't
hurt (especially if folks end up building e.g. VMODs on top of it).

--dho

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
Re: VSC self-documentation [ In reply to ]
--------
In message <CABoVN9AnwjDGXhpyJ1Z=zUC3W4G+mzobK7VEBB6bzVFwcPMerg@mail.gmail.com>
, Dridi Boukelmoune writes:

>> ... but I still feel dirty using JSON to pass data structures
>> from one C-program to another, or as it may be, from a python
>> program to another.
>
>There are other solutions for this kind of descriptors, and you could
>probably get a msgpack [...]

Yes, or CBOR, XDR, N different ASN.1's or ...

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
Re: VSC self-documentation [ In reply to ]
--------
In message <CADe=ujbko6ohS7L=gYj0AhnS1vdqSa5xpFznAg2Us=G=Ee1CFg@mail.gmail.com>, "Devon H. O'Dell" writes:

>In case JSON ends up being a thing, [...]

>Unsure whether strict correctness is required, but it certainly can't
>hurt (especially if folks end up building e.g. VMODs on top of it).

Thus inspired I just tested my code[1] agains the testsuite, and after
one small tweak[2] it passes all the Yes/No tests.

The "i_" and transform testcases gets various results: I don't
transform numbers to C numeric types, so I don't find all the ieee64
overflows, and I'm not being anal about unicode either.

> It's a streaming parser [...] It's ~700 LOC with header files,
> and supports custom allocators.

That's not a bad payback for a couple hundred lines of code.

For the specific case of VCC/VMOD "symbol tables" and VSC (and VSL?)
descriptions, the JSON won't be streaming, and I don't see why
varnishstat/log would ever need custom allocators.

For this prototyping run I'll hang onto my own code - primarily
because it is already written in "varnish style" and uses vqueue
etc, but once I get it working I'll circle back and try pdjson also.

> In case JSON ends up being a thing, [...]

... we will have to decide if we expose the "raw" JSON parser
in libvarnishapi, or if it is just used internally to implement the
VSM/VSC/VSL APIs we expose.

Like VGZ, I think I am inclined to not expose it, in order to not
grow our APIs beyond our support-capacity.

Poul-Henning

[1] http://phk.freebsd.dk/misc/json.c

[2] I my check for control-chars in strings I forgot that char is signed.

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
Re: VSC self-documentation [ In reply to ]
On Fri, May 19, 2017 at 1:08 AM, Poul-Henning Kamp <phk@phk.freebsd.dk> wrote:
> --------
> In message <CADe=ujbko6ohS7L=gYj0AhnS1vdqSa5xpFznAg2Us=G=Ee1CFg@mail.gmail.com>, "Devon H. O'Dell" writes:
>
>>In case JSON ends up being a thing, [...]
>
>>Unsure whether strict correctness is required, but it certainly can't
>>hurt (especially if folks end up building e.g. VMODs on top of it).
>
> Thus inspired I just tested my code[1] agains the testsuite, and after
> one small tweak[2] it passes all the Yes/No tests.
>
> The "i_" and transform testcases gets various results: I don't
> transform numbers to C numeric types, so I don't find all the ieee64
> overflows, and I'm not being anal about unicode either.

Cool! The unspecified stuff is wiggly; I didn't care much about one
way or the other on that, either.

> [2] I my check for control-chars in strings I forgot that char is signed.

Except when it isn't! Whether "plain" char is signed or unsigned is
implementation-defined (§6.2.5p3, §6.3.1.1p3). (But I guess the fix
was making the thing unsigned char, so it doesn't matter anyway :)).

--dho

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
Re: VSC self-documentation [ In reply to ]
--------
In message <CADe=ujYbWs09K_UgsB+A6+u97OnTf3xR9HYk1pwKoNNr99EBjQ@mail.gmail.com>
, "Devon H. O'Dell" writes:

>> [2] I my check for control-chars in strings I forgot that char is signed.
>
>Except when it isn't! Whether "plain" char is signed or unsigned is
>implementation-defined

Yeah, compliments to the ISO-C people for that bit of insanity...

--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev@varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
Re: VSC self-documentation [ In reply to ]
Looks pretty decent. Definitely makes sense to have some kind of JSON
support in Varnish, especially on the parsing and reading end. I too had to
implement my own JSON lib just because the options out there just dont hit
all requirements, libraries generally optimize for 1 thing and somehow do
everything else poorly, like navigation vs performance or strictness vs
performance. What I got is ~1600 LOC, but it has very strict JSON grammer
parsing. Very fast on both ends too, parsing is done in 1 pass (streamed)
and the result is a nice search index which acts very close to a hashtable
when searching. Infact, the search index get stored into cache (see below).

I could definitely see open sourcing this in the form of a VMOD. Yes, I
said VMOD. Check out the following VCL. Sorry for the hijack, and yes, im
digging up object support in VCL again :)

---
import request;
import objcore;
import json;
import types;

sub vcl_recv {
new json_req = request.new();

if (req.url == "/") {
json_req.copy_headers();
json_req.set_host(local.ip);
json_req.set_port(std.port(local.ip));
json_req.set_url("/some/data.json");

# Tell the VMOD if this request goes back into Varnish
# safely grab the objcore, if possible
# (there is a more elegant way to do this
# but I rather get the point across)
json_req.want_objcore();

json_req.send();
}
}

sub vcl_deliver {
new oc = objcore.new();
new json_doc = json.new();
new json_field = types.string();

if (json_req.sent()) {
json_req.wait();

# Load the objcore from the json request
oc.get_reference(json_req.get_objcore());

if (!oc.is_valid()) {
return(synth(401));
}

# Similar to ESI data
json_doc.load_index(oc.get_attribute("json"));

if (!json_doc.is_valid()) {
# Parse the response text into JSON
json_doc.load_text(json_req.get_body());
# Write back the index into cache
oc.set_attribute("json", json_doc.get_index());
}

json_field.set(json_doc.get("some_json_field"));

# We now read a JSON field from the body of an object in cache
# Future lookups will hit the JSON search index directly
}
}
---

Basically, make a request back to Varnish, grab the objcore, and then
load/store/cache JSON into it. That JSON could have been a cache miss, it
gets cached, it could have been a 404, this all plays really nice with how
Varnish works. A client can even request that JSON back out (minus the
search index, obviously). Dont get all up in arms, this all works, this is
pretty much how Edgestash works over at Varnish Software :) Also, remember,
the point of VCL is to allow people to write what they want, so if someone
wants to write the above VMODs/VCL, good for them.

So ya, if I can write a JSON vmod object, then that would be a good case to
release JSON code. The flipside is that with objects, JSON parsing is just
the tip of the iceburg in terms of providing a nice standard library with
all the things dev expect, plus interop with Varnish internals, like
objcore above.

--
Reza Naghibi
Varnish Software

On Fri, May 19, 2017 at 10:48 AM, Poul-Henning Kamp <phk@phk.freebsd.dk>
wrote:

> --------
> In message <CADe=ujYbWs09K_UgsB+A6+u97OnTf3xR9HYk1pwKoNNr99EBjQ@
> mail.gmail.com>
> , "Devon H. O'Dell" writes:
>
> >> [2] I my check for control-chars in strings I forgot that char is
> signed.
> >
> >Except when it isn't! Whether "plain" char is signed or unsigned is
> >implementation-defined
>
> Yeah, compliments to the ISO-C people for that bit of insanity...
>
> --
> Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
> phk@FreeBSD.ORG | TCP/IP since RFC 956
> FreeBSD committer | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by incompetence.
>
> _______________________________________________
> varnish-dev mailing list
> varnish-dev@varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
>