Mailing List Archive: _step question: I really want to understand something...

_step question: I really want to understand something...

skinkie at xs4all

Aug 3, 2008, 5:57 AM

Post #1 of 6 (2055 views)

Alvaro,

Could you *please* explain me again. I have asked this many times, but I
still have questions about it when I am implementing plugins. Lets take
the problem:

ret_t
cherokee_handler_server_info_step (cherokee_handler_server_info_t *hdl,
cherokee_buffer_t *buffer)
{
if (hdl->buffer.len > 0) {
cherokee_buffer_add_buffer (buffer, &hdl->buffer);
cherokee_buffer_move_to_begin (&hdl->buffer, buffer->size);
if (hdl->buffer.len == 0)
return ret_eof_have_data;

return ret_ok;
} else
return ret_eagain;
}

vs

ret_t
cherokee_handler_server_info_step (cherokee_handler_server_info_t *hdl,
cherokee_buffer_t *buffer)
{
cherokee_buffer_add_buffer (buffer, &hdl->buffer);
return ret_eof_have_data;
}

Why is implementation 1 not defaultly used. Why doesn't implementation 2
with for example java when sending large documents? It would mean the
world for me if you elaborate on these examples and maybe the 'chunked'
variant.

Stefan

_step question: I really want to understand something... [ In reply to ]

Aug 3, 2008, 8:18 AM

Post #2 of 6 (2016 views)

On 3 Aug 2008, at 14:57, Stefan de Konink wrote:

> if (hdl->buffer.len > 0) {
> cherokee_buffer_add_buffer (buffer, &hdl->buffer);
> cherokee_buffer_move_to_begin (&hdl->buffer, buffer-
> >size);
> if (hdl->buffer.len == 0)
> return ret_eof_have_data;

This snipped does not make much sense because a number of reasons:

- cherokee_buffer_add_buffer() appends the whole buffer, so it
wouldn't make sense to move the rest to the begin, because you already
know that it has been added completely. If you wanted to clean the
buffer, a call to cherokee_buffer_clean() would have the same result.

- The call the cherokee_buffer_move_to_begin() is also wrong because
of the buffer->size parameter. A buffer has three properties: ->buf, -
>len and ->size. The first one holds a pointer to the char *, the
second stores de length of the information, and the last one stores
the allocated memory size (which will always be equal or greater than -
>len).

In this case, the correct call would be:
cherokee_buffer_move_to_begining (&hdl->buffer, buffer->len);

> } else
> return ret_eagain;

If there is nothing on the buffer, I would say that you should rather
return ret_eof. (If your handler is collecting data asynchronously
that could be different though).

> vs
>
> ret_t
> cherokee_handler_server_info_step (cherokee_handler_server_info_t
> *hdl,
> cherokee_buffer_t *buffer)
> {
> cherokee_buffer_add_buffer (buffer, &hdl->buffer);
> return ret_eof_have_data;
> }

In this case, the handler does know that there will only be a single
call to the _step() method. That's why it copies all the information
that has previously built and returns eof_have_data (the last data
package before closing the connection. It is like ret_ok + ref_oef).

A more generic implementation of this method, following the direction
that you pointed with you example would be:

ret_t
cherokee_handler_example_step (cherokee_handler_example_t *hdl,
cherokee_buffer_t *buf)
{
if (cherokee_buffer_is_empty (&hdl->prebuilt))
return ret_eof;

cherokee_buffer_add (buf, &hdl->prebuilt, 1024);
cherokee_buffer_move_to_begin (&hdl->prebuilt, 1024);

if (cherokee_buffer_is_empty (&hdl->prebuilt))
return ret_eof_have_data;

return ret_ok;
}

This method would be iterating along an internal buffer, sending
chunks of 1k bytes, until the buffer is empty.

> Why is implementation 1 not defaultly used. Why doesn't
> implementation 2
> with for example java when sending large documents? It would mean the
> world for me if you elaborate on these examples and maybe the
> 'chunked'
> variant.

Well, if you knew the response length in advance it would be better if
you added the Content-Length header. In case you do not know it, then
the chunked encoding is the way to go.

The only thing you will have to do in order to reply a request with a
chunked response is to add the encoding header in the _add_headers()
method, and then use the cherokee_buffer_add_buffer_chunked() instead
of the cherokee_buffer_add_buffer() function. The rest is exactly the
same.

Good luck!.. and do not hesitate to ask anything :-)

--
Greetings, alo.
http://www.alobbs.com/

_step question: I really want to understand something... [ In reply to ]

skinkie at xs4all

Aug 3, 2008, 8:45 AM

Post #3 of 6 (2017 views)

Alvaro Lopez Ortega schreef:
> On 3 Aug 2008, at 14:57, Stefan de Konink wrote:
>
>> if (hdl->buffer.len > 0) {
>> cherokee_buffer_add_buffer (buffer, &hdl->buffer);
>> cherokee_buffer_move_to_begin (&hdl->buffer,
>> buffer->size);
>> if (hdl->buffer.len == 0)
>> return ret_eof_have_data;
>
> This snipped does not make much sense because a number of reasons:

[snip]

The above one fixed my 'java' issues, but I'll replace it with your variant.

>> vs
>>
>> ret_t
>> cherokee_handler_server_info_step (cherokee_handler_server_info_t *hdl,
>> cherokee_buffer_t *buffer)
>> {
>> cherokee_buffer_add_buffer (buffer, &hdl->buffer);
>> return ret_eof_have_data;
>> }
>
> In this case, the handler does know that there will only be a single
> call to the _step() method. That's why it copies all the information
> that has previously built and returns eof_have_data (the last data
> package before closing the connection. It is like ret_ok + ref_oef).

Could you please elaborate how the handler 'knows' this? I mean, if we
have a client that doesn't respect the content header, it will still
chunk won't it?

> A more generic implementation of this method, following the direction
> that you pointed with you example would be:
>
> ret_t
> cherokee_handler_example_step (cherokee_handler_example_t *hdl,
> cherokee_buffer_t *buf)
> {
> if (cherokee_buffer_is_empty (&hdl->prebuilt))
> return ret_eof;
>
> cherokee_buffer_add (buf, &hdl->prebuilt, 1024);
> cherokee_buffer_move_to_begin (&hdl->prebuilt, 1024);
>
> if (cherokee_buffer_is_empty (&hdl->prebuilt))
> return ret_eof_have_data;
>
> return ret_ok;
> }

Ok.

> This method would be iterating along an internal buffer, sending chunks
> of 1k bytes, until the buffer is empty.
>
>> Why is implementation 1 not defaultly used. Why doesn't implementation 2
>> with for example java when sending large documents? It would mean the
>> world for me if you elaborate on these examples and maybe the 'chunked'
>> variant.
>
> Well, if you knew the response length in advance it would be better if
> you added the Content-Length header. In case you do not know it, then
> the chunked encoding is the way to go.

Will adding the Content-Length change anything for the sending out of
data? Because I calculate the content length header from my buffer
length, thus it is present.

> The only thing you will have to do in order to reply a request with a
> chunked response is to add the encoding header in the _add_headers()
> method, and then use the cherokee_buffer_add_buffer_chunked() instead of
> the cherokee_buffer_add_buffer() function. The rest is exactly the same.

You always make it sound easy :D

> Good luck!.. and do not hesitate to ask anything :-)

Thanks for your wise answers :)

Stefan

_step question: I really want to understand something... [ In reply to ]

Aug 3, 2008, 9:22 AM

Post #4 of 6 (2019 views)

On 3 Aug 2008, at 17:45, Stefan de Konink wrote:
> Alvaro Lopez Ortega schreef:
>> On 3 Aug 2008, at 14:57, Stefan de Konink wrote:
>>
>>> ret_t
>>> cherokee_handler_server_info_step (cherokee_handler_server_info_t
>>> *hdl,
>>> cherokee_buffer_t *buffer)
>>> {
>>> cherokee_buffer_add_buffer (buffer, &hdl->buffer);
>>> return ret_eof_have_data;
>>> }
>>
>> In this case, the handler does know that there will only be a single
>> call to the _step() method. That's why it copies all the information
>> that has previously built and returns eof_have_data (the last data
>> package before closing the connection. It is like ret_ok + ref_oef).
>
> Could you please elaborate how the handler 'knows' this? I mean, if we
> have a client that doesn't respect the content header, it will still
> chunk won't it?

In the case of the server_info handler, it knows that it is a single
package of known length because it has just built it. Server_info
builds a reply HTML page with the server information, so each time it
receives a request it builds the HTML content, and when it's time to
iterate to send the content, it pushes it all at once. In this case,
it is fine to do that because we know that the content is no longer
than a few hundred bytes. In case it were a few Kbytes, it would have
to iterate (copy a piece and return ret_ok, until everything is sent).

>>> Why is implementation 1 not defaultly used. Why doesn't
>>> implementation 2
>>> with for example java when sending large documents? It would mean
>>> the
>>> world for me if you elaborate on these examples and maybe the
>>> 'chunked'
>>> variant.
>>
>> Well, if you knew the response length in advance it would be better
>> if
>> you added the Content-Length header. In case you do not know it, then
>> the chunked encoding is the way to go.
>
> Will adding the Content-Length change anything for the sending out of
> data? Because I calculate the content length header from my buffer
> length, thus it is present.

No, the Content-Length header is only interesting for the client
(browser).

The header is need is you want to allow the client to use Keep-Alive
connections. Well, actually you need to either send the Content-Length
or use Chunked encoding; otherwise, the server will have to close the
connection when your handler returns ret_eof.

In this case, if you know the length the best option is to add the
Content-Length header, and the iterate sending the content until you
return ret_eof_have_data or ret_eof. That will allow the server to use
Keep-alive.

>> Good luck!.. and do not hesitate to ask anything :-)
>
> Thanks for your wise answers :)

you're very welcome!

--
Greetings, alo.
http://www.alobbs.com/

_step question: I really want to understand something... [ In reply to ]

skinkie at xs4all

Aug 3, 2008, 9:39 AM

Post #5 of 6 (2015 views)

Alvaro Lopez Ortega schreef:
> In the case of the server_info handler, it knows that it is a single
> package of known length because it has just built it. Server_info builds
> a reply HTML page with the server information, so each time it receives
> a request it builds the HTML content, and when it's time to iterate to
> send the content, it pushes it all at once. In this case, it is fine to
> do that because we know that the content is no longer than a few hundred
> bytes. In case it were a few Kbytes, it would have to iterate (copy a
> piece and return ret_ok, until everything is sent).

Here you assume that the client *will* fetch the request in once because
it is small enough?

>>>> Why is implementation 1 not defaultly used. Why doesn't
>>>> implementation 2
>>>> with for example java when sending large documents? It would mean the
>>>> world for me if you elaborate on these examples and maybe the 'chunked'
>>>> variant.
>>>
>>> Well, if you knew the response length in advance it would be better if
>>> you added the Content-Length header. In case you do not know it, then
>>> the chunked encoding is the way to go.
>>
>> Will adding the Content-Length change anything for the sending out of
>> data? Because I calculate the content length header from my buffer
>> length, thus it is present.
>
> No, the Content-Length header is only interesting for the client (browser).
>
> The header is need is you want to allow the client to use Keep-Alive
> connections. Well, actually you need to either send the Content-Length
> or use Chunked encoding; otherwise, the server will have to close the
> connection when your handler returns ret_eof.
>
> In this case, if you know the length the best option is to add the
> Content-Length header, and the iterate sending the content until you
> return ret_eof_have_data or ret_eof. That will allow the server to use
> Keep-alive.

Ok.

Stefan

_step question: I really want to understand something... [ In reply to ]

Aug 3, 2008, 11:28 AM

Post #6 of 6 (2019 views)

On 3 Aug 2008, at 18:39, Stefan de Konink wrote:

> Alvaro Lopez Ortega schreef:
>> In the case of the server_info handler, it knows that it is a single
>> package of known length because it has just built it. Server_info
>> builds
>> a reply HTML page with the server information, so each time it
>> receives
>> a request it builds the HTML content, and when it's time to iterate
>> to
>> send the content, it pushes it all at once. In this case, it is
>> fine to
>> do that because we know that the content is no longer than a few
>> hundred
>> bytes. In case it were a few Kbytes, it would have to iterate (copy a
>> piece and return ret_ok, until everything is sent).
>
> Here you assume that the client *will* fetch the request in once
> because
> it is small enough?

That cannot be predicted. Actually, it does not even matter. The
client could try to read 10Tbytes, or it could read the response byte
by byte.. but, either case, it wouldn't make any difference from the
server point of view. That is the 'magic' of the buffered sockets:
both the server and the client side are buffered.

So, in this case, it is likely that - being a small response - the
client will receive it in a single step, although that is something
completely unpredictable (it depends on many factors: the MTU, how the
client OS buffers, and even how the client software reads from the
socket).

Cheers!

--
Greetings, alo.
http://www.alobbs.com/