Mailing List Archive

How to de-serialize json?
Following up on our recent simulating discussion on adding an API to an
application, I wonder is someone can help me understand something:

Catalyst uses HTTP::Body to parse body content. It currently handles these
request content types:

our $TYPES = {
'application/octet-stream' => 'HTTP::Body::OctetStream',
'application/x-www-form-urlencoded' => 'HTTP::Body::UrlEncoded',
'multipart/form-data' => 'HTTP::Body::MultiPart',
'multipart/related' => 'HTTP::Body::XFormsMultipart',
'application/xml' => 'HTTP::Body::XForms'
};

But, Catalyst::Controller::DBIC::API and Catalyst::Controller::Rest both use
Catalyst::Action::Deserialize.

My question is this: why use an action class instead of extending
HTTP::Body to deserialize the content? Isn't it HTTP::Body's job to decode
the body based on the content-type of the request?

I'm just wondering if I'm missing some important reason why these other
request content types are handled differently.

Seems like HTTP::Body is the correct place to do all decoding. Decoded
JSON, for example, would just end up in $c->req->params and controllers
could be oblivious to the encoding of the request (similar to how we don't
really care how params are decoded if the body is x-www-form-urlencoded or
form-data). True, could end up with a request parameter that is a hashref,
but I don't see anything wrong with that as long as parameters are validated
correctly.

So, why different approaches to decoding request body content?



--
Bill Moseley
moseley@hank.org
Re: How to de-serialize json? [ In reply to ]
On Sat, Jan 23, 2010 at 10:16 AM, Bill Moseley <moseley@hank.org> wrote:
> Following up on our recent simulating discussion on adding an API to an
> application, I wonder is someone can help me understand something:
>
> Catalyst uses HTTP::Body to parse body content.  It currently handles these
> request content types:
>
> our $TYPES = {
>     'application/octet-stream'          => 'HTTP::Body::OctetStream',
>     'application/x-www-form-urlencoded' => 'HTTP::Body::UrlEncoded',
>     'multipart/form-data'               => 'HTTP::Body::MultiPart',
>     'multipart/related'                 => 'HTTP::Body::XFormsMultipart',
>     'application/xml'                   => 'HTTP::Body::XForms'
> };
>
> But, Catalyst::Controller::DBIC::API and Catalyst::Controller::Rest both use
> Catalyst::Action::Deserialize.
>
> My question is this:  why use an action class instead of extending
> HTTP::Body to deserialize the content?  Isn't it HTTP::Body's job to decode
> the body based on the content-type of the request?
>
> I'm just wondering if I'm missing some important reason why these other
> request content types are handled differently.
>
> Seems like HTTP::Body is the correct place to do all decoding.  Decoded
> JSON, for example, would just end up in $c->req->params and controllers
> could be oblivious to the encoding of the request (similar to how we don't
> really care how params are decoded if the body is x-www-form-urlencoded or
> form-data).   True, could end up with a request parameter that is a hashref,
> but I don't see anything wrong with that as long as parameters are validated
> correctly.
>
> So, why different approaches to decoding request body content?
>
>

Well, I never really equated deserialization to decoding, so my answer
may not be fully satisfactory.

If I assume that decoding is synonymous with de-serialization, it
makes more sense. At first thought, I just don't think they're that
similar, though. Maybe in implementations (comparing JSON to HTTP
POST parameters) it is, but in the case of HTTP::Body decoding a
mime64-encoded JPEG, it isn't at all.

>From a behavior standpoint, having a POST/PUT'd JSON segment that ends
up in ->params would be maddening to me. They aren't parameters, not
even in the loosest of the RFC interpretations.

I can appreciate wanting to increase the reusability, and having a
deserialization component in HTTP::Body (which, in turn could be used
for Form, etc).

If HTTP::Body could support this, Catalyst::Action::REST wouldn't be
tremendously different, it has the Deserialize action so you can
specify arbitrary deserialization schemes (after the body is decoded).
You'd still need this behavior, still have the action.

Not a bad idea, those are my thoughts on it... and to summarize in one
sentence: it does seem like a good idea that could end up with a lot
of hacking and not a lot of practical savings.

-J

_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/
Re: How to de-serialize json? [ In reply to ]
On Sat, Jan 23, 2010 at 1:40 PM, J. Shirley <jshirley@gmail.com> wrote:

>
>
> If I assume that decoding is synonymous with de-serialization, it
> makes more sense. At first thought, I just don't think they're that
> similar, though. Maybe in implementations (comparing JSON to HTTP
> POST parameters) it is, but in the case of HTTP::Body decoding a
> mime64-encoded JPEG, it isn't at all.
>


With a jpeg I assume the content type would be form-data (that included an
upload in the form) where the file ends up in $req->uploads, not as a
request parameter.

I find decoding requests analogous to Views. In my apps controllers take
input (params, arguments and uploads) and the result is data in the stash.
Then the View has the job of serializing (normally to HTML via template,
but no reason it can't be JSON or anything else). In fact I have many
controller actions that are used for both normal full-page HTTP requests and
AJAX requests. So, similar to how the controller action does not know or
care what view is going to be used, the controller action doesn't know or
care how the request is serialized over the wire. That's how I picture it.



>
> >From a behavior standpoint, having a POST/PUT'd JSON segment that ends
> up in ->params would be maddening to me. They aren't parameters, not
> even in the loosest of the RFC interpretations.
>

I'm trying to understand that point of view. Why is that maddening? If you
have a request serialized as json then $req->parameters would go unused and
instead have the deserialzed request end up some place else, say as
$req->data?

I have an XMLRPC API to an application. I implemented it by creating an
HTTP::Body subclass that parses the XMLRPC XML body. The <method> ends up
mapped to an action, the <params> ends up as body parameters, and <base64>
elements end up as uploads. As a result existing controller actions can be
used for both XMLRPC request and for normal web requests. All I have to do
to expose the action in the API is add XMLRPC( $method_name ) as a action
attribute.

Catalyst::Engine hard-codes HTTP::Body. I think it would be more flexible
if the body parser class could be a config option (to allow easy
sub-classing), -- similar to how the request class can be defined -- but
it's not that difficult to set up now. Just have to add the content type
to %HTTP::Body::TYPES.

Thanks for your comments, I appreciate the feedback.


--
Bill Moseley
moseley@hank.org
Re: How to de-serialize json? [ In reply to ]
Excerpts from Bill Moseley's message of Sat Jan 23 19:45:28 -0500 2010:
> On Sat, Jan 23, 2010 at 1:40 PM, J. Shirley <jshirley@gmail.com> wrote:
>
> >
> >
> > If I assume that decoding is synonymous with de-serialization, it
> > makes more sense. At first thought, I just don't think they're that
> > similar, though. Maybe in implementations (comparing JSON to HTTP
> > POST parameters) it is, but in the case of HTTP::Body decoding a
> > mime64-encoded JPEG, it isn't at all.
> >
>
>
> With a jpeg I assume the content type would be form-data (that included an
> upload in the form) where the file ends up in $req->uploads, not as a
> request parameter.

That assumption may hold true for *your* applications, but he didn't say
anything about a form or even a web browser. It's perfectly reasonable,
especially in the context of REST APIs, to talk about non-form-based request
bodies. (I've written actions that accepted PUT requests with a content-type
of application/vnd.ms-excel, for example.)

hdp.

_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/
Re: How to de-serialize json? [ In reply to ]
On Sat, Jan 23, 2010 at 5:39 PM, Hans Dieter Pearcey <
hdp.perl.catalyst.users@weftsoar.net> wrote:

> Excerpts from Bill Moseley's message of Sat Jan 23 19:45:28 -0500 2010:
>
> > With a jpeg I assume the content type would be form-data (that included
> an
> > upload in the form) where the file ends up in $req->uploads, not as a
> > request parameter.
>
> That assumption may hold true for *your* applications, but he didn't say
> anything about a form or even a web browser. It's perfectly reasonable,
> especially in the context of REST APIs, to talk about non-form-based
> request
> bodies. (I've written actions that accepted PUT requests with a
> content-type
> of application/vnd.ms-excel, for example.)
>

But, that's a different content type. I assumed form-data. So, in this
case in my "HTTP::Body deserialization layer" approach, I'd thus add:

$HTTP::Body::Types('application/vnd.ms-excel') = 'My::ExcelParser';

which would result in $req->uploads having an upload object for the
spreadsheet. My::ExcelParser would probably be a thin sub-class of
something like My::Upload. Then the same controller would work with both a
web request with an upload field or this REST request and expect to find the
upload object in $req->uploads.



--
Bill Moseley
moseley@hank.org
Re: How to de-serialize json? [ In reply to ]
Excerpts from Bill Moseley's message of Sat Jan 23 21:47:00 -0500 2010:
> On Sat, Jan 23, 2010 at 5:39 PM, Hans Dieter Pearcey <
> hdp.perl.catalyst.users@weftsoar.net> wrote:
>
> > Excerpts from Bill Moseley's message of Sat Jan 23 19:45:28 -0500 2010:
> >
> > > With a jpeg I assume the content type would be form-data (that included
> > an
> > > upload in the form) where the file ends up in $req->uploads, not as a
> > > request parameter.
> >
> > That assumption may hold true for *your* applications, but he didn't say
>
> But, that's a different content type. I assumed form-data. So, in this

You said: What about extending HTTP::Body, e.g. to decode JSON into body_params?

jshirley said: Ugh. Also, what about (non-param-like) things like jpegs?

You said: Well, they'd be file uploads.

I said: You might like that, but you can't assume everyone would, and the
request might not even have a form content-type.

You said: Well, they'd be file uploads.

Me, writing this message: ???

As far as I can tell, you missed the point of my message, which makes me wonder
if I've missed the point of yours. Are you talking about a set of conventions
you'd like to be able to build for your own use on top of HTTP::Body, or a set
of conventions that you expect everyone will want and so should be built into
HTTP::Body, or something else entirely?

hdp.

_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/
Re: How to de-serialize json? [ In reply to ]
On Sat, Jan 23, 2010 at 7:20 PM, Hans Dieter Pearcey <
hdp.perl.catalyst.users@weftsoar.net> wrote:

>
> As far as I can tell, you missed the point of my message, which makes me
> wonder
> if I've missed the point of yours. Are you talking about a set of
> conventions
> you'd like to be able to build for your own use on top of HTTP::Body, or a
> set
> of conventions that you expect everyone will want and so should be built
> into
> HTTP::Body, or something else entirely?
>

I thought you were saying that the request might not be a normal form
posting, and I was saying only that HTTP::Body can support that, too.
I was not suggesting everyone should use one method over another.

HTTP::Body seems (to me) like the natural place to deserialize. Yet, the
REST modules I cited use an action class to deserialize. Thus, I was
wondering if there was a specific reasons for that approach that I had not
understood. That's really all.





--
Bill Moseley
moseley@hank.org
Re: How to de-serialize json? [ In reply to ]
On Sun, Jan 24, 2010 at 8:01 AM, Bill Moseley <moseley@hank.org> wrote:
>
>
> On Sat, Jan 23, 2010 at 7:20 PM, Hans Dieter Pearcey
> <hdp.perl.catalyst.users@weftsoar.net> wrote:
>>
>> As far as I can tell, you missed the point of my message, which makes me
>> wonder
>> if I've missed the point of yours.  Are you talking about a set of
>> conventions
>> you'd like to be able to build for your own use on top of HTTP::Body, or a
>> set
>> of conventions that you expect everyone will want and so should be built
>> into
>> HTTP::Body, or something else entirely?
>
> I thought you were saying that the request might not be a normal form
> posting, and I was saying only that HTTP::Body can support that, too.
> I was not suggesting everyone should use one method over another.
> HTTP::Body seems (to me) like the natural place to deserialize.  Yet, the
> REST modules I cited use an action class to deserialize.  Thus, I was
> wondering if there was a specific reasons for that approach that I had not
> understood.  That's really all.

I cannot claim to understand all the concerns here - but to add my two
cents: it sounds like this deserialisation thing is not something
specific to Catalyst and now with other frameworks and libraries
gaining grounds - it would make sense to put that logic into something
reusable across them.


--
Zbigniew Lukasiak
http://brudnopis.blogspot.com/
http://perlalchemy.blogspot.com/

_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/
Re: How to de-serialize json? [ In reply to ]
On Tue, Jan 26, 2010 at 1:49 AM, Zbigniew Lukasiak <zzbbyy@gmail.com> wrote:
> On Sun, Jan 24, 2010 at 8:01 AM, Bill Moseley <moseley@hank.org> wrote:
>>
>>
>> On Sat, Jan 23, 2010 at 7:20 PM, Hans Dieter Pearcey
>> <hdp.perl.catalyst.users@weftsoar.net> wrote:
>>>
>>> As far as I can tell, you missed the point of my message, which makes me
>>> wonder
>>> if I've missed the point of yours.  Are you talking about a set of
>>> conventions
>>> you'd like to be able to build for your own use on top of HTTP::Body, or a
>>> set
>>> of conventions that you expect everyone will want and so should be built
>>> into
>>> HTTP::Body, or something else entirely?
>>
>> I thought you were saying that the request might not be a normal form
>> posting, and I was saying only that HTTP::Body can support that, too.
>> I was not suggesting everyone should use one method over another.
>> HTTP::Body seems (to me) like the natural place to deserialize.  Yet, the
>> REST modules I cited use an action class to deserialize.  Thus, I was
>> wondering if there was a specific reasons for that approach that I had not
>> understood.  That's really all.
>
> I cannot claim to understand all the concerns here - but to add my two
> cents: it sounds like this deserialisation thing is not something
> specific to Catalyst and now with other frameworks and libraries
> gaining grounds - it would make sense to put that logic into something
> reusable across them.


I'm all for reusable code, but in no way should HTTP::Body start
taking this behavior by default. I'm not really that sure how
effective it is, anyway.

decode_json( $c->req->body ); Is just not that hard :)


-J

_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/
Re: How to de-serialize json? [ In reply to ]
Zbigniew Lukasiak wrote:
> I cannot claim to understand all the concerns here - but to add my two
> cents: it sounds like this deserialisation thing is not something
> specific to Catalyst and now with other frameworks and libraries
> gaining grounds - it would make sense to put that logic into something
> reusable across them.

You mean like Data::Serializer, which is what C::A::REST uses?

Cheers
t0m


_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/
Re: How to de-serialize json? [ In reply to ]
On Tue, Jan 26, 2010 at 6:33 AM, J. Shirley <jshirley@gmail.com> wrote:

>
>
> I'm all for reusable code, but in no way should HTTP::Body start
> taking this behavior by default. I'm not really that sure how
> effective it is, anyway.
>

No, I was not suggesting that would be the default (although I'm not sure
why not handling other serializations by default is a bad idea). Not sure
what you mean by "effective".


> decode_json( $c->req->body ); Is just not that hard :)
>

Of course it's not that hard. Of course, this isn't hard, either: [1]

map { s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg; $_ } split /[&;=]/,
$c->req->body;

I see those as similar operations. The request is serialized in both cases.

But, one should not have to worry about adding low-level details like that
to application code when using an elegant web framework. ;)


No big deal. I was just curious why the HTTP::Body approach was not used in
the existing REST/RPC modules, as that was already the place used by
Catalyst to de-serialize the body. I thought maybe there was a reason I
might not understood, which is why I asked.

[1] Or whatever the correct approach is, and apologies to Damian for the
map.

--
Bill Moseley
moseley@hank.org
Re: How to de-serialize json? [ In reply to ]
On Wed, Jan 27, 2010 at 7:33 AM, Bill Moseley <moseley@hank.org> wrote:
>
>
> On Tue, Jan 26, 2010 at 6:33 AM, J. Shirley <jshirley@gmail.com> wrote:
>>
>>
>> I'm all for reusable code, but in no way should HTTP::Body start
>> taking this behavior by default.  I'm not really that sure how
>> effective it is, anyway.
>
> No, I was not suggesting that would be the default (although I'm not sure
> why not handling other serializations by default is a bad idea).  Not sure
> what you mean by "effective".
>
>>
>> decode_json( $c->req->body ); Is just not that hard :)
>
> Of course it's not that hard.  Of course, this isn't hard, either:  [1]
>
> map { s/%([0-9A-Fa-f]{2})/chr(hex($1))/eg; $_ } split /[&;=]/,
> $c->req->body;
>
> I see those as similar operations.  The request is serialized in both cases.
>
> But, one should not have to worry about adding low-level details like that
> to application code when using an elegant web framework. ;)
>
>
> No big deal.  I was just curious why the HTTP::Body approach was not used in
> the existing REST/RPC modules, as that was already the place used by
> Catalyst to de-serialize the body.  I thought maybe there was a reason I
> might not understood, which is why I asked.
>
> [1] Or whatever the correct approach is, and apologies to Damian for the
> map.
>

(Really not trying to beat a dead horse, but I think this thread does
have some useful thoughts in it so I'd like to just summarize)

Well, I think that Data::Serializer is what you are after. As far as
putting that feature directly in HTTP::Body, I'm not really sure what
it gains you since deserializing is not something you always want to
do unless your application can specify it. Decoding, however, is
something you do want regardless of application.

And if your application specifies it, then it seems reasonable to me
to put it in the framework layer. It seems like you could probably
have a request trait that could either automatically or lazily
deserialize things, and that would make everybody happy. So,
something like Catalyst::TraitFor::Request::Deserialize would be
useful. Alternatively, you could also write this as a Plack
middleware.

-J

_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/
Re: How to de-serialize json? [ In reply to ]
On 27 Jan 2010, at 15:33, Bill Moseley wrote:
>
> No big deal. I was just curious why the HTTP::Body approach was not
> used in the existing REST/RPC modules, as that was already the place
> used by Catalyst to de-serialize the body. I thought maybe there
> was a reason I might not understood, which is why I asked.

HTTP::Body isn't really structured for this - you can (and I _do_ in
one of my apps) add or override the content type handlers.

However as it's class data, this is perl interpreter wide - which
means that two different applications with conflicting requirements
can't exist in the same mod_perl interpreter - not awesome.

Given that the serialization/deserialization isn't hard as is (as
noted elsewhere in the thread), I guess that the (potential) overlap
isn't too great to be worth trying to do something about this..

Cheers
t0m


_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/
Re: How to de-serialize json? [ In reply to ]
On Wed, Jan 27, 2010 at 1:16 PM, Tomas Doran <bobtfish@bobtfish.net> wrote:

>
> On 27 Jan 2010, at 15:33, Bill Moseley wrote:
>
>>
>> No big deal. I was just curious why the HTTP::Body approach was not used
>> in the existing REST/RPC modules, as that was already the place used by
>> Catalyst to de-serialize the body. I thought maybe there was a reason I
>> might not understood, which is why I asked.
>>
>
> HTTP::Body isn't really structured for this - you can (and I _do_ in one of
> my apps) add or override the content type handlers.
>
> However as it's class data, this is perl interpreter wide - which means
> that two different applications with conflicting requirements can't exist in
> the same mod_perl interpreter - not awesome.
>

I see. So you are saying that Content-Type: application/json might need to
be deserialized differently in different applications that share the same
interpreter? Obviously, json could decode into an array ref so that could
not be mapped to request parameters. Is that what you mean? Or that an
application might want the raw json?

--
Bill Moseley
moseley@hank.org
Re: How to de-serialize json? [ In reply to ]
On 28 Jan 2010, at 00:14, Bill Moseley wrote:

> Or that an application might want the raw json?

Or that one app depends on JSON::XS doing the decoding whilst one
depends on JSON.pm..

Cheers
t0m


_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/
Re: How to de-serialize json? [ In reply to ]
Excerpts from Bill Moseley's message of Wed Jan 27 19:14:29 -0500 2010:
> I see. So you are saying that Content-Type: application/json might need to
> be deserialized differently in different applications that share the same
> interpreter? Obviously, json could decode into an array ref so that could
> not be mapped to request parameters. Is that what you mean? Or that an
> application might want the raw json?

Well, we know from this thread that the last one, at least, is true; you would
like it to go into body_params and I wouldn't.

hdp.

_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/
Re: How to de-serialize json? [ In reply to ]
On Wed, Jan 27, 2010 at 4:14 PM, Bill Moseley <moseley@hank.org> wrote:

> I see.  So you are saying that Content-Type: application/json might need to
> be deserialized differently in different applications that share the same
> interpreter?  Obviously, json could decode into an array ref so that could
> not be mapped to request parameters.  Is that what you mean?  Or that an
> application might want the raw json?
>

I actually have two applications that expect and need the raw data,
rather than the serialized response.

In one case, it's a simple relay mechanism (unreliable system POSTs
JSON feed, and reliable system then rebroadcasts it) similar to a
pubsub model. It is much faster to just relay the data rather than
deserialize.

Another case, storing the data in something like MogileFS. I don't
necessarily want to deserialize it but I do need the mime-type from
the uploads. I tend to be the type that store data as I receive it,
then clean it separately as needed.

Half-way contrived examples, but based off of real world stuff I have
done/am doing.

-J

_______________________________________________
List: Catalyst@lists.scsys.co.uk
Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst
Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/
Dev site: http://dev.catalyst.perl.org/