Mailing List Archive

Some small questions on streaming expressions
Hello all,

I’m trying to reindex from a collection to a new collection with a different schema, using streaming expressions. I can’t use REINDEXCOLLECTION directly, because I need to process documents a bit.

I couldn’t figure out 3 simple, related things for hours so forgive me if I just ask.

1) Is there a way to duplicate the value of a field of an incoming tuple into two fields?
I tried the select expression:
select(
echo("Hello"),
echo as echy, echo as echee
)

But when I use the same field twice, only the last “as” takes effect, it doesn’t copy the value to two fields:
{
"result-set": {
"docs": [.
{
"echee": "Hello"
},
{
"EOF": true,
"RESPONSE_TIME": 0
}
]
}
}

I accomplished this by using leftOuterJoin, with same exact stream in left and right, joining on itself with different field names. But this has the penaly of executing the same stream twice, It’s no problem for small streams but in my case there will be a couple hundred million tuples coming from the stream.


2) Is there a way to “feed” one stream’s output to two different streams? Like feeding output of a stream source to two different stream decorator without executing the same stream twice?
3) Does the “let” stream hold its entire content in memory when a stream is assigned to a variable, or does it stream continuously too? If not, I imagine it can be used for my question 2.


I’m glad that Solr has streaming expressions.

--ufuk yilmaz

Sent from Mail for Windows 10
Re: Some small questions on streaming expressions [ In reply to ]
Your first example looks like a bug to me. This may be work around for you:

select(echo("Hello"),
echo as blah,
lower(echo) as blah1)

Returns:

{ "result-set": { "docs": [. { "blah": "Hello", "blah1": "hello" }, { "EOF":
true, "RESPONSE_TIME": 0 } ] } }

The string manipulation function is working properly but the straight
mapping does not.

Your second question: can we split a stream's output to two streams.
Currently only the let expression does this I believe.

But, to your third question, the let expression does not stream, it's all
in memory. The let expression is designed for vector math over samples or
aggregations (time series).

So, right now I don't think we have a way to split a stream and operate
over it with a different set of streams.

Joel Bernstein
http://joelsolr.blogspot.com/


On Sat, Feb 27, 2021 at 4:42 PM ufuk y?lmaz <uyilmaz@vivaldi.net.invalid>
wrote:

> Hello all,
>
>
>
> I’m trying to reindex from a collection to a new collection with a
> different schema, using streaming expressions. I can’t use
> REINDEXCOLLECTION directly, because I need to process documents a bit.
>
>
>
> I couldn’t figure out 3 simple, related things for hours so forgive me if
> I just ask.
>
>
>
> 1. Is there a way to duplicate the value of a field of an incoming
> tuple into two fields?
>
> I tried the select expression:
>
> select(
>
> echo("Hello"),
>
> echo as echy, echo as echee
>
> )
>
>
>
> But when I use the same field twice, only the last “as” takes effect, it
> doesn’t copy the value to two fields:
>
> {
>
> "result-set": {
>
> "docs": [.
>
> {
>
> "echee": "Hello"
>
> },
>
> {
>
> "EOF": true,
>
> "RESPONSE_TIME": 0
>
> }
>
> ]
>
> }
>
> }
>
>
>
> I accomplished this by using leftOuterJoin, with same exact stream in left
> and right, joining on itself with different field names. But this has the
> penaly of executing the same stream twice, It’s no problem for small
> streams but in my case there will be a couple hundred million tuples coming
> from the stream.
>
>
>
>
>
> 1. Is there a way to “feed” one stream’s output to two different
> streams? Like feeding output of a stream source to two different stream
> decorator without executing the same stream twice?
> 2. Does the “let” stream hold its entire content in memory when a
> stream is assigned to a variable, or does it stream continuously too? If
> not, I imagine it can be used for my question 2.
>
>
>
>
>
> I’m glad that Solr has streaming expressions.
>
>
>
> --ufuk yilmaz
>
>
>
> Sent from Mail <https://go.microsoft.com/fwlink/?LinkId=550986> for
> Windows 10
>
>
>