Hello all,
I’m trying to reindex from a collection to a new collection with a different schema, using streaming expressions. I can’t use REINDEXCOLLECTION directly, because I need to process documents a bit.
I couldn’t figure out 3 simple, related things for hours so forgive me if I just ask.
1) Is there a way to duplicate the value of a field of an incoming tuple into two fields?
I tried the select expression:
select(
echo("Hello"),
echo as echy, echo as echee
)
But when I use the same field twice, only the last “as” takes effect, it doesn’t copy the value to two fields:
{
"result-set": {
"docs": [.
{
"echee": "Hello"
},
{
"EOF": true,
"RESPONSE_TIME": 0
}
]
}
}
I accomplished this by using leftOuterJoin, with same exact stream in left and right, joining on itself with different field names. But this has the penaly of executing the same stream twice, It’s no problem for small streams but in my case there will be a couple hundred million tuples coming from the stream.
2) Is there a way to “feed” one stream’s output to two different streams? Like feeding output of a stream source to two different stream decorator without executing the same stream twice?
3) Does the “let” stream hold its entire content in memory when a stream is assigned to a variable, or does it stream continuously too? If not, I imagine it can be used for my question 2.
I’m glad that Solr has streaming expressions.
--ufuk yilmaz
Sent from Mail for Windows 10
I’m trying to reindex from a collection to a new collection with a different schema, using streaming expressions. I can’t use REINDEXCOLLECTION directly, because I need to process documents a bit.
I couldn’t figure out 3 simple, related things for hours so forgive me if I just ask.
1) Is there a way to duplicate the value of a field of an incoming tuple into two fields?
I tried the select expression:
select(
echo("Hello"),
echo as echy, echo as echee
)
But when I use the same field twice, only the last “as” takes effect, it doesn’t copy the value to two fields:
{
"result-set": {
"docs": [.
{
"echee": "Hello"
},
{
"EOF": true,
"RESPONSE_TIME": 0
}
]
}
}
I accomplished this by using leftOuterJoin, with same exact stream in left and right, joining on itself with different field names. But this has the penaly of executing the same stream twice, It’s no problem for small streams but in my case there will be a couple hundred million tuples coming from the stream.
2) Is there a way to “feed” one stream’s output to two different streams? Like feeding output of a stream source to two different stream decorator without executing the same stream twice?
3) Does the “let” stream hold its entire content in memory when a stream is assigned to a variable, or does it stream continuously too? If not, I imagine it can be used for my question 2.
I’m glad that Solr has streaming expressions.
--ufuk yilmaz
Sent from Mail for Windows 10