I'm not sure if this approach works on CosmosDB, but it may well be the only approach you can realistically take with Gremlin that doesn't involve multiple Gremlin requests and extra processing. I use this sample graph for demonstration:
g = TinkerGraph.open().traversal()
g.addV().property(id,'A').as('a').
addV().property(id,'B').as('b').
addV().property(id,'C').as('c').
addV().property(id,'D').as('d').
addV().property(id,'E').as('e').
addV().property(id,'F').as('f').
addE('next').from('a').to('b').
addE('next').from('b').to('c').
addE('next').from('b').to('d').
addE('next').from('c').to('e').
addE('next').from('d').to('e').
addE('next').from('e').to('f').iterate()
The approach involves use of group()
where you essentially form a new grouping of traversed objects each time you loop through the repeat()
:
gremlin> g.V('A').
......1> group('m').by(constant(-1)).
......2> repeat(out().group('m').by(loops())).
......3> cap('m')
==>[-1:[v[A]],0:[v[B]],1:[v[C],v[D]],2:[v[E],v[E]],3:[v[F],v[F]]]
That gives you the structure of the data you want to process, now you just need to make sure you terminate the repeat()
as early as possible:
gremlin> g.V('A').
......1> group('m').by(constant(-1)).
......2> until(cap('m').select(values).unfold().count(local).sum().is(gte(3))).
......3> repeat(out().group('m').by(loops())).
......4> cap('m')
==>[-1:[v[A]],0:[v[B]],1:[v[C],v[D]]]
In the above example, we look at "m" in the until()
and do a count of all the vertices collected so far. When it exceeds our max ,in this case "3", we quit. When we quit, we can see that we may or may not have collected more than we needed. In this example we did, so we need to throw that away. You technically need all but the last grouping to satisfy your limit, but unfortunately "all but last" is not easy with Gremlin. I ended up with this approach which basically grabs the last item to throw away and then uses it as a filter against the result. Note that we get two results because traversing to that next level would exceed our limit of "3" results total:
gremlin> g.V('A').
......1> group('m').by(constant(-1)).
......2> until(cap('m').select(values).unfold().count(local).sum().is(gt(3))).
......3> repeat(out().group('m').by(loops())).
......4> cap('m').
......5> select(values).as('v').
......6> tail(local).as('e').
......7> select('v').unfold().
......8> where(P.neq('e')).
......9> unfold()
==>v[A]
==>v[B]
Note that when we bump from a limit of "3" to "4" the result changes as traversing to the next level will add 2 more to the total but will not exceed 4 total.
gremlin> g.V('A').
......1> group('m').by(constant(-1)).
......2> until(cap('m').select(values).unfold().count(local).sum().is(gt(4))).
......3> repeat(out().group('m').by(loops())).
......4> cap('m').
......5> select(values).as('v').
......6> tail(local).as('e').
......7> select('v').unfold().
......8> where(P.neq('e')).
......9> unfold()
==>v[A]
==>v[B]
==>v[C]
==>v[D]
This next example notes that we don't take duplicates into account with this as it's not clear from your use case what's expected (or if this will even work) but hopefully this provides enough structure for you to at least form the traversal you're looking for:
gremlin> g.V('A').
......1> group('m').by(constant(-1)).
......2> until(cap('m').select(values).unfold().count(local).sum().is(gt(5))).
......3> repeat(out().group('m').by(loops())).
......4> cap('m').
......5> select(values).as('v').
......6> tail(local).as('e').
......7> select('v').unfold().
......8> where(P.neq('e')).
......9> unfold()
==>v[A]
==>v[B]
==>v[C]
==>v[D]
Thanks to Kelvin Lawrence for suggesting the general approach I've taken in this answer.