If I perform a side-effecting/mutating operation on individual data structures specific to each member of lazy sequence using map
, do I need to (a) call doall
first, to force realization of the original sequence before performing the imperative operations, or (b) call doall
to force the side-effects to occur before I map a functional operation over the resulting sequence?
I believe that no doall
s are necessary when there are no dependencies between elements of any sequence, since map
can't apply a function to a member of a sequence until the functions from map
s that produced that sequence have been applied to the corresponding element of the earlier sequence. Thus, for each element, the functions will be applied in the proper sequence, even though one of the functions produces side effects that a later function depends on. (I know that I can't assume that any element a will have been modified before element b is, but that doesn't matter.)
Is this correct?
That's the question, and if it's sufficiently clear, then there's no need to read further. The rest describes what I'm trying to do in more detail.
My application has a sequence of defrecord structures ("agents") each of which contains some core.matrix vectors (vec1
, vec2
) and a core.matrix matrix (mat
). Suppose that for the sake of speed, I decide to (destructively, not functionally) modify the matrix.
The program performs the following three steps to each of the agents by calling map
, three times, to apply each step to each agent.
- Update a vector
vec1
in each agent, functionally, using assoc
.
- Modify a matrix
mat
in each agent based on the preceding vector (i.e. the matrix will retain a different state).
- Update a vector
vec2
in each agent using assoc
based on the state of the matrix produced by step 2.
For example, where persons
is a sequence, possibly lazy (EDIT: Added outer doall
s):
(doall
(->> persons
(map #(assoc % :vec1 (calc-vec1 %))) ; update vec1 from person
(map update-mat-from-vec1!) ; modify mat based on state of vec1
(map #(assoc % :vec2 (calc-vec2-from-mat %))))) ; update vec2 based on state of mat
Alternatively:
(doall
(map #(assoc % :vec2 (calc-vec2-from-mat %)) ; update vec2 based on state of mat
(map update-mat-from-vec1! ; modify mat based on state of vec1
(map #(assoc % :vec1 (calc-vec1 %)) persons)))) ; update vec1 from person
Note that no agent's state depends on the state of any other agent at any point. Do I need to add doall
s?
EDIT: Overview of answers as of 4/16/2014:
I recommend reading all of the answers given, but it may seem as if they conflict. They don't, and I thought it might be useful if I summarized the main ideas:
(1) The answer to my question is "Yes": If, at the end of the process I described, one causes the entire lazy sequence to be realized, then what is done to each element will occur according to the correct sequence of steps (1, 2, 3). There is no need to apply doall
before or after step 2, in which each element's data structure is mutated.
(2) But: This is a very bad idea; you are asking for trouble in the future. If at some point you inadvertently end up realizing all or part of the sequence at a time other than what you originally intended, it could turn out that the later steps get values from the data structure that were put there at at the wrong time--at a time other than what you expect. The step that mutates a per-element data structure won't happen until a given element of the lazy seq is realized, so if you realize it at the wrong time, you could get the wrong data in later steps. This could be the kind of bug that is very difficult to track down. (Thanks to @A.Webb for making this problem very clear.)
See Question&Answers more detail:
os