What data structure, exactly, are deques in C++?

Question

Welcome To Ask or Share your Answers For Others

What data structure, exactly, are deques in C++?

posted Oct 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

What data structure, exactly, are deques in C++?

Is there a specific data structure that a deque in the C++ STL is supposed to implement, or is a deque just this vague notion of an array growable from both the front and the back, to be implemented however the implementation chooses?

I used to always assume a deque was a circular buffer, but I was recently reading a C++ reference here, and it sounds like a deque is some kind of array of arrays. It doesn't seem like it's a plain old circular buffer. Is it a gap buffer, then, or some other variant of growable array, or is it just implementation-dependent?

UPDATE AND SUMMARY OF ANSWERS:

It seems the general consensus is that a deque is a data structure such that:

the time to insert or remove an element should be constant at beginning or end of the list and at most linear elsewhere. If we interpret this to mean true constant time and not amortized constant time, as someone comments, this seems challenging. Some have argued that we should not interpret this to mean non-amortized constant time.
"A deque requires that any insertion shall keep any reference to a member element valid. It's OK for iterators to be invalidated, but the members themselves must stay in the same place in memory." As someone comments: This is easy enough by just copying the members to somewhere on the heap and storing T* in the data structure under the hood.
"Inserting a single element either at the beginning or end of a deque always takes constant time and causes a single call to a constructor of T." The single constructor of T will also be achieved if the data structure stores T* under the hood.
The data structure must have random access.

It seems no one knows how to get a combination of the 1st and 4th conditions if we take the first condition to be "non-amortized constant time". A linked list achieves 1) but not 4), whereas a typical circular buffer achieves 4) but not 1). I think I have an implementation that fulfills both below. Comments?

We start with an implementation someone else suggested: we allocate an array and start placing elements from the middle, leaving space in both the front and back. In this implementation, we keep track of how many elements there are from the center in both the front and back directions, call those values F and B. Then, let's augment this data structure with an auxiliary array that is twice the size of the original array (so now we're wasting a ton of space, but no change in asymptotic complexity). We will also fill this auxiliary array from its middle and give it similar values F' and B'. The strategy is this: every time we add one element to the primary array in a given direction, if F > F' or B > B' (depending on the direction), up to two values are copied from the primary array to the auxiliary array until F' catches up with F (or B' with B). So an insert operation involves putting 1 element into the primary array and copying up to 2 from the primary to the auxiliary, but it's still O(1). When the primary array becomes full, we free the primary array, make the auxiliary array the primary array, and make another auxiliary array that's yet 2 times bigger. This new auxiliary array starts out with F' = B' = 0 and having nothing copied to it (so the resize op is O(1) if a heap allocation is O(1) complexity). Since the auxiliary copies 2 elements for every element added to the primary and the primary starts out at most half-full, it is impossible for the auxiliary to not have caught up with the primary by the time the primary runs out of space again. Deletions likewise just need to remove 1 element from the primary and either 0 or 1 from the auxiliary. So, assuming heap allocations are O(1), this implementation fulfills condition 1). We make the array be of T* and use new whenever inserting to fulfill conditions 2) and 3). Finally, 4) is fulfilled because we are using an array structure and can easily implement O(1) access.

question from:https://stackoverflow.com/questions/8627373/what-data-structure-exactly-are-deques-in-c

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T05:13:08+0000

(Making this answer a community-wiki. Please get stuck in.)

First things first: A deque requires that any insertion to the front or back shall keep any reference to a member element valid. It's OK for iterators to be invalidated, but the members themselves must stay in the same place in memory. This is easy enough by just copying the members to somewhere on the heap and storing T* in the data structure under the hood. See this other StackOverflow question " About deque<T>'s extra indirection "

(vector doesn't guarantee to preserve either iterators or references, whereas list preserves both).

So let's just take this 'indirection' for granted and look at the rest of the problem. The interesting bit is the time to insert or remove from the beginning or end of the list. At first, it looks like a deque could trivially be implemented with a vector, perhaps by interpreting it as a circular buffer.

BUT A deque must satisfy "Inserting a single element either at the beginning or end of a deque always takes constant time and causes a single call to a constructor of T."

Thanks to the indirection we've already mentioned, it's easy to ensure there is just one constructor call, but the challenge is to guarantee constant time. It would be easy if we could just use constant amortized time, which would allow the simple vector implementation, but it must be constant (non-amortized) time.

Categories

What data structure, exactly, are deques in C++?

What data structure, exactly, are deques in C++?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags