Ticket #26 (closed enhancement: wontfix)

Opened 3 years ago

Last modified 3 years ago

Vector should bit pack unboxed boolean vectors

Reported by: ezyang Owned by:
Priority: major Milestone:
Version: Keywords:
Cc:

Description

As mentioned by dons in his Migrating from uvector to vector blog post, Bool arrays don't appear to be bit packed. They should be.

Change History

Changed 3 years ago by rl

  • status changed from new to closed
  • resolution set to wontfix

Actually, the decision not to use a bit-packed representation is a very deliberate one. There are several reasons for this.

Firstly, this is usually a trade-off between size (bit-packed arrays use less memory) and speed (accessing individual elements is slower). Usually, and 8x increase in the size of Bool arrays is not really a concern but the performance hit is. There are, of course, many cases where the opposite is true and a bit-packed representation is faster sometimes because of cache/memory effects but in my experience, bit packing is usually not worth it.

Secondly, bit-packed arrays don't fit very well into the current framework which always reads and writes individual elements. An operation like zipWith (&&) would be horribly inefficient with a bit-packed representation. It would be possible to optimise this but it's a lot of work.

Lastly, the (currently undocumented) intention is to guarantee that writing array elements of primitive types is an atomic operation. This is really important for multithreaded programs and I would really like Bool to be treated as a primitive type. This rules out a bit-packed representation.

As a final data point, vector<bool> uses a packed representation in C++ and that decision seems to be widely regarded as a mistake.

I'm closing the ticket although, of course, it would be great to also provide bit-packed arrays. But in my opinion, they are really a different data structure and, incidentally, much harder to implement efficiently.

Note: See TracTickets for help on using tickets.