Ticket #75 (closed enhancement: wontfix)

Opened 16 months ago

Last modified 16 months ago

Pack unboxed Bool vectors tightly

Reported by: ds Owned by:
Priority: major Milestone:
Version: Keywords:
Cc:

Description

Hello,

we currently have:

newtype instance Vector Bool = V_Bool (P.Vector Word8)

but apparently, only one Bool is stored in each Word8, instead of the possible 8. Is this deliberately so?

Change History

Changed 16 months ago by rl

  • status changed from new to closed
  • resolution set to wontfix

Yes, this is quite deliberate. There two reasons for this.

Firstly, most vector operations are implemented in terms of reading and writing individual elements. This would be horrendously slow to do for individual bits (it's already quite slow for bytes). There is some work being done on this but that'll take a while. But even if/when this is solved, something like zipWith (&&) would still be very slow, you'd have to use specialised collective operations. In general a bit-packed representation uses less memory but provides much slower access to individual elements. This is not the trade-off that unboxed and storable vectors make in general so doing this just for Bool would be quite inconsistent, IMO.

Secondly, a bit-packed representation doesn't allow parallel access to individual elements, for obvious reasons. AFAIK, on all platforms that GHC runs on (basically x86 and Sparc), byte and half-word writes are essentially atomic so concurrent writes to different elements of an unboxed or storable vector don't interfere with each other. This is a very nice property to have and DPH, for instance, relies on it.

That said, a separate generic packed vector type would be very nice to have.

Changed 16 months ago by ds

Good points; thanks for explaining.

FWIW, I'm probably biased by my current use case, where I have a large number of small bitvectors, and the work consists of virtually nothing but aggregate operations.

Ended up making a specialized class for that (with instances Word, Word64 and Vectors of these, and choice of instance based on a known-at-runtime size, which is often <= 64).

Note: See TracTickets for help on using tickets.