| | 1 | |
| | 2 | = DRAFT! Not yet submitted! = |
| | 3 | |
| | 4 | = Proposal: Add binary 0.5.0.2 to the Haskell Platform = |
| | 5 | |
| | 6 | Proposal Author: Ian Lynagh |
| | 7 | |
| | 8 | Maintainer: Lennart Kolmodin, Don Stewart |
| | 9 | |
| | 10 | == Introduction == |
| | 11 | |
| | 12 | This is a proposal for the [http://hackage.haskell.org/package/binary binary] package to be included in the next |
| | 13 | major release of the Haskell platform. |
| | 14 | |
| | 15 | Everyone is invited to review this proposal, following the standard |
| | 16 | procedure for proposing and reviewing packages. |
| | 17 | |
| | 18 | http://trac.haskell.org/haskell-platform/wiki/AddingPackages |
| | 19 | |
| | 20 | Review comments should be sent to the libraries mailing list by |
| | 21 | January 31st. |
| | 22 | |
| | 23 | == Credits == |
| | 24 | |
| | 25 | The following individuals contributed to the review process: <no-one, yet!> |
| | 26 | |
| | 27 | == Abstract == |
| | 28 | |
| | 29 | The 'binary' package provides efficient, pure binary serialisation using |
| | 30 | lazy !ByteStrings. |
| | 31 | |
| | 32 | Haskell values may be encoded to and from binary formats, written to |
| | 33 | disk as binary, or sent over the network. |
| | 34 | |
| | 35 | The binary format can either be an externally defined format, or |
| | 36 | binary's internal default format may be used if you wish only to |
| | 37 | serialise and deserialise from a Haskell program. |
| | 38 | |
| | 39 | Documentation and tarball from the hackage page: |
| | 40 | |
| | 41 | http://hackage.haskell.org/package/binary |
| | 42 | |
| | 43 | Main development repo: |
| | 44 | |
| | 45 | darcs get http://code.haskell.org/binary/ |
| | 46 | |
| | 47 | Active branches: |
| | 48 | |
| | 49 | darcs get http://www.haskell.org/~kolmodin/code/binary-push |
| | 50 | |
| | 51 | darcs get http://www.haskell.org/~kolmodin/code/binary-push-unpacked |
| | 52 | |
| | 53 | All package requirements are met. |
| | 54 | |
| | 55 | == Rationale == |
| | 56 | |
| | 57 | `binary` provides basic functionality not yet available in the Haskell Platform. |
| | 58 | |
| | 59 | `binary` has 193 [http://bifunctor.homelinux.net/~roel/cgi-bin/hackage-scripts/revdeps/binary-0.5.0.2#direct direct reverse dependencies] including `Agda`, `hxt`, `Pugs` `SHA` and `tar`. It is also used by GHC, although currently GHC's copy is renamed as `binary` is not in the HP. |
| | 60 | |
| | 61 | == The API == |
| | 62 | |
| | 63 | The API is broken up into four pieces: |
| | 64 | |
| | 65 | * The main interface, for serialising and deserialising values: |
| | 66 | |
| | 67 | http://hackage.haskell.org/packages/archive/binary/0.5.0.2/doc/html/Data-Binary.html |
| | 68 | |
| | 69 | * Functions for implementing serialisation for datatypes: |
| | 70 | |
| | 71 | http://hackage.haskell.org/packages/archive/binary/0.5.0.2/doc/html/Data-Binary-Put.html |
| | 72 | |
| | 73 | * Functions for implementing deserialisation for datatypes: |
| | 74 | |
| | 75 | http://hackage.haskell.org/packages/archive/binary/0.5.0.2/doc/html/Data-Binary-Get.html |
| | 76 | |
| | 77 | * An internal type used for constructing !ByteStrings incrementally: |
| | 78 | |
| | 79 | http://hackage.haskell.org/packages/archive/binary/0.5.0.2/doc/html/Data-Binary-Builder.html |
| | 80 | |
| | 81 | Here is an example of the basic functionality, from the haddock docs: |
| | 82 | |
| | 83 | To serialise a custom type, an instance of Binary for that type is |
| | 84 | required. For example, suppose we have a data structure: |
| | 85 | |
| | 86 | {{{ |
| | 87 | > data Exp = IntE Int |
| | 88 | > | OpE String Exp Exp |
| | 89 | > deriving Show |
| | 90 | }}} |
| | 91 | |
| | 92 | We can encode values of this type into bytestrings using the |
| | 93 | following instance, which proceeds by recursively breaking down the |
| | 94 | structure to serialise: |
| | 95 | |
| | 96 | {{{ |
| | 97 | > instance Binary Exp where |
| | 98 | > put (IntE i) = do put (0 :: Word8) |
| | 99 | > put i |
| | 100 | > put (OpE s e1 e2) = do put (1 :: Word8) |
| | 101 | > put s |
| | 102 | > put e1 |
| | 103 | > put e2 |
| | 104 | > |
| | 105 | > get = do t <- get :: Get Word8 |
| | 106 | > case t of |
| | 107 | > 0 -> do i <- get |
| | 108 | > return (IntE i) |
| | 109 | > 1 -> do s <- get |
| | 110 | > e1 <- get |
| | 111 | > e2 <- get |
| | 112 | > return (OpE s e1 e2) |
| | 113 | }}} |
| | 114 | |
| | 115 | Note how we write an initial tag byte to indicate each variant of the |
| | 116 | data type. |
| | 117 | |
| | 118 | We can simplify the writing of 'get' instances using monadic |
| | 119 | combinators: |
| | 120 | |
| | 121 | {{{ |
| | 122 | > get = do tag <- getWord8 |
| | 123 | > case tag of |
| | 124 | > 0 -> liftM IntE get |
| | 125 | > 1 -> liftM3 OpE get get get |
| | 126 | }}} |
| | 127 | |
| | 128 | To serialise this to a bytestring, we use 'encode', which packs the |
| | 129 | data structure into a binary format, in a lazy bytestring |
| | 130 | |
| | 131 | {{{ |
| | 132 | > > let e = OpE "*" (IntE 7) (OpE "/" (IntE 4) (IntE 2)) |
| | 133 | > > let v = encode e |
| | 134 | }}} |
| | 135 | |
| | 136 | Where 'v' is a binary encoded data structure. To reconstruct the |
| | 137 | original data, we use 'decode' |
| | 138 | |
| | 139 | {{{ |
| | 140 | > > decode v :: Exp |
| | 141 | > OpE "*" (IntE 7) (OpE "/" (IntE 4) (IntE 2)) |
| | 142 | }}} |
| | 143 | |
| | 144 | The lazy !ByteString that results from 'encode' can be written to |
| | 145 | disk, and read from disk using Data.!ByteString.Lazy IO functions, |
| | 146 | such as hPutStr or writeFile: |
| | 147 | |
| | 148 | {{{ |
| | 149 | > > writeFile "/tmp/exp.txt" (encode e) |
| | 150 | }}} |
| | 151 | |
| | 152 | And read back with: |
| | 153 | |
| | 154 | {{{ |
| | 155 | > > readFile "/tmp/exp.txt" >>= return . decode :: IO Exp |
| | 156 | > OpE "*" (IntE 7) (OpE "/" (IntE 4) (IntE 2)) |
| | 157 | }}} |
| | 158 | |
| | 159 | We can also directly serialise a value to and from a Handle, or a file: |
| | 160 | |
| | 161 | {{{ |
| | 162 | > > v <- decodeFile "/tmp/exp.txt" :: IO Exp |
| | 163 | > OpE "*" (IntE 7) (OpE "/" (IntE 4) (IntE 2)) |
| | 164 | }}} |
| | 165 | |
| | 166 | And write a value to disk |
| | 167 | |
| | 168 | {{{ |
| | 169 | > > encodeFile "/tmp/a.txt" v |
| | 170 | }}} |
| | 171 | |
| | 172 | == Design decisions and random facts == |
| | 173 | |
| | 174 | * The interface is pure, modulo IO helper functions for (de)serialising directly to files pure |
| | 175 | * Built on top of lazy !ByteString |
| | 176 | * Uses CPP extension |
| | 177 | * When building with GHC, uses !MagicHash and !UnboxedTuple extensions |
| | 178 | * Uses !FlexibleContexts extension fo this instance: |
| | 179 | instance (Binary i, Ix i, Binary e, IArray UArray e) => Binary (UArray i e) where |
| | 180 | * The implementation is entirely Haskell (no additional C code or libraries). |
| | 181 | * The package provides a !QuickCheck testsuite and some benchmarks. |
| | 182 | * The package adds no new dependencies to the HP. |
| | 183 | * The package builds with the Simple cabal way. |
| | 184 | * There is no existing functionality for binary serialisation in the HP. |
| | 185 | * All but one exports have haddock docs, and many have complexity annotations. |
| | 186 | * The code is -Wall clean |
| | 187 | |
| | 188 | == Open issues == |
| | 189 | |
| | 190 | 1. There is currently work on redesigning the parsing interface |
| | 191 | to support incremental parsing. The work is taking place in |
| | 192 | the `binary-push` and `binary-push-unpacked` branches, and |
| | 193 | the changes are in the `Data.Binary.Get` module. |
| | 194 | We may wish to accept the package with this change, rather |
| | 195 | than adding it in its current form. |
| | 196 | |
| | 197 | == Notes == |
| | 198 | |
| | 199 | The implementation consists of 4 modules. |
| | 200 | The modules are under 2000 lines, under 1000 of which is actual code. |