Changes between Initial Version and Version 1 of Proposals/binary

Show
Ignore:
Timestamp:
11/04/10 18:12:54 (4 years ago)
Author:
igloo (IP: 81.241.228.230)
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Proposals/binary

    v1 v1  
     1 
     2= DRAFT! Not yet submitted! = 
     3 
     4= Proposal: Add binary 0.5.0.2 to the Haskell Platform = 
     5 
     6Proposal Author: Ian Lynagh 
     7 
     8Maintainer: Lennart Kolmodin, Don Stewart 
     9 
     10== Introduction == 
     11 
     12This is a proposal for the [http://hackage.haskell.org/package/binary binary] package to be included in the next 
     13major release of the Haskell platform. 
     14 
     15Everyone is invited to review this proposal, following the standard 
     16procedure for proposing and reviewing packages. 
     17 
     18    http://trac.haskell.org/haskell-platform/wiki/AddingPackages 
     19 
     20Review comments should be sent to the libraries mailing list by 
     21January 31st. 
     22 
     23== Credits == 
     24 
     25The following individuals contributed to the review process: <no-one, yet!> 
     26 
     27== Abstract == 
     28 
     29The 'binary' package provides efficient, pure binary serialisation using 
     30lazy !ByteStrings. 
     31 
     32Haskell values may be encoded to and from binary formats, written to 
     33disk as binary, or sent over the network. 
     34 
     35The binary format can either be an externally defined format, or 
     36binary's internal default format may be used if you wish only to 
     37serialise and deserialise from a Haskell program. 
     38 
     39Documentation and tarball from the hackage page: 
     40 
     41    http://hackage.haskell.org/package/binary 
     42 
     43Main development repo: 
     44 
     45    darcs get http://code.haskell.org/binary/ 
     46 
     47Active branches: 
     48 
     49    darcs get http://www.haskell.org/~kolmodin/code/binary-push 
     50 
     51    darcs get http://www.haskell.org/~kolmodin/code/binary-push-unpacked 
     52 
     53All package requirements are met. 
     54 
     55== Rationale == 
     56 
     57`binary` provides basic functionality not yet available in the Haskell Platform. 
     58 
     59`binary` has 193 [http://bifunctor.homelinux.net/~roel/cgi-bin/hackage-scripts/revdeps/binary-0.5.0.2#direct direct reverse dependencies] including `Agda`, `hxt`, `Pugs` `SHA` and `tar`. It is also used by GHC, although currently GHC's copy is renamed as `binary` is not in the HP. 
     60 
     61== The API == 
     62 
     63The API is broken up into four pieces: 
     64 
     65 * The main interface, for serialising and deserialising values: 
     66 
     67    http://hackage.haskell.org/packages/archive/binary/0.5.0.2/doc/html/Data-Binary.html 
     68 
     69 * Functions for implementing serialisation for datatypes: 
     70 
     71    http://hackage.haskell.org/packages/archive/binary/0.5.0.2/doc/html/Data-Binary-Put.html 
     72 
     73 * Functions for implementing deserialisation for datatypes: 
     74 
     75    http://hackage.haskell.org/packages/archive/binary/0.5.0.2/doc/html/Data-Binary-Get.html 
     76 
     77 * An internal type used for constructing !ByteStrings incrementally: 
     78 
     79    http://hackage.haskell.org/packages/archive/binary/0.5.0.2/doc/html/Data-Binary-Builder.html 
     80 
     81Here is an example of the basic functionality, from the haddock docs: 
     82 
     83    To serialise a custom type, an instance of Binary for that type is 
     84    required. For example, suppose we have a data structure: 
     85 
     86    {{{ 
     87    > data Exp = IntE Int 
     88    >          | OpE  String Exp Exp 
     89    >    deriving Show 
     90    }}} 
     91 
     92    We can encode values of this type into bytestrings using the 
     93    following instance, which proceeds by recursively breaking down the 
     94    structure to serialise: 
     95 
     96    {{{ 
     97    > instance Binary Exp where 
     98    >       put (IntE i)          = do put (0 :: Word8) 
     99    >                                  put i 
     100    >       put (OpE s e1 e2)     = do put (1 :: Word8) 
     101    >                                  put s 
     102    >                                  put e1 
     103    >                                  put e2 
     104    > 
     105    >       get = do t <- get :: Get Word8 
     106    >                case t of 
     107    >                     0 -> do i <- get 
     108    >                             return (IntE i) 
     109    >                     1 -> do s  <- get 
     110    >                             e1 <- get 
     111    >                             e2 <- get 
     112    >                             return (OpE s e1 e2) 
     113    }}} 
     114 
     115    Note how we write an initial tag byte to indicate each variant of the 
     116    data type. 
     117 
     118    We can simplify the writing of 'get' instances using monadic 
     119    combinators: 
     120 
     121    {{{ 
     122    >       get = do tag <- getWord8 
     123    >                case tag of 
     124    >                    0 -> liftM  IntE get 
     125    >                    1 -> liftM3 OpE  get get get 
     126    }}} 
     127 
     128    To serialise this to a bytestring, we use 'encode', which packs the 
     129    data structure into a binary format, in a lazy bytestring 
     130 
     131    {{{ 
     132    > > let e = OpE "*" (IntE 7) (OpE "/" (IntE 4) (IntE 2)) 
     133    > > let v = encode e 
     134    }}} 
     135 
     136    Where 'v' is a binary encoded data structure. To reconstruct the 
     137    original data, we use 'decode' 
     138 
     139    {{{ 
     140    > > decode v :: Exp 
     141    > OpE "*" (IntE 7) (OpE "/" (IntE 4) (IntE 2)) 
     142    }}} 
     143 
     144    The lazy !ByteString that results from 'encode' can be written to 
     145    disk, and read from disk using Data.!ByteString.Lazy IO functions, 
     146    such as hPutStr or writeFile: 
     147 
     148    {{{ 
     149    > > writeFile "/tmp/exp.txt" (encode e) 
     150    }}} 
     151 
     152    And read back with: 
     153 
     154    {{{ 
     155    > > readFile "/tmp/exp.txt" >>= return . decode :: IO Exp 
     156    > OpE "*" (IntE 7) (OpE "/" (IntE 4) (IntE 2)) 
     157    }}} 
     158 
     159    We can also directly serialise a value to and from a Handle, or a file: 
     160 
     161    {{{ 
     162    > > v <- decodeFile  "/tmp/exp.txt" :: IO Exp 
     163    > OpE "*" (IntE 7) (OpE "/" (IntE 4) (IntE 2)) 
     164    }}} 
     165 
     166    And write a value to disk 
     167 
     168    {{{ 
     169    > > encodeFile "/tmp/a.txt" v 
     170    }}} 
     171 
     172== Design decisions and random facts == 
     173 
     174 * The interface is pure, modulo IO helper functions for (de)serialising directly to files pure 
     175 * Built on top of lazy !ByteString 
     176 * Uses CPP extension 
     177 * When building with GHC, uses !MagicHash and !UnboxedTuple extensions 
     178 * Uses !FlexibleContexts extension fo this instance: 
     179   instance (Binary i, Ix i, Binary e, IArray UArray e) => Binary (UArray i e) where 
     180 * The implementation is entirely Haskell (no additional C code or libraries). 
     181 * The package provides a !QuickCheck testsuite and some benchmarks. 
     182 * The package adds no new dependencies to the HP. 
     183 * The package builds with the Simple cabal way. 
     184 * There is no existing functionality for binary serialisation in the HP. 
     185 * All but one exports have haddock docs, and many have complexity annotations. 
     186 * The code is -Wall clean 
     187 
     188== Open issues == 
     189 
     190   1. There is currently work on redesigning the parsing interface 
     191      to support incremental parsing. The work is taking place in 
     192      the `binary-push` and `binary-push-unpacked` branches, and 
     193      the changes are in the `Data.Binary.Get` module. 
     194      We may wish to accept the package with this change, rather 
     195      than adding it in its current form. 
     196 
     197== Notes == 
     198 
     199The implementation consists of 4 modules. 
     200The modules are under 2000 lines, under 1000 of which is actual code.