Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#11318 closed bug (invalid)

Data.Text.length allocates one closure per character

Reported by: bgamari Owned by:
Priority: normal Milestone:
Component: Compiler Version: 7.10.3
Keywords: Cc:
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case:
Blocked By: Blocking:
Related Tickets: #11284 Differential Rev(s):
Wiki Page:

Description

In #11284 I noticed that map Data.Text.length applied to a list of Text values would result in code which would allocate one closure per character due to lack of lambda lifting.

It turns out it's even worse than this.

Consider this example (inspired by #11284),

module Hi2 (hello) where

import qualified Data.Text as T

hello :: Int
hello = T.length hi

hi :: T.Text
hi = T.pack "hello"

When compiled with GHC 7.10.3 with -O, the following simplified Core is produced,

hello :: Int
hello =
  case unpackCString# "hello"#
  of _ { Text dt_a33D dt1_a33E dt2_a33F ->
  let {
    a_a33C :: Int#
    a_a33C = +# dt1_a33E dt2_a33F } in
  letrec {
    $wloop_length_s3bt :: Int# -> Int# -> Int#
    $wloop_length_s3bt =
      \ (ww_s3bk :: Int#) (ww_s3bo :: Int#) ->
        case tagToEnum# @ Bool (>=# ww_s3bo a_a33C) of _ {
          False ->
            case indexWord16Array# dt_a33D ww_s3bo of r#_a35n { __DEFAULT ->
            case tagToEnum# @ Bool (geWord# r#_a35n (__word 55296)) of _ {
              False -> $wloop_length_s3bt (+# ww_s3bk 1) (+# ww_s3bo 1);
              True ->
                case tagToEnum# @ Bool (leWord# r#_a35n (__word 56319)) of _ {
                  False -> $wloop_length_s3bt (+# ww_s3bk 1) (+# ww_s3bo 1);
                  True -> $wloop_length_s3bt (+# ww_s3bk 1) (+# ww_s3bo 2)
                }
            }
            };
          True -> ww_s3bk
        }; } in
  case $wloop_length_s3bt 0 dt1_a33E of ww_s3bs { __DEFAULT ->
  I# ww_s3bs
  }
  }

Even this simple example produces a length function which allocations

Change History (2)

comment:1 Changed 4 years ago by bgamari

Resolution: invalid
Status: newclosed

The premise of this ticket is totally wrong: we aren't allocating one closure per character, we are allocating one closure per call. This is totally reasonable. Apologies for the noise.

comment:2 Changed 4 years ago by Ben Gamari <ben@…>

In 933abfa7/ghc:

rel-notes: Add note about UndecidableSuperClasses and #11762

Test Plan: Read it

Reviewers: austin, kosmikus

Reviewed By: kosmikus

Subscribers: thomie

Differential Revision: https://phabricator.haskell.org/D2111

GHC Trac Issues: #11318, #11762
Note: See TracTickets for help on using tickets.