Opened 3 years ago

Last modified 7 months ago

#13002 patch bug

:set -O does not work in .ghci file

Reported by: George Owned by: RolandSenn
Priority: normal Milestone: 8.10.1
Component: GHCi Version: 8.0.1
Keywords: RecompilationCheck Cc: dfeuer, nh2
Operating System: Unknown/Multiple Architecture: Unknown/Multiple
Type of failure: Runtime performance bug Test Case: T13002
Blocked By: Blocking:
Related Tickets: #8635 #9370 Differential Rev(s): https://gitlab.haskell.org/ghc/ghc/merge_requests/480
Wiki Page:

Description (last modified by George)

{-# OPTIONS_GHC -Wall #-}

module Foo where
        
testFromTo :: Int -> Int
testFromTo n = length ([0..(10^n)] :: [Int])
 cat ~/.ghci
:set +s
:set -fobject-code 
:set -O
bash-3.2$ touch Foo.hs
bash-3.2$ ghci
GHCi, version 8.0.1: http://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from /Users/gcolpitts/.ghci
Prelude> :load Foo
:load Foo
[1 of 1] Compiling Foo              ( Foo.hs, Foo.o )
Ok, modules loaded: Foo (Foo.o).
(0.15 secs,)
Prelude Foo> testFromTo 5
testFromTo 5
100001
(0.02 secs, 8,885,888 bytes)
Prelude Foo> :quit
:quit
Leaving GHCi.
bash-3.2$ touch Foo.hs
bash-3.2$ ghci -fobject-code -O
GHCi, version 8.0.1: http://www.haskell.org/ghc/  :? for help
Loaded GHCi configuration from /Users/gcolpitts/.ghci
Prelude> :load Foo
:load Foo
[1 of 1] Compiling Foo              ( Foo.hs, Foo.o )
Ok, modules loaded: Foo (Foo.o).
(0.15 secs,)
Prelude Foo> testFromTo 5
testFromTo 5
100001
(0.02 secs, 98,400 bytes)

While supplying -fobject-code -O as an argument to ghci seems like an easy workaround; it isn't feasible, as far as I know, when using emacs thus setting priority to normal rather than low.

Attachments (2)

Foo-if (3.6 KB) - added by dfeuer 21 months ago.
The interface produced with -O on the command line
Foo-if-leak (3.6 KB) - added by dfeuer 21 months ago.
The interface produced *without* -O on the command line

Download all attachments as: .zip

Change History (22)

comment:1 Changed 2 years ago by George

Architecture: x86_64 (amd64)Unknown/Multiple
Operating System: MacOS XUnknown/Multiple

comment:2 Changed 2 years ago by George

Type of failure: OtherRuntime performance bug

In the context of ghci, the ignored -O in the .ghci file results in a runtime performance bug. Thus I changed the type of failure to runtime performance bug (in the context of ghci which is the component).

comment:3 Changed 2 years ago by George

Bug is present in 8.2.1-rc1 also.

Note this is about the behavior of these flags when there is no .o file before starting ghci. This bug says nothing about the behavior of these flags when there is a .o file before starting ghci. For that see #13604

Last edited 2 years ago by George (previous) (diff)

comment:4 Changed 2 years ago by bgamari

Milestone: 8.2.1

It would be nice if we could get this fixed for 8.2.

comment:5 Changed 2 years ago by bgamari

Milestone: 8.2.18.4.1

Bump off to 8.4.

comment:6 Changed 2 years ago by bgamari

Keywords: RecompilationCheck added

comment:7 Changed 2 years ago by George

Description: modified (diff)

comment:8 Changed 21 months ago by dfeuer

Milestone: 8.4.18.0.3
Resolution: worksforme
Status: newclosed

I'm not at all convinced that this bug is valid at all. I suspect the timing difference you saw was some sort of insignificant artifact; perhaps the extra command line options are affecting GC timings. I tested with a version that includes an utterly bogus RULES pragma:

{-# OPTIONS_GHC -Wall #-}

module T13002 where

testFromTo :: Int -> Int
testFromTo n = length ([0..(10^n)] :: [Int])
{-# NOINLINE testFromTo #-}

test :: Int -> Int
test x = testFromTo x

{-# RULES
"bogus" forall x. testFromTo x = 12
 #-}

This produces semantically different test functions depending on whether optimizations are enabled or not, and the change occurs as expected when I add -O to the .ghci file. It works for me under 8.0, 8.2, and HEAD. So I'm going to close this for now; if you disagree, I suggest you add a more robust test case.

comment:9 Changed 21 months ago by dfeuer

That is, GHC's processing of its own command line could affect GC timings in some tiny way that is enough to lead to the difference you experienced. Just to make really sure, I turned on -ddump-simpl and saw that the core produced when loading the test file had all the signs of having been optimized (e.g., strictness annotations, unfoldings, and worker/wrapper pairs).

comment:10 Changed 21 months ago by dfeuer

Cc: dfeuer added

comment:11 Changed 21 months ago by George

Resolution: worksforme
Status: closednew

Thanks for taking the time to investigate, I should have been more explicit, the bug is not about the small timing difference; it is about the large allocation difference: 8,885,888 bytes vs 98,400 bytes. That is not insignificant right? I think it is the difference between fusion occurring and not occurring. I agree that the ddump-simpl output seems to show that it is not a bug but the allocation difference indicates to me that it is not optimizing i.e.doing list fusion. I thought it worth double checking. If you are still not convinced feel free to close and I will try to come up with a more robust test case or convince myself that I am wrong.

Last edited 21 months ago by George (previous) (diff)

comment:12 Changed 21 months ago by dfeuer

I'm guessing that the allocation difference is caused by what GHCi does before it loads the module, and quite possibly even before it loads the .ghci file. How could I check this?

Last edited 21 months ago by dfeuer (previous) (diff)

comment:13 Changed 21 months ago by dfeuer

No, my last comment was bogus. The .o file for the one that allocates a lot of memory is larger! ~The .hi files are identical. Very strange.~

Last edited 21 months ago by dfeuer (previous) (diff)

comment:14 Changed 21 months ago by dfeuer

No, no, no. I'm wrong again. The .hi files are only the same for my modified version. The originals are different. I'll attach them.

Changed 21 months ago by dfeuer

Attachment: Foo-if added

The interface produced with -O on the command line

Changed 21 months ago by dfeuer

Attachment: Foo-if-leak added

The interface produced *without* -O on the command line

comment:15 Changed 16 months ago by bgamari

Milestone: 8.0.38.6.1

Ticket retargeted after milestone closed

comment:16 Changed 15 months ago by bgamari

Milestone: 8.6.18.8.1

These won't be fixed for 8.6, bumping to 8.8.

comment:17 Changed 9 months ago by osa1

Milestone: 8.8.18.10.1

Bumping milestones of low-priority tickets.

comment:18 Changed 7 months ago by RolandSenn

Owner: set to RolandSenn

I debugged this ticket and found the following:

The Issue

GHC has a dynamic flag Opt_IgnoreInterfacePragmas or -fignore-interface-pragmas. The setting of this flag is dependent on the optimization flag -O. -O0 will set the Opt_IgnoreInterfacePragmas flag On, -O1 and -O2 will set it Off. If the Opt_IgnoreInterfacePragmas flag is On, then GHC(i) does not store optimization data (eg inline pragmas or rewrite rules) from the interface files.

See compiler/iface/LoadIface.hs:loadInterface. The field containing the rules is eps_rule_base, the one containig the pragmas is eps_PTE both in the ExternalPackageState (EPS) record.

The default optimization is -O0, so Opt_IgnoreInterfacePragmas is On. During startup, GHCi processes the interface GHC.Base containing the fold/build rule. As Opt_IgnoreInterfacePragmas is On, GHCi does not store any rules in the EPS record.

Only after this, we get the possibility to set the optimization flag to -O1 either in the .ghci file or with a :set command. But it's already too late, the fold/build rule wasn't stored, the simplifier doesn't know the rule and cannot optimize the loop: GHCi needs 8800000 bytes to process the example in this ticket.

If we specify -O1 as a command line parameter everything is fine: Opt_IgnoreInterfacePragmas is Off, ghci stores the fold/build rule, the simplifier knows and uses the rule, and processing our example needs less than 100000 bytes!

The History

This is an old, well known and heavily discussed issue. See tickets #8635 and #9370. The simplest of the proposed solutions was just to store all the information from the interface files independently of the Opt_IgnoreInterfacePragmas flag. Then @richardfung provided a patch Phab:D2485 for #9370. Unsurprisingly the validation run showed some stats error, because storing all the optimization needed more space.

@simonmar commented:

I care a lot about unoptimized compile-time performance - couldn't this make things worse by forcing GHC to read all the interface pragmas from interface files, even with -O0?

However, @simonmar also made two suggestions how to solve this issue. The first one was:

Lazily load the pragma info, so that it doesn't cost anything if we don't use it. The simplifier should use Opt_IgnoreInterfacePragmas to decide whether to use the pragma info from external Ids or not.

After this, @richardfung abandonded his patch.

The Proposed Solution:

I think the above suggestion of @simonmar is the way to go:

  1. If the Opt_IgnoreInterfacePragmas flag is set On, we continue to ignore the optimization data in the interface files. However if we ignore the optimization data, we store the module name in a new list, probably also in the ExternalPackageState (EPS) record.
  2. <Somewhere in the code>™: If the Opt_IgnoreInterfacePragmas is set Off and there are module entries in the new list, we reload and reprocess the interface files, ignore everything but the optimization data previously excluded by Opt_IgnoreInterfacePragmas. We add this information to the EPS record. Then we clear the new list. (I still have to investigate where to put this functionality...)
  3. We use wrapper functions to access the fields eps_rule_base and eps_PTE: If Opt_IgnoreInterfacePragmas is Off, the wrapper functions return all the data, else they return the initial values, without any optimization data from the interface files.

All comments are welcome!

Last edited 7 months ago by RolandSenn (previous) (diff)

comment:19 Changed 7 months ago by RolandSenn

Differential Rev(s): https://gitlab.haskell.org/ghc/ghc/merge_requests/480
Status: newpatch
Test Case: T13002

comment:20 Changed 7 months ago by nh2

Cc: nh2 added

Subscribing since #12847 was marked as a duplicate of this.

Note: See TracTickets for help on using tickets.