Opened 6 years ago
Closed 3 years ago
#8082 closed bug (wontfix)
Ordering of assembly blocks affects performance
Reported by: | jstolarek | Owned by: | |
---|---|---|---|
Priority: | normal | Milestone: | |
Component: | Compiler (NCG) | Version: | 7.6.3 |
Keywords: | Cc: | gidyn, simonmar | |
Operating System: | Linux | Architecture: | x86_64 (amd64) |
Type of failure: | Runtime performance bug | Test Case: | |
Blocked By: | Blocking: | ||
Related Tickets: | Differential Rev(s): | ||
Wiki Page: |
Description
During my work on #6135 I noticed that performance of reverse-complem benchmark in nofib depends highly on the order in which assembly block are laid out. With my patches I am consistently getting a 18-20% speed-up. In theory my patches should not impact performance of existing programs, but for some reason they affect the ordering of generated assembly blocks. On of the earlier versions of my patch I noticed that kahan benchmark suffered a 16% performance hit and again the only difference I noticed in the generated assembly was ordering of blocks. I did a more in-depth investigation in case of kahan and it turned out that this difference results from the way Core is generated: the difference between HEAD and my patches was that a worker function had its three parameters passed in different order. I did not investigate this for reverse-complem because Core is considerably larger, but I could spend some time on it if it might be relevant.
Attachments (2)
Change History (10)
Changed 6 years ago by
Attachment: | reverse-complem-HEAD.asm added |
---|
Changed 6 years ago by
Attachment: | reverse-complem-bool-primops.asm added |
---|
Assembly generated with my patches
comment:1 Changed 6 years ago by
It is best to view attached assembly dumps with:
diff -y --suppress-common-lines reverse-complem-HEAD.asm reverse-complem-bool-primops.asm | less
Or some other visual diff program.
comment:2 Changed 5 years ago by
Cc: | gidyn simonmar added |
---|
comment:3 Changed 5 years ago by
Type of failure: | None/Unknown → Runtime performance bug |
---|
comment:4 Changed 5 years ago by
I can confirm that this can cause serious wobbles. When working on #10137, I changed the use of <
to >=
when generating if-then-else trees, and some numbers went up and some went down, without any guidance as to which one is better:
Min -0.1% -0.0% -3.2% -3.2% 0.0% Max +0.0% 0.0% +4.4% +4.3% +3.3% Geometric Mean -0.0% -0.0% +0.3% +0.3% +0.1%
Looks like without dynamic tracing, this problem is not easily solved.
comment:5 Changed 5 years ago by
could that be a branch prediction issue? (plus perhaps a matter of memory locality in the instruction cache?)
comment:8 Changed 3 years ago by
Resolution: | → wontfix |
---|---|
Status: | new → closed |
I don't really see what we can do about this. There are numerous reasons why code layout affects performance (e.g., see this talk from the LLVM developers meeting) and tackling most of them is Very Hard.
We can continue to chat on this ticket, but I'm going to close it as its not really actionable as-is.
Assembly generated by HEAD