Opened 7 months ago

Last modified 7 months ago

#16354 new bug

LLVM Backend generates invalid assembly.

Reported by: AndreasK Owned by:
Priority: high Milestone:
Component: Compiler (LLVM) Version: 8.9
Keywords: CodeGen Cc:
Operating System: Windows Architecture: Unknown/Multiple
Type of failure: None/Unknown Test Case:
Blocked By: Blocking:
Related Tickets: Differential Rev(s):
Wiki Page:


module Main where

import System.Random
import qualified Data.Vector as V

g1 = mkStdGen 0 :: StdGen
v2 = take 10000 $ randoms g1 :: [Int]

main :: IO ()
main = do
$ /e/ghc_regSpill/inplace/bin/ghc-stage2.exe Repro.hs -O2 -fllvm -O2
Loaded package environment from C:\ghc\msys64\home\Andi\tmp\minmax\unpred\.ghc.environment.x86_64-mingw32-8.7.20190215
[1 of 1] Compiling Main             ( Repro.hs, Repro.o )
C:\\ghc\\msys64\\tmp\\ghc385768_0\\ghc_6.s: Assembler messages:
C:\\ghc\\msys64\\tmp\\ghc385768_0\\ghc_6.s:354: Error: junk at end of line, first unrecognized character is `,'
`gcc.exe' failed in phase `Assembler'. (Exit code: 1)

I've reproduced this with ghc 8.4/8.6 and HEAD using llvm 7.0.1 on windows.

It only occurs with -O2.

The issues arise from these directives:

	.section	.rdata,"dr",discard,__xmm@000000000000004e0000000000000000

Attachments (1)

Repro.ll (81.0 KB) - added by AndreasK 7 months ago.

Download all attachments as: .zip

Change History (12)

comment:1 Changed 7 months ago by AndreasK

As far as I can tell the issue is that the format used is only supported for ELF.


Assuming that is the reason hopefully we just need to pass an appropriate flag to llvm.

comment:2 Changed 7 months ago by AndreasK

Operating System: Unknown/MultipleWindows

Changed 7 months ago by AndreasK

Attachment: Repro.ll added

comment:3 Changed 7 months ago by AndreasK

Steps to reproduce.

Andi@Horzube MINGW64 ~/tmp/minmax/llvm
$ opt -O3 Repro.ll > opt.bc

Andi@Horzube MINGW64 ~/tmp/minmax/llvm
$ llc opt.bc -o llc.s -O3

Andi@Horzube MINGW64 ~/tmp/minmax/llvm
$ as llc.s
llc.s: Assembler messages:
llc.s:367: Error: junk at end of line, first unrecognized character is `,'

comment:4 Changed 7 months ago by AndreasK

The issue here is that the target "x86_64-unknown-windows" generates .section directives which are not supported for pe output by the assembler.

As far as I can tell however "x86_64-unknown-windows" (or one of the synonyms) IS the proper target. So this seems to be an issue with llvm.

comment:5 Changed 7 months ago by AndreasK

$ llc --version
  LLVM version 7.0.1
  Optimized build.
  Default target: x86_64-w64-mingw32
  Host CPU: skylake

  Registered Targets:
    aarch64    - AArch64 (little endian)
    aarch64_be - AArch64 (big endian)
    amdgcn     - AMD GCN GPUs
    arm        - ARM
    arm64      - ARM64 (little endian)
    armeb      - ARM (big endian)
    bpf        - BPF (host endian)
    bpfeb      - BPF (big endian)
    bpfel      - BPF (little endian)
    hexagon    - Hexagon
    lanai      - Lanai
    mips       - Mips
    mips64     - Mips64 [experimental]
    mips64el   - Mips64el [experimental]
    mipsel     - Mipsel
    msp430     - MSP430 [experimental]
    nvptx      - NVIDIA PTX 32-bit
    nvptx64    - NVIDIA PTX 64-bit
    ppc32      - PowerPC 32
    ppc64      - PowerPC 64
    ppc64le    - PowerPC 64 LE
    r600       - AMD GPUs HD2XXX-HD6XXX
    sparc      - Sparc
    sparcel    - Sparc LE
    sparcv9    - Sparc V9
    systemz    - SystemZ
    thumb      - Thumb
    thumbeb    - Thumb (big endian)
    x86        - 32-bit X86: Pentium-Pro and above
    x86-64     - 64-bit X86: EM64T and AMD64
    xcore      - XCore

comment:6 Changed 7 months ago by bgamari

Status: newupstream

Reported upstream as #40824.

Last edited 7 months ago by bgamari (previous) (diff)

comment:7 Changed 7 months ago by AndreasK

$ as --version
GNU assembler (GNU Binutils) 2.30
Copyright (C) 2018 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or later.
This program has absolutely no warranty.
This assembler was configured for a target of `x86_64-w64-mingw32'.

Windows 10 Pro - Build 17763 (64 Bit)

comment:8 Changed 7 months ago by AndreasK

Going by the answers upstream our options are:

  • Switch to the llvm assembler llvm-mc for the llvm backend.
  • Hope binutils implements the llvm extensions eventually.
  • Go back to LLVM 6

comment:9 Changed 7 months ago by bgamari

I don't think it would be wise to let this issue hold us back on an old llvm release. However, the other options aren't much better. Ideally we would just use llvm-mc but this is easier said than done since we use invoke the assembler through gcc. Using as directly is complicated by the fact that we would then be responsible for running cpp ourselves.

comment:10 Changed 7 months ago by AndreasK

The problem with mc is that we can get assembly with CPP directives in it:

8:52 PM AndreasK> bgamari: our generated assembly contains CPP directives?
8:55 PM <bgamari> no, but a TH splice may have added an assembler source file that does

We could (depending on the backend used) invoke gcc on the generated assembly just for CPP and the pass the result on to as/llvm-mc depending on the backend.

AndreasK> bgamari: We could pipe assembly through the gcc CPP, and then pipe the result through as/llvm-mc depending on backend though right?
9:00 PM <bgamari> AndreasK, yes
9:00 PM we already have support for invoking the preprocessor
9:00 PM so really this is likely just a matter of plumbing

comment:11 Changed 7 months ago by AndreasK

Status: upstreamnew

Given that LLVM will likely keep things specific to their assembler when it suits them I don't think we can count on this being resolved through upstream changes.

So I'm reopening this. Just needs someone to dive into the plumbing.

Note: See TracTickets for help on using tickets.