`mark_start()`/`mark_end()` sometimes break autovectorization #30

Seelengrab · 2022-12-02T14:22:41Z

Adding mark_start() to the tight inner loop here:

@inline function scorep1(opp, me)
    isdraw = opp == me
    iswin  = (opp+0x1 == me) | (me+0x2 == opp)
    me + (0x3*isdraw) + (0x6*iswin)
end

@inline function scorep2(opp, target)
    mychoice = mod1(opp + mod1(target+0x1, 0x3), 0x3)
    mychoice + 0x3*(target-0x1)
end

solve(file::String) = solve(read(file))
function solve(data, f::F=scorep1) where F
    l = length(data)
    acc = UInt16(0)
    @inbounds @simd for idx in 1:4:l
        opp = data[idx + 0] - UInt8('A') + 0x1
        me  = data[idx + 2] - UInt8('X') + 0x1
        acc += f(opp, me)
    end
    acc
end

Breaks vectorization pretty badly. It goes from happily using lots of xmm to only using eax & friends. I just wanted to know how much performance was still left on the table, which is kind of hard to do when the tool breaks the vectorization. I don't yet know how, so this issue is just here for tracking this in general, but it ought to be possible to have our cake & eat it too here.

The text was updated successfully, but these errors were encountered:

Robert-j7 · 2025-02-05T21:02:18Z

I tried a few random things, adding a nop @asmcall, calling @llvm.donothing(),and replacing @simd with LLVMLoopInfo none worked. Then found this in the llvm documentation:

However, this interferes with optimizations like loop vectorization and may have an impact on the code generated. 
This is because the __asm statements are seen as real code having important side effects, which limits how the code around them can be transformed. 
If users want to make use of inline assembly to emit markers, then the recommendation is to always verify that the output assembly is equivalent to the assembly generated in the absence of markers.

How should we go about implementing #31 ?

Seelengrab added the bug label Dec 2, 2022

Seelengrab mentioned this issue Dec 2, 2022

Find a replacement for @asmcall for mark_end()/mark_start() #31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`mark_start()`/`mark_end()` sometimes break autovectorization #30

`mark_start()`/`mark_end()` sometimes break autovectorization #30

Seelengrab commented Dec 2, 2022 •

edited

Loading

Robert-j7 commented Feb 5, 2025

mark_start()/mark_end() sometimes break autovectorization #30

mark_start()/mark_end() sometimes break autovectorization #30

Comments

Seelengrab commented Dec 2, 2022 • edited Loading

Robert-j7 commented Feb 5, 2025

`mark_start()`/`mark_end()` sometimes break autovectorization #30

`mark_start()`/`mark_end()` sometimes break autovectorization #30

Seelengrab commented Dec 2, 2022 •

edited

Loading