[Fun] Fwd: [Emc-developers] More trouble for LinuxCNC's memory model on ARM

Artur Kozubski artkocoder w gmail.com
Śro, 22 Lip 2015, 07:32:11 CEST


---------- Forwarded message ----------
From: Jeff Epler <jepler w unpythonic.net>
Date: Wed, Jul 22, 2015 at 2:16 AM
Subject: [Emc-developers] More trouble for LinuxCNC's memory model on ARM
To: emc-developers w lists.sourceforge.net


A year ago I wrote about how ARM's memory model is not strong enough to
reliably transport double-precision floats across HAL pins:
    http://mid.gmane.org/20140702141237.GB65254%40unpythonic.net

Today after looking at rare ARM-only buildbot failures, some of us
researched the ARM memory model a bit more, and found some unfortunate
assumptions that seem to hold up on x86 but not on ARM.

You can find the lengthy PDF document "ARM Architecture Reference Manual
ARMv7-A and ARMv7-R edition" by your favorite search engine.  Down in
appendix G.2.2 is a nice section explaining the observed failures, all
of which seemed to happen only on ARM, the impact being that sometimes
halsampler prints 0 instead of an expected value.

    Weakly-ordered message passing problem
        P1:
            STR R5, [R1]        ; set new data
            STR R0, [R2]        ; send flag indicating data ready

        P2:
            WAIT([R2]==1)       ; wait on flag
            LDS R5, [R1]        ; read new data

        In the absence of barriers, an end result of P2: R5=0 is
        permissible

The fix is to use "barrier instructions", "DMB [ST]" and "DMB" on the
writer and reader sides respectively.  ("DMB [ST]" seems to mean '"DMB"
or "DMB ST"; "DMB" is the strongest barrier, "DMB ST" is a specific kind
of weaker barrier)

It appears that the gcc built-in function __sync_synchoronize will
generate the required instruction on ARM.  On x86 this generates the odd
instruction 'lock orl $0, (%esp)' and on x86_64 (or x86 with
-march=pentium4), the 'mfence' instruction which will cause a small
performance hit and as far as I know is not necessary.  In particular,
it's not required in this case according to this summary of the Intel
SDM in http://www.cl.cam.ac.uk/~pes20/weakmemory/cacm.pdf :
    Example 8-1 Stores are not reordered with other stores

        Proc 0                          Proc 1
    MOV [x] <- 1                     mov EAX <- [y]
    MOV [y] <- 1                     mov EBX <- [x]

    Forbidden final state: Proc 1:EAX=1 and Proc 1:EBX=0

We identified some locations in LinuxCNC where these barriers definitely
need to be added:
    streamer/sampler
    halscope
    task/motion
    nml shared memory regions
    mutex operations (may already be right)
but probably we will not immediately identify all such places and fix
them all.  I hope to work up a branch this weekend for further testing,
particularly if I can reproduce the behavior on my ARM board (which is
the odroid u3, same as in the buildbot farm).  If it's not too invasive,
I'll propose it for inclusion in 2.7, but I'm likely to make it
cumulative with my rework of streamer/sampler since it centralizes what
was 2 distinct sets of code before.

... since I first tried to send this message, sourceforge had their
little meltdown and I did further research.

First, I did targeted testing on my ARM and found that with a new test I
coded up, the sampler bug showed up on average more than once per
minute; and that with the addition of barriers it went way down to zero,
or at least less than once per 16 hours.  I have not placed this work on
a tree on git.linuxcnc.org yet, but I plan to rework the basic fix for
streamer/sampler *not* on top of the experimental library-ized streamer
but in a way that is suitable for 2.7.  I also now believe that nml
shared memory regions are safe, due to use of OS mutexes which should
already contain the required barriers.

Jeff

------------------------------------------------------------------------------
Don't Limit Your Business. Reach for the Cloud.
GigeNET's Cloud Solutions provide you with the tools and support that
you need to offload your IT needs and focus on growing your business.
Configured For All Businesses. Start Your Cloud Today.
https://www.gigenetcloud.com/
_______________________________________________
Emc-developers mailing list
Emc-developers w lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/emc-developers



-- 
Pozdrawiam,
Artur Kozubski
-------------- następna część ---------
Załącznik HTML został usunięty...
URL:  <http://list-servers.net/pipermail/fun/attachments/20150722/ba298741/attachment.html>


Więcej informacji o liście Fun