#linuxcnc-devel | Logs for 2014-06-18

[06:36:12] <skunkworks> logger[psha],
[06:36:13] <logger[psha]> skunkworks: Log stored at http://psha.org.ru/irc/%23linuxcnc-devel/2014-06-18.html
[06:57:54] <micges1> skunkworks: second half of spikes in jerk tp is harder than I thought, no wonder they weren't fully fixed
[07:00:11] <skunkworks> heh
[07:24:36] <skunkworks> if it was easy - everyone would be doing it
[07:30:37] <micges-dev> yeah
[07:33:25] <archivist> what is easy to person A can be difficult for B, where B finds something else easy that A finds hard
[07:34:02] <Tom_itx> C sits and waits for A & B to finish
[07:34:17] <micges-dev> haha
[07:35:43] <archivist> D needs poking because he can if he wants to
[10:54:57] <cradek> seb_kuzminsky: fixing all the prototypes (missing consts) doesn't help my crash
[10:55:07] <cradek> seb_kuzminsky: however adding a single printk at the beginning of the function fixes it
[10:55:38] <cradek> tpSetTermCond: tp = f90d1024, cond = 2
[10:57:44] <KGB-linuxcnc> 03Chris Radek 05master c99d3be 06linuxcnc 10src/emc/tp/tp.h Fix many prototypes * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=c99d3be
[10:57:44] <KGB-linuxcnc> 03Chris Radek 05master 4b2203b 06linuxcnc 10src/emc/tp/tp.h Remove unused prototype * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=4b2203b
[11:00:42] <seb_kuzminsky> oh god
[11:01:36] <seb_kuzminsky> i compiled linux-3.4.55-rtai and the rtai-modules on precise, with gcc 4.6, you're building linuxcnc on wheezy, with gcc 4.7
[11:01:50] <seb_kuzminsky> i wonder if there's an abi incompatibility? where's jepler when we need him
[11:03:48] <seb_kuzminsky> is any other thread monkeying with the TP_STRUCT *tp pointer?
[11:12:41] <jepler> I'm here
[11:13:15] <jepler> cradek: find the .o (not the .ko) where tpItCrashes is defined, and objdump -d it
[11:13:47] <jepler> also the .o where tpItCrashes is called .. objdump -d it too
[11:14:02] <jepler> -d -r probably, so I can see where the call to tpItCrashes is
[11:14:05] <jepler> and pastebin it all
[11:15:03] <jepler> is the kernel cradek is running built on wheezy, or is it the package that was intended for precise?
[11:15:10] <cradek> native wheezy build
[11:15:14] <cradek> oh the kernel
[11:15:28] <cradek> umm I think it's intended for precise
[11:15:44] <cradek> am I doing a stupid?
[11:16:39] <Tom_itx> only if you get caught
[11:16:51] <jepler> well, seb_kuzminsky points out that it may mean you are building kernel modules with a different compiler than used for the kernel itself
[11:17:21] <cradek> that is true
[11:19:13] <skunkworks> can you go back to 4.6?
[11:20:26] <cradek> jepler: http://paste.ubuntu.com/7664345/
[11:21:49] <jepler> well the disassembly really does have fldl (%edi)
[11:23:42] <seb_kuzminsky> my wheezy kernel packages are nearly ready
[11:23:56] <seb_kuzminsky> only 1-9 more weeks
[11:24:01] <jepler> the use of %edi doesn't make any sense to me
[11:24:06] <jepler> regparm uses EAX, EDX, ECX
[11:24:26] <jepler> the calling code appears to do the expected: push the value of that double parameter on the stack
[11:24:33] <seb_kuzminsky> is (%edi) the double?
[11:25:14] <jepler> %edi is normally a callee-saved register
[11:28:12] <cradek> http://www.marshut.com/isznht/dc-hitting-a-compiler-bug-or-undefined-behavior.html
[11:28:35] <cradek> I think this is a userland app seeing the same thing with gcc4.7.2
[11:29:20] <jepler> nothing I know makes fldl (%edi) make any sense
[11:31:40] <jepler> can you see if you can figure out how to capture the gcc commandline that was used to build command.o?
[11:32:10] <seb_kuzminsky> make V=1
[11:32:11] <cradek> there are a zillion fldl (%edi)
[11:32:24] <cradek> sure
[11:32:31] <jepler> some could make sense, if %edi has a meaningful value
[11:33:08] <jepler> emcmotAioWrite seems to have the same problem
[11:33:36] <jepler> anywhere the parameter list is int, double
[11:34:23] <jepler> tpInitBlendArcFromPrev is an example where fldl (%edi) makes sense, because %edi is a calculated address
[11:35:25] <cradek> jepler: http://paste.ubuntu.com/7664435/
[11:48:35] <skunkworks_> skunkworks: test
[11:50:12] <jepler> g++-4.7 -m32 -mregparm=3 -mpreferred-stack-boundary=2 -Os
[11:50:23] <jepler> this seems to be the minimum set of flags to get the inexpicable fldl (%edi)
[11:50:48] <cradek> g++?
[11:50:56] <jepler> my bad, gcc
[11:55:49] <cradek> you think removing -Os will make it all work?
[11:56:46] <jepler> or try -mpreferred-stack-boundary=4
[11:56:58] <jepler> you can't change -mregparm, that's part of ABI
[11:57:18] <jepler> -Os you could change (deleting it gives you -O2)
[12:01:03] <jepler> you can "test" it by looking at the disassembly, if you hope to save a reboot
[12:01:14] <cradek> ok, building now
[12:03:03] <cradek> heh, without -Os it fails to build
[12:03:04] <cradek> Linking linuxcnc.so
[12:03:04] <cradek> ../lib/liblinuxcnchal.so.0: undefined reference to `_rt_shm_alloc'
[12:03:05] <cradek> ../lib/liblinuxcnchal.so.0: undefined reference to `rt_shm_free'
[12:03:19] <jepler> oh, you changed that flag in a place it applies to userspace
[12:03:50] <jepler> so you'd better use -O or -O2 instead of just deleting it
[12:04:53] <cradek> -O2: rtapi/vsnprintf.h:487:7: error: expected identifier or ‘(’ before ‘__extension__’
[12:05:09] <jepler> debian's gcc 4.6 also gives fldl(%edi) with the same flags
[12:05:26] <jepler> that one I don't know about
[12:05:59] <jepler> #ifdef strsep / #undef strsep / #endif
[12:06:28] <jepler> before the implementation of strsep
[12:06:42] <jepler> probably we should only implement strsep if __KERNEL__
[12:06:45] <jepler> or whatever the check is
[12:07:23] <jepler> though <linux/string.h> has strsep, so maybe we should totally dump our implementation
[12:22:19] <cradek> removing (the right) -Os seems to fix it - doing a full build before I push that
[12:22:30] <cradek> or is it the right thing to do?
[12:26:20] <cradek> http://paste.ubuntu.com/7664669/
[12:28:37] <cradek> yay, it runs now
[12:29:36] <cradek> http://pastie.org/9302636
[12:31:42] <seb_kuzminsky> i never understood why we used -Os in the first place
[12:31:54] <seb_kuzminsky> but i also dont understand why turning it off fixes this panic
[12:31:54] <KGB-linuxcnc> 03Chris Radek 05cradek/wheezy-fixcrash 60d2432 06linuxcnc 10src/Makefile 10src/Makefile.inc.in Fix crash on debian wheezy + gcc4.7.2 * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=60d2432
[12:32:31] <Tom_itx> -O has been a problem with AVR-gcc at times too
[13:32:07] <jepler> seb_kuzminsky: the theory is that in RT code, every cache line of code is probably a cache miss, so minimizing code size is best
[13:56:42] <cradek> looks like buildbot says that change doesn't break anything...
[14:16:13] <skunkworks_> Yay
[14:33:41] <cradek> seb_kuzminsky: should I make that change in 2.6? I am worried that it might be miscompiled in the same way, but we haven't (yet) encountered it.
[15:02:41] <cradek> I wish during a conflict resolution, I could easily find the commit log message (and full diff) for both sides
[15:07:40] <skunkworks_> we got a new laser here at work - it is using gecko drives in it...
[15:07:57] <cradek> cool
[15:08:06] <skunkworks_> it was a suprise
[15:08:40] <skunkworks_> oh - you saw it when you took the tour.. (it had just been delivered..) (I am sure you remember...) ;)
[15:09:13] <skunkworks_> looked like a big metal box ;)
[15:09:19] <cradek> yeah I hardly saw anything, only that
[15:14:32] <cradek> mmm hmm, the git is strong with this one
[15:20:40] <KGB-linuxcnc> 03Chris Radek 05joints_axes5 7d0d8ce 06linuxcnc New branch with 161 commits pushed, 10183 files changed, 038812(+), 045092(-) since master/4b2203b
[15:21:40] <cradek> ^ this is rebased onto our brave-new-world master
[15:27:07] <seb_kuzminsky> yay for ja5
[15:27:28] <seb_kuzminsky> if we had perf tests, we'd know how big a win -Os is vs -O2
[15:27:50] <cradek> counting the time of the reboot?
[15:27:59] <cradek> pretty sure -Os is a loser for me
[15:28:25] <seb_kuzminsky> for you, on master, for sure
[15:29:20] <cradek> sorry, my actual nonsnarky point was that if part of 2.6 is possibly miscompiled with -Os, and it seems like it probably is, we should not risk it, even if we haven't seen the crash yet.
[15:29:36] <seb_kuzminsky> yeah your point is well taken
[15:29:44] <seb_kuzminsky> i just don't know the impact
[15:30:12] <seb_kuzminsky> i guess 2.6.0 is a ways off yet (gotta sort the kernel first), so maybe we can get some runtime on this new CFLAG before the release
[15:31:01] <cradek> so you want me to do it?
[15:31:08] <seb_kuzminsky> gulp
[15:31:09] <seb_kuzminsky> yes
[15:33:28] <KGB-linuxcnc> 03Chris Radek 052.6 5bbab25 06linuxcnc 10src/Makefile 10src/Makefile.inc.in Fix crash on debian wheezy + gcc4.7.2 * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=5bbab25
[15:33:28] <KGB-linuxcnc> 03Chris Radek 05master a91a55a 06linuxcnc 10src/Makefile Merge branch '2.6' * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=a91a55a
[15:34:40] <seb_kuzminsky> thx
[15:34:50] <seb_kuzminsky> and thanks to you and jepler for figuring it out
[15:36:11] <cradek> I'm glad it happened to me, and I'm glad jepler is smart enough to figure stuff like that out
[15:36:36] <seb_kuzminsky> i wonder why it affects you, but no one else (yet)
[15:37:39] <cradek> I bet a small number of people are running recent master on wheezy+rtai
[15:37:46] <cradek> um, maybe just me
[15:38:35] <cradek> but jepler found gcc4.6 miscompiles it too, which sure makes it weirder
[15:38:47] <cradek> I don't know
[15:39:47] <cradek> battery's about gone, bbl.
[15:39:50] <cradek> too much compiling.
[15:46:17] <seb_kuzminsky> seeya
[15:46:19] <seb_kuzminsky> thanks again
[15:53:52] <mozmck> I heard somewhere that -Os was not the best thing to use.
[15:54:29] <micges1> -
[15:54:42] <mozmck> I think it was when I was having some problems in my embedded stuff, someone told me that it could generate some strange and bad code...
[15:54:51] <micges1> -O2 not best, -Os not best to use
[15:55:20] <micges1> seems for last years optimalisation in gcc isn't the best part of it
[15:55:59] <mozmck> I've been using -Os in some projects, and -O3 in others...
[15:56:31] <mozmck> all arm-cortex-m series.
[15:57:24] <micges1> from time I've been using -O0 for all projects they just work fine, before there was many gotchas like this one
[15:58:32] <mozmck> I needed -Os in one project. Anything else made the program too big to fit on the chip :) It works fine.
[16:01:45] <micges1> I agree, -Os is critical on embedded projects
[16:01:53] <KGB-linuxcnc> 03Francis Tisserant 05v2.5_branch d156ecc 06linuxcnc 10docs/src/config/ini_config_fr.txt 10docs/src/gui/axis_fr.txt 10docs/src/hal/basic_hal_fr.txt 10docs/src/hal/comp_fr.txt French doc update and cleaning * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=d156ecc
[16:01:53] <KGB-linuxcnc> 03Francis Tisserant 05v2.5_branch ea8e59a 06linuxcnc 10docs/html/gcode_fr.html 10docs/src/gcode/gcode_fr.txt French doc update: Document G5,G5.1,G5.2,G5.3 NURBS/spline commands * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=ea8e59a
[16:22:11] <KGB-linuxcnc> 05cradek/wheezy-fixcrash 60d2432 06linuxcnc 04. branch deleted * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=60d2432
[16:52:07] <memleak> There is always Clang
[17:04:18] <seb_kuzminsky> the buildbot runs clang on every push, it didn't find anything wrong with the new tp
[17:04:33] <seb_kuzminsky> oh, you mean we should consider building linuxcnc with clang instead of with gcc?
[17:57:06] <cradek> andypugh: what is the thing that looks exactly like what it looks like?
[17:57:53] <andypugh> It’s odd. Nobody has reconised it yet. I
[17:58:43] <andypugh> http://www.ltwerner.com/wwii/images/leeenfield-bolthandle.gif
[17:58:47] <cradek> maybe you overestimate the extent to which it looks like what it is
[17:59:06] <cradek> oh ok, yeah that was my guess
[18:01:17] <andypugh> More specifically: https://plus.google.com/photos/108164504656404380542/albums/6016758997842138209?authkey=CJ6K36Pg5_T5ew
[18:02:10] <andypugh> Which involves lenses and LEDs rather than lead and gunpowder
[18:03:07] <cradek> cool
[18:04:19] <andypugh> I don’t think the guy I am building it for is expecting an actual bolt/trigger mechanism where the firing pin operates the microswitch. But that is what he is getting :-)
[18:04:55] <cradek> sounds like the plans were a bit underspecified...
[18:05:09] <andypugh> He only wanted me to thread the three tubes. But it started to look like fun :-)
[18:05:44] <andypugh> (He provided the stock)
[18:07:43] <andypugh> The stock has a “knob” at the bottom that I haven’t seen before. (I have only really shot old Lee-Enfields). I wondered what it was for until I tried a standing “shot” and it’s perfect for the palm of a hand where the elbow is wedged on your hip.
[18:08:43] <andypugh> (I quite like shooting, I don’t partiularly like shooters, they scare me.)
[18:09:08] <cradek> I can sure understand both those feelings
[20:57:11] <jepler> two more notes about fldl (%edi): I think I may have over-minimized the set of cflags that led to the problem. some versions load edi with an appropriate value then use fldl (%edi)
[20:57:26] <jepler> though I don't know why; it's a longer sequence than fldl 0x8(%ebp)
[20:57:42] <jepler> second note, it doesn't seem to happen with gcc 4.7.3 (debian testing)
[20:57:54] <jepler> so it may be a fixed problem, just not fixed in debian stable :-/
[21:00:00] <jepler> also I think I was mistaken when I said I saw the same problem in debian gcc 4.6.3
[21:07:16] <jepler> aha, here's the same problem in the gcc bugzilla: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55940
[21:07:33] <jepler> bad code generated using %edi to access a parameter
[21:07:58] <jepler> for some reason triggered with -m32 -Os -mpreferred-stack-boundary=2
[21:09:05] <jepler> oh and -mregparm=3
[21:09:16] <jepler> and yes, they believe it's fixed in 4.7.3, which agrees with my testing
[21:09:23] <jepler> cradek: congrats, it really was a gcc ug!
[21:09:25] <jepler> er, bug
[21:09:29] <cradek> yay, I guess
[21:09:33] <cradek> thanks for figuring it all out
[21:09:38] <jepler> heh I was invoked by name
[21:09:41] <jepler> I had to defend my honor
[21:09:42] <jepler> bbl
[22:57:36] <memleak> seb_kuzminsky, it was a suggestion for those having problems building linuxcnc with gcc