#linuxcnc-devel | Logs for 2013-09-28

[00:00:44] <jared6> anybody still awake? having a problem with stepconf on a BBB
[00:00:59] <jared6> gives me the no gnome.ui error
[02:31:42] <mhaberler> cradek: around?
[02:32:21] <mhaberler> anyway.. re inverse kins only & jog
[02:32:55] <mhaberler> I checked in master; there is no inverse-only kins; a few have KINEMATICS_BOTH (need start value)
[02:33:09] <mhaberler> that does say somebody doesnt come up with one eventually
[02:33:48] <mhaberler> I have checked in the deltatau docs how they deal with the issue that fwd kins mayb be expensive (eg if an iterative solution is required)
[02:35:01] <mhaberler> what they do is call fwd kins only in a non-rt context and explicitly at program start
[02:39:49] <mhaberler> re coord jog: seems micges is adding teleop jog, the old EMCMOT_JOG_* commands seem to be intended for joint-jog only - leaving the question: fuzz EMCMOT_JOG_* for coord jog too (rather no), introduce EMCMOT_COORD_JOG_ commands, or wait to see what comes out of ja3; I need some status update what the state of affairs with new the jogging code actually is before I go about it, dont want parallel invention here
[02:41:37] <mhaberler> does say/does not say/
[02:46:51] <mhaberler> anyway, I think the idea 'you need a fwd kins, but it need not be RT' is an interesting one for the future; I see two ways to go about that: 1. for now, do a expensive fwdkins in say a python hal module which gets triggered from motion control; 2. if motion control becomes the userland part of a user/RT dual comp it shouldnt be an issue to do it in motion control handler to start with
[07:25:24] <alex_joni> cradek: really long string + 2 plastic cups would do the trick
[07:25:41] <alex_joni> then it's just a matter of shouting 1 or 0
[07:26:01] <archivist> half
[09:15:37] <JT-Shop> anyone know where Axis File > Properties is calculated?
[12:16:43] <tjtr33> cradek, hello, you mentioned a vismach for Michael to use. where can i find it?
[12:17:12] <cradek> git.linuxcnc.org/joints_axes3, sim/axis/rdelta
[12:17:14] <cradek> mhaberler: ^^
[12:17:22] <tjtr33> thx!
[12:17:34] <mhaberler> ah, ok
[12:17:40] <mhaberler> was just looking for a config
[12:18:14] <cradek> machine on, home all, $, jog away
[12:21:17] <mhaberler> works fine
[12:22:03] <mhaberler> how will you handle the wheel thing? separate pins ? adapt to pose nameing?
[12:22:34] <cradek> suspect each axis will have a wheel-counts pin, just like each joint does
[12:23:07] <cradek> and a wheel jog enable pin
[12:23:14] <cradek> micges would know better
[12:24:37] <andypugh> Have each gui change the name of the pins and re-wire the HAL automatically. Or some better solution. :-)
[12:25:14] <mhaberler> btw did you read back, I posted some postscriptum to our discussion: http://emc.mah.priv.at/irc/%23linuxcnc-devel/2013-09-28.html#07:16:35
[12:26:26] <cradek> I don't understand what you mean about COORD_JOG
[12:26:44] <andypugh> Is it reasonable just to _require_ a fwd kins?
[12:26:57] <cradek> I have not tested inverse only in ja3 (or forever in non-ja3) but I'd like to not add any new obstacles to the ability to have inverse only
[12:27:20] <cradek> andypugh: why would you want to require it?
[12:28:02] <cradek> if you have a machine you home and switch to world mode and leave it there, you simply don't need them, AND they can be very hard to write
[12:28:09] <andypugh> To save us the trouble of coding to allow inverse only. (This is the un-royal "us", meaning "you")
[12:28:15] <cradek> haha
[12:28:29] <cradek> we already allow it (but whether or not it's buggy I have no idea)
[12:29:01] <andypugh> What do we do with it?
[12:29:33] <cradek> the idea is if you ever switch back to joint mode and move a joint, you have to re-home before you can get back to world mode
[12:29:46] <cradek> that's the ONLY limitation it causes (should cause)
[12:30:01] <andypugh> I am wondering if a fwd kins that is only valid at or around the axis homed positions is viable.
[12:30:35] <cradek> sort of -- you have to give an initial world position corresponding to the joints-just-homed position
[12:30:42] <andypugh> Anyway, I don't have a horse in this race, so I will stop speculating.
[12:31:04] <cradek> and I'll bbl
[12:34:01] <mhaberler> well it's about the type confusion in NML jog commands - it was named EMC_AXIS_JOG_* and was meant for a joint; it is just renamed to EMC_JOG_* but the message doesnt carry a discriminant whether it is intended to jog a joint or an axis
[12:34:41] <mhaberler> it is used for both world and joint jogs though
[12:35:26] <andypugh> From the point of view of wiring up a machine, you might well want switch position 1 to jog Joint 0 or Axis X depending on machine state.
[12:36:35] <andypugh> I appear to have managed to miss the meeting...
[12:36:53] <mhaberler> yessir
[12:41:04] <archivist> I want random geared axes, so I can do coordinated jogs for gear motions and simpler loops in the g code
[12:41:09] * archivist ducks
[12:43:58] <archivist> am already thinking of axes 6 and 7 for my machine but they may be manual for a while
[12:45:34] <andypugh> I am probably going to add a kins-supported but manual B axis to my mill.
[12:45:42] <andypugh> So I can drill slanted holes.
[12:46:07] <CaptHindsight> is that what those curved drill bits are for?
[12:47:24] <archivist> when I am rolling the blank with A the B needs to follow at an inverse and differing amount +- an offset for slot width
[14:01:24] <tjtr33> cradek hello, how to try rdelta cfg?
[14:01:27] <tjtr33> I used "git branch --track ja3 origin/joints_axes3" then "git checkout ja3" then and make
[14:01:28] <tjtr33> I see "LINUXCNC - 2.6.0~pre joints_axes" in terminal running linuxcnc, but no sim/rdelta
[14:24:09] <andypugh> tjtr33: Is the file missing from the configs folder? Or just from the config picker?
[14:24:22] <andypugh> Did you . ./rip-environment?
[14:27:04] <tjtr33> yes missing, yes cd ~/linuxcnc-dev |. scripts/rip<tab>| linuxcnc
[14:27:37] <tjtr33> i see that the git cmds did NOT get lineardelta nor rotarydelta src files in src/emc/kins
[14:28:00] <tjtr33> so step 1 was wrong ( getting the src s )
[14:28:42] <tjtr33> i used these notes http://linuxcnc.org/dapper/emc2/emc2/index.php/italian/forum/10-advanced-configuration/26910-hexapod-controls?limitstart=0
[14:29:15] <tjtr33> they are yours, no finger pointing meant, its just what i used
[14:29:32] <tjtr33> hell i'd be lost even more without those notes
[14:34:20] <tjtr33> sorry missing both ways, not in file directory, not in cfg picker
[14:34:58] <andypugh> what does git branch say?
[14:35:55] <tjtr33> "Branch ja3 set up to track remote branch joints_axes3 from origin."
[14:36:20] <tjtr33> for cmd "git branch --track ja3 origin/joints_axes3"
[14:36:41] <andypugh> No, I meant, literally "git branch"
[14:37:37] <tjtr33> * ja3 <lf> master <lf>
[14:38:12] <tjtr33> (i think the char in between was lf (000a)
[14:38:49] <andypugh> Yes, the * shows the branch you are on.
[14:38:58] <andypugh> git log
[14:40:56] <tjtr33> http://pastebin.com/2YaZffVs
[14:41:15] <tjtr33> i guess im in some shell now
[14:41:51] <andypugh> press q
[14:42:19] <tjtr33> thx
[14:42:36] <andypugh> You seem to be a long way behind in the git history.
[14:42:55] <andypugh> what does git pull say?
[14:43:41] <tjtr33> well wait a sec... git checkout ja3 says "Your branch is behind 'origin/joints_axes3' by 340 commits, and can be fast-forwarded."
[14:44:05] <tjtr33> thats what its says NOW versus i just did this 30 min ago so dont see why its sof far behind
[14:44:17] <tjtr33> i dont need to knwo why, but what should i do
[14:44:45] <andypugh> git pull
[14:44:46] <tjtr33> will try git pu;;
[14:44:49] <tjtr33> yep
[14:45:16] <tjtr33> now make?
[14:46:00] <tjtr33> yes the pull mentions lineardeltakins and rotarydeltakins
[14:46:36] <andypugh> You probably ought to make, yes.
[14:47:01] <andypugh> The new kins need to be compiled.
[14:47:14] <tjtr33> yes mid compile now
[14:50:02] <tjtr33> sudo make setuid | cd | cd linuxcnc-dev | .scripts/rip<tab> | linuxcnc now rdelta is there thx andy!
[14:51:03] <tjtr33> no idea why i'd be so far behind 30 minutes after checkout
[14:54:50] <tjtr33> ah well sim_rdelta.hal:1: Can't find module 'rotarydeltakins' in /home/tomp/linuxcnc-dev/rtlib Shutting down and cleaning up LinuxCNC...
[14:55:53] <tjtr33> true its not in that dir
[14:57:13] <tjtr33> src/emc/kinematics has the .c & .cc & .h files bit no .o ( was not built )
[15:00:46] <tjtr33> must be something like make cfg to tell build about new stuff
[15:12:51] <andypugh> Yes, did you do ./configure?
[15:12:54] <tjtr33> nope those deltakins are not getting built despite ./autogen.sh and ./configure then make ( no .o )
[15:13:13] <tjtr33> sorry i was composing ^^^ when you asked
[15:14:33] <andypugh> You can do it by hand if you want. comp --install deltakins.c
[15:15:42] <tjtr33> ok cd src/emc/kinematics ...
[15:16:03] <andypugh> Odd that they are not being built, they exist in the makefile
[15:17:01] <tjtr33> comp --install rotary deltakins.c comp --install lineardeltakins.c
[15:18:04] <tjtr33> sim_rdelta.hal:4: execv(rotarydelta): No such file or directory these names are not the names expected??
[15:20:50] <andypugh> I am not sure. I haven't tried it
[15:21:32] <tjtr33> i had no space in the actual cmdline "tomp@lptp-1004snd:~/linuxcnc-dev/src/emc/kinematics$ comp --install rotarydeltakins.c"
[15:24:56] <andypugh> Does comp look to be succeeding?
[15:26:27] <tjtr33> yes http://pastebin.com/Rzfnv3Ny
[15:26:38] <tjtr33> is it a name clash?
[15:28:19] <tjtr33> i got rtlib/rotarydeltaskins.ko while error is "sim_rdelta.hal:4: execv(rotarydelta): No such file or directory" note" no kins suffix
[15:28:36] <andypugh> There is a slight possibility that it is a sudo make setuid problem.
[15:28:37] <alex_joni> execv(rotarydelta) doesn't sound like a kins
[15:28:55] <alex_joni> can you post the sim_rdelta.hal ?
[15:28:57] <tjtr33> i did exec sudo make setuid
[15:29:30] * alex_joni wonders if it's not a vismach thingie missing, not the actual kins
[15:30:16] <tjtr33> http://pastebin.com/VVwb7f3Y
[15:30:52] <tjtr33> ( btw: alex, sign me up for your new BBB bob )
[15:31:04] <alex_joni> well, it seems I was partly right
[15:31:15] <alex_joni> loadusr -W rotarydelta
[15:31:17] <alex_joni> on line 4
[15:31:32] <alex_joni> that means load the non-rt executable called rotarydelta
[15:31:35] <tjtr33> ok my guess top
[15:31:38] <tjtr33> too
[15:31:54] <alex_joni> no idea what that is though.. but I bet jepler should ;)
[15:32:28] <andypugh> I think that is the Vismach model
[15:33:00] <alex_joni> tjtr33: try commenting it out, if you don't need it for now
[15:33:10] <alex_joni> tjtr33: I sent you a pm for the BBB bob
[15:33:26] <andypugh> alex_joni: Using GTL2000 ?
[15:33:32] <alex_joni> andypugh: yeah
[15:33:41] <andypugh> It's like magic isn't it?
[15:34:45] <alex_joni> sure feels like :D
[15:35:10] <tjtr33> alex, wont that stomp on all the pins listed right afterwards? rotarydelta.joint0 et al ?
[15:36:57] <alex_joni> hmm.. you're probably right
[15:37:22] <tjtr33> hehe i was sim_rdelta.hal:23: Pin 'rotarydelta.joint0' does not exist
[15:37:52] <alex_joni> yeah, just comment out the second half
[15:38:12] <alex_joni> net pfr rotarydeltakins.platformradius # => rotarydelta.pfr
[15:39:08] <tjtr33> hey lemme ask cradek when he's about ( and mha sez it worked for him somehow )
[15:39:17] <tjtr33> and you're right there should be vismach code
[15:39:40] <tjtr33> oh, truncate the names ,, i can try that
[15:40:23] <tjtr33> oh you mean disconnect the signals.... hmm
[15:40:26] <alex_joni> keep the net connected to the kins only
[15:40:38] <alex_joni> and not try to connect them to the vismach model, which didn't loat
[15:40:40] <alex_joni> load
[15:40:56] <alex_joni> my guess would be that vismach didn't build for some reason.. python issue maybe?
[15:41:03] <alex_joni> but that should be in the build log
[15:41:08] <alex_joni> or the output from configure
[15:41:30] <alex_joni> but since it's after 11 pm here, and I have a 16h drive tomorrow.. I'm saying good luck & good night ;)
[15:42:20] <tjtr33> that runs, just no vismach thank you
[15:52:26] <tjtr33> the submakefile is present and has rotarydeltakins in build list, rotarydelta.py is present
[16:00:13] <andypugh> What happens if you try halrun / loudusr rotarydelta?
[16:00:46] <andypugh> (err, loadusr, not loud user)
[16:05:26] <tjtr33> in there now, there is no ~/linuxcnc-dev/bin/rotarydelta ( thats where the vismach should be )
[16:06:10] <tjtr33> halcmd: <stdin>:1: execv(rotarydelta): No such file or directory
[16:06:34] <tjtr33> ^^^ just halrun then loadusr rotatydelta
[16:06:50] <tjtr33> rotarydelta
[16:06:54] <andypugh> can you find the file?
[16:07:05] <tjtr33> doesnt exit afaict
[16:08:55] <andypugh> src/hal/user_comps/vismach/rotarydelta.py
[16:09:05] <tjtr33> like i said before, the src exists and is listed in submakefile, bit doesnt get built inot the bin dir ( where all other vismach models go )
[16:09:38] <andypugh> It's a python file, you should just be able to run it.
[16:10:03] <tjtr33> tomp@lptp-1004snd:~/linuxcnc-dev$ ls src/hal/user_comps/vismach/rotarydelta.py ->> src/hal/user_comps/vismach/rotarydelta.py exists
[16:10:24] <tjtr33> just copy to bin sans .py ?
[16:10:47] <andypugh> Well, it's worth a try
[16:11:10] <andypugh> Not sure why this is all so hard though.
[16:11:25] <andypugh> I wonder if perhaps it is better to make clean then start again.
[16:13:14] <tjtr33> yep, i forced it by copying san .py and set it executable and now it bitches ImportError: No module named rotarydeltakins
[16:13:16] <tjtr33> ( not ready for prime time )
[16:13:46] <tjtr33> i bet i could force it to work but yo're right, should not be this hacked
[16:14:14] <andypugh> pcw_home: What is the correct hex representation of a 1khz reference clock of a hm2dpll on a 6i25? I am not sure of the units of the frequency, or of the clocklow (for example)
[16:14:20] <tjtr33> off to clean up this munged system now
[16:16:30] <tjtr33> woops, hey thank you andy
[16:24:39] <jepler> 6a281a6 rtapi_math: fix isfinite implementation for pre-4.4 gcc
[16:25:03] <jepler> when I am this commit (tip of ja3) I can run rdelta and ldelta sample configs
[16:25:43] <jepler> I didn't have to do anything tricky
[16:25:57] <jepler> I did run into an unrelated problem that I mailed the list about, which is why I wasn't saying anything back when it was .. more relevant
[16:26:00] <jepler> bbl
[16:26:38] <pcw_home> clocklow is in integer Hz
[16:27:38] <pcw_home> BaseRateDDS := round(TwoToThe42*((BaseRate)/(DPLLBaseClock/PreScale)));
[16:28:13] <pcw_home> base rate being 1000 for a 1 KHz servo thread
[16:30:40] <andypugh> pcw_home: Yes, I have that in the regmap. What I am unclear on is frequency or time, and the units used.
[16:31:17] <pcw_home> all frequency and in hz
[16:31:49] <andypugh> Initially the DPLL appears to be triggering every 33 seconds. Which is a bit slow :-)
[16:32:22] <andypugh> Did you have any lucj with the Fanuc? Did you notice that I pushed a new version of the branch?
[16:32:53] <pcw_home> I didnt have a chance to try your fix yet (fanuc encoders etc are at work)
[16:35:51] <pcw_home> 33 seconds is pretty close to 32768*1/1000
[16:36:43] <andypugh> Dmesg says that the HM2DPLL Timer 1 pin is on P2-12. The Manual says P2-12 is a Gnd... Who is right?
[16:37:26] <jepler> I'd bet on the manual
[16:37:28] <mhaberler> jepler: stunning analysis.. didnt even think of static members default dtors
[16:37:38] <jepler> mhaberler: ain't C++ grand!
[16:38:00] <jepler> it's in the class of "obvious in retrospect" weird crashing bugs
[16:38:08] <pcw_home> both (after the hdr26 transitions to a DB25)
[16:38:09] <mhaberler> yeah, right ;)
[16:38:17] <jepler> now the challenge is to do better than my so-called "fix"
[16:38:53] <pcw_home> manual has hdr pins dmesg has DB25 pins
[16:39:00] <mhaberler> what about dumping the static members and make that a proper class - the dtor could do it in proper sequence
[16:39:07] <mhaberler> my gut feeling is you're right
[16:39:19] <mhaberler> that pythis dtor is the issue
[16:39:50] <mhaberler> it is a one time effort - explicit member initialisation
[16:40:04] <mhaberler> but we get multiple instances per address space for freee
[16:40:25] <mhaberler> (for a given amount of hours counted as 'free')
[16:41:00] <mhaberler> that would give controlled sequencing of the dtor and maybe even make it debuggable
[16:41:19] <jepler> I don't know what happens when you take the "static" off of that declaration
[16:41:25] <jepler> I assume it doesn't build anymore
[16:41:36] <mhaberler> you loose default initialisation, that is all
[16:42:01] <mhaberler> that moves to the ctor and a humungous initializer list
[16:42:22] <mhaberler> I think it was convenience - in particular interp_arrays.cc
[16:42:40] <pcw_home> so DDS for 1 KHz ia 0x7DD4414.A70C774DC167F6D4C (0x7DD4415)
[16:42:41] <pcw_home> if prescale is one
[16:43:16] <mhaberler> but the dtor would delete pythis last thing and then it shouldnt matter anymore
[16:43:21] <pcw_home> is (assuming 33.333 MHz clocklow)
[16:43:58] <mhaberler> just removing the static kwd will make the initializer statements in interp_array.cc fail to compile
[16:44:29] <jepler> and interpmodule is quite grumpy
[16:44:42] <mhaberler> grumpy..
[16:45:31] <mhaberler> meaning hard to understand or adapt? (yes and yes)
[16:46:05] <jepler> /usr/src/linuxcnc-rotarydelta/src/emc/rs274ngc/rs274ngc_interp.hh|628 col 9| error: invalid use of non-static data member ‘Interp::_setup’
[16:46:08] <jepler> /usr/src/linuxcnc-rotarydelta/src/emc/rs274ngc/interpmodule.cc|274 col 40| error: from this location
[16:46:14] <jepler> .def_readwrite("stack_index", Interp::_setup.stack_index) // error stack pointer, beyond last entry
[16:47:17] <jepler> and so on for every Interp::_setup use
[16:47:38] <mhaberler> post the change you did? let me try that here
[16:48:16] <jepler> no, trying to remove the 'static' from _setup
[16:50:07] <jepler> hold on, I had a stray & left
[16:50:20] <jepler> no that's not it
[16:50:25] <cradek> what are you guys talking about?
[16:50:33] <jepler> how do you form a pointer-to-member-of-nested-struct?
[16:51:01] <jepler> > Yes, it is forbidden. You are not the first to come up with this perfectly logical idea. In my opinion this is one of the obvious "bugs"/"omissions" in the specification of pointers-to-members in C++, but apparently the committee has no interest in developing the specification of pointers-to-members any further (as is the case with most of the "low-level" language features).
[16:51:07] <jepler> http://stackoverflow.com/questions/1929887/is-pointer-to-inner-struct-member-forbidden
[16:51:18] <jepler> so .. interpmodule will unfortunately need a thorough reaming :-/
[16:54:08] <mhaberler> anything other than POD arrays is straightforward to adapt
[16:54:26] <mhaberler> fixed size arrays are hell to expose in python
[16:54:49] <mhaberler> templates 'r us
[16:55:42] <mhaberler> but at the time I was afraid of changing this to vector
[16:58:19] <jepler> none of the .def_readwrites that refer to Interp::_seutp are arrays
[16:59:33] <mhaberler> meaning it should be straightforward (ha)
[17:03:29] <mhaberler> did you include interp_internal.hh in rs274_interp.hh to get it to compile?
[17:05:18] <jepler> yes
[17:05:25] <jepler> which is also unfortunate
[17:09:17] <jepler> well anyway I think for each of those properties you "just" need to write an explicit getter and an explicit setter and refer to them in an .add_property
[17:09:37] <mhaberler> the _setup members will need thin wrappers - noisy but trivial
[17:09:45] <mhaberler> right
[17:10:35] <mhaberler> anyway, I think it's rather the question how to proceed on this; I would actually prefer to take the opportunity to get rid of static members
[17:10:47] <jepler> the way you have talked before about the trouble static _setup causes for you, I had assumed the problem with getting rid of it ran deep
[17:10:52] <jepler> it's in comparatively new code though
[17:11:11] <jepler> interpmodule is not in 2.5, right?
[17:11:29] <mhaberler> no, master only - it's about 18 months there
[17:11:58] <jepler> OK
[17:12:19] <mhaberler> lots of suprises for 2.5 addicts in master ;)
[17:12:28] <jepler> as long as it doesn't have to be fixed in 2.5 by such an invasive change
[17:12:39] <mhaberler> no
[17:13:54] <jepler> can you observe the crash before changing anything?
[17:13:58] <mhaberler> it would be good to know beforehand if dtor sequencing (ie pythis last) actually fixes the problem
[17:14:07] <jepler> I haven't looked at what the lifetime of Interp objects is; if they are collected after exit() then the same problem will exist
[17:14:13] <mhaberler> you mean an axis segfault during shutdown?
[17:14:17] <jepler> yes
[17:14:48] <jepler> Py_Finalize will be called before exit() and the destructors of global data will be after exit()
[17:14:50] <mhaberler> I have heard of it; I might have seen it
[17:15:04] <mhaberler> aja
[17:15:06] <jepler> well make sure you see it, so you can be confident it's fixed post your changes
[17:15:28] <jepler> I worry because I didn't look at all the Interps we declare, I bet some of them are global so you're back to the same problem
[17:15:39] <mhaberler> static ctors/dtors are a stupidity which should be wiped
[17:15:47] <jepler>
[17:15:47] <jepler> int Py_IsInitialized()
[17:15:47] <jepler> Return true (nonzero) when the Python interpreter has been initialized, false (zero) if not. After Py_Finalize() is called, this returns false until Py_Initialize() is called again.
[17:16:12] <jepler> you can write magic in the destructor of setup: if !Py_IsInitialized() { silently discard the reference, if boost::python lets you }
[17:16:22] <mhaberler> right
[17:16:33] <mhaberler> not sure it will
[17:17:00] <mhaberler> but I was there before; the fix was a wrapper which redefines the dtor for that pointer
[17:17:07] <mhaberler> let me see
[17:20:43] <jepler> Py_INCREF(pythis.ptr());
[17:20:56] <jepler> this would be the hacky way to do it: by incrementing the reference count, you force it to leak
[17:21:34] <jepler> but we're tring to be free of hacks here :-/
[17:22:13] <jepler> new(&pythis) bp::object(NULL);
[17:22:14] <jepler> but we're tring to be free of hacks here :-/
[17:22:21] <jepler> OK, I'm going to stop and leave you to decide what's best.
[17:24:40] <mhaberler> ha
[17:27:00] <mhaberler> does the crash go away if you do a Py_INCREF(pythis.ptr()) ?
[17:27:37] <mhaberler> and - what config did you get the crash with ?
[17:28:58] <mhaberler> what puzzles me is that the pythis dtor should only delete the wrapper class instance, not the interp instance (was this avtually the problem?)
[17:29:29] <jepler> I was in ja3 running configs/sim/axis/rdelta.ini
[17:29:37] <jepler> just start and then close axis after a few seconds
[17:29:49] <mhaberler> g.l.o:joint_axes3 ?
[17:29:59] <jepler> right
[17:30:10] <mhaberler> that is based on master?
[17:30:19] <jepler> master has been merged into it fairly recently
[17:31:13] <mhaberler> oh, July, I see
[17:31:23] <jepler> I didn't actually try on master branch
[17:31:34] <mhaberler> should be the same for that problem
[17:31:48] <jepler> that was what I thought, if my diagnosis was accurate
[17:32:00] <tjtr33> andypugh, make clean; make; sudo make setuid all good now got the vismach up thanks
[17:33:20] <mhaberler> 'close axis' meanin delete window or File->Quit?
[17:33:21] <jepler> running --enable-simulator but again it shouldn't be relevant
[17:33:32] <jepler> window manager close action (alt-f4) is what I did
[17:33:45] <jepler> should do the same thing as file>quit
[17:33:52] <jepler> I sort of think that maybe this started affecting me when I upgraded to debian unstable
[17:34:00] <jepler> that's why I wanted to be sure you would see it
[17:34:08] <jepler> anyway, terminal prints that axis crashed and it's logged in dmesg..
[17:34:19] <jepler> [526616.119200] axis[6119]: segfault at 98 ip 000000000046d598 sp 00007fffa9803f90 error 6 in python2.7[400000+23a000]
[17:34:56] <mhaberler> I get none on 10.04 with boost 1.40, python 2.6.5
[17:35:35] <mhaberler> gut feeling: boost version - what do you have?
[17:36:17] <jepler> libboost-python-dev
[17:36:37] <jepler> errrrr
[17:36:39] <jepler> that's a lie, hold on
[17:36:41] <jepler> (wrong system)
[17:36:48] <jepler>
[17:38:06] <mhaberler> I have a wheezy vm with python 2.7.3 and boost 1.49.0-3, that might do
[17:38:10] <jepler> so yeah, it could be a boost::python difference too
[17:47:26] <mhaberler> I dont get that segfault on that wheezy platform, sim build of joints_axes3
[17:48:40] <mhaberler> gcc version is 4.7.2-5
[18:03:27] <mhaberler> well I dont know how to reproduce this one; last option is a non-VM;)
[18:04:17] <mhaberler> x86 or amd64?
[18:08:58] <jepler> oh amd64
[18:09:17] <jepler> if you have something you want me to test, let me know
[18:09:36] <jepler> it happens at least 50% of the time for me, maybe more; with my initial "fix" it happened 0 in 10 times
[18:14:37] <mhaberler> ok, will try on amd64
[18:27:34] <mhaberler> amd64/wheezy: nada
[18:32:03] <mhaberler> 20 runs - nothing, both x86/10.04, amd64/wheezy
[18:38:57] <jepler> I guess only jessie shows it then
[19:19:10] <cradek> I don't see crash on exit, joints_axes3, amd64 precise
[19:19:36] <cradek> I think I *have* seen it on this machine, but it won't do it for me now
[19:26:10] <mhaberler> I dont deny it's there; I have seen it in logs sent from users
[19:26:32] <mhaberler> so I will ask them
[19:26:44] <andypugh> I see this on occasion: "Sep 26 23:58:11 dn2800 kernel: [101996.649996] ldpreload_main[17923]: segfault at c ip 40333057 sp bf922720 error 4 in libstdc++.so.6.0.13[40277000+e9000]"
[19:27:05] <mhaberler> this is something else
[19:27:12] <mhaberler> its the result of the configure test
[19:27:56] <jepler> typedef struct setup_struct
[19:27:56] <jepler> {
[19:27:56] <jepler> + ~setup_struct() {
[19:27:56] <jepler> + assert(!pythis || Py_IsInitialized());
[19:27:56] <jepler> + }
[19:28:00] <mhaberler> the/a/ which tests whether the LD_PRELOAD workaround is still needed, so its harmless
[19:28:18] <jepler> I think this assertion is equivalent to what I'm claiming is the Python-required invariant
[19:28:34] <jepler> it fires for me, of course..
[19:29:23] <mhaberler> hm, I wonder what causes the Py_Finalize
[19:29:45] <mhaberler> sure its hidden implicitly in layer 23, library 42
[19:30:15] <jepler> when your executable is python, it is called when python is exiting
[19:30:49] <mhaberler> any chance you can breakpoint both Py_Finalize and ~setup_struct()?
[19:31:35] <jepler> what will cause 'pythis' to get a value in the first place?
[19:31:39] <jepler> I guess I should grep
[19:31:50] <mhaberler> Interp.init()
[19:32:10] <mhaberler> nevermind. The killer is this stupid static class initialisation. Sequencing of ?tors becomes pretty much uncontrollable if its used.
[19:32:17] <mhaberler> get rid of them and we be fine
[19:33:00] <mhaberler> change linking order..surprise surprise
[19:33:17] <mhaberler> this is a non-no technique, no matter who came up with it
[19:34:17] <mhaberler> we can fix this one, but its working on the wrong goal - we should get rid of statics in Interp, and static class ctors/dtors
[19:35:53] <mhaberler> no amount of fiddling will change this for Python modules using interp if the exit hander gets to do something erratic
[19:36:15] <jepler> http://pastebin.com/AE95iccc
[19:36:42] <jepler> if you make _setup not static, you still have a problem if an Interp object has static storage duration
[19:37:25] <mhaberler> well that should be true for any C++ class used from Python modules - it should go away
[19:37:51] <jepler> I thought maybe gcodemodule had a static-duration Interp but it does not. (it has a pointer)
[19:37:55] <mhaberler> great trace - it shows the Python exit is first, then the dtor
[19:38:21] <mhaberler> not sure this is enough if you have static members
[19:38:50] <mhaberler> I think the class pointer dont matter if you have a dtor for a static member
[19:39:53] <mhaberler> that gdb log establishes the sequence, whatever the cause is why it doesnt show on some platforms - it might be just a change in the ordering how dtors are called
[19:40:19] <jepler> Py_Finalize is no called from a destructor
[19:40:23] <jepler> it's called from Py_Main
[19:40:27] <mhaberler> right
[19:41:33] <mhaberler> it seems for some reason on some platforms the dtors are called after Py_Main exits
[19:41:55] <mhaberler> if we tie the dtor to the module unload handler we be fine
[19:43:08] <mhaberler> actually that be an interesting try: call setup_struct::~setup_struct() from the gcodemodule exit handler (but I dimly remember this is another can of worms)
[19:44:02] <mhaberler> can we establish if the exit handler is called before or after Py_Finalize?
[19:44:23] <mhaberler> I mean atexit(3)
[19:45:01] <jepler> If Py_Finalize is in Py_Main don't we know it's before atexit handlers are called?
[19:45:12] <mhaberler> is it? I'm not sure
[19:46:20] <jepler> in python, main() ends with: return Py_Main(argc, argv);
[19:46:31] <mhaberler> another 'fix' would be to explicitly provide a Interp dtor method in gcodemodule - I dont like it, but if it works it is another piece in the puzzle
[19:47:24] <mhaberler> right, exit handlers are too late: post return of main(). http://www.cplusplus.com/reference/cstdlib/atexit/
[19:48:21] <mhaberler> by then Py_Finalize() happened already
[19:48:55] <jepler> you can provide a destructor for setup_struct as I did above to explicitly test for the invariant violation, but after that exits it'll call the destructors for the members including pythis which is where the problem occurs
[19:49:23] <mhaberler> not if you call it explicitly before Py_Finalize(9
[19:49:50] <mhaberler> that is just a wart added by this static ctor usage thing
[19:50:10] <mhaberler> I mean from Python by a method exported from gcodemodule..cc
[19:50:50] <mhaberler> Py_Finalize wasnt planned to work with stupid c++ usage in extensions ;)
[19:51:56] <mhaberler> I think what happens in the pythis dtor is just the Python wrapper being destroyed, but that seems to assume Py_Finalize wasnt called yet
[19:52:50] <mhaberler> if you do a boost::python::object(something), the C++ wrapper class for something gets instantiated
[19:53:57] <mhaberler> interpmodule+py*cc take care not to accidentially instantiate or destroy the interp proper - I dont believe the interp dtor being called from _there_ is at work; its rather the wrapper dtor
[19:54:03] <jepler> (maybe I should be saying "contract", not "invariant". stupid technical terms)
[19:55:19] <mhaberler> I tried something similar with an atexit() handler in a Python module, and I dimly remember the result was similar - too late
[19:56:43] <jepler> typedef struct setup_struct
[19:56:44] <jepler> {
[19:56:44] <jepler> + ~setup_struct() {
[19:56:44] <jepler> + if(!Py_IsInitialized()) new(&pythis) boost::python::object();
[19:56:44] <jepler> + }
[19:57:14] <jepler> that basically nulls out the internal PyObject* in the problem situation
[19:57:30] <jepler> so that the dtor of pythis does nothing
[19:57:46] <mhaberler> does this change what you see?
[19:58:10] <mhaberler> that hypothesis 'wrapper dtor must happen before Py_Finalize()' can be tested - I would bet in uses Python memory alloc or rather free
[19:58:29] <jepler> no crashes in 5 runs with that
[19:58:54] <mhaberler> ha
[20:00:13] <mhaberler> actually there might be more than one boost::python::object reference in _setup (possibly in the call stack or the remap stack)
[20:00:54] <jepler> I sure am not testing any situation like that
[20:01:16] <jepler> I feel like ths is pointing back to my band-aid fix, which really won't make things much worse when you've had time to find a right fix
[20:02:05] <jepler> back at tip of ja3, 5 / 5 runs crashed at exit so it's nearly 100% for me
[20:02:14] <mhaberler> no, and it is cosmetic to start with because at that point Python has no damagae potential left
[20:02:34] <jepler> do you mind if I go ahead and push the band-aid, then?
[20:02:40] <mhaberler> no, not at all
[20:02:52] <mhaberler> to master I assume?
[20:03:03] <jepler> right
[20:03:29] <mhaberler> we know the causality, we have a, uh, repair, but there's not that much upside from beautifying it
[20:03:47] <mhaberler> perfect
[20:03:58] <jepler> alrigh, thanks for your time and thoughts..
[20:04:04] <mhaberler> thanks - I think I learned something
[20:05:37] <mhaberler> at some point we need to converge on a license tagging method.. I still have hesitations to follow your suggestions, but I rather put that off to tomorrow or so, its 2:50 here and my license disussion battery is in the red zone ;)
[20:06:38] <jepler> hmm this is not good .. in master branch, axis is segfaulting at start :-/
[20:06:43] <jepler> more to look into I guess
[20:06:59] <mhaberler> actually the reason why pythis is wrapped this way was another consequence of interp lifetime issues;
[20:07:04] <mhaberler> post your fix?
[20:09:29] <mhaberler> hm, I havent touched rs274_pre.cc and interpmodule.cc since 4/13 which is before the merge in ja3
[20:09:46] <jepler> my new crash is something totally .. other
[20:10:00] <mhaberler> great ;)
[20:10:16] <jepler> but the crash site is code that hasn't been touched in years
[20:10:25] <mhaberler> being?
[20:10:59] <jepler> segmentation fault in XChangeProperty called from seticon in seticon.c
[20:11:29] <mhaberler> very weird
[20:21:10] <mhaberler> actually.. if I understand dtors right you could do a null dtor like ~setup_struct() {} with the same result; I would think that overrides default destruction of all _setup members including pythis
[20:21:29] <jepler> that is contrary to my understanding of C++
[20:22:18] <mhaberler> you mean static members are still destroyed even if an explicit dtor doesnt?
[20:23:00] <mhaberler> I though an explicit dtor prevents default dtor construction?
[20:23:32] <jepler> the destructors of all base objects are still called
[20:23:48] <mhaberler> oh, class hierarchy
[20:24:47] <jepler> I am not sure I'm using the right terms again
[20:24:51] <jepler> I think I meant member, not base
[20:24:53] <mhaberler> I get the idea
[20:25:11] <jepler> the destructor of the member pythis will be called regardless of whether setup_struct which contains it has a non-default constructor
[20:26:24] <mhaberler> while discussing warts.. here somebody was in a simalar situation.. lifetime extension needed: http://www.hafizpariabi.com/2008/01/using-custom-deallocator-in.html
[20:26:56] <mhaberler> leaks 'r us
[20:28:01] <mhaberler> I guess you'd call it redefining the dtor of the base object
[20:32:52] <jepler> you'd think that over the years I would have looked into why I never got the axis icon on 64-bit systems, wouldn't you
[20:34:03] <mhaberler> I think I leave you with rearranging the deck chairs for today ;)
[20:38:51] <KGB-linuxcnc> 03jepler 05master 33380c4 06linuxcnc 10src/emc/ 10rs274ngc/interp_array.cc 10rs274ngc/rs274ngc_interp.hh * interp: avoid violating Python runtime contract
[20:38:52] <KGB-linuxcnc> 03jepler 05master 88eec4b 06linuxcnc 10lib/python/rs274/icon.py * axis: fix the icon
[20:40:02] <jepler> chair configuration: relatively satisfying
[20:48:55] <KGB-linuxcnc> 03andy 05ssi-fanuc-biss-dpll ef066b2 06linuxcnc 10src/ 10(8 files in 2 dirs)
[20:48:56] <KGB-linuxcnc> Further tidying up and introduction of the HM2DPLL module to allow pre-triggering
[20:48:56] <KGB-linuxcnc> 03andy 05ssi-fanuc-biss-dpll 405e482 06linuxcnc 10src/hal/drivers/mesa-hostmot2/ 10hm2_dpll.c 10hostmot2.h 10pins.c * HM2DPLL seems to be working now
[20:49:53] <andypugh> Gah! This was meant to be an early night, a busy day tomorrow of moving fresher students about by vintage fire engine.
[20:50:37] <andypugh> Goodnight chaps.
[21:18:57] <linuxcnc-build_> build #1349 of lucid-amd64-sim is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/lucid-amd64-sim/builds/1349 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:18:57] <linuxcnc-build_> build #1346 of lucid-i386-sim is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/lucid-i386-sim/builds/1346 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:18:57] <linuxcnc-build_> build #1349 of lucid-i386-realtime-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/lucid-i386-realtime-rip/builds/1349 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:18:58] <linuxcnc-build_> build #1350 of hardy-amd64-sim is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/hardy-amd64-sim/builds/1350 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:18:58] <linuxcnc-build_> build #1345 of lucid-rtai-i386-clang is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/lucid-rtai-i386-clang/builds/1345 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:18:58] <linuxcnc-build_> build #1348 of hardy-i386-sim is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/hardy-i386-sim/builds/1348 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:18:59] <linuxcnc-build_> build #1346 of hardy-i386-realtime-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/hardy-i386-realtime-rip/builds/1346 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:58:57] <linuxcnc-build> build #430 of precise-amd64-rtpreempt-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-amd64-rtpreempt-rip/builds/430 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:58:57] <linuxcnc-build> build #1349 of precise-amd64-sim is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-amd64-sim/builds/1349 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:58:58] <linuxcnc-build> build #1145 of precise-amd64-sim-clang is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-amd64-sim-clang/builds/1145 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:58:58] <linuxcnc-build> build #456 of precise-x86-xenomai-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-x86-xenomai-rip/builds/456 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:58:58] <linuxcnc-build> build #430 of precise-amd64-xenomai-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-amd64-xenomai-rip/builds/430 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:58:59] <linuxcnc-build> build #550 of precise-i386-realtime-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-i386-realtime-rip/builds/550 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:58:59] <linuxcnc-build> build #446 of precise-x86-rtpreempt-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-x86-rtpreempt-rip/builds/446 blamelist: Jeff Epler <jepler@unpythonic.net>
[21:58:59] <linuxcnc-build> build #1344 of checkin is complete: Failure [4failed] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/checkin/builds/1344 blamelist: Jeff Epler <jepler@unpythonic.net>
[22:39:01] <linuxcnc-build_> build #1350 of lucid-amd64-sim is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/lucid-amd64-sim/builds/1350 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[22:39:02] <linuxcnc-build_> build #1347 of lucid-i386-sim is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/lucid-i386-sim/builds/1347 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[22:39:02] <linuxcnc-build_> build #1350 of lucid-i386-realtime-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/lucid-i386-realtime-rip/builds/1350 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[22:39:02] <linuxcnc-build_> build #1351 of hardy-amd64-sim is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/hardy-amd64-sim/builds/1351 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[22:39:03] <linuxcnc-build_> build #1346 of lucid-rtai-i386-clang is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/lucid-rtai-i386-clang/builds/1346 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[22:39:03] <linuxcnc-build_> build #1349 of hardy-i386-sim is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/hardy-i386-sim/builds/1349 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[22:39:03] <linuxcnc-build_> build #1347 of hardy-i386-realtime-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/hardy-i386-realtime-rip/builds/1347 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[23:19:01] <linuxcnc-build> build #431 of precise-amd64-rtpreempt-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-amd64-rtpreempt-rip/builds/431 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[23:19:01] <linuxcnc-build> build #1350 of precise-amd64-sim is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-amd64-sim/builds/1350 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[23:19:01] <linuxcnc-build> build #1146 of precise-amd64-sim-clang is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-amd64-sim-clang/builds/1146 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[23:19:02] <linuxcnc-build> build #457 of precise-x86-xenomai-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-x86-xenomai-rip/builds/457 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[23:19:02] <linuxcnc-build> build #431 of precise-amd64-xenomai-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-amd64-xenomai-rip/builds/431 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[23:19:03] <linuxcnc-build> build #551 of precise-i386-realtime-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-i386-realtime-rip/builds/551 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[23:19:03] <linuxcnc-build> build #447 of precise-x86-rtpreempt-rip is complete: Failure [4failed git] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-x86-rtpreempt-rip/builds/447 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>
[23:19:04] <linuxcnc-build> build #1345 of checkin is complete: Failure [4failed] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/checkin/builds/1345 blamelist: dummy, Andy Pugh <andy@bodgesoc.org>