#linuxcnc-devel | Logs for 2016-07-29

Back
[00:01:52] <KGB-linuxcnc> 03Sebastian Kuzminsky 05seb/2.6/task-revert 4cdbbf1 06linuxcnc 10src/emc/task/emctask.cc Revert "Task: fix a recent "surprise motion on abort" bug" * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=4cdbbf1
[00:01:53] <KGB-linuxcnc> 03Sebastian Kuzminsky 05seb/2.6/task-revert c7fafc2 06linuxcnc 10src/emc/task/emctaskmain.cc Revert "task: Fix serial number handling after 516deaef" * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=c7fafc2
[00:01:53] <KGB-linuxcnc> 03Sebastian Kuzminsky 05seb/2.6/task-revert 4b9f8da 06linuxcnc 10(5 files in 3 dirs) Revert "Task: add drain_interp_list" * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=4b9f8da
[00:01:55] <KGB-linuxcnc> 03Sebastian Kuzminsky 05seb/2.6/task-revert 8fffa9d 06linuxcnc 10src/emc/task/emctaskmain.cc Revert "Task: simplify handling of emcCommand" * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=8fffa9d
[00:01:59] <KGB-linuxcnc> 03Sebastian Kuzminsky 05seb/2.6/task-revert 9e530d8 06linuxcnc 10src/emc/rs274ngc/rs274ngc_pre.cc 10tests/motion-logger/basic/expected.builtin-startup.in 10tests/motion-logger/mountaindew/expected.motion-logger Revert "interp: reset Interp and Canon state on Abort" * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=9e530d8
[00:02:04] <KGB-linuxcnc> 03Sebastian Kuzminsky 05seb/2.6/task-revert 1e809f9 06linuxcnc 03tests/motion-logger/startup-gcode-abort/skip 03tests/statbuffer-g5x-abort/skip skip the tests that fail without the task fixes * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=1e809f9
[00:05:41] <linuxcnc-build> Hey! build 0000.checkin #4437 is complete: Success [3build successful]
[00:05:41] <linuxcnc-build> Build details are at http://buildbot.linuxcnc.org/buildbot/builders/0000.checkin/builds/4437
[02:23:53] <linuxcnc-build> build #550 of 4017.5.deb-wheezy-armhf is complete: Failure [4failed shell_3] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/4017.5.deb-wheezy-armhf/builds/550 blamelist: Sebastian Kuzminsky <seb@highlab.com>
[07:05:39] <jepler> on my odroid I got the module-loading/pid failure after just 357 iterations
[07:15:06] <jepler> 2400+ iterations before uspace-plus merge and no failure
[07:15:11] <jepler> so it seems likely that it's me somehow
[09:26:15] <jepler> on the odroid, ~29000 iterations and no failure, using a master ref from before uspace-plus
[09:26:18] <jepler> booooo
[09:30:05] <skunkworks> isn't that good?
[09:30:27] <skunkworks> or you know there should be a failure and it isn't failing?
[09:30:40] <jepler> I don't understand why it's failing
[09:31:01] <jepler> but the evidence shows it's my fault, and that's why I say booooo
[09:31:08] <skunkworks> ah
[09:31:20] * skunkworks points finger at jepler
[09:32:40] <jepler> also a failure with frequency 1-in-300-ish is hard to be confident you've fixed just by running a test..
[09:50:10] <jepler> 546108fa69a5ac1998e69e49e22f0398959a4888 is the first bad commit
[09:51:02] <jepler> (a result I believe even if I don't get why yet)
[09:54:12] <jepler> 4 float IN 0 pid.11.Dgain
[09:54:13] <jepler> 4 float IN 0 pid.11Note: Using POSIX non-realtime
[09:54:13] <jepler> .14.FF2
[09:54:13] <jepler> 4 float IN 0 pid.14.Igain
[09:54:17] <jepler> they're both message ordering problems
[10:27:42] <KGB-linuxcnc> 03Jeff Epler 05master e996a30 06linuxcnc 10src/rtapi/uspace_rtapi_app.cc uspace: avoid use of message queue in main thread * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=e996a30
[10:41:03] <mozmck> jepler: is this message queue new with uspace-plus? what is it exactly?
[10:41:11] <jepler> mozmck: yes, it is.
[10:41:45] <jepler> mozmck: of course, uspace traces its lineage all the way back to sim, in which it is fine to do anything at any time without regard to whether it keeps realtime performance
[10:42:05] <mozmck> makes sense
[10:42:08] <jepler> mozmck: one of these things is rtapi_print_msg, which simply called fprintf
[10:42:36] <jepler> this continued to work OK on uspace with PREEMPT-RT kernel, because PREEMPT-RT is awesome
[10:42:56] <jepler> but for RTAI LXRT and probably for Xenomai posix-skin, it is no longer okay to call fprintf from realtime code
[10:43:22] <mozmck> I see. is LXRT a posix skin I presume?
[10:43:25] <jepler> so this new thing is an in-memory, lockless queue which rtapi_print_message puts a message into and then non-realtime code takes each message out and actually prints it
[10:44:38] <jepler> but the printing part is actually in a thread of its own, so the ordering of messages changed compared to what it was before, including interleaving a message printed by halcmd with one printed by rtapi_app
[10:45:27] <jepler> so it led to testsuite failures, because we depend on e.g., the message "Note: Using POSIX non-realtime" to not interrupt right in the middle of a line of output from "halcmd show pin"
[10:46:01] <jepler> RTAI LXRT is a weird combination of things, it's partly pthread(-style) APIs and partly rtai-specific APIs
[10:46:31] <mozmck> I think I get the gist of what you're saying - thanks for taking the time to explain!
[10:47:12] <seb_kuzminsky> threads are hard
[10:47:14] <jepler> each new rtos implementation is fairly small, under 200 lines each, so I encourage you to take a look if you want to know what it required to support rtai in uspace
[10:47:28] <jepler> 138 non-blank lines in src/rtapi/uspace_rtai.cc
[10:48:20] <mozmck> I'll try and do that.
[10:48:26] <jepler> so for instance you'll see that we use pthread_create to create a realtime thread (portable pthread API call) but then we use rt_make_hard_real_time to turn that thread into a realtime thread, and rt_task_make_periodic_relative_ns to make it execute with a specific period...
[10:49:12] <KGB-linuxcnc> 03Sebastian Kuzminsky 052.6 1e809f9 06linuxcnc fast forward * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=1e809f9
[10:55:05] <seb_kuzminsky> i plan to merge the revert of the task-fiasco into 2.7 and release 2.7.6 today or tomorrow
[10:55:34] <seb_kuzminsky> that will stop the flow of regressions i caused, and will include hm2-eth, and whatever else got pushed to 2.7 while i was running around with my hair on fire
[10:56:00] <seb_kuzminsky> i'll try to make 2.6.13 too
[10:56:43] <seb_kuzminsky> then i'll be gone for two weeks, until mid-august sometime, camping without reliable wifi, so i won't be able to respond to RM-type things (or anything)
[10:56:59] <jepler> seb_kuzminsky: thanks for the warning, and thanks for putting humpty dumpty together again
[10:57:12] <jepler> and we know all your efforts are well-motivated
[10:57:24] <seb_kuzminsky> thanks
[10:57:31] <jepler> also, have a blast
[10:57:41] <seb_kuzminsky> my heart's in the right place, if only the same was true of my brain
[11:03:13] <skunkworks> seb_kuzminsky, darn - but good
[11:03:48] <seb_kuzminsky> yeah, i'm pretty bummed about that revert :-/
[11:04:03] <seb_kuzminsky> it's clearly the right answer, i can't delude myself any longer
[11:04:15] <seb_kuzminsky> i'll try to do better in master, where that kind of invasive change belongs
[11:05:48] <KGB-linuxcnc> 03Sebastian Kuzminsky 052.7 f016d19 06linuxcnc 10(5 files in 4 dirs) Merge remote-tracking branch 'origin/2.6' into 2.7 * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=f016d19
[11:15:52] -linuxcnc-github:#linuxcnc-devel- [13linuxcnc] 15jepler opened issue #127: buildbot: uspace debs don't include RTAI LXRT support yet 02https://github.com/LinuxCNC/linuxcnc/issues/127
[11:17:36] -linuxcnc-github:#linuxcnc-devel- [13linuxcnc] 15jepler opened issue #128: tests, packaging: make it possible to package the tests and run them from an installed system 02https://github.com/LinuxCNC/linuxcnc/issues/128
[11:33:27] <mozmck> Is there any way to tell when a file has finished loading in gremlin?
[11:52:34] -linuxcnc-github:#linuxcnc-devel- [13linuxcnc] 15joekline9 commented on issue #68: ... 02https://github.com/LinuxCNC/linuxcnc/issues/68#issuecomment-236226684
[11:58:30] -linuxcnc-github:#linuxcnc-devel- [13wlo] 15jepler pushed 1 new commit to 06master: 02https://github.com/LinuxCNC/wlo/commit/3b4fb8fc95b854ad168411c35de2e3e9c8509dcb
[11:58:30] -linuxcnc-github:#linuxcnc-devel- 13wlo/06master 143b4fb8f 15Jeff Epler: new showcase...
[11:58:37] <KGB-wlo> push to master branch: http://linuxcnc.org/
[12:01:49] -linuxcnc-github:#linuxcnc-devel- [13linuxcnc] 15joekline9 commented on issue #68: As far as leadin goes. Some cnc lathes like to run in rev (M4) when threading. You then must start the thread in a groove and feed Z+. In this case leadin should be short as possible. I'd say less than 2 or 3 mm at 400 to 500 RPM. 02https://github.com/LinuxCNC/linuxcnc/issues/68#issuecomment-236228994
[12:13:42] <KGB-linuxcnc> 03Sebastian Kuzminsky 05master 2ebd5e7 06linuxcnc Merge remote-tracking branch 'origin/2.7' * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=2ebd5e7
[12:35:52] <jepler> (shouldn't metric people have to use radians per second or something?)
[12:36:13] <seb_kuzminsky> gradians
[13:54:59] <linuxcnc-build> build #552 of 4017.5.deb-wheezy-armhf is complete: Failure [4failed shell_3] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/4017.5.deb-wheezy-armhf/builds/552 blamelist: Sebastian Kuzminsky <seb@highlab.com>
[14:23:28] <linuxcnc-build> build #2786 of 1404.rip-wheezy-rtpreempt-amd64 is complete: Failure [4failed runtests] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/1404.rip-wheezy-rtpreempt-amd64/builds/2786 blamelist: Sebastian Kuzminsky <seb@highlab.com>
[14:39:26] <jepler> .. a blank line missing from expected output !?
[15:10:34] <linuxcnc-build> build #4441 of 0000.checkin is complete: Failure [4failed] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/0000.checkin/builds/4441 blamelist: Sebastian Kuzminsky <seb@highlab.com>
[15:40:22] <linuxcnc-build> build #2586 of 1400.rip-wheezy-i386 is complete: Failure [4failed compile runtests] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/1400.rip-wheezy-i386/builds/2586 blamelist: Sebastian Kuzminsky <seb@highlab.com>
[15:44:24] <jepler> and now a different rando failure
[15:44:34] <jepler> that's about messages printed by a free-running component afaict
[15:48:14] <linuxcnc-build> build #4442 of 0000.checkin is complete: Failure [4failed] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/0000.checkin/builds/4442 blamelist: Sebastian Kuzminsky <seb@highlab.com>
[17:03:39] <seb_kuzminsky> jepler: the missing blank line i think indicates that and mdi call to the M100 shell script got lost
[17:04:38] <seb_kuzminsky> the t0 tests run a ton of MDI to shell scripts to inspect interpreter numbered parameters
[17:04:56] <seb_kuzminsky> ... via linuxcncrsh, fed by a shell script | nc
[17:05:08] <seb_kuzminsky> not how i would write that test today...
[17:07:22] <seb_kuzminsky> the first failure was in 2.7, the second one in master
[17:13:06] <seb_kuzminsky> the second failure (in master including both uspace++ and the task-revert) does a bunch of 'loadusr; show pin; unload', of a comp that printf's a bunch of debug output, but the debug output didn't show up (wtf)
[17:24:21] <mozmck> I'm trying to make my own copy of HAL_Gremlin for some modification (yuck I know), and it is telling me my class has no attribute 'lathe_option'
[17:25:31] <mozmck> Thing is I basically just copied the hal_gremlin file to my own directory and changed the name. lathe_option is a member of gremlin which my class is derived from. After telling me that it goes ahead and runs and all seems to work?
[17:26:25] <mozmck> I'm not sure how to troubleshoot it. I commented out the code that used it and it then complained about another variable.
[18:11:16] <mozmck> *grumble* - I'm not sure I really like python
[18:26:18] <andypugh> mozmck: I know what you mean. Lots of rally useful and easy to use libraries, but it’s harf to know if it is doing anything like what you really want.
[18:31:18] <jthornton> I like python but hate convoluted libraries
[18:33:36] <seb_kuzminsky> python is one of those languages where you really have to read the docs and understand the design, in order to make sense of its behaviors
[18:33:44] <mozmck> Well, the python in linuxcnc is a whole bunch of convoluted libraries ;-)
[18:33:55] <seb_kuzminsky> yeah :-/
[18:34:04] <mozmck> seb_kuzminsky: yeah, I need to do that more.
[18:34:20] <seb_kuzminsky> that reminds me, somewhere i have a branch that adds a bunch of docstrings to the linuxcnc.so python module, i should dig that up...
[18:35:00] <jthornton> I've not figured out how to read the python docs yet, they just don't make sense. All auto generated from the code and in greek or something
[18:35:25] <mozmck> This problem is almost like it take too long to start up the gremlin and it thinks it doesn't have member variables it really has! After it is running it works!
[20:32:42] <mozmck> Well, I was not able to subclass HAL_Gremlin either because I got the same errors. Not sure but I think it has to do with loading it after the GUI is loaded instead of it being loaded from the xml file?
[20:33:57] <mozmck> I did figure out how to do what I wanted with two lines added to HAL_Gremlin though.
[20:35:40] <mozmck> When opening a file I save the block delete state, set it to true, then when the signal is emitted that gremlin has the file loaded I set block delete back to it's previous state. Kind of a hack for now but it gets the tool size reasonable for some of the large files with lots of G38.2 followed by G92.
[21:25:58] <jepler> that's terrible but I'm happy it works for you.
[21:26:11] <mozmck> yeah, I feel kinda the same.
[21:27:21] <mozmck> I was not intending to do it this way, but I wrote a little filter program to add / to all the G38.2 and G92 lines for running in a sim config, then figured out block delete would help the display as well.
[21:32:06] <mozmck> It would be really nice if there could be a way to give the backplot a list of codes to ignore. G38.2, G38.3, G92, M*