#linuxcnc-devel | Logs for 2015-08-08

[08:42:03] <jepler> cradek: the difference is that "sequence number 3.000000" is not printed in random-without-t0 ?
[08:42:24] <jepler> .. on precise, only?
[08:43:14] <jepler> and on 2.6 branch only?
[09:14:48] <cradek> there's a later one missing a different one
[09:36:44] <jepler> 57 test iterations on wheezy/uspace/vanilla and no failure
[09:36:53] <jepler> frustratingly, it's a ~1-minute test
[09:40:18] <KGB-linuxcnc> 03Sebastian Kuzminsky 052.7 f21902a 06linuxcnc Merge remote-tracking branch 'origin/hy-vfd' into 2.7 * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=f21902a
[09:40:51] <jepler> I'll look at it more this afternoon
[09:40:56] <jepler> seb_kuzminsky: yay
[10:44:02] <pcw_home> Hmm I wouldn't jog a 25 M/M machine with a USB keyboard...
[10:47:09] <pcw_home> At least unless there was a working hal watchdog for USB interface aliveness
[10:47:50] <mozmck> Unfortunately, that's about the only reason option for most machines our customers build.
[10:47:56] <mozmck> reasonable
[10:48:32] <mozmck> and it has never been a problem in mach3 ;)
[10:53:56] <mozmck> A wheel where you have to select the axis you want to jog would slow things down a good bit during setup.
[11:04:07] <pcw_home> A joystick might be better but a KB continuous jog can go bad if the USB system decides to to a hike in the middle of the jog
[11:05:38] <mozmck> That's true. I would tend to agree with emcPT though, that probably 95% of linuxcnc machines use keyboard jogging.
[11:06:10] <archivist> a lot of newer boxes are usb keyboard too
[11:06:30] <mozmck> I wonder how one could make a watchdog component? Almost seems like you would have to write a driver that sat under the keyboard driver?
[11:07:29] <mozmck> Or maybe there is a way to intercept traffic and watch it. I don't know if the USB traffic would actually stop in a case like this or not though?
[11:09:00] <pcw_home> seem like the low level USB driver would have some kind of connection status info
[11:09:06] <pcw_home> seems
[11:11:34] <mozmck> That may be. I'll put it on my list to look at.
[11:12:57] <pcw_home> Ha! the specification for maximum response time to setup packets is 5 seconds
[11:25:20] <jthornton> by golly I made them expand all... now contract all should be a piece of pie
[12:03:35] <jepler> The message "sequence number 6.000000" is printed by the M100 script in tests/t0/subs/M100
[12:03:56] <jepler> so the test behavior says there's now some way that a M100 invocation can be lost
[12:04:56] <jepler> I should probably run this test in a system I'm artificially putting heavy loading on, as I did to finally get the smoking gun for the timed-out changing modes test...
[12:06:44] <jepler> > A fatal flaw in the system is that people can easily pose as real funeral directors
[12:08:54] <Tom_itx> jthornton, yay!
[12:16:15] <cradek> jepler: crap, that's not good
[12:36:35] <cradek> emcSystemCmd: abandoning process %d, running ``%s''\n
[12:36:45] <cradek> jepler: if you turn on TASK_ISSUE debug you might get a diagnostic
[12:36:51] <cradek> there are several of them
[12:38:42] <pcw_home> Is there some way for HAL to know if the GUI is healthy, like a GUI heartbeat or some such?
[12:39:04] <cradek> nope
[12:41:38] <pcw_home> I noticed Axis exits if you close the controlling terminal but HAL cheerfully continues
[12:42:15] <cradek> AXIS exiting should cause all of linuxcnc to exit; that's the usual way to shut down
[12:43:47] <pcw_home> at least on my system (2.7 uspace under Ubuntu 14.04) that does not seem to be the case
[12:44:53] <cradek> are you sure AXIS is exiting?
[12:46:28] <pcw_home> not visible on screen or via top,ps
[12:47:31] <pcw_home> only piece left running is rtapi_app
[12:50:25] <cradek> that doesn't seem right :-/
[12:51:01] <pcw_home> its entertaining
[12:51:02] <pcw_home> halcmd still works fine
[12:55:46] <cradek> jepler: stupidly, because they are errors, not debug messages
[13:22:16] <JT-Shop> Tom_itx, I got the Expand All / Collapse All figured out
[13:26:42] <Tom_itx> yippee!
[13:26:54] <mozmck> pcw_home: 5 second max response time! But you shouldn't be getting setup packets - unless the device gets reset I guess?
[13:28:26] <pcw_home> Right, but it can happen with a bad enough noise issue (re-enumeration)
[13:29:56] <mozmck> Yes, I see. I'm a bit familiar with noise issues.
[13:31:03] <mozmck> Thinking of that, do you know where I can get bulk usb cables with ferrites on them? I use printer type cables - A to B
[13:31:22] <jepler> cradek: hm ok
[13:31:40] <jepler> on iteration 8 of the stressed system, I had 4 consecutive lines missing
[13:32:26] <jepler> so at least I can reproduce it..
[13:33:14] <jepler> turning on all debug and running again
[13:33:56] <pcw_home> Not off hand
[13:33:57] <pcw_home> We get our parallel cables from monoprice
[13:36:06] <pcw_home> I guess you could also use a big enough bead to slip over the B connector end
[13:36:19] <pcw_home> (of a standard cable)
[13:42:30] <pcw_home> or use a ADuM3160 :-)
[13:44:38] <jepler> nothing too informative in debug
[13:45:08] <jepler> reverting the sequence number fix and running the test again. I may not know what's going on yet, but I can find out whether I just introduced it
[13:45:24] <jepler> aaaannnddd with git revert 301cc84f8727cba3cca3323e04deb9dccaa261a8 it reproduced
[13:45:33] <cradek> amazing
[13:46:19] <cradek> the two commits were about sequence numbers. the new error is missing sequence number things. and ... it's a coincidence
[13:46:36] <cradek> I will never understand software
[13:46:58] <jepler> let me reset back to 2.6.8 and test again
[13:48:28] <KGB-linuxcnc> 03John Thornton 052.7 e3f81f8 06linuxcnc 10docs/src/index.tmpl Docs: add expand/collapse all button * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=e3f81f8
[13:49:10] <jepler> failures at 2.6.8 as well
[13:49:44] <jepler> test procedure: in one terminal, repeatedly 'make && make clean' a huge software project to generate lots of memory pressure & disk acces; in another terminal, repeatedly run the test in question
[13:49:53] <jepler> it probably helps to have a hard disk drive and not ssd
[13:49:53] <jepler> afk
[14:25:20] <skunkworks> so are these errors poping up because the build bot is getting overloaded?
[14:27:26] <cradek> maybe, but they're still bugs
[14:35:33] <skunkworks> right
[14:36:13] <skunkworks> just seems more and more odd bugs are showing up related to timing
[14:38:53] <skunkworks> pcw_home: we order a lot of cables from monoprice. They are so inexspensive for what you get.
[14:40:13] <skunkworks> uh oh.. http://www.cnczone.com/forums/novakon-systems/278468-tormach.html
[14:44:31] <pcw_home> Yeah they are inexpensive but still good quality (we tore the parallel cables apart to check construction)
[14:51:33] <skunkworks> jepler: your patch fixed the visimach problem here also
[14:56:13] <jepler> skunkworks: thanks for testing
[15:01:33] <pcw_home> multiple hm2_eth cards up for about a week no issues (I forget exactly when I started)
[15:34:33] <seb_kuzminsky> jepler: please push the vismach fix, thanks
[15:35:00] <seb_kuzminsky> the missing mdi output has been failing on 2.6 and later for a long time, though the frequency seems to have gone up lately
[15:35:13] <seb_kuzminsky> i can repro it here too, running the test in a constrained vm over and over
[15:42:07] <KGB-linuxcnc> 03John Thornton 052.7 e8ee08c 06linuxcnc 10docs/src/index.tmpl 10docs/src/index_es.tmpl Docs: add expand/collapse all button * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=e8ee08c
[15:42:08] <KGB-linuxcnc> 03John Thornton 052.7 8adceba 06linuxcnc 10docs/src/config/stepper.txt Docs: update to match current locations of sample files * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=8adceba
[15:42:08] <KGB-linuxcnc> 03John Thornton 052.7 e5ebed2 06linuxcnc 10docs/src/getting-started/running-linuxcnc.txt Docs: move anchor to correct place * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=e5ebed2
[15:42:26] <jthornton> the list is getting real short now
[15:45:18] <jepler> OK, I am starting to have a handle on the t0 test failure
[15:45:54] <jepler> I started from a position of not trusting linuxcncrsh, so I wrote my own using python linuxcnc module as the ui
[15:46:10] <jepler> first I just had it M100, 10000 times; it succeeded twice in a row in getting all 10000 lines of output
[15:46:46] <jepler> then I added 't1m6' to it. task repeatedly logs 'Requested tool 1 not found in the tool table' and only a few lines are logged to output
[15:47:31] <jepler> so I suspect that it has something to do with queued mdi behavior after an abort
[15:48:23] <seb_kuzminsky> sweet
[15:48:36] <seb_kuzminsky> i too have concluded it's task, not linuxcncrsh, that's misbehaving
[15:49:30] <jepler> iiuc we now have a queue between UI and task of NML commands that have been sent but not processed
[15:49:49] <jepler> so we have task talking to io to find out the result of whether the tool is there
[15:49:59] <jepler> in the meantime it can read some variable number of MDIs out of the NML command queue
[15:50:31] <jepler> when io says "no tool 1", task aborts (I'm assuming; I didn't look) and discards the MDIs it had read off the command queue, but not the ones that remain in the command queue
[15:50:56] <jepler> so the test needs to wait for the abort that follows a failed T-, if it is intentionally sending failing T- calls
[15:51:10] <jepler> or something
[15:51:32] <jepler> hm but 2.6 doesn't have the queue
[15:52:07] <seb_kuzminsky> 344 commits in 2.7 since 2.7.0~pre6
[15:52:08] <seb_kuzminsky> crikey
[16:09:10] <jepler> that was april? another prerelease seems in order
[16:11:42] <seb_kuzminsky> yep
[16:29:02] <KGB-linuxcnc> 03Jeff Epler 052.6 db71c5a 06linuxcnc 10lib/python/vismach.py vismach: work around a bug in mesa * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=db71c5a
[16:42:47] <seb_kuzminsky> thanks jepler
[16:47:14] <seb_kuzminsky> mozmck: you around?
[18:43:51] <mozmck> seb_kuzminsky: I'm around now
[19:11:57] <andypugh> There were no taxis at the station when I returned from a day of machining in London. (Fixing the fire engine). So I walked the 2.5 miles home. It was harder work than normal, I just weighed my backpack and it was 32lb. A set of Burnerd EC collets. lathe tools and inserts.
[19:21:35] <andypugh> Ah, wrong channel….
[20:01:03] <jepler> seb_kuzminsky: this commit looks weird, I didn't look whether a later commit reverted it
[20:01:06] <jepler> Subject: [Emc-commit] 2.7: Create README.md
[20:11:38] <jepler> ah OK
[20:11:38] <jepler> diff --git a/README.md b/src/hal/user_comps/huanyang-vfd/README.md
[20:11:39] <jepler> similarity index 100%
[20:11:39] <jepler> rename from README.md
[20:11:39] <jepler> rename to src/hal/user_comps/huanyang-vfd/README.md
[20:58:31] <seb_kuzminsky> jepler: git sure stinks at making renames obvious
[21:37:35] <KGB-linuxcnc> 05hy-vfd d4d282e 06linuxcnc 04. branch deleted * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=d4d282e
[22:24:19] <pcw_home> jthornton: around?
[22:24:20] <pcw_home> Theres a weird out of place paragraph in the INI Configuration section of the docs