#linuxcnc-devel Logs

May 21 2020

#linuxcnc-devel Calendar

08:47 AM Centurion-Dan2 is now known as Centurion_Dan
09:25 AM pcw_home: Ohh thats a fairly nasty bug in master...
09:25 AM skunkworks: pcw_home: nasty bug?
09:26 AM pcw_home: on the user list: Horv?th Csaba
09:28 AM skunkworks: huh. Odd.
09:42 AM pcw_home: I just rebuilt master and it seems to be fixed
10:02 AM pcw_home: oops, not so fast its doing the same thing again
10:03 AM skunkworks: so you hit escape and it runs for a bit after letting up>
10:03 AM skunkworks: ?
10:04 AM pcw_home: it does what looks like a random rapid to who knows where
10:05 AM skunkworks: oh
10:05 AM skunkworks: yeck
10:47 AM pcw_home: Yay Fedex "found" our $18000 package after about a Month of phone calls
10:52 AM skunkworks: wow - that must be a load off.
10:54 AM pcw_home: It is, I really did not relish the idea of rebuilding those cards
11:17 AM mozmck: pcw_home: at least they found them! We lost more than that with UPS and they never found nor would their insurance cover it.
11:18 AM mozmck: That run after hitting escape sounds similar to a problem we had when using the feature to run a subroutine on ABORT
11:23 AM mozmck: https://github.com/LinuxCNC/linuxcnc/issues/579
11:23 AM skunkworks: mozmck: I thought that too - but it was I thught related to subs.. or gcode remap?
11:24 AM mozmck: Hi skunkworks, yes, using the ON_ABORT_COMMAND.
11:24 AM mozmck: Where is the thread on the current issue?
11:26 AM skunkworks: it is on the mailing list
11:27 AM mozmck: Ah, I see it.
11:29 AM mozmck: Sounds like the same kind of behavior. The problem with ON_ABORT_COMMAND was the queue not being cleared properly, so it would actually jump ahead to where it was reading ahead and run the last line from readahead.
01:09 PM pcw_home: I got another error with esc (on the LinuxCNC splash code): "Unknown oword number Position Relative Actual"
01:12 PM pcw_home: emc/task/emctask.cc 69: interp_error: Unknown oword number
01:40 PM rmu|w: I get expected behaviour on current build
01:47 PM pcw_home: I don't, I still get random moves some times with master
01:47 PM pcw_home: (when you hit esc)
01:48 PM pcw_home: and a 1.2" negative Z motion which would cause pain and suffering on a real mill
01:50 PM pcw_home: and this : emc/task/emctask.cc 69: interp_error: Unknown oword number on the LinuxCNC splash screen occasionally on hitting esc
01:55 PM rmu|w: i'm on 8756d78d648d77c68952d1737f632d2eaea83cae on the rpi
01:55 PM rmu|w: will test tomorrow with the "real" machines
01:56 PM rmu|w: pcw_home: what user interface do you use? axis?
01:56 PM pcw_home: Yes this is axis
02:05 PM pcw_home: http://freeby.mesanet.com/crazy_motion.png
02:06 PM pcw_home: that diagonal motion was dos after esc was pressed...
02:06 PM pcw_home: done
02:18 PM rmu|w: i increased feedrate and now it indeed seems to do a rapid after pressing esc
02:18 PM pcw_home: it seems to be random
02:19 PM pcw_home: sometimes its OK, sometimes I get the oword complaint, sometimes I get random motion
02:20 PM pcw_home: (note the last z position on the gcode and the DRO in the .png...)
02:22 PM pcw_home: (even with less than 100% feedrate)
02:30 PM rmu|w: very strange. with the torus test, i got a diagonal move over the whole square and a realtime delay after that
02:32 PM rmu|w: something is very wrong
02:32 PM pcw_home: yeah, a loose wire somewhere
02:36 PM rmu|w: i can only trigger it while doing a downward z move, then it cancels the move, display freezes for 2-3s, and it feeds to the same position (233.4/190.0/0)
02:37 PM rmu|w: most of the time
02:37 PM pcw_home: I could not cause any similar behaviour in 2.8
02:40 PM pcw_home: Since I am only machining imaginary materials I don't think I have ever hit esc before
02:40 PM rmu|w: even the DRO tab in axis freezes for some seconds
02:40 PM rmu|w: i hit it all the time on a real machine ;)
02:42 PM pcw_home: I dont see any delay (but this is running on a fairly fast machine)
02:44 PM rmu|w: 120000 line torus on rpi4
02:44 PM pcw_home: The other thing I see is the sometimes the feedrate gets set to 0 on esc (and no crazy move)
02:45 PM skunkworks: git bisect!
02:46 PM pcw_home: ahh, just got a following error on the crazy move
02:52 PM pcw_home: Yeah git bisect, I guess start at 2.9 and 2.8
02:52 PM rmu|w: branch point of 2.8 from master
02:56 PM pcw_home: right, how would I find that?
02:57 PM rmu|w: no idea
02:57 PM rmu|w: i'm puzzled by g92, g92z40 sets z to -1056something
02:58 PM rmu|w: ah, inches
02:59 PM rmu|w: https://stackoverflow.com/questions/1527234/finding-a-branch-point-with-git
02:59 PM rmu|w: diff -u <(git rev-list --first-parent 2.8 ) \
02:59 PM rmu|w: <(git rev-list --first-parent master) | \
02:59 PM rmu|w: sed -ne 's/^ //p' | head -1
03:01 PM rmu|w: it says 66f9da3b12bfd4e0a8095fa85fa83cdfd78d136d
03:02 PM rmu|w: but that is wrong
03:03 PM rmu|w: hmm
03:40 PM Centurion-Dan2 is now known as Centurion_Dan
04:11 PM rmu|w: pcw_home: 2.8 started with commit efc3491786730f47418a0b3023a70e1b2d1de4cd
04:13 PM pcw_home: thanks for digging that up, I may try git bisect tomorrow (time to go to my 1/2 day of work)
04:16 PM rmu|w: i'm looking at task issue debug output and trying to find something
04:16 PM rmu|w: bisecting on the pi doesnt make much sense
04:17 PM pcw_home: no, that would be pretty painful
04:24 PM rmu|w: it is issuing EMC_TASK_ABORT, followed by EMC_TASK_PLAN_SYNCH, followed by EMC_TRAJ_PLAN_LINEAR_MOVE, that seems to be the crazy move
04:25 PM rmu|w: followed by EMC_TRAJ_SET_TERM_CONDITION and EMC_TRAJ_SET_TELEOP_ENABLE
04:32 PM Centurion-Dan2 is now known as Centurion_Dan
04:37 PM pcw_home: I wonder if that gets wallpapered over by setting the feedrate to 0
04:51 PM rmu|w: motion id seems to make a jump
04:51 PM rmu|w: if it doesn't do the crazy move, the EMC_TRAJ_PLAN_LINEAR_MOVE after EMC_TASK_PLAN_SYNCH is not there
04:57 PM rmu|w: motion id corresponds to line number, so in my case, the interp jumps to ca. +1200 lines in the g-code, and executes the moves there
04:57 PM rmu|w: coordinates where machine comes to stop correspond match those moves
05:04 PM rmu|w: this line https://github.com/LinuxCNC/linuxcnc/blob/master/src/emc/task/emctaskmain.cc#L2592 is suspicious, probably something needs to be done differently for the case interpState = EMC_TASK_INTERP_IDLE
05:04 PM rmu|w: but this stuff is 15 years old
07:31 PM cerna: When I want to look if push to master by some SHA passed all tests, all I need to check is the 0000.checkin? Find it in http://buildbot.linuxcnc.org/buildbot/json/builders/0000.checkin/builds/_all?as_text=1 - if it is not there, then it is too old?
08:42 PM mozmck: rmu|w: that's exactly the behavior we saw with the move after abort if using the ON_ABORT_COMMAND option. Sounds like this is doing it without that option?
09:20 PM jepler: cerna: not quite, because failures in packaging don't reflect on the status of 0000.checkin
09:37 PM Tom_L: jepler, i did a successful 2.8 build on the rpi. your img file seems quite good
09:37 PM jepler: nice!
09:37 PM Tom_L: after dependency hell of course :)
09:38 PM Tom_L: built a deb and installed and tested 2.8
09:39 PM Tom_L: then i found out buildbot built them already
09:39 PM seb_kuzminsky: dependencies shouldn't be hellish, http://linuxcnc.org/docs/2.8/html/code/building-linuxcnc.html#Satisfying-Build-Dependencies
09:39 PM Tom_L: no it wasn't just alot of them
09:40 PM Tom_L: but the procedure works good on it
09:40 PM Tom_L: thanks all of your for the efforts!
09:54 PM seb_kuzminsky: it's actually wrong that the rpi4 debs are where they are on wlo, they shouldn't be comingled with the debian packages, raspbian is a different distribution with different architecture-specific compile flags
09:56 PM Tom_L: i wondered if it was a buster takeoff or something of it's own
09:57 PM seb_kuzminsky: it's the buster source debs, built with different compiler flags