#linuxcnc-devel Logs
Mar 08 2023
#linuxcnc-devel Calendar
03:25 AM linuxcnc-build2: Build [#566](http://buildbot2.highlab.com/buildbot/#builders/11/builds/566) of `10-rip.debian-12-bookworm-rtpreempt-amd64` 4failed.
03:45 AM linuxcnc-build2: Build [#567](http://buildbot2.highlab.com/buildbot/#builders/11/builds/567) of `10-rip.debian-12-bookworm-rtpreempt-amd64` 8completed with warnings.
05:20 AM -!- #linuxcnc-devel mode set to +v by ChanServ
08:38 AM seb_kuzminsky: CaptHindsight[m]: thanks for the link, that looks interesting
08:57 AM -!- #linuxcnc-devel mode set to +v by ChanServ
11:31 AM CaptHindsight[m]: seb_kuzminsky: https://forum.linuxcnc.org/27-driver-boards/46770-driver-firmware-pcb-for-pi-rp2040-pio-i-e-an-easy-to-configure-fpga-like-card but no code was ever posted so Remora would still need to be ported to the 2040
12:05 PM -!- #linuxcnc-devel mode set to +v by ChanServ
02:09 PM cradek: https://paste.debian.net/hidden/8e119141/
02:10 PM cradek: This is the same failure to start I asked about a while back. After several retries it will always start, but it fails maybe half the time. I did eventually get it to run under strace (which is hard because of the way it switches between UIDs) but under strace it always starts successfully.
02:11 PM cradek: is there an obvious next step?
02:13 PM cradek: this is the busywait6 branch rebased onto a newer 2.9 (I don't think that's important but I can sure push it to github if someone wants to see it)
02:16 PM pcw--home: "board fails HM2 registration" without a more specific complaint is not something I have ever seen
02:17 PM pcw--home: Sounds like maybe dropped data during discovery
02:17 PM cradek: fwiw here's the exact code I'm running: https://github.com/cradek/linuxcnc/tree/busywait6
02:18 PM pcw--home: Can you try with stock LinuxCNC?
02:18 PM cradek: ifconfig doesn't show anything dropped (I don't know if that matters) https://paste.debian.net/hidden/8d666fcd/
02:20 PM cradek: unfortunately I don't think I can run it without this: https://github.com/cradek/linuxcnc/commit/68c73f29e4c3863a65b7cde8f6492065dbb178ad
02:21 PM pcw--home: inconsistent "failed registration" really points to missing or corrupted IDROM data
02:22 PM cradek: should I read the IDROM a bunch of times with mesaflash?
02:26 PM cradek: `mesaflash --readhmid --addr 10.0.0.2 --device 7i80hd-16' worked dozens of times in a row
02:28 PM pcw--home: Yeah if that works (and the md5 of the result is the same) that would tend to rule out a flaky 7I80
02:29 PM cradek: oh I didn't check the result, I just looked for errors. one sec
02:29 PM pcw--home: (though the driver probably may not read the IDROM the same way)
02:30 PM cradek: yeah, identical output every time
02:31 PM pcw--home: This is the very first time I have heard of this kind of error
02:31 PM cradek: thanks, that's good to know
02:32 PM pcw--home: so that's why I am a bit suspicious of the branch you are running
02:32 PM cradek: I will have to dig then
02:32 PM cradek: yeah that's definitely reasonable
02:32 PM cradek: (but once it starts, it runs perfectly)
02:32 PM cradek: I will see what it takes to get it to run on exactly 2.9
02:33 PM cradek: I see there's been more work on busywait6 since my snapshot too
02:33 PM pcw--home: If I helps I will reserve a 7I80HDT for a board swap check
02:34 PM pcw--home: (Can't make plain 7I80s anymore)
02:34 PM pcw--home: maybe in a year
02:36 PM cradek: I appreciate that
02:36 PM cradek: I'll see what I can figure out here in the next couple days and let you know
02:38 PM pcw--home: If I get a chance (maybe next week) I could try the busywait branch also
02:39 PM cradek: it makes the latency *very* good on my hardware. but before you bother maybe we should see if seb thinks it's time for wider testing yet
02:40 PM seb_kuzminsky: can you turn up the debug level and get it to fail? there might be a separate debug level for hm2...
02:41 PM cradek: let me see if I can figure out how
02:42 PM cradek: looks like it just uses rtapi_print_msg
02:44 PM seb_kuzminsky: ok good
02:47 PM cradek: https://paste.debian.net/hidden/cefeee00/
02:53 PM pcw--home: Dispatch latency (what the latency test measures) is usually not the issue with Ethernet, its network latency
02:55 PM pcw--home: sudo chrt 99 ping -i .001 -q 10.10.10.10 (run for an hour or so) will give you an idea of network latency
02:57 PM cradek: ok, I'll run it while I go get some lunch
03:39 PM cradek: --- 10.0.0.2 ping statistics ---
03:39 PM cradek: 2522497 packets transmitted, 2522496 received, 3.96433e-05% packet loss, time 2525021ms
03:39 PM cradek: rtt min/avg/max/mdev = 0.069/0.089/0.163/0.003 ms
03:43 PM seb_kuzminsky: 163 µs doesn't seem bad for a round trip but what do i know
03:54 PM cradek: when it fails, it just stops after [pastebin] line 43: https://paste.debian.net/hidden/df9c2cab/
04:05 PM JT-Shop_ is now known as JT-Shop
04:43 PM cradek: argh, I added a bunch of HM2_DBG to narrow it down, and it disappeared
04:43 PM cradek: ... so I'm running memtest now
04:57 PM cradek: ... which made it through a pass with no problems
04:58 PM Tom_L: on older machines one of the first things i do is reseat all the cards/memory
05:14 PM seb_kuzminsky: this is a near-new raspberry pi 400 iirc
05:19 PM seb_kuzminsky: cradek: annoying that the driver doesn't say anything helpful when it falls over :-(
05:24 PM Tom_L: ahh
05:31 PM cradek: No, it's an old amd64
06:17 PM seb_kuzminsky: oops, ok
06:57 PM rene-dev56 is now known as rene-dev5
08:14 PM pcw--home: 163 usec is really good
08:26 PM cradek: I guess next I'll remove my debug prints and see if it breaks again, and if it does, and it probably will, ... that sucks