#linuxcnc-devel Logs

Apr 22 2020

#linuxcnc-devel Calendar

12:08 AM seb_kuzminsky: well this is not good:
12:12 AM seb_kuzminsky: menu "Virtio drivers"
12:12 AM seb_kuzminsky: depends on !IPIPE
12:12 AM seb_kuzminsky: so this RTAI patche disables virtio
12:12 AM seb_kuzminsky: i switched the VM from virtio disk to simulated sata and it booted
04:15 AM andypugh: The kernel runs well under VMware.
05:17 AM andypugh: seb_kuzminsky: Do I need to re-generate the packages to get the file names right? Or have you re-constructed them?
05:42 AM andypugh: jepler: It seem that my dodgy “fix” for the memset worked: http://buildbot.linuxcnc.org/buildbot/builders/1635.rip-buster-rtpreempt-i386/builds/111/steps/compile/logs/warnings%20%2855%29
05:43 AM andypugh: Only docs warnings left now.
09:01 AM seb_kuzminsky: andypugh: i renamed the ones i needed, not sure about the others
09:02 AM seb_kuzminsky: andypugh: my VM sort of froze up while running runtests in a loop overnight
09:02 AM andypugh: I wonder if that is the VM or the kernel?
09:03 AM andypugh: My test PC has up-times of days, but typically not running realtime.
09:08 AM andypugh: Thinking about it, I have occasionally seen a lockup during runtests. I wonder if loading/unloading realtime is slightly squiffy?
09:09 AM andypugh: (I have definitely had LinuxCNC live and running for days)
09:26 AM rene_dev_: andypugh the test failure sounds a bit like undefined behavior. I will run the clang analyzer again, and see if it finds something
09:27 AM jepler: The other theory that occurred to me about the tool errors is, it seemed to report that an earlier loaded tool number was still loaded. I don't have a grasp anymore of how data flows among task, tool, and UI; but could it be that due to the enlarged buffers for tool data, things are taking more time than expected to get pushed around?
09:28 AM andypugh: It does feel like a possile race condition
09:28 AM jepler: so more like a race condition among these programs that run independently, than undefined behavior that tools like clang, ubsan, and valgrind would find
09:28 AM andypugh: It might even only show up in runtests, where the tool number is polled immediately.
09:29 AM jepler: and the buildbot tests run on a pretty heavily loaded machine, since all those are VMs within the same system and they're all compiling or running tests at once
09:29 AM andypugh: I can’t recall if the HAL pin is correct. If HAL is correct and userspace is momentarily not, that might not be an issue.
09:30 AM andypugh: Gene is seeing it on a standalone Pi4, I think.
09:30 AM jepler: I should look at the actual test code with this hypothesis in mind
09:30 AM jepler: instead of flapping my gums / keyboard
09:31 AM rmu|w: "undefined behaviour" should always do the same if it doesn't depend on uninitialized data or some race condition
09:31 AM jepler: oh I thought you said P4 and I was thinking, that's seriously old computing but that's Gene I guess :)
09:35 AM seb_kuzminsky: my bridgeport runs on a P4 :-)
09:35 AM seb_kuzminsky: pentium 4, single core
09:37 AM andypugh: I have finally silenced the xhc-whb04b-6 driver docs build.
09:37 AM jepler: excellent. how's the output look?
09:37 AM andypugh: Now just the “missing section” ones. Though I am puzzled.
09:37 AM andypugh: http://buildbot.linuxcnc.org/buildbot/builders/1630.rip-stretch-rtpreempt-amd64/builds/1700/steps/compile/logs/warnings%20%2818%29
09:37 AM jepler: 2.8 right?
09:38 AM andypugh: Yes.
09:38 AM seb_kuzminsky: andypugh: this "sort of locks up during realtime unload" problem is why i could never get rtai 5.0 on linux 4.4 to be stable
09:38 AM andypugh: xhc docs now online: http://linuxcnc.org/docs/2.8/html/man/man1/xhc-whb04b-6.1.html
09:38 AM seb_kuzminsky: it happens for me both on VMs and on bare metal
09:38 AM andypugh: With the new kernel?
09:41 AM seb_kuzminsky: your new rtai kernel freezes my buster VM, i haven't tried it on bare metal yet
09:41 AM seb_kuzminsky: my old rtai 5.0/linux 4.4 kernel locked up both my VMs and my baremetal machines
09:41 AM seb_kuzminsky: i'm going to try your kernel on bare metal next
09:48 AM andypugh: I also have a 4.19.114 version to try.
09:51 AM andypugh: (Whilst a lockup exiting realtime is a bad thing, it shouldn’t lead to broken machines and faulty parts)
09:52 AM rene_dev_: jepler then there is still a bug.
09:52 AM rene_dev_: rmu|w not on 10 different os, with 10 different compilers
09:53 AM jepler: rene_dev_: yeah, even "just" a flaky test is a bug that should be addressed.
09:54 AM rmu|w: rene_dev_: no, but on the same machine/same os/same compiler it should
09:54 AM jepler: since it has a negative effect on getting code changes vetted and packages built
09:54 AM rene_dev_: rmu|w not if its like the bug in halrun, where it depends on uninizialized variables
09:55 AM rene_dev_: jepler the values I used are from tormach, so they are well tested
09:55 AM jepler: If the behavior change is "99.9% of the time, the changed tool is visible in 20ms" to "90% of the time, the changed tool is visible in 20ms" then the solution is probably to change the test to wait longer before checking. But I stress that this is just an untested hypothesis that I think may be worth looking into
09:55 AM rene_dev_: jepler does this increase decrease the mem available for hal pins?
09:56 AM rene_dev_: because I read reports of limited hal pins, and I wonder if its related
09:56 AM jepler: "this" being the change in the number of tools? No, that comes from a separate limit
09:56 AM jepler: maybe it's something else like that UI creates pins for every pocket and now there are more of them?
09:57 AM andypugh: There is another test that sometimes fails too:
09:57 AM andypugh: :: /home/buildslave/emc2-buildbot/wheezy-i386/rip-wheezy-i386/build/tests/motion-logger/startup-gcode-abort
09:58 AM rmu|w: "out of memory" was probably me with gmoccapy. i have a ridiculous amount of IO pins (3 mesa 7i90 on sserial, 216 IO pins) on that machine.
09:58 AM andypugh: startup-gcode-abort doesn’t look like a real bug, there is just sometimes an extra couple of lines of output. Though you have to wonder _why_ that is.
09:59 AM andypugh: Max mesa IO pins per board is something like 2944.
10:03 AM rmu|w: fact is, i had to increase hal memory since upgrading to recent 2.8, but only when using gmoccapy as GUI, not with axis or the newfangled QT stuff
10:16 AM jepler: The doc warnings: asciidoc: WARNING: axis.txt: line 1249: missing section: [sect5]
10:17 AM jepler: mean that the section nesting is too deep for the style or ourput type
10:18 AM jepler: It's actually written as 4 levels ("====" sets off the header) but there's a directive somewhere else which "pushes down" the headers so it can be included in a larger document, like ":leveloffset: 1"
10:19 AM jepler: https://groups.google.com/forum/#!msg/asciidoc/66CogF4v4yY/apXdU9AB_X8J
10:22 AM jepler: so basically the fix would be to restructure the docs so "====" isn't used?
10:24 AM wellindex: andypugh, I tried to make a pull request from linuxcnc github I can't see how to make it find my github repo! anyway ,a link to a patch in that repo I would use Dial but if you change the scale by double click it automatically changes the out pin and the machine jumps to that new output i think it is a dangerous option in dial https://github.com/ALatSMT/pyvcp.jogwheel-options/commit/60b840f042ce342646c739d6377278184f0c41ee
10:28 AM andypugh: wellindex: Go to the LinuxCNC github, then press the “New pull request” button and the click the “compare across forks” link.
10:29 AM andypugh: jepler: Aha! I hadn’t thought to google “Section 5”, I had only tried “section missing”
10:30 AM jepler: https://gist.github.com/9a5d89df95157c8b931d230f44a445fd this took care of the warnings in axis.txt -- do you want to elaborate it or shall I?
10:31 AM andypugh: I can do it.
10:31 AM andypugh: I have everything open
10:31 AM jepler: awesome, thank you!
10:33 AM andypugh: I have a feeling that I have spent three days doing someting that Pavel could have done in 3 hours. Not that I have seen anything of psha for years.
10:48 AM seb_kuzminsky: micges got back to me about mesaflash, he says: "Do what you need to do to make it ok. I can add you as admin to the project, atm I'm very away from this"
10:49 AM seb_kuzminsky: so i think i'll move his mesaflash repo to the linuxcnc group on github, then we'll merge in the changes from jt's repo
10:51 AM seb_kuzminsky: and somehow try to communicate with the 20(!) people who have forked micges' repo to let them know about the move
10:51 AM jepler: when you "move" a repo in github, it gets a redirect
10:51 AM jepler: that's different than forking it or just creating a new repo
10:52 AM jepler: [repo] -> settings -> options -> danger zone -> transfer ownership
10:53 AM jepler: you'd transfer to the linuxcnc organization, change hats, and accept the transfer
10:54 AM seb_kuzminsky: thanks!
10:55 AM jepler: I think you have all the permissions on the linuxcnc organization side, just make sure you transfer it into the correct linuxcnc. I think organization linuxcnc-org is for the website, while organization linuxcnc is for everything else.
10:55 AM jepler: I can help with porting the changes from the jt repo, let me know if you want my time
11:02 AM seb_kuzminsky: whaa? there are two linuxcncs??
11:03 AM jepler: maybe linuxcnc-org is not used
11:03 AM seb_kuzminsky: wlo is a repo under the linuxcnc org
11:03 AM jepler: I'm not sure
11:03 AM jepler: okay then linuxcnc-org is .. nothing
11:03 AM jepler: or me squatting on a name, maybe
11:03 AM jepler: forget I mentioned it
11:03 AM jthornton: seb_kuzminsky, what do I need to do on my end?
11:04 AM seb_kuzminsky: jthornton: for now, nothing
11:04 AM jthornton: ok, that's very easy to do
11:04 AM jthornton: and yippie and thanks
11:05 AM jepler: seb_kuzminsky: happy to see you around, by the way
11:05 AM seb_kuzminsky: when micges gives me admin access to his mesaflash repo i'll move it to our linuxcnc group, then jepler and/or i will merge the changes that make up the first few commits of your repo into linuxcnc/mesaflash, and i'll make a new release and put new debs on wlo
11:06 AM jthornton: awesome
11:06 AM seb_kuzminsky: jepler: it's nice to be back! it's fun working together with everyone here again
11:06 AM seb_kuzminsky: jthornton: oh, and after that's done i think you should delete your repo, to reduce potential confusions
11:07 AM jthornton: yes, I assumed that
11:07 AM seb_kuzminsky: thumbs-up.png
11:07 AM jepler: afk, time to stretch
11:07 AM wellindex: andypugh, thanks didn't know I had to fork I imported, got it!
12:19 PM Tom_L: when mesaflash finally lands would someone please post a link to it?
12:20 PM Tom_L: andypugh, docs look much better
12:24 PM andypugh: Woohoo! Two (so far) completely clean, green builds: http://buildbot.linuxcnc.org/buildbot/grid
12:26 PM JT-Shop: awesome
12:30 PM andypugh: But some of them have 115 erors….
12:34 PM Tom_L: details....
12:36 PM andypugh: Right, as a celebration, I think I am going to hop on my bicycle and breathe some fresh air. I think I last left the computer chair on Saturday :-)
12:36 PM JT-Shop: I wish the lube for my crank would show up so I can get my bicycle back together
01:25 PM andypugh: Hmm, thses compiler errors are clearly wrong: http://buildbot.linuxcnc.org/buildbot/builders/1650.rip-buster-rtpreempt-rpi4/builds/68/steps/compile/logs/warnings%20%286%29
01:31 PM andypugh: jepler: seb_kuzminsky I have a concern…
01:32 PM andypugh: I no longer have _much_ of a concern.
01:33 PM andypugh: I see now that is a Pi, so PCI Mesa cards are not needed.
01:33 PM andypugh: And the redefinition of rtapi_inl to 0 won’t cause any problems, other than the compiler moaning.