#linuxcnc-devel Logs
May 08 2020
#linuxcnc-devel Calendar
10:46 AM -!- #linuxcnc-devel mode set to +v by ChanServ
10:50 AM seb_kuzminsky: i added the new missing build dependencies to buster-python3, the next build should pass i think
10:50 AM seb_kuzminsky: linuxcnc-build: force build --branch=master 1660.buster-python3
10:50 AM linuxcnc-build: no such builder '1660.buster-python3'
10:50 AM seb_kuzminsky: linuxcnc-build: force build --branch=master 1660.rip-buster-python3
10:50 AM linuxcnc-build: build #17 forced
10:50 AM linuxcnc-build: I'll give a shout when the build finishes
10:53 AM jepler: seb_kuzminsky: thanks!
10:56 AM pcw_home: http://freeby.mesanet.com/hostmot2.9 new modules added
11:24 AM linuxcnc-build: Hey! build 1660.rip-buster-python3 #17 is complete: Warnings [8warnings compile]
11:24 AM linuxcnc-build: Build details are at http://buildbot.linuxcnc.org/buildbot/builders/1660.rip-buster-python3/builds/17
11:57 AM dwrobel: seb kuzminsky: 1660 build succeedded, while it shouldn't. it compiles html documentation and without https://github.com/LinuxCNC/linuxcnc/pull/842 it should failed if there would be no python/python2 interpreter installed.
11:58 AM seb_kuzminsky: that machine has both python2 and python3 installed
11:58 AM jepler: I don't think you can install a debian without python2 at all
11:58 AM dwrobel: but that will give us false positives
11:59 AM dwrobel: you can have both python2 and python3 but please check if you can avoid unversion /usr/bin/python?
12:00 PM jepler: dwrobel: no, debian python-minimal provides /usr/bin/python as a symlink to python2, it's unconditional and debian will probably break all over the place if you try to change it
12:00 PM jepler: if you have documentation on how to do what you're saying on debian based OSes I'm happy to learn better though
12:02 PM dwrobel: i'm on fedora where unversioned /usr/bin/python is a separate optional package - that's where I'm testing it
12:06 PM dwrobel: i checked it on docker "FROM docker.io/debian:buster" and there is no /usr/bin/python, so at first glance the system can work without it.
12:41 PM jepler: huh attempting to remove python-minimal from my desktop system wants to remove 300+ packages
12:43 PM jepler: libboost-python1.67-dev depends python-dev depends python depends python-minimal, so pulling in dev packages will pull it in
12:52 PM dwrobel: debian:unstable might looks more promising but it would requires some changes like: https://gist.github.com/dwrobel/fa831d179930e0035fb8f1a662fe7f0c
01:01 PM rene_dev_: jepler I changed python to python3 on buster using update-alternatives, and python2 and 3 linuxcnc still work fine
01:02 PM seb_kuzminsky: on my buster, /usr/bin/python is not managed by update-alternatives
01:02 PM rene_dev_: it isnt, but you can make it manage it :D
01:02 PM seb_kuzminsky: isure
01:02 PM seb_kuzminsky: *sure
01:02 PM rene_dev_: didnt seem to break anything
01:03 PM seb_kuzminsky: it's not a supported way of setting up a system, or the debian maintainers would have done it that way to start with, but it's a good and interesting datapoint that it didn't break your system :-)
01:04 PM rene_dev_: I know, I only wanted to test linuxcnc related stuff, to make sure it works both ways
01:07 PM seb_kuzminsky: cool
02:39 PM andypugh: Does it matter if HAL doesn’t free a couple of pins / params? My feeling is that is foesn’t matter as the shared memory are is handled inside RTAPI and the whole thing gets freed as a block in exit?
04:00 PM jepler: as far as I'm aware, the kinds of memory leaks that valgrind diagnoses are all freed when a program exits. In the case of HAL, things created in the HAL shared memory segment would be freed when halrun / linuxcnc session finishes and the shared memory segment is deleted.
04:23 PM Tom_itx is now known as Tom_L
04:48 PM andypugh: I tried commenting-out the whole thread freeing
04:48 PM andypugh: Nothing changed
04:49 PM andypugh: (and i still got the RTAI crash after 8629 iterations)
04:50 PM andypugh: I wonder if anyone has tried 10,000 iterations of load-unload with preempt-rt?
04:51 PM andypugh: (and, that said, I have managed 10,000 with RTAI. It’s proving a sloe thing to diagnose)
04:56 PM andypugh: So far, I have found that exiting here in hal_create_thread() has managed 10,000 cycles all three tiemes I have tried it: https://github.com/LinuxCNC/linuxcnc/blob/c0bdc74a649927994deb80b761b4feb02f165e0e/src/hal/hal_lib.c#L1928
04:56 PM andypugh: But exiting here: https://github.com/LinuxCNC/linuxcnc/blob/c0bdc74a649927994deb80b761b4feb02f165e0e/src/hal/hal_lib.c#L1928
04:56 PM andypugh: Has failed 2 times out of three.
04:58 PM andypugh: (this is a (halrun / loadrt threads / exit ) loop
04:59 PM andypugh: Does anyone have a handy preempt-rt physical machine to try this on?
05:06 PM andypugh: jepler: This wasn’s someting foind by valgrind, it was an oddity (an if (0) ) that I found in the LinuxCNC code)
05:07 PM andypugh: My worry is that in paring-down my test case to the minimum set I have foind something that crashes after 6000 cycles that is not the original problem.
05:09 PM andypugh: seb_kuzminsky: found a test case that crashed after 600 or so cycles. I have pared it down to something that is crashing in (generally) < 10,000 cycles. But is that the same issue?
05:27 PM Tom_dev: andypugh, Linux buster 4.19.0-6-rt-amd64 #1 SMP PREEMPT RT Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux
05:27 PM Tom_dev: if you care to step me thru what you need i could try it
05:28 PM andypugh: https://pastebin.ubuntu.com/p/yKHrBWskkj/
05:29 PM andypugh: Save that and make it executable. Run it. Wait an hour,
05:30 PM Tom_dev: what file extension?
05:35 PM Tom_dev: line 7: realtime: command not found
05:36 PM Tom_dev: (realtime start0
05:36 PM Tom_dev: (realtime start)
05:37 PM Tom_dev: oh i likely need to change the path to rip-environment
05:39 PM Tom_dev: halcmd: hal_init() failed: -22
05:39 PM Tom_dev: NOTE: 'rtapi' kernel module must be loaded
05:39 PM Tom_dev: RTAPI requires the real-time kernel 4.19.114-rtai to run.
05:39 PM andypugh: Hmm?
05:40 PM Tom_dev: the error i get
05:40 PM andypugh: That’s wierd
05:41 PM Tom_dev: this ssd has more than one kernel on it. i booted from the one i posted
05:41 PM Tom_dev: if that matters
05:41 PM andypugh: I think perhaps you need to _remove_ the rip-envirnment from the script
05:41 PM Tom_dev: what is a comment line in a .sh file?
05:41 PM Tom_dev: i'll comment it out
05:42 PM Tom_dev: if i do that i'll get the realtime start error probably
05:42 PM andypugh: Just nuke it from orbit. It’s the only way to be sure.
05:43 PM Tom_dev: starting pass 34
05:43 PM Tom_dev: ./andy_run_test_mod.sh: line 6: realtime: command not found
05:43 PM Tom_dev: Note: Using POSIX realtime
05:43 PM Tom_dev: Note: Using POSIX realtime
05:43 PM Tom_dev: realtime start is on 6
05:44 PM andypugh: It managed 34 passes then couldn’t find realtime any more?
05:44 PM Tom_dev: no i stopped it
05:45 PM Tom_dev: i can let it run if you want
05:45 PM andypugh: Does “halrun” work?
05:45 PM Tom_dev: tom@buster:~$ halrun -U
05:45 PM Tom_dev: Note: Using POSIX realtime
05:46 PM Tom_dev: brb
05:47 PM andypugh: I don’t know enough to know if “realtime start” ought to work on preempt-rt
05:49 PM Tom_dev: well if you don't we're both in trouble
05:49 PM rene_dev_: I have never heard about realtime start
05:49 PM rene_dev_: I cah test it on my box if you want
05:51 PM Tom_dev: well at least you know what you're looking at :)
05:56 PM Tom_dev: i'm running it without the realtime start line
06:10 PM andypugh: I don’t know if that works/
06:13 PM Tom_dev: yeah me either
06:13 PM Tom_dev: it's done ~9k iterations though
06:15 PM Tom_dev: starting pass 10000
06:15 PM Tom_dev: Note: Using POSIX realtime
06:15 PM Tom_dev: Note: Using POSIX realtime
06:24 PM Tom_dev: it is loading and unloading it
06:24 PM Tom_dev: i ran it manually to check if 'threads' was still loaded
06:25 PM Tom_dev: so if that _was_ a good test, it passed
06:26 PM Tom_dev: tom@buster:~$ halcmd -v loadrt threads
06:26 PM Tom_dev: Note: Using POSIX realtime
06:26 PM Tom_dev: <commandline>:0: Component 'threads' ready
06:26 PM Tom_dev: <commandline>:0: Program '/usr/bin/rtapi_app' started
06:26 PM Tom_dev: <commandline>:0: Realtime module 'threads' loaded
06:27 PM Tom_dev: tom@buster:~$ halcmd -v loadrt threads
06:27 PM Tom_dev: threads: already exists
06:27 PM Tom_dev: but halrun -U clears it, then it will reload
06:27 PM Tom_dev: seems valid
06:31 PM Tom_L: i'll run it with the rtai kernel
06:32 PM Tom_dev: Linux buster 4.19.114-rtai #1 SMP PREEMPT Tue Apr 28 23:46:05 CDT 2020 x86_64 GNU/Linux
06:34 PM andypugh: The preempt looks good.
06:37 PM Tom_L: having problems running it on rtai. says it's not loaded
06:38 PM Tom_L: rtapi kernel module not loaded
06:40 PM andypugh: put realtime start back|
06:40 PM Tom_dev: i did, i ran the original one you posted
06:40 PM andypugh: maybe RTAI needs the rip-environment?
06:42 PM Tom_dev: how do i load it?
06:45 PM andypugh: Well, the script had that in, initially
06:45 PM Tom_dev: yeah that didn't work
06:46 PM andypugh: I don’t know enough about your setup to know what is going wrong.
06:46 PM Tom_dev: tom@buster:~$ realtime start
06:46 PM Tom_dev: insmod: ERROR: could not insert module /usr/realtime/modules/rtai_hal.ko: Invalid module format
06:47 PM Tom_dev: yeah i'm not sure either
06:47 PM Tom_dev: way above my paygrade
06:47 PM andypugh: That sounds like perhaps your RTAI, LinuxCNC and kernel don’t match
06:47 PM andypugh: I would guess that LinuxCNC won’t work either?
06:47 PM Tom_dev: it was one of my early builds but it shows up like it's loaded
06:47 PM Tom_dev: right
06:48 PM Tom_dev: lcnc won't start either
06:48 PM Tom_dev: i can try it on the other pc
06:48 PM Tom_dev: i think it's ok
06:49 PM Tom_dev: but it was the one with the horrible latency
06:52 PM Tom_dev1: starting pass 1
06:52 PM Tom_dev1: starting pass 2
06:52 PM Tom_dev1: starting pass 3
06:52 PM Tom_dev1: starting pass 4
06:52 PM Tom_dev1: starting pass 5
06:52 PM Tom_dev1: starting pass 6
06:53 PM Tom_dev1: starting pass 7
06:53 PM Tom_dev1: starting pass 8
06:53 PM Tom_dev1: starting pass 9
06:53 PM Tom_dev: woops
06:54 PM Tom_dev: gonna take alot longer on that pc
06:55 PM Tom_dev: froze after pass 259
06:56 PM Tom_dev: try #2
06:58 PM Tom_dev1: Linux buster 4.19.114-rtai-amd64 #1 SMP Thu Apr 30 23:42:09 CDT 2020 x86_64 GNU/Linux
06:59 PM andypugh: 269 is closer to what I was getting with the full test.
07:00 PM andypugh: I need to back-track to see what I did to manage tens of thousands
07:00 PM Tom_dev: locked after 208 this time
07:00 PM Tom_dev: asrock Q1900
07:01 PM andypugh: Yes, seems that you have the probem on one PC with RTAI and not on another with pre-empt.
07:01 PM Tom_dev: yes
07:01 PM andypugh: Though it’s a shame that it isn’t the same PC.
07:01 PM Tom_dev: i could swap ssd and test it on this one
07:01 PM Tom_dev: it's the one i built it on
07:01 PM andypugh: But does fit the pattern of RTAI broken and preempt not
07:02 PM Tom_dev: gimme a couple and i'll test rtai on this one
07:04 PM Tom_dev: running
07:07 PM Tom_L: mb gigabyte i5
07:08 PM Tom_L: some passes take longer to run
07:08 PM andypugh: Yes, odd that
07:08 PM Tom_L: as did on the asrock 1900
07:08 PM Tom_L: but it's up past 600
07:09 PM Tom_L: 700
07:11 PM Tom_L: 1000
07:16 PM andypugh: Guessing it crashed and tooj you off the internet?
07:17 PM Tom_L: 1760
07:17 PM Tom_L: no
07:17 PM Tom_L: i'm on another pc here anyway
07:17 PM Tom_L: 1850
07:18 PM Tom_L: 2k
07:19 PM Tom_L: wonder what the difference is
07:19 PM Tom_L: want me to try the preempt-rt on the asrock q1900 as well?
07:21 PM Tom_L: started stalling around 2350 but it's still running
07:23 PM Tom_shop: asrock Q1900 test
07:23 PM Tom_shop: Linux buster 4.19.0-6-rt-amd64 #1 SMP PREEMPT RT Debian 4.19.67-2+deb10u2 (2019-11-11) x86_64 GNU/Linux
07:28 PM Tom_L: ok the gigabyte i5 froze at 3384
07:29 PM Tom_shop: asrock isupto 850 on preempt-rt
07:33 PM andypugh: I wish it was more predictable, it is not being easy to research
07:33 PM andypugh: Anyway, I have been staring at a screen all day, time to sleep I think.
07:33 PM Tom_L: i'm not sure how much memory the i5 has. that may be a contributing factor
07:38 PM linuxcnc-build: build #20 of 1660.rip-buster-python3 is complete: Failure [4failed compile] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/1660.rip-buster-python3/builds/20 blamelist: Rene Hopf <renehopf@mac.com>, Chris Morley <chrisinnanaimo@hotmail.com>
08:35 PM Tom_shop: andypugh, fwiw results of preempt-rt on the asrock q1900:
08:35 PM Tom_shop: starting pass 10000
08:35 PM Tom_shop: Note: Using POSIX realtime
08:35 PM Tom_shop: Note: Using POSIX realtime
09:12 PM memfrob: I think I fixed the crash with abs, should solve the build bot issue: https://github.com/NTULINUX/RTAI/commit/e5b522c36ffb21d91b3169269a7b3599082650f4
09:13 PM memfrob: Problem: That breaks the RTAI testsuite, it hangs on "cleaning up"
09:14 PM memfrob: rmmod hangs trying to unload rtai_sched -- At least now I know where I'm looking.
09:15 PM memfrob: Great, latency-test isn't working now
09:15 PM memfrob: So RTAI testsuite and LinuxCNC latency-test are broken but the bash loop is fine..
09:17 PM memfrob: Fix one thing, break two things
09:19 PM Tom_L: how would you get the kernel test suite to be a part of this one: https://github.com/NTULINUX/RTAI
09:19 PM memfrob: kernel test suite?
09:20 PM Tom_L: doesn't seem to be there
09:20 PM memfrob: what kernel test suite?
09:20 PM Tom_L: well jepler was asking about it the other day
09:20 PM Tom_L: in the /kern/ directory iirc
09:21 PM memfrob: that repository in terms of kernel source is just IPIPE, the conversion to RTAI, the Kconfig tweaks, and don't taint on RTAI load
09:21 PM memfrob: I'd have to know exactly what he means
09:22 PM memfrob: there is no all-for-one kernel testsuite although the phoronix test suite is pretty nice.
09:23 PM Tom_L: i just noticed it was in the download from rtai.org rtai5.2.tar.bz2
09:23 PM Tom_L: but not in that link i posted
09:23 PM memfrob: Are you referring to showroom ?
09:23 PM Tom_L: i know little about this stuff
09:23 PM memfrob: Oh, the RTAI kernel test suite instead of the user testsuite
09:23 PM memfrob: the kernel one always crashes for me
09:23 PM Tom_L: yes
09:23 PM memfrob: so I grabbed the user one
09:24 PM memfrob: I can't test the kernel one on my end so I didn't put it in there
09:24 PM Tom_L: :)
09:24 PM memfrob: I reverted that commit btw, made things worse than better
09:24 PM memfrob: It's been broken for me for years
09:24 PM memfrob: Does it work for you?
09:24 PM Tom_L: i did get a good install finally but it crashes
09:25 PM memfrob: the kernel one?
09:25 PM Tom_L: i think so
09:25 PM memfrob: yeah it's a POS.
09:25 PM Tom_L: you're asking things i'm guessing answers to :)
09:25 PM Tom_L: the one andy's been working wiht
09:26 PM Tom_L: i ran his crash test on a couple pcs here this evening
09:27 PM memfrob: I need to find the magic value for msleep
09:27 PM memfrob: There's only a few instances of msleep and changing them around really changes the behavior, a lot.
09:28 PM memfrob: I don't know C enough to know exactly what I'm doing and why the results are so huge.
09:28 PM memfrob: 5 microseconds vs 50 microseconds, whole thing blows up but the crash test succeeds. Absolutely not a clue why :)
09:29 PM memfrob: A difference of 45 microseconds shouldn't break the launch window of latency-test for example
09:30 PM Tom_L: i posted results of andy's test here fwiw
09:30 PM memfrob: If one problem can get fixed while breaking two others, then the other two bugs can fixed too. At least now I know where to look but it'd take someone like jepler to fix it the right way.
09:34 PM linuxcnc-build: build #6816 of 0000.checkin is complete: Failure [4failed] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/0000.checkin/builds/6816 blamelist: Rene Hopf <renehopf@mac.com>, Chris Morley <chrisinnanaimo@hotmail.com>
09:46 PM memfrob: You think if I keep trying numbers between 0 and 50 I'll find the number that works? lol
09:51 PM Tom_L: 50 compiles later...
09:52 PM memfrob: That's what ccache is for!