#linuxcnc-devel Logs
Apr 25 2019
#linuxcnc-devel Calendar
09:25 AM mozmck: The problem I see with a 3.16.x rtai kernel is that it is (I believe) as old as the wheezy kernel. The people fussing about wheezy being EOL aren't going to be any happier with a newer distribution with an ancient kernel I suspect.
09:31 AM JT-Shop2: let them eat cake then...
09:34 AM Roguish: mozmck: at some point there is an EOL for everything. For old installations, that people don't want to mess with, ok, turn off the updates and leave 'em alone. For new installation, use something 'current', not necessarily bleeding edge but is stable and has some lifetime to go. For the 'bleeding edge' folks, let 'em do what they want.
09:35 AM Roguish: Trying to make everyone happy without making anyone upset is just impossible. And a pretty big waste of valuable support time.
09:36 AM JT-Shop2: yea I'm happy that the CHNC is running Ubuntu 10.04 well I was until I found QtPyVCP now I gotta upgrade it to debian 9
09:37 AM JT-Shop2: I still don't see how a livecd is tied to a release, I know it's nice to have one
09:42 AM CaptHindsight: mozmck: there is no stable RTAI for > 3.16
09:43 AM Roguish: CaptHindsight: is there any reason that RTAI needs to be continued? real functional reason?
09:44 AM CaptHindsight: just for those that want to use a LPT for stepping and preempt_rt is bot fast enough or for application that just need faster
09:44 AM CaptHindsight: bot/not
09:46 AM CaptHindsight: we kept RTAI going for several years when the rtai.org dev just didn't have time anymore
09:46 AM Roguish: Ok. makes sense. Though personally I thing LPT should die. Sort of like DOS.......
09:46 AM CaptHindsight: but he seems to resent help and most RTAI users are the cheapest of the cheap
09:47 AM CaptHindsight: even the guberment military contractors
09:47 AM CaptHindsight: nothing wrong with LPT ports for stepping
09:48 AM CaptHindsight: I'd rather see the cloud die or M$ or a ton of other slime
09:48 AM Roguish: with you on 'the cloud'.
09:49 AM CaptHindsight: back doored factory firmware is high on my list
09:50 AM CaptHindsight: but you'll probably get your LPT wish vs any of my desires
09:50 AM CaptHindsight: LPT never hurt anybody
09:54 AM Roguish: maybe the Linuxcnc should be forked. one LPT/RTAI branch and one prempt/modern branch. don't really know. I just see an awfully lot of support time devoted to getting someone's IBM XT running
09:55 AM Roguish: maybe I'm just too cynical.
09:56 AM mozmck: Roguish: I personally don't care about EOL on the distribution. But this whole renewed effort to get stretch out with a newer rtai kernel is due to people complaining about wheezy going EOL
09:58 AM mozmck: Yeah, LPT is nice because you get cheap and fast IO.
10:01 AM CaptHindsight: how cheap are the FPGA's with hardware PCI?
10:01 AM Roguish: Ok, well EOL doen't mean it quits running. just turn off the updates..... Heck, I have a computer still running XP. no big deal...... wish it were NT..
10:02 AM JT-Shop2: try and reinstall XP
10:03 AM mozmck: I think people now are worried about security. But EOL also doesn't mean it automatically gets hacked - and for a machine control? Just take it off the 'net or put it behind a good firewall.
10:04 AM Roguish: security????? they're machine controllers. keep 'em off the internet for goodness sakes. that is the lamest excuse possible.
10:04 AM Roguish: seriously.
10:04 AM mozmck: CaptHindsight: from mesa it looks like the 5i25 is $89
10:05 AM mozmck: Roguish: that's my thought as well, but some people...
10:06 AM Roguish: no sympathies here. gotta tell people to get real.
10:06 AM mozmck: https://github.com/LinuxCNC/linuxcnc/issues/576
10:06 AM CaptHindsight: budget conscious users with new hardware that just want stepping
10:06 AM CaptHindsight: they are the ones that need a newer kernel
10:07 AM mozmck: I think you can still get a workable LPT card for $10 - $15?
10:08 AM CaptHindsight: I was one of the few that used RTAI with Mesa cards and still needed fast threads for machine vision
10:09 AM CaptHindsight: why we kept RTAI alive
10:09 AM Roguish: that's like worrying about latency when the machine is made of plywood and 80/20 ......
10:10 AM CaptHindsight: but it was disheartening to see that the plywood and 8020 users were the ones besides us using it
10:11 AM Roguish: the budget conscious sure don't mind spending lots of support hours, especially when it freeeeee.
10:11 AM CaptHindsight: but there a few threads about forking LCNC to use cheap ARM's to do the RT
10:12 AM CaptHindsight: yes, they don't value their time, and especially not yours
10:13 AM CaptHindsight: paid support was near non-existent, big OEM's would take months to budget a few $K
10:13 AM Roguish: is there still a 'board of directors' for linuxcnc?
10:14 AM CaptHindsight: and still want free help
10:15 AM CaptHindsight: or we contracted for a RTAI package for this hardware and 32b, can you send us a package for another completely different distro and with 64b support?
10:16 AM CaptHindsight: "Linux engineers" that couldn't seem to build anything on their own
10:42 AM CaptHindsight: imx8 PCIe can acts as an endpoint but it's not a cheap ARM soc
10:43 AM CaptHindsight: price is higher than a spartan6
02:53 PM andypugh: CaptHindsight: I have done a few more tests on this PC / RTAI
02:53 PM andypugh: Booting from the Wheezy Live CD gives apparent stability and good latency
02:57 PM andypugh: insmod of rtai_ha.ko and rtai_sched.ko manually causes no problems and nothing odd in dmesg
02:59 PM andypugh: andypugh@RM-one:~$ cat /proc/cpuinfo
02:59 PM andypugh: processor : 0
02:59 PM andypugh: vendor_id : GenuineIntel
02:59 PM andypugh: model name : Intel(R) Core(TM) i3-3220T CPU @ 2.80GHz
02:59 PM andypugh: cpu cores : 2
03:01 PM andypugh: [ 292.891916] RTAI[hal]: compiled with gcc version 6.3.0 20170516 (Debian 6.3.0-18) .
03:01 PM andypugh: [ 293.155396] RTAI[hal]: mounted (IPIPE-NOTHREADS, IMMEDIATE (INTERNAL IRQs DISPATCHED), ISOL_CPUS_MASK: 0).
03:01 PM andypugh: [ 293.155402] SYSINFO: CPUs 4, LINUX APIC IRQ 33025, TIM_FREQ 87297998, CLK_FREQ 2793536000, CPU_FREQ 2793536000
03:01 PM andypugh: [ 293.155406] RTAI_APIC_TIMER_IPI: RTAI DEFINED 33026, VECTOR 33027; LINUX_APIC_TIMER_IPI: RTAI DEFINED 33025, VECTOR 33025
03:01 PM andypugh: [ 293.155410] TIMER NAME: lapic; VARIOUSLY FOUND APIC FREQs: 87297998, 87297998, 0
03:01 PM andypugh: [ 300.533206] RTAI[malloc]: global heap size = 2097152 bytes, <BSD>.
03:01 PM andypugh: [ 300.580500] , kstacks pool size = 524288 bytes.
03:01 PM andypugh: [ 300.580506] RTAI[sched]: hard timer type/freq = APIC/87297998(Hz); default timing: oneshot; linear timed lists.
03:01 PM andypugh: [ 300.580522] RTAI[sched]: Linux timer freq = 250 (Hz), TimeBase freq = 2793536000 hz.
03:01 PM andypugh: [ 300.580525] RTAI[sched]: timer setup = 19 ns, resched latency = 3924 ns.
03:04 PM andypugh: I see some compiler warnings compiling RTAI
03:04 PM andypugh: calibration_helper.c: In function ‘main’:
03:04 PM andypugh: calibration_helper.c:134:45: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
03:04 PM andypugh: rt_thread_create((void *)user_calibrator, (void *)loops, 0);
03:04 PM andypugh: home/andypugh/RTAI/base/math/e_asin.c: In function ‘__ieee754_asin’:
03:04 PM andypugh: } else
03:05 PM andypugh: home/andypugh/RTAI/base/math/e_asin.c: In function ‘__ieee754_asin’:
03:05 PM andypugh: home/andypugh/RTAI/base/math/e_asin.c:83:8: warning: this ‘else’ clause does not guard... [-Wmisleading-indentation]
03:05 PM andypugh: } else
03:05 PM andypugh: and
03:06 PM andypugh: home/andypugh/RTAI/base/math/k_rem_pio2.c: In function ‘__kernel_rem_pio2’:
03:06 PM andypugh: home/andypugh/RTAI/base/math/k_rem_pio2.c:170:6: warning: this ‘for’ clause does not guard... [-Wmisleading-indentation]
03:06 PM andypugh: for(j=0,fw=0.0;j<=jx;j++) fw += x[j]*f[jx+i-j]; q[i] = fw;
03:39 PM CaptHindsight: andypugh: back
03:40 PM CaptHindsight: andypugh: since the live cd works I'll or memleak will send you a new kernel config
03:41 PM andypugh: Worth noting that LiveCD is i696 not x86_64
03:50 PM jthornton: hmmm wonder why?
04:11 PM mozmck: I know back in '10 we did only i686 because it worked on all hardware and there was a lot of 32-bit only hardware still.
04:11 PM mozmck: We also had some problems with 64 bit - at least on multi-core AMD systems.
04:14 PM memfrob: Hi andypugh, just to test and debug, can I make a specialized kernel config for you to try?
04:15 PM memfrob: If you send me your dmesg, lsmod, and lspci -k output, I can make a kernel tuned specific to your hardware -- This will cut down the compile time by a large margin too so recompiles won't take long at all.
04:16 PM andypugh: Yes, though at the moment (for fun) I am compiling a vanilla 4.9.80 kernel
04:16 PM andypugh: But I can interrupt that
04:17 PM memfrob: No that's fine, go ahead.
04:17 PM memfrob: With the current 3.16.52 RTAI kernel you have running right now, I know it seems a bit silly but do you have a second machine with a way to read off the serial port of the trouble system?
04:18 PM memfrob: If not (or too much work) you can try my suggestion yesterday of simply loading rtai_hal.lo and rtai_sched.ko and run dmesg in time to see if you get any errors.
04:18 PM andypugh: I tried that, no erroirs
04:19 PM andypugh: Give me a momemt
04:19 PM memfrob: This is the first time I've seen someone have trouble with my RTAI tree, but no trouble on the live CD.
04:19 PM andypugh: Ah, can’t do what I thought, as the machine is currently booted from the preempr-rt kernel
04:20 PM memfrob: You can send the output of those commands through any kernel as long as it supports all your hardware.
04:20 PM andypugh: OK
04:22 PM andypugh: https://pastebin.com/LJerVsjJ
04:22 PM andypugh: It’s a Core i3 cpu, if that matters
04:24 PM andypugh: Interestingly latency is 17k servo 18k base with LiveCD Wheezy RTAI and 10k / 6k with Stretch preempt RT.
04:24 PM memfrob: Did you apply all 7 patches against 3.16.52 ?
04:25 PM andypugh: Yes
04:25 PM andypugh: (And all took cleanly)
04:26 PM memfrob: OK I'll go through it all.
04:26 PM andypugh: The kernel appears OK in normal use now, but things go pretty wrong when I tun RTAI
04:29 PM memfrob: Hmm, the kernel is perfectly stable until the RTAI modules load?
04:30 PM andypugh: Hard to say. I haven’t tried to do much other than run RTAI
04:30 PM memfrob: Oh, CPU group scheduler is on and a few other things.
04:30 PM andypugh: And it seems stable with rtai_hal and rtai_sched loaded (but probably not doing anything)
04:31 PM memfrob: They load???
04:31 PM andypugh: Seem to
04:31 PM memfrob: With no problems? Can you post dmesg output after you loaded rtai_hal and rtai_sched under the 3.16.52 RTAI kernel?
04:32 PM andypugh: OK. Let me stop this compile and boot into the 3.16 kernel
04:34 PM memfrob: Watchdog is on too.. I missed a few things.
04:34 PM memfrob: I think this new config will fix it all.
04:35 PM memfrob: Is it possible for you to turn off EDAC?
04:35 PM memfrob: (in BIOS)
04:36 PM andypugh: https://pastebin.com/ZC0VzgZL
04:37 PM memfrob: WARNING: CPU: 0 PID: 0 at kernel/sched/idle.c:175 cpu_startup_entry+0x52f/0x540()
04:37 PM memfrob: Tickless idle was on too.
04:38 PM memfrob: So yeah, I missed a few things. The debian config needs several options throughout disabled or toggled to different options.
04:38 PM andypugh: I looked in the BIOS (it’s a facny one with a graphical interface and search) and “edac” “eda” and “dac” all come up blank
04:39 PM memfrob: ok.
04:39 PM andypugh: Is it a problem that /proc/cpuinfo says 2 CPUs and RTAI is configured for 4 ?
04:41 PM andypugh: Anything else to look at in the BIOS while I am here? Virtialisation / Power states / AHCI ?
04:41 PM memfrob: Virtualization and power states should all be off. AHCI is fine.
04:41 PM memfrob: ACPI is another story :P
04:43 PM andypugh: I can try turning that off or on (or off and then on again :-)
04:46 PM andypugh: ACPI is already off, as far as I can see
05:03 PM memfrob: Ok, this is the new generic config: https://pastebin.com/EY6tP0nw
05:04 PM memfrob: Needs initrd -- not tuned for your hardware but this one should work on all systems.
05:06 PM andypugh: When you say “needs initrd” what do you mean? Would I notice if the previous attempt lacked it?
05:07 PM memfrob: When a Linux kernel (the bzImage itself) does not have your specific ATA/PATA/SATA/IDE controller built-in (Y) but rather as a module (M) you'll get a kernel panic saying that it can't find your root device.
05:08 PM andypugh: OK. I would have noticed that :-)
05:08 PM memfrob: Considering you already have a 3.16.52 kernel booted with RTAI enabled with those drivers as modules, you made an initial ramdisk to go with it.
05:08 PM memfrob: I just want to make it clear when I send configs because I've had customers in the past say my kernel didn't boot because of a step they missed on their end.
05:08 PM andypugh: Did you add -RTAI in the config? I see it aftger loading your config and I am wondering if I am using your config or have loaded the old one
05:09 PM memfrob: I added it.
05:09 PM andypugh: OK
05:09 PM memfrob: You won't need the nomodeset line btw, I disabled nouveau to fix that, and it also allowed me to disable WMI.
05:10 PM memfrob: WMI has a mind of it's own and can cause problems similar to the one you're having.
05:10 PM memfrob: Radeon and Intel DRM drivers only.
05:10 PM memfrob: (and displaylink if anyone is using those USB monitors)
05:11 PM memfrob: If it still doesn't work, I'll go through making a system-tuned kernel config from scratch, just for you.
05:12 PM andypugh: just checking, when I exit menuconfig is your config saved to .config, or does “make” rememeber that it is using your config
05:12 PM andypugh: memfrob: The aim here is to have a generic kernel for an alternaive ISO
05:12 PM memfrob: It is best to save it manually to .config on top of whatever it wants to do on it's own.
05:12 PM andypugh: If we need a custom-tuned one, that’s not owrth bothering with
05:13 PM memfrob: Yes, I understand that. But if your system, personally, doesn't work with my RTAI tree, I need to know, because my knowledge, my RTAI tree is rock solid stable.
05:13 PM memfrob: *to my
05:14 PM andypugh: OK, so the custom config would be to prove that my PC system is to blame?
05:14 PM andypugh: Compiling now, I will give the the news in an hour or more
05:14 PM memfrob: Yes, but I sent you the generic one.
05:15 PM memfrob: Since you needed the nomodeset line for your system, I disabled nouveau in case others have nvidia cards too that have problems with 3.16.52
05:15 PM andypugh: Is this a concern? Just scrolled past
05:15 PM andypugh: kernel/sched/core.c: In function ‘context_switch’:
05:15 PM andypugh: kernel/sched/core.c:2414:3: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
05:15 PM andypugh: if (unlikely(__ipipe_switch_tail()))
05:16 PM memfrob: Support more hardware by disabling hardware in config, sounds counter-intuative..
05:16 PM memfrob: No, that's just GCC throwing a warning about indentation.
05:17 PM memfrob: Will be back in 15 min, let me know how it goes!
05:17 PM andypugh: It will take more than 15 mins!
06:36 PM andypugh: The kernel has compiled.
06:36 PM andypugh: I recompiled RTAI, but still seem to get:
06:36 PM andypugh: insmod: ERROR: could not insert module /usr/realtime/modules/rtai_hal.ko: Invalid module format
06:44 PM andypugh: OK, I seem to have fixed that (repeated make install and make modules_install)
06:47 PM andypugh: (incidentally, I still need “nomodeset”)
06:59 PM andypugh: Mayve there is a clue here?
06:59 PM andypugh: I logged in to the machine with ssh and set up a tail -f on the kernel log.
06:59 PM memfrob: you need to rebuild RTAI against your new kernel
06:59 PM memfrob: sorry I was afk.
06:59 PM andypugh: Then I started the testsuite on the actual PC
06:59 PM andypugh: https://pastebin.com/TMDbxGHK
07:00 PM memfrob: RTAI modules (rtai_hal.ko etc) must always be in line with your RTAI kernel. You can't rebuild the kernel and not RTAI otherwise you'll get all kinds of errors.
07:01 PM andypugh: I have done that
07:01 PM memfrob: Ok, did the testsuite work?
07:01 PM andypugh: (RTAI) make clean, make menuconfig make make install
07:01 PM memfrob: Sounds right.
07:02 PM andypugh: The testsuite seems to freeze, then when I Ctrl-C that terminal it says something about CANNOT FIND MAILBOX
07:03 PM andypugh: Then some info about the calibration testduite
07:03 PM andypugh: and nothing after # timer mode is oneshot
07:04 PM memfrob: CANNOT FIND MAILBOX .. are you loading rtai_hal.ko and rtai_sched.ko by hand before trying to run the testsuite?
07:05 PM memfrob: The testsuite should handle all of the module loading by itself.
07:08 PM andypugh: I just tried booting with nolapic, but the result is the same
07:09 PM memfrob: Because you need nomodeset I'm not sure you built the kernel right as this is an nvidia card.
07:09 PM andypugh: No, just sudo bash /usr/realtime/testsuite/run
07:10 PM andypugh: Well, the kernel has certainly built differently, as it now uncomresses the kernel whereas it mentioned ramdisk before
07:11 PM memfrob: I thought the system you sent me the info about has nvidia graphics, but I don't see any nvidia card in lspci.
07:11 PM andypugh: But it is more than likely that I have messed up somewhere
07:12 PM andypugh: Anyway, thanks for your efforts
07:12 PM andypugh: But I think it is time for me to accept defeat for tonight
07:13 PM memfrob: Ok, and because the decompress message came up, it is in fact the new kernel.
07:13 PM memfrob: CANNOT FIND MAILBOX means the testsuite isn't loading the modules properly which means there's probably a mismatch somewhere.
07:14 PM andypugh: Apr 26 01:04:30 RM-one kernel: RTAI[sched]: timer setup = 69 ns, resched latency = 3873 ns.
07:14 PM andypugh: Apr 26 01:04:30 RM-one kernel: LXRT releases PID 1047 (ID: display).
07:14 PM andypugh: packet_write_wait: Connection to fe80::42a5:efff:fe05:a291%en1 port 22: Broken pipe
07:15 PM andypugh: LXRT?
07:15 PM memfrob: That message is fine.
07:15 PM memfrob: What does bash --version say?
07:16 PM andypugh: GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)
07:16 PM memfrob: Ok, just double checking.
07:17 PM memfrob: I re-wrote all the ancient 200+ line RTAI scripts with features from either Bash 4.4 or 4.3
07:17 PM andypugh: After trying to run the testsuite the machine still accepts mouse movement and clicks, but very sluggishly. I can bring up the boot menu, but restart just freezes and I have to use the power button.
07:17 PM memfrob: Oh nevermind, the only line that does any of that is the setsmi script.
07:19 PM memfrob: Well tomorrow I'll send you a suggestion to try and maybe we can see why it's failing then.
07:19 PM andypugh: Maybe I should run the dmesgg tail on the same PC, maybe networking goes down first?
07:24 PM andypugh: I actually see slightly less on the actual machine, (Don’t see the LXRT message)
07:26 PM memfrob: Because it can't find the mailbox, something else is wrong
07:26 PM andypugh: I didn’t see that message this time…
07:27 PM memfrob: Because it's inconsistent that's also a problem.
07:29 PM memfrob: I have no idea what
07:29 PM memfrob: 's going on, honestly. The only way to find out would be to load the testsuite by hand, without the scripts, and see exactly what's happeing.
07:30 PM andypugh: I can try that, but tomorrow
07:31 PM memfrob: Sounds good, because it's almost acting like the real-time code itself is what is killing it, and the testsuite itself isn't broken.
07:31 PM memfrob: Generally with crashes, it crashes when you load rtai_sched -- I've never seen a case of this before.
07:31 PM andypugh: Well, at least it isn’t a boring problem :-)
07:32 PM memfrob: Yeah it's actually quite interesting. Well, good night andypugh!
07:32 PM andypugh: Goodnight