#linuxcnc-devel Logs
Jun 30 2020
#linuxcnc-devel Calendar
08:10 AM rmu|w: what is the opinion on non-ascii-characters in filenames and comments of g-code files? names of tools in tool tables? currently, characters with 8th bit set trigger all kind of errors
08:14 AM rmu|w: this is probably something to fix after python2->python3 transition
03:04 PM andypugh: rmu|w: I think if the OS supports a user’s language then we should try to. But I can see that our interpreter probably can’t habdle unicode G-code…
03:08 PM andypugh: I am mainly asking jepler this, but the question goes to anyone familar with C:
03:08 PM andypugh: Does this loop in RTAI do something that I am missing?
03:08 PM andypugh: http://svn.savannah.gnu.org/viewvc/rtai/vulcano/base/sched/sched.c?revision=107&view=markup
03:08 PM andypugh: Line 2581
03:09 PM andypugh: Maybe it is some linked-list magic?
03:11 PM jepler: andypugh: I am surprised that most of the operations in that loop are on a single piece of data, not one for each "CPU"
03:12 PM andypugh: As far as I can see all it really does is rt_linux_task.runnable_on_cpus = RTAI_NR_CPUS…
03:13 PM jepler: When the loop is all done, the structure is initialized. Notably, rt_linux_task.runnable_on_cpus = cpuid will be the highest cpuid in the system.
03:14 PM jepler: and a couple of global(?) arrays have a bunch of their elements pointed at this single rt_linux_task structure: rt_smp_current[cpuid] = &rt_linux_task; rt_smp_fpu_task[cpuid] = &rt_linux_task;
03:14 PM andypugh: Assuming that RTAI_NR_CPUS is true.
03:15 PM jepler: yeah surely it must be
03:15 PM andypugh: Is there any chance it comes from the menuconfig?
03:15 PM jepler: "We're inserting greater technology for greater capability"
03:15 PM andypugh: Need to track it down, I guess
03:15 PM jepler: RTAI_NR_CPUS?
03:16 PM andypugh: Yes.
03:17 PM andypugh: Probably is: #define RTAI_NR_CPUS CONFIG_RTAI_CPUS
03:17 PM jepler: #ifdef CONFIG_SMP
03:17 PM jepler: #define RTAI_NR_CPUS CONFIG_RTAI_CPUS
03:19 PM andypugh: Now, Alec says that should be _at least_ as many CPUS as there are, and Paolo says that it should be _exactly_ the number (and with hyperthreading off)
03:19 PM andypugh: I have been configuring with 8 (on Alec’s advice)
03:23 PM andypugh: I don’t know if this matters, I am still trying to understand the code. But i am definitely seeing the system crash at or around the time that rtai_sched loads.
03:23 PM jepler: this is with the vulcano tree?
03:23 PM andypugh: Yes.
03:23 PM andypugh: (That loop is the same in NTULINUX though)
03:24 PM jepler: base/include/rtai_schedcore.h:#define rt_linux_task (rt_smp_linux_task[cpuid])
03:24 PM jepler: ahah here's the thing about that loop, there's indexing of a global array hidden by a macro
03:25 PM rmu|w: andypugh: would be interesting to look at generated assembly. optimizer will move rt_linux_task initialization that repeats out of the loop if there is really no side-effect
03:26 PM andypugh: There is a function rtai_cpuid(). I wonder what that returns?
03:27 PM rmu|w: regarding the unicode stuff, that is mostly a python problem. interpreter should not care about 8bit characters in comments.
03:27 PM andypugh: Do you have a reliable way to grep for a function definition rather than use?
03:27 PM jepler: no, grep + secondary human filtering
03:28 PM jepler: rtai_cpuid probably returns the ID of the CPU where the code was recently running on.
03:29 PM andypugh: Ah, not a function anyway: #define rtai_cpuid() hal_processor_id()
03:30 PM andypugh: Ahm no, that’s ARM, x86 uses #define rtai_cpuid() ipipe_processor_id()
03:31 PM rmu|w: https://github.com/MaskRay/ccls https://github.com/MaskRay/emacs-ccls works pretty well IME ("go to definition")
03:32 PM andypugh: But I am using Geany :-) (Geany can “go to definition” too, but only looks in open files)
03:34 PM rmu|w: ccls implements the language server protocol (from microsoft?), maybe geany supports that?
03:36 PM rmu|w: ccls uses an actual c-compiler (llvm) to parse the code, you can feed it actual compiler arguments so all symbols etc.. are identical to what the real compiler sees.
03:37 PM andypugh: I am probably allowing myself to get distracted. It is probably not ideal that task->runnable_on_cpus != ipipe_processor_id() ever if the menuconfig over-estimates the number of CPUs. But if that was a problem it wouldn’t be this sublte problem.
03:37 PM jepler: if rt_smp_linux_task[999] is runnable only on CPU 999 it probably doesn't matter much
03:38 PM jepler: nor should it ever run
03:38 PM jepler: but yeah in general we need for things to still work properly if the number of max CPUs configured is not equal to the number of present CPUs.
03:38 PM jepler: the rest of linux does
03:40 PM rmu|w: hmm. i should read all the backlog before writing something... forget the stuff with looking at asm code. i hate macro.
03:40 PM rmu|w: s
03:50 PM andypugh: jepler: Any ideas of ways to make sure that a printk makes it to log? As I am monitoring via ssh I suspect that execution and logging are not synchronous. Maybe some strategic pauses would be informative,
03:52 PM jepler: you mean, assuming you're going to crash the whole computer in a few microseconds? ugh, good luck
03:53 PM andypugh: Yes, exactly that..
03:54 PM andypugh: I have filled __rtai_lxrt_init with printk() and see a varing nmber of them in the log.
03:55 PM andypugh: I have two theories. 1) That logging is inexact 2) That a realtime thread is starting early and calling a null pointer…
03:55 PM rmu|w: andypugh: https://www.kernel.org/doc/html/latest/networking/netconsole.html
03:57 PM andypugh: I am doing something a _bit_ like that, as I have a remote ssh terminal running tail -f /var/log/kern.log
03:58 PM andypugh: Which, interestingly, displsys stuff that is not in the kern.log file after reboot.
04:00 PM andypugh: I guess I _could_ run an etherney cable from the Mac, up the stairs and to the Linux test machine.
04:01 PM andypugh: (I do have one long enought)
04:02 PM rmu|w: netconsole doesn't go through userspace AFAIK, so could continue to work while userspace is completely dead
04:04 PM andypugh: I am fairly sure that it isn’t just userspace that is dead with these crashes
05:15 PM andypugh: rmu|w: I don’t seem to get anything through netconsile
05:15 PM andypugh: I do get: iMac:~ andypugh$ nc -lu 6666
05:15 PM andypugh: [ 1180.586355] netconsole-setup: Test log message to verify netconsole configuration.
05:15 PM andypugh: So the link is there, but nothing is sent to it.
05:18 PM MarkusBec_ is now known as MarkusBec
05:47 PM andypugh: answer: sudo dmesg -n 8
06:07 PM andypugh: jepler: Did you find any answer to Pi kernel installation “ unable to make backup link of './boot/System.map-4.19.71-rt24-v7l+' before installing new version: Operation not permitted “ problem?
08:44 PM jepler: andypugh: last time I suggested that this command let me complete an upgrade from one version of my kernel to another: dpkg-divert --list | grep rpikernelhack | while read _ _ a _ _ _ _; do dpkg-divert --package rpikernelhack --no-rename --remove $a; done
08:45 PM jepler: and said it appeared that something could leave the diverts around but I have not discovered what does that besides interrupting a package installation.
08:45 PM jepler: after running that command, re-run the installation of the new kernel
08:45 PM jepler: as root