#linuxcnc-devel Logs
Jun 03 2019
#linuxcnc-devel Calendar
12:13 AM CaptHindsight: meant nano not micro
12:10 PM seb_kuzminsky: memfrob: awesome!
12:10 PM seb_kuzminsky: i merged the yapps2 thing into 2.8, and merged 2.8 into master
01:02 PM seb_kuzminsky: oops, there's a missing build dependency on python-yapps, i'll add that
01:50 PM memfrob: Wow, so changing the core from 1 to 0 really hurts latency, over a million nanosecs right off the bat with this current setup, move it to CPU 1 and it goes away
01:56 PM memfrob: I'm not really good at C programming so this code doesn't compile, and I have absolutely no idea why: http://dpaste.com/079KMJB
02:02 PM memfrob: So instead, I need duplicating code: http://dpaste.com/1YKYH7F Final snippet: http://dpaste.com/1Z7F2HG
02:05 PM memfrob: What's the best way to do this so this can get merged upstream? Two instances of this statement: if(rt_cpu_number != -1) is the only way it builds.
02:08 PM rmu: memfrob: what are you trying to accomplish. linuxcnc "detects" isolcpus just fine as it is.
02:08 PM memfrob: rmu, I'm trying to get LinuxCNC to run on the last core, not the first one.
02:08 PM rmu: memfrob: use isolcpus=3
02:09 PM rmu: (in case of quadcore)
02:09 PM memfrob: rmu: LinuxCNC does not detect if isolcpus=3 is set and will not use 3, it will use 0.
02:12 PM memfrob: Although that's actually not a bad idea if LinuxCNC could grep the kernel command line somehow or you could specify a core on the ./configure line. I was told that there was a discussion in #linuxcnc about how LinuxCNC will already use the last core, however if that is the intended behavior, it is not correct.
02:12 PM rmu: memfrob: are you sure? did you inspect with e.g. taskset or have a look in /proc
02:13 PM memfrob: Yes, I am sure.
02:19 PM rmu: *searching my system*
02:20 PM rmu: just a moment
02:24 PM memfrob: All of the logic for finding the core is in find_rt_cpu_number(); and there is nothing in there about isolcpus
02:25 PM rmu: no, but isolcpus affects what getaffinity returns
02:26 PM memfrob: http://man7.org/linux/man-pages/man2/sched_setaffinity.2.html
02:26 PM memfrob: "the only way to schedule processes onto the isolated CPUs is via sched_setaffinity() or the cpuset(7) mechanism."
02:29 PM memfrob: CPU_SET(rt_cpu_number, &cpuset); -> const static int rt_cpu_number = find_rt_cpu_number(); -> sched_setaffinity and sched_getaffinity functions
02:30 PM memfrob: The latency is horrible on core 0 but great on the isolated CPUs. I have CPU 0 for kernel timekeeping as you cannot use that core for nohz_full, so when core 0 is used, I can tell right away by the latency.
02:31 PM memfrob: I can do this a million times over again, go from hard-coding it to anywhere but 0, and then reverting my changes to mainline LinuxCNC, and my change will break latency at worst to 15 microseconds, whenever I use the tree as-is, latency is horrible.
02:31 PM memfrob: /s/break/bring
02:31 PM rmu: hmm
02:32 PM rmu: maybe i patched exactly that on my raspberry pi
02:32 PM memfrob: My current kernel command line is nohz_full=1,2,3 isolcpus=1,2,3 rcu_nocbs=1,2,3 idle=poll
02:33 PM memfrob: cores 1-3 work amazing, core 0 is broken for real-time.
02:33 PM rmu: so your find_rt_cpu_number returns 0 even in case you set isol_cpus?
02:40 PM memfrob: This is weird.
02:41 PM memfrob: I applied this patch to vanilla linuxcnc: http://dpaste.com/2FKDYNQ and it says CPU 3, but if I change it to CPU 3 myself, latency is fine.
02:43 PM memfrob: Hold on..
02:44 PM rmu: hmm
02:44 PM memfrob: I think my system is in a weird state now.
02:45 PM rmu: i'm also in a weird state now because all my realtime test systems booted a non-realtime-kernel
02:45 PM rmu: strangfe
02:45 PM rmu: *blaming unattended-upgrade*
02:47 PM memfrob: All my cores are going nuts, need to restart I guess.
02:49 PM rmu: if you specify isolcpus=1,2,3, everything will run on core 0
02:49 PM rmu: i would only isolate core 3 for realtime stuff
02:49 PM rmu: that is enough
02:50 PM rmu: usually you don't need to run linuxcnc realtime threads in parallel
02:50 PM memfrob: Note: Using POSIX non-realtime -- If you're using PREEMPT_RT it shouldn't say that, Maybe recompiling linuxcnc over and over again without running make clean finally broke the tree lol
02:52 PM rmu: perhaps you bootet into the wrong kernel? or you need to execute "make setuid"?
02:52 PM memfrob: Same kernel, trying this again.
02:53 PM rmu: so i just checked on my pi test system with preempt rt, 4 cores, isolcpus=3, rtapi properly sets cpu affinity to core 3
02:53 PM memfrob: Ok I got it all back.
02:53 PM rmu: for realtime threads
02:54 PM CaptHindsight: last night having scheduling on core 0 and real time threads on another core (1-3) worked best, but this all WIP
02:55 PM rmu: conventional wisdom is to run realtime stuff on one core and not on core 0
02:56 PM memfrob: pid 10864's (rtapi_app) current affinity list: 0-3
02:56 PM memfrob: my new kernel command line is nohz_full=3 isolcpus=3 rcu_nocbs=3
02:56 PM rmu: memfrob: get the pid of the realtime thread
02:56 PM CaptHindsight: rmu: just saying what the test results were showing
02:56 PM memfrob: what's that called?
02:57 PM rmu: e.g. look in /proc/10864/task
02:57 PM memfrob: 10864 10865 10867 10868
02:58 PM rmu: check those pids with taskset
03:01 PM memfrob: I restarted the latency test. ls /proc/11448/task -> pid 11449's current affinity list: 0-2 -> pid 11451's current affinity list: 3 -> pid 11452's current affinity list: 3
03:01 PM memfrob: vanilla linuxcnc. so maybe having isolcpus=1,2,3 broke the logic.
03:01 PM memfrob: time to test that now.
03:02 PM rmu: then it should be 0/3/3
03:05 PM rmu: ps -eaF
03:05 PM memfrob: Alright now the vanilla tree works as-is. It wasn't doing this last night at all.
03:06 PM memfrob: I spent hours testing it too.
03:06 PM rmu: column PSR shows core executing the thread
03:08 PM rmu: so whatever the issue was it seems it was something else ;)
03:13 PM memfrob: Yeah and if it's ever running on the wrong core again for whatever reason, I know how to fix it.
03:14 PM memfrob: Well thank you rmu for helping me debug. With your raspberry pi, what were referring to?
03:15 PM rmu: it's one of my linuxcnc test systems
03:15 PM rmu: the only one that is currently powered and has a known ip adress...
03:15 PM memfrob: I meant in terms of my patch, you noticed something funny with the core logic too?
03:17 PM rmu: i was in this particular rabbit hole myself, but after some tinkering around i found out that stock linuxcnc rtapi uspace code is fine as it is
03:17 PM rmu: it is a bit fuzzy, that was 2 years ago
03:18 PM CaptHindsight: maybe trying to change core affinity without rebooting gets wacky
03:20 PM mozmck: Here is the script to find the last core and any cores which share cache with it: ISOLCPUS=`lscpu -p | tac | awk -F, '{ if(FNR==1) {lastcore=$2; cpu=$1; cpu0=$1} else if ($2==lastcore && cpu0>1) {cpu = $1 "," cpu} } END {if(cpu0>0) {print "isolcpus=" cpu} }'`
03:20 PM mozmck: Then this to add it to the grub command line: sed -i "/^GRUB_CMDLINE_LINUX_DEFAULT=/ s/\"$/ $ISOLCPUS\"/" /etc/default/grub
03:22 PM mozmck: I have never seen a problem with linuxcnc not putting the realtime stuff on the last core. I've used isolcpus a lot - but haven't tried rcu_nocbs. I'll have to look into that more.
03:26 PM mozmck: seb_kuzminsky: thanks for doing to the 2.8 branch and buildmaster/slave stuff! I was out all day yesterday and am getting caught up now.
04:20 PM memfrob: seb_kuzminsky, I want to thank you as well for the yapps fix, working great.
04:53 PM seb_kuzminsky: mozmck, memfrob: sure thing! it feels good to be doing something in linuxcnc again :-)
04:53 PM seb_kuzminsky: i've got a little bit of tweaking left to do, then i'll return to lurk mode
05:25 PM CaptHindsight: I forget, what is the status of absolute encoders with LCNC, still to be developed?
05:27 PM CaptHindsight: https://www.sick.com/media/docs/7/07/607/Technical_information_HIPERFACE_DSL_Implementation_en_IM0056607.PDF at least there is a spec for the protocol
08:59 PM memfrob: Hi seb_kuzminsky so if I understand correctly, LinuxCNC has two different pre-release branches? Or is 2.8 beta now?
09:00 PM memfrob: How are you going to keep track of what should be in 2.8 vs master? Is master bleeding edge and 2.8 is dev-stable?
10:02 PM memfrob: brb