#linuxcnc-devel | Logs for 2014-11-12

Back
[08:58:03] <seb_kuzminsky> it happened again
[08:58:11] <seb_kuzminsky> on run #5846
[08:58:43] <seb_kuzminsky> the halui mdi test uses a full linuxcnc stack with a custom python gui (not using lui), and halui (using lui)
[08:59:20] <seb_kuzminsky> the test gui sets the task mode to one of manual/auto/mdi, then pokes the halui pin that causes halui to send an mdi command (switching mode to mdi first, if needed)
[09:00:32] <seb_kuzminsky> the behavior is: i see the python test-ui send its task-mode command, then i see halui send *its* task-mode command, then the halui task-mode command to switch back
[09:00:45] <seb_kuzminsky> the mdi command between the two halui mode-setting commands is missing
[09:01:12] <seb_kuzminsky> the second mode-setting command has a serial number indicating that lui thought it sent a command in between, but task never saw it
[09:01:54] <seb_kuzminsky> halui uses the "wait for received" technique, so sending an nml message to task doesn't return out of lui until task has echoed the serial number back via status
[09:05:02] <seb_kuzminsky> halui doesn't call lui_set_wait_mode(), so it gets the default wait mode of lui_command_wait_mode_received
[09:06:02] <seb_kuzminsky> during this time, the python test-ui is carefully not touching the command nml buffer
[09:06:46] * seb_kuzminsky sideeyes task
[09:22:42] <jepler> seb_kuzminsky: surely you aren't allowed to blame task until you make the same failure without lui
[09:28:49] <seb_kuzminsky> you're probably right
[10:42:02] <tinkerer> ANSCD: Philae has landed!!
[10:42:02] <tinkerer> http://new.livestream.com/esa/cometlanding
[10:51:01] <seb_kuzminsky> sweet
[11:10:43] <skunkworks> write tmax is still 399288
[12:52:10] <mozmck> Is it still good to disable hyperthreading for best latency? I'm using a dell gx620 single core pentium 4.
[12:52:47] <seb_kuzminsky> mozmck: i'm really not sure
[12:53:07] <mozmck> I guess I can just try and see.
[12:53:38] <seb_kuzminsky> i'd say, use the lstopo program from the hwloc-nox package to see which vCPUs share cache
[12:54:19] <seb_kuzminsky> then either use isolcpus to isolate the vCPU you want for rtai and all the vCPUs that might mess with the same cache
[12:55:13] <seb_kuzminsky> i guess if you disable hyperthreading that'll modify the cpu topology that lstopo reports, and it'll indicate a different set of vpcus to isolate, possibly just a single one
[13:01:29] <mozmck> I just disabled it, and the latency dropped in half
[13:01:54] <mozmck> was about 25000 for the base thread, is now about 9600
[13:03:58] <seb_kuzminsky> wow, that's great!
[13:04:03] <seb_kuzminsky> are you using isolcpus too?
[13:04:16] <mozmck> no, it's a single core.
[13:04:38] <seb_kuzminsky> does "nproc" currently say 1?
[13:04:44] <seb_kuzminsky> (with hyperthreading off)
[13:04:54] <mozmck> never heard of nproc, but yes it does
[13:04:59] <seb_kuzminsky> heh
[13:05:03] <seb_kuzminsky> ok, that makes sense then
[13:05:24] <seb_kuzminsky> with hyperthreading on your machine has two vcpus (on its one core), sharing cache
[13:05:43] <mozmck> jumped to 11800 now, still not too bad.
[13:06:02] <seb_kuzminsky> the non-rtai vcpu is competing with the rtai vcpu for cache space, evicting rtai code, introducing increased memory latency for rtai, which shows up as increased scheduling latency
[13:06:40] <mozmck> that makes sense.
[13:06:47] <seb_kuzminsky> with hyperthreading off there is only one vcpu, so there's half the executable code running and competing for cache space, so your latency is better
[13:54:55] <skunkworks> On the systems I have tested - disabling hyperthreading improves latency
[14:01:29] <seb_kuzminsky> that makes sense
[14:02:13] <seb_kuzminsky> isabling hyperthreading has a similar cache-pressure-reducing effect as using isolcpus to isolate all the vcpus that share the cache
[14:02:23] <seb_kuzminsky> plus also reducing the pressure on the l3 cache since there's less code running
[14:02:51] <seb_kuzminsky> i'd expect the effect to be significant in the absense of proper isolcpus, and small in the presence of correct isolcpus
[19:21:11] <seb_kuzminsky> grr
[19:21:16] <seb_kuzminsky> crash, you stupid thing
[19:21:19] <seb_kuzminsky> stop working so well
[19:25:46] <PCW> waiting for a random error?
[19:26:02] <seb_kuzminsky> yeah.... waiting for a race condition to bite
[19:27:54] <PCW> On my first computer I made a DRAM expansion memory board and it had soft error about once every 2 hours
[19:27:55] <PCW> I would think I had fixed it with my latest change and then BEEP BEEP, very frustrating
[19:28:38] <PCW> finally figured it out, but it took days