#linuxcnc-devel | Logs for 2014-01-30

[00:06:17] <memleak> cannot home while shared home directory is closed?
[00:19:25] <KGB-linuxcnc> 03John Morris 05zultron/ubc3-dev 50582a1 06linuxcnc 10tests/bitops.0/test.sh tests/bitops.0/test.sh: change include dir to /include from /src/rtapi * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=50582a1
[00:19:25] <KGB-linuxcnc> 03Michael Haberler 05zultron/ubc3-dev ba5ad00 06linuxcnc 10src/rtapi/rtapi_msgd.c msgd: use synchronous signal delivery with signalfd() * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=ba5ad00
[00:19:25] <KGB-linuxcnc> 03Michael Haberler 05zultron/ubc3-dev bffa1d9 06linuxcnc 10src/rtapi/rtapi_msgd.c rtapi/msgd: explicitly shut down rtapi_app if exiting * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=bffa1d9
[00:19:27] <KGB-linuxcnc> 03John Morris 05zultron/ubc3-dev c52a41f 06linuxcnc 10tests/toolchanger/toolno-pocket-differ/shared-test.sh toolchanger test: make race condition less likely by increasing sleep * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=c52a41f
[01:25:54] <linuxcnc-build> build #38 of deb-precise-xenomai-binary-amd64 is complete: Failure [4failed apt-get-update shell_1] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/deb-precise-xenomai-binary-amd64/builds/38 blamelist: dummy, Sebastian Kuzminsky <seb@highlab.com>, Michael Haberler <git@mah.priv.at>, John Morris <john@zultron.com>
[01:28:58] <linuxcnc-build> build #38 of deb-precise-rtpreempt-binary-x86 is complete: Failure [4failed apt-get-update shell_2] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/deb-precise-rtpreempt-binary-x86/builds/38 blamelist: dummy, Sebastian Kuzminsky <seb@highlab.com>, Michael Haberler <git@mah.priv.at>, John Morris <john@zultron.com>
[01:29:18] <linuxcnc-build> build #38 of deb-precise-rtpreempt-binary-amd64 is complete: Failure [4failed apt-get-update shell_2] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/deb-precise-rtpreempt-binary-amd64/builds/38 blamelist: dummy, Sebastian Kuzminsky <seb@highlab.com>, Michael Haberler <git@mah.priv.at>, John Morris <john@zultron.com>
[01:36:59] <zultron> Hmm, this time around, same test fails, this time on fc19-32-rtpreempt (last was fc20-32-xenomai).
[01:37:51] <zultron> And this time, there *were* lines missing from the 'gcode-output' file, despite a 5 second sleep; maybe I made a mistake last time.
[02:01:12] <seb_kuzminsky> i meant a wait command inside linuxcncrsh
[02:03:06] <seb_kuzminsky> it supposedly accepts 'set wait done', but i dont think i ever got that to work
[02:21:46] <linuxcnc-build> build #1339 of deb-lucid-rt-binary-i386 is complete: Failure [4failed shell_3] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/deb-lucid-rt-binary-i386/builds/1339 blamelist: dummy, Sebastian Kuzminsky <seb@highlab.com>, Michael Haberler <git@mah.priv.at>, John Morris <john@zultron.com>
[03:24:39] <memleak> sweeet latency test running for a few hours now with CPU benchmarks running and staying at 40us
[03:24:50] <memleak> preempt_rt now
[08:28:55] <skunkworks> PCW, http://imagebin.org/289943
[08:29:06] <skunkworks> that is on the old atom
[08:30:17] <skunkworks> seems to peak at 200k writes - 500k reads
[08:53:52] <skunkworks> I switched over to a asus/amd motherboard. It locks up with the delays remarked out. Really crappy read times also.
[09:32:45] <skunkworks> that hardware doesn't play nice.
[09:33:03] <skunkworks> the servo thread is taking over 1ms
[09:33:53] <skunkworks> latency runs good though... Put in a pro 100 card and still get long servo thread times
[09:40:06] <micges> skunkworks: those times are in cycles, what cpu Hz is this?
[09:44:45] <linuxcnc-build> build #39 of deb-precise-xenomai-binary-amd64 is complete: Failure [4failed apt-get-update shell_3] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/deb-precise-xenomai-binary-amd64/builds/39 blamelist: dummy, Sebastian Kuzminsky <seb@highlab.com>, Michael Haberler <git@mah.priv.at>, John Morris <john@zultron.com>
[09:48:54] <skunkworks> oh - really?
[09:49:14] <skunkworks> I was wondering why I wasn't getting realtime errors when linuxcnc was running...
[09:50:11] <micges> yeah me too so I checked
[09:51:36] <micges> so if you have 2GHz then 1M cycles is about 50% so not so bad
[09:51:44] <micges> but could be better :)
[09:53:57] <skunkworks> no - the on this system (different intel base) the read time is taking 1.2 ms (if it is in ms?)
[09:55:12] <skunkworks> micges, are you saying this is not ns? http://imagebin.org/289943
[09:55:31] <skunkworks> (511us)
[09:58:46] <micges> yes I'm sure those are cpu clocks
[10:00:10] <skunkworks> huh
[10:02:15] <skunkworks> so a 2.4 ghz machiene with at read time averaging about 1200000?
[10:03:42] <micges> 500us
[10:04:02] <skunkworks> huh
[10:04:50] <skunkworks> I think i have had this problem before... :)
[10:05:13] <micges> hold on
[10:06:43] <skunkworks> well - it has to be something odd - as linuxcnc is not throwing a reatime delay error.. And if it was in ns - it is way over 1ms..
[10:10:15] <skunkworks> but the writes are taking 18000 whatevers
[10:18:05] <micges> if my calcs are ok your read function took 50us not 500us
[10:18:55] <micges> and write took 1us
[10:22:21] <skunkworks> wow
[10:23:14] <micges> I will double check it, seems that we all missed that those are cycles not ns
[10:25:09] <skunkworks> so 511800 on a 1.8ghz?
[10:30:04] <seb_kuzminsky> nginx includes 4-star code: https://github.com/nginx/nginx/blob/master/src/core/ngx_cycle.h#L38
[10:30:39] <micges> haha
[10:33:24] <skunkworks> I get 500us if I calculated right?
[10:33:54] <cradek> that it's void**** is extra funny
[10:34:34] <skunkworks> cradek - up for 15 minutes so far - dad found 2 open resistors..
[10:34:43] <skunkworks> on the drives
[10:34:46] <skunkworks> *drive
[10:36:40] <cradek> skunkworks: yay!
[10:43:48] <micges> skunkworks: correct again, on 1GHz cpu 500k cycle would be 500ns, so on 1,8 GHz cpu it will be about 300us
[10:44:21] <micges> and 18000 on 1,8GHz would be about 10us
[10:44:57] <micges> (I'm sick and seems that drugs removed my basic math skills,,,)
[10:45:06] <micges> time to rest, bbl
[10:50:03] <skunkworks> heh
[11:14:02] <cradek> have any of you tried the wheezy live cd?
[11:14:56] <cradek> I'd prefer for us to be able to use wheezy instead of precise for our next cd, but we really should not give up the live feature
[11:19:44] <seb_kuzminsky> i've used the wheezy netinst usb image to install, but not as a live system
[11:19:48] <seb_kuzminsky> and i havent tried to customize it
[11:21:27] <cradek> I'm about to install the rtai kernel (from our precise/base repo) on wheezy and see what happens
[11:25:09] <cradek> well if the install ever finishes
[11:25:17] <seb_kuzminsky> a person named Birchy reported in private email to me that our rtai kernel deb installs and runs fine on wheezy, and gives good latency
[11:25:24] <cradek> the ubuntu install-from-live is sure fast
[11:25:36] <cradek> that's good to hear
[11:25:51] <seb_kuzminsky> i guess we should put the rtai debs in the new wheezy dist in our repo too, if it works
[11:26:04] <cradek> I'll let you know my results today
[11:26:08] <seb_kuzminsky> cool
[11:26:19] <cradek> well, on one piece of hardware, anyway
[11:35:19] <skunkworks> ok - so to double check - the time, max, max-time is in clock cycles?
[11:36:26] <cradek> huh, my new install doesn't boot
[11:36:35] <skunkworks> yeck
[11:37:03] <cradek> in grub I can see it installed 3.2.0-something-pae, but all I get is a blinking cursor when it boots that
[11:38:16] <skunkworks> PCW, what did you end up with for pid on your config?
[11:41:48] <cradek> aha, turning ACPI on in the bios fixed it (it was hanging at PnP probe)
[11:57:24] <skunkworks> pcw_home, Hey.. Did you know that time and timemax was clock cycles? I didn't
[11:57:48] <skunkworks> it just happend to be on my atom board it was 'close' to actual ns...
[12:02:29] <skunkworks> pcw_home, this didn't make sense.. http://imagebin.org/289976
[12:02:41] <cradek> the rtai kernel from precise/base boots, but the linuxcnc debs from buildbot/precise/master-rt will not install
[12:03:25] <skunkworks> 1272924 was way more than a 1ms base period but I was not getting any realtime errors.
[12:05:45] <skunkworks> but if I did my math right - that is 530us on a 2.8ghz machine
[12:06:20] <skunkworks> cradek, you need to give us more information if we are going to help you... ;)
[12:06:57] <seb_kuzminsky> lol @ skunkworks
[12:06:59] <cradek> well they weren't meant to, I just thought I might get lucky
[12:07:27] <cradek> incompatible boost-python version and libc6 version
[12:07:32] <seb_kuzminsky> cradek: like thsi? https://www.youtube.com/watch?v=5NV6Rdv1a3I
[12:08:04] <skunkworks> heh - those french!
[12:12:06] <pcw_home> I dont know that so that things quite a bit better
[12:12:08] <pcw_home> I also think there maybe two read cycles that can be combined
[12:12:09] <pcw_home> (if this is the case the read times can be almost halved)
[12:15:10] <pcw_home> ( and those numbers match the ping numbers if the CPU clock speed is factored in)
[12:15:35] <skunkworks> I think I ran into that problem before - ns vs clock cycles.. (I had assumed it was ns...) I can not find a reference to it in the docs.
[12:16:26] <skunkworks> the write times are really low then :)
[12:16:34] <pcw_home> seems like a public parameter like read time should be in ns
[12:16:55] <pcw_home> (just one FP scale op)
[12:21:20] <skunkworks> could you do it in hal? or can't you hook parameters to say a scale componant
[12:23:21] <pcw_home> I always thought you could not 'net' parameters but I could be wrong
[12:23:35] <skunkworks> that is my feeling
[12:24:15] <skunkworks> pcw_home, what did you end up with for pidff settings?
[12:24:47] <pcw_home> FF1=1.000 FF2=0 P=50
[12:25:15] <pcw_home> (pix maxerror = .0003 or something)
[12:25:26] <skunkworks> ok. those give me following errors at anything above 50ipm
[12:25:29] <pcw_home> s/pix/pid/
[12:27:26] <pcw_home> you may have to adjust FF1 a tiny amount
[12:27:27] <pcw_home> you also need to set pid.X.error-previous-target true
[12:31:01] <skunkworks> error-previous target is set t true. I will play with ff1
[12:31:44] <pcw_home> you may need to look at a plot so see what going on
[12:32:18] <skunkworks> sure - what do you usually look at?
[12:33:37] <pcw_home> ferror when jogging
[12:33:43] <skunkworks> are you saying pid max error is .0003 or following error?
[12:34:17] <skunkworks> pid max erorr is 0
[12:35:15] <skunkworks> never mind - it is set to .0005
[12:43:59] <pcw_home> Until the DPLL is used for read sampling you will need to set the ferror to maxvel * max jitter
[12:47:45] <pcw_home> Hmm thats interesting max jog speed is set to 60 IPM but i can jog at full speed (1200 IPM) with the axis buttons
[12:50:02] <seb_kuzminsky> the cycles-vs-ns confusion in thread and function runtime reporting has bitten me too occasionally
[12:50:48] <seb_kuzminsky> after 78919de4a1c5ef9e495dc25ae27f43ff1e54b7e0, should we switch it to report ns instead?
[12:50:54] <skunkworks> pcw_home, you have the feed rate override set to 3000%
[12:51:08] <skunkworks> or whatever your config is set to.
[12:52:05] <pcw_home> ahh didn't now that affects jogging as well
[12:53:05] <skunkworks> So - I get an error of .012 http://imagebin.org/289981
[12:53:16] <skunkworks> jogging x to 1200 ipm
[12:53:51] <skunkworks> with p=50 ff1=1
[12:54:11] <skunkworks> god - linuxcnc is cool
[12:54:22] <pcw_home> looks like your FF1 needs adjustment (servo thread period off)
[12:54:38] <skunkworks> so - like 1.0001 and see?
[12:55:01] <pcw_home> if you plot the velocity also its easier to see
[12:55:20] <micges> seb_kuzminsky: yes I think we should try to have all possible in ns
[12:55:25] <pcw_home> calibrate make this easy
[12:59:36] <pcw_home> skunkworks: just like tuning a velocity mode servo
[12:59:38] <pcw_home> (though FF1 of 1.000 worked for me , you may have a significant time base error in your servo thread)
[13:04:54] <pcw_home> stepgen at 1200 IPM:
[13:04:55] <pcw_home> http://imagebin.ca/v/1AdtzsuVn5Il
[13:10:11] <pcw_home> Note that the jitter causes apparent ferrors proportional to the jitter times the velocity
[13:10:12] <pcw_home> Actual errors will be much smaller due to the fact that the control system is
[13:10:14] <pcw_home> almost a perfect velocity mode servo so 99.99% of control is FF1
[13:10:54] <pcw_home> (and corrections via the P term are limited by the max error term)
[13:15:11] <skunkworks> this is what I get just strait p and ff1 http://imagebin.org/289986
[13:15:45] <skunkworks> (50 and1_
[13:16:29] <skunkworks> doesn't that look funky?
[13:17:34] <pcw_home> look like you need more FF1
[13:17:49] <pcw_home> (or less actually)
[13:18:48] <pcw_home> thats does show the correcction slew rate limiting nicely
[13:19:35] <pcw_home> (and linear to exponential when you get below the PID maxerror limit)
[13:21:02] <cradek> looks like you're accelerating about twice as fast as the stepgen can
[13:21:43] <pcw_home> Pretty sure he just needs to reduce FF1 a bit (timebase difference)
[13:22:04] <skunkworks> if I decrease the ff1 to .999 the first spike gets larger the second spike gets smaller
[13:22:58] <skunkworks> that error is about .003"
[13:23:08] <pcw_home> Is ff2 0?
[13:23:14] <skunkworks> yes
[13:23:30] <pcw_home> very strange
[13:23:58] <pcw_home> you saw my plot, FF1=1 FF2=0 P=50
[13:25:38] <skunkworks> this is with .999 ff1 http://imagebin.org/289989
[13:26:23] <skunkworks> I also checked hal to make sure ff2 was 0
[13:31:07] <skunkworks> cradek, I increased the stepgen acc headroom to 500 from 360 - no real change - the axis is set to 300in/sec/sec
[13:31:44] <pcw_home> try freeby.mesanet.com/7i76e.zip
[13:31:45] <pcw_home> (you will have to change the card name/ip address/mac address)
[13:32:01] <pcw_home> thats what i have thats working
[13:32:35] <pcw_home> (with unchanged ubc3-7i80)
[13:32:57] <skunkworks> is that the right url?
[13:33:27] <pcw_home> Im suspicious of removing the rtapi delay since its has such profound effects
[13:33:56] <skunkworks> pcw_home, I had to put the delays back in - they locked up a computer I was testing
[13:34:21] <skunkworks> got it
[13:34:41] <pcw_home> thats the problem i had as well
[13:39:46] <pcw_home> on the previous ubc-7i80 version (that you had the watchdog issues with)
[13:39:48] <pcw_home> I had to add FF2 = servo period I think this was because he ended up using the
[13:39:49] <pcw_home> previous cycles read data. Look like you have a related problem (or a ini/hal file difference)
[13:42:40] <skunkworks> well - I get the same trace
[13:43:11] <skunkworks> spikes to about .003ish"
[13:44:07] <pcw_home> sounds like late data
[13:44:15] <pcw_home> 1 cycle late
[13:44:32] <skunkworks> so the driver might still be reading late?
[13:44:58] <pcw_home> sure looks like it
[13:45:25] <skunkworks> let me put this back on the atom.. for grins.. I thought that worked better.
[13:48:25] <pcw_home> you can probably partially fix it with FF2 but the fact that theres a difference
[13:48:27] <pcw_home> (with identical ini/hal files) indicates a problem
[13:54:15] <pcw_home> I added a tiny bit of FF2 on mine (.00012) I think this corrects for the
[13:54:17] <pcw_home> fact that the updated velocity is always a bit late so you might try that
[13:54:43] <pcw_home> (maybe yours is slower)
[13:56:39] <pcw_home> it may just be that you need a bit of FF2
[13:58:21] <pcw_home> so maybe nothing really wrong
[14:08:03] <skunkworks> pcw_home, http://imagebin.org/289991
[14:08:19] <skunkworks> same computer
[14:09:47] <micges> pcw_home: probably delay should be left in but decreased to let say 2us
[14:09:48] <pcw_home> ahh network must be a bit slower than mine
[14:10:25] <pcw_home> I think i tried 1 usec but it locked up
[14:10:45] <micges> oh
[14:11:43] <pcw_home> There seems to be trouble with logging, skunkworks: did you find a fix
[14:11:48] <pcw_home> ?
[14:12:04] <pcw_home> I only see the watchdog I/O in the log
[14:12:31] <skunkworks> pcw_home, http://imagebin.org/289991
[14:12:31] <micges> do you have latest source from git?
[14:12:40] <skunkworks> same computer
[14:12:55] <skunkworks> just added ff2 to .00025
[14:13:06] <pcw_home> so its ok just needs a bit of FF2
[14:13:21] <pcw_home> Yes from this morning
[14:13:35] <pcw_home> (git pull from this morning)
[14:13:47] <skunkworks> I am running from yesterday morning - Should I update?
[14:14:11] <pcw_home> I just have trouble with the log
[14:14:51] <micges> skunkworks: no
[14:14:59] <skunkworks> ok
[14:15:12] <micges> pcw_home: you have debug=1 in loadrt line?
[14:15:41] <skunkworks> oh - pcw_home what trouble are you having with log?
[14:16:13] <pcw_home> Yes debug=1, logging works but I think its dropped data
[14:22:33] <cradek> http://timeguy.com/cradek-files/emc/wheezy-rtai.png
[14:22:51] <cradek> I built and installed a master deb, and it all works
[14:25:09] <pcw_home> despite mhaberlers reservations about preemt_rt/kernel networking,
[14:25:11] <pcw_home> it actually seems usable. I have not seen any big delays at all despite a
[14:25:12] <pcw_home> month of uptime
[14:26:10] <pcw_home> Thats great latency! (and here we are figuring out how to live with 300 usec latencies)
[14:27:39] <cradek> well it's rtai...
[14:27:46] <skunkworks> and it isn
[14:28:01] * cradek hands skunkworks a '
[14:28:08] <skunkworks> and it isn't 300us latencies.. It is network read/write times I think
[14:28:27] <skunkworks> cradek, that is very nice!
[14:28:53] <skunkworks> the latency test seems to stay under 100us
[14:29:32] <skunkworks> cradek, what hardware?
[14:29:54] <pcw_home> Well i have peaks of 500 usec or so from a a baseline 200 usec network read time
[14:29:56] <pcw_home> Preemt_rt thread latencies are about 80 usec worst case on this machine
[14:30:20] <cradek> skunkworks: a weird slightly old server-class motherboard
[14:30:28] <cradek> skunkworks: model name: Pentium(R) Dual-Core CPU E6600 @ 3.06GHz
[14:30:36] <skunkworks> nice
[14:30:38] <cradek> it reports this, but it's nonsense if you look up what an E6600 is
[14:30:46] <cradek> so I don't really know what it is
[14:31:10] <cradek> it has two e1000s, and serial and parallel ports on the motherboard, so that makes me happy
[14:31:26] <skunkworks> nice
[14:32:09] <skunkworks> so how hard to make a livecd?
[14:32:36] <cradek> I don't know
[14:33:12] <cradek> probably similar to but not quite as weird as ubuntu (that's the summary of all my experiences with debian wheezy)
[14:34:41] <skunkworks> heh
[14:36:38] <skunkworks> how long is weezy life?
[14:51:32] <cradek> I can't seem to find that information
[14:51:44] <cradek> I think it's as long as the ubuntu LTS (and was more recently released)
[14:54:34] <cradek> > The security team tries to support a stable distribution for about one year after the next stable distribution has been released, except when another stable distribution is released within this year.
[14:55:15] <skunkworks> pcw_home, with the ff2 set to .00025 it has been running through the splash screen
[14:56:59] <cradek> skunkworks: so probably until early to mid 2016
[14:57:17] <skunkworks> wow - nice
[14:57:44] <cradek> so only 3-3.5 years, when ubuntu supposedly guaranteed 5, except that was always a lie
[14:58:33] <cradek> really 5 is too long. new hardware always stops things working in that much time.
[14:59:36] <skunkworks> cradek, is the hardware you are testing near you - or is it off in some distant location?
[14:59:54] <skunkworks> (can you actually touch it)
[15:00:01] <cradek> it's behind me
[15:00:04] <skunkworks> ok
[15:00:25] <cradek> it's up to 7k now that the screen has powered off and on
[15:00:42] <pcw_home> skunkworks: yeah I suspect it is useable as-is on some hardware and with jitter tolerant tuning
[15:00:43] <pcw_home> (a 1 KHz velocity mode servo should be OK as well)
[15:01:27] <skunkworks> pcw_home, if the readtimes can be cut in half - then 2khz servo threads should maybe be possible
[15:04:08] <pcw_home> 2KHz works on this machine (until I I watch a flash video)
[15:04:23] <skunkworks> that was pretty cool - adding just a bit of ff2 I could see the spikes slowly get smaller
[15:05:02] <skunkworks> on the k&t it seemed like I was just flailing around ;)
[15:05:40] <skunkworks> pcw_home, does it throw a real time delay?
[15:06:44] <pcw_home> Yes, or sserial gets a "hey I haven't finished the last transaction" error
[15:07:41] <skunkworks> well - the few systems I have had this on - no realtime delayse
[15:07:50] <skunkworks> delays
[15:08:27] <pcw_home> a real machine is much more complex than the stepgen (since its an ideal velocity mode device, no delays or inertia)
[15:08:29] <pcw_home> so tuning is more involved
[15:08:50] <skunkworks> right
[15:09:09] <skunkworks> it is all in the digital world..
[15:09:19] <skunkworks> andypugh, My 7i80 works!
[15:09:48] <andypugh> I thought it always did?
[15:10:10] <skunkworks> andypugh, my 7i80 now works with linuxcnc!
[15:10:20] <andypugh> I thought it always did?
[15:10:48] <skunkworks> no - there was an issue with the 7i80 and the watchdog with rt_preempt
[15:11:27] <skunkworks> I had it working with rtnet - but that was even more of a pain
[15:11:27] <pcw_home> micges fixed the bug
[15:11:56] <pcw_home> thats the thing with preemt_rt, simple but slow
[15:12:11] <skunkworks> pcw_home, how much is involved with getting it down to 1 read?
[15:12:52] <pcw_home> Not sure
[15:12:56] <pcw_home> bbl
[15:14:45] <micges> skunkworks: I'll add eeror checking today and I'll test some more on 7i76e with stepgen and encoder, then I'll check what can we do to down to one read per cycle
[15:15:14] <skunkworks> micges, awesome.
[15:26:21] <skunkworks> bbl
[15:36:40] <micges> mhaberler: hi
[15:36:48] <mhaberler> hi micges
[15:37:28] <mhaberler> only found this mmap'd packet ring stuff today, looks promising - read _without_ a syscall seems possible
[15:39:17] <micges> we will see how much 'faster' we need after I'll get down to 1 eth read per cycle
[15:39:26] <micges> now it's 2 reads
[15:40:15] <micges> I'm getting to it now
[15:41:46] <micges> this 'worst case udp' doc is very interesting
[15:42:45] <PCW> for 1K servo thread its seems usable now (with jitter tolerant configs) DPLL will help here also
[15:43:06] <mhaberler> I played a bit with ftrace and it's pretty easy to see where time gets lost; it's more a filtering exercise
[15:43:39] <mhaberler> that might be easier with a standalone minimal program instead of all of RTAPI, rtapi_app ff
[15:43:57] <PCW> yeah going through the whole network stack is quite a maze
[15:44:45] <mhaberler> putting rtapi_app under ftrace will create 130dB+ noise
[15:45:12] <mhaberler> well maybe not if just 1 thread running
[15:45:56] <mhaberler> I'm wary the servo cycle time window is getting a bit cramped with these time lags
[15:47:31] <PCW> 1KHz seem fine as is and when we get to one read there's plenty of headroom
[15:48:35] <PCW> but something faster would be needed for torque loops and higher performance machines
[15:51:40] <PCW> and theres still the possibility of going to a 2 packet system (only one read one write per cycle versus current 3 packet system)
[15:53:04] <mhaberler> why two reads? does the 7i80 send 2 frames?
[15:55:09] <micges> wd is checked in separate packet atm
[15:55:44] <mhaberler> ah
[15:56:06] <mhaberler> any chance to piggyback wd on every turnaround?
[15:56:13] <PCW> because wd was always a separate function
[15:56:51] <mhaberler> is that a spontanous packet by the 7i80 or is that a reply to a request?
[15:57:35] <PCW> the 7I80 is a pure slave, no spontaneous packets
[15:57:58] <PCW> (well except for bootp)
[15:58:01] <mhaberler> ah; it's been a while since I read the proto spec
[15:58:04] <mhaberler> sure ;)
[15:58:30] <mhaberler> I hope we can drag Nicholas McGuire into advising on this, he really knows the pitfalls well
[15:59:23] <mhaberler> does the wd checking thread to be in the servo cycle? I think that could be farmed out
[15:59:50] <PCW> we normally only have a servo thread
[16:00:50] <mhaberler> sure. The response should be pretty instantaneous, but technically there's no reason this couldnt be in a plain non-rt thread even
[16:01:27] <mhaberler> I'm not suggesting its a good idea; it might just not be needed to add to the servo cycle time budget
[16:02:14] <PCW> the wd stuff just needs to get included in the one packet en-queuing
[16:02:16] <PCW> (hopefully without impacting the hal file separate WD function syntax)
[16:02:17] <mhaberler> normal and wd replies use the same udp port I assume?
[16:02:39] <mhaberler> anyway to direct the wd reply to a different port?
[16:02:59] <mhaberler> then the wd read could be handled by another thread than servo
[16:03:12] <PCW> Yes they just need to get merged (since the per packet overhead is large but the per data item overhead is small)
[16:03:30] <mhaberler> so you do that on the firmware side?
[16:04:38] <PCW> no, the driver juts needs to merge the watchdog I/O with the rest
[16:04:44] <PCW> just
[16:05:10] <mhaberler> the driver.. hm2_eth.c I assume?
[16:05:18] <PCW> yes
[16:05:20] <mhaberler> not down the stack, right
[16:05:21] <mhaberler> ah
[16:05:50] <mhaberler> it's interesting to see the EtherCAT round trip times; that's almost an order faster
[16:05:58] <PCW> its just a historical accident tha the WD had its own function
[16:06:02] <mhaberler> aja
[16:06:42] <PCW> well if you write a driver for specific hardware, low latency is easy
[16:08:09] <micges> hmm
[16:08:27] <micges> I've merged wd into tram packet
[16:08:40] <micges> now I have 200k cycles read time
[16:08:41] <mhaberler> well if testing the mmap/ring method still shows this high times that would certainly point to the driver; but an ftrace should give pretty exact time lags from syscall to packet enqueue and from packet in to read complete
[16:09:11] <micges> on 2.8GHz
[16:09:21] <PCW> Im pretty sure it not the driver but kernel getting in the way
[16:09:41] <mhaberler> I'm away from hardware but the packet-tx-ring.c etc codes compiled and seemed to at least start (vbox)
[16:09:54] <PCW> (in the cases that work) Some drivers are hopeless
[16:10:33] <micges> that's ~70us
[16:10:39] <PCW> micges thats great!
[16:10:45] <mhaberler> right, the net stack does all sorts of non-deterministic stuff; Nicholas says the showstopper is everyting which needs memory allocation, and there's a lot of that going on (skbufs etc)
[16:10:53] <mhaberler> ha
[16:11:45] <micges> 40k cycles write time
[16:11:51] <mhaberler> well if we can shave off some of that by raw sockets etc it might even mean higher servo rates are possible
[16:12:01] <micges> that's 15us
[16:12:38] <mhaberler> very good already!
[16:12:42] <PCW> it may be the the current large delay spikes are memory allocation related
[16:12:48] <mhaberler> yes
[16:13:00] <PCW> but still OK at 1 KHz
[16:13:01] <mhaberler> that was his point - looks great most of the time, then - dang
[16:13:35] <PCW> well Ive beat on this pretty hard for a month or so and have no had any OOps
[16:14:09] <PCW> but obviously could be better
[16:14:14] <mhaberler> like 'unexpected rt delay' you mean?
[16:14:43] <PCW> yes (or sserial not complete errors)
[16:15:24] <mhaberler> still.. downstream more complex kins might stress the servo budget again; maybe one should test with a non-trivkins setup
[16:16:19] <mhaberler> actually it occors to me ftrace could well be used to instrument motion (at least kthreads)
[16:17:41] <micges> there is some room in driver to speed it up
[16:17:59] <micges> but it should be as fast as possible
[16:18:00] <mhaberler> the nice part about ftrace is - very low overhead if not used
[16:18:15] <mhaberler> anyway, thats a different theme
[16:18:57] <seb_kuzminsky> jepler: are you around?
[16:19:48] <micges> finally times are more sane now, I'll push changes later into ubc3-7i80
[16:22:10] <PCW> Great!
[16:23:52] <PCW> I think there's still a name bug in 7i76e pin names (two dots together)
[16:24:52] <micges> with sserial - yes there is
[16:25:36] <PCW> ahh that right its s sserial bug with longer than 4 char names
[16:26:45] <mhaberler> I just read up on this ring/mmap code. It seems even the tx side does not push down the frame through skbufs in the sendto(); what it does is actually notify the kernerl of a ringbuffer address containing a valid packet, meaning skbufs are not involved, this seems to go directly to the driver thereafter
[16:28:11] <mhaberler> it does actually send packets on vbox, so this might be a viable route
[16:34:54] <micges> will it work under xenomai userland relatime?
[16:35:55] <micges> current driver sockets approach won't work becouse of domain switch (relatime->not realtime)
[16:36:48] <micges> though I must check latest driver and see how it's improved
[16:40:45] <mhaberler> xenomai: I need to ask on the list
[16:41:26] <micges> thanks
[16:41:54] <mhaberler> I _think_ the sendto() will cause a domain switch; but since it's just a notify maybe we can talk them into something rt compati
[16:42:05] <mhaberler> them into doing something rt compatible
[16:44:46] <mhaberler> when I'm back I'll do some tests on real hardware, see if that brings an improvement; if it's worth it, we can take it to the xeno list; but I'll ask anyway if somebody tried
[16:45:13] <mhaberler> in theory rx path should work, since no system call is involved after setup, just manipulation in shm
[16:45:21] <mhaberler> tx path is the dubious one
[16:45:43] <andypugh> puzzled..
[16:46:08] <andypugh> man setenv / getenv works. But trying to use them gives "command not found"
[16:46:32] <andypugh> (Idoo)
[16:46:33] <mhaberler> wrong shell?
[16:46:38] <andypugh> (Udoo)
[16:46:49] <andypugh> Possibly...
[16:47:13] <mhaberler> setenv isnt a bash command; csh methinks
[16:47:22] <mhaberler> export FOO=fooval
[16:47:47] <mhaberler> getting anywhere with the udoo xenomai kernel?
[16:48:29] <andypugh> Yeah, just trying to figure out isolcpus and u-Boot
[16:48:41] <mhaberler> ha
[16:49:02] <andypugh> Paul Corner rudely suggested I should learn how
[16:49:07] <mhaberler> do these guys have any plans to upstream their board support?
[16:49:09] <mhaberler> haha
[16:49:23] <andypugh> There was a word there that I could have left out, I suppose.
[16:49:30] <mhaberler> got the wrong badge, hm
[16:50:47] <andypugh> printenv works.
[17:00:41] <KGB-linuxcnc> 03Dave 05unified-build-candidate-3 b45ce81 06linuxcnc 10(12 files in 4 dirs) Supply (c) & license notices for K9 config files. * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=b45ce81
[17:10:15] <mhaberler> PCW: see http://queue.acm.org/detail.cfm?id=2103536, search for TX_RING. this compares raw sockets with PACKET_TX/RX_RING to netmap. Now netmap is still way ahead, but the packet rate of the former is still pretty decent ;)
[17:11:21] <mhaberler> netmap skips skbufs, but thats obviously the only real difference
[17:13:44] <mhaberler> looks like acceptable compromise between generic hardware+driver reuse at 1.8mio packets/s
[17:14:13] <mhaberler> also turns out this feature is used by wireshark, so it's going to stay around
[17:14:35] <mhaberler> (or rather likely)
[17:29:01] <KGB-linuxcnc> 03Michael Geszkiewicz 05ubc3-7i80 77425c3 06linuxcnc 10src/hal/drivers/mesa-hostmot2/hm2_eth.c 10src/hal/drivers/mesa-hostmot2/hostmot2.c 10src/hal/drivers/mesa-hostmot2/hostmot2.h 10src/hal/drivers/mesa-hostmot2/watchdog.c hm2: read watchdog with same packet like rest of data from board * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=77425c3
[17:29:01] <KGB-linuxcnc> 03Michael Geszkiewicz 05ubc3-7i80 c3ff92b 06linuxcnc 10src/hal/drivers/mesa-hostmot2/hm2_eth.c hm2_eth: enable debug logging in enqueue_write function * 14http://git.linuxcnc.org/?p=linuxcnc.git;a=commitdiff;h=c3ff92b
[17:29:28] <micges> PCW: ^^
[17:29:34] <micges> skunkworks: ^^
[17:32:00] <micges> current timings on 2,8GHz: 95us read, and two 15us writes per thread cycle
[17:41:51] <skunkworks> wow
[17:42:01] <micges> both times are 10% better without logging
[17:42:02] <skunkworks> I will test it tomorrow
[17:50:49] <skunkworks> micges: the functions work the same? Don't need to change anything in the configs?
[17:51:54] <micges> no config changes
[17:54:05] <skunkworks> neat
[18:02:10] <seb_kuzminsky> micges: i like your watchdog-via-tram change
[18:05:52] <micges> thanks
[18:05:59] <PCW> mhaberler, I looked at netmap a while ago and it looks good
[18:06:32] <mhaberler> right, same trick just a tad faster, which I dont think is relevant here
[18:06:44] <mhaberler> at the cost of custom drivers
[18:08:56] <PCW> Thanks for that commit micges , I'll have to try it out later today when I get a chance
[18:10:33] <PCW> the hm2_spi stuff should probably mimic the hm2_eth since the adress overhead can be reduced the same way
[18:11:31] <linuxcnc-build> build #39 of deb-precise-rtpreempt-binary-x86 is complete: Failure [4failed apt-get-update shell_2] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/deb-precise-rtpreempt-binary-x86/builds/39 blamelist: Dave <dave@CalypsoVentures.com>
[18:11:41] <linuxcnc-build> build #39 of deb-precise-rtpreempt-binary-amd64 is complete: Failure [4failed apt-get-update shell_2] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/deb-precise-rtpreempt-binary-amd64/builds/39 blamelist: Dave <dave@CalypsoVentures.com>
[18:12:22] <PCW> mhaberler:Ill take a look at the PACKET_TX/RX_RING, generic drivers would be much better
[18:42:12] <linuxcnc-build> build #1730 of lucid-rtai-i386-clang is complete: Failure [4failed compile] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/lucid-rtai-i386-clang/builds/1730 blamelist: Michael Geszkiewicz <micges@wp.pl>
[18:51:13] <linuxcnc-build> build #40 of deb-precise-xenomai-binary-amd64 is complete: Failure [4failed apt-get-update shell_3] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/deb-precise-xenomai-binary-amd64/builds/40 blamelist: Dave <dave@CalypsoVentures.com>
[19:00:05] <brianmorel99> Using google there was a message from the buildbot about not defining _FORTIFY_SOURCE when using CNC. Was this option added to the configure script?
[19:02:27] <brianmorel99> Sorry, gcc. I normally grep and remove it from the .in files, but I have to redo it after each pull.
[19:02:55] <linuxcnc-build> build #813 of precise-amd64-xenomai-rip is complete: Failure [4failed apt-get-update compile] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-amd64-xenomai-rip/builds/813 blamelist: Michael Geszkiewicz <micges@wp.pl>
[19:05:35] <brianmorel99> "real 5m31.424s" with make , "real 1m28.992s" with make -j8
[19:08:28] <linuxcnc-build> build #934 of precise-i386-realtime-rip is complete: Failure [4failed compile] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-i386-realtime-rip/builds/934 blamelist: Michael Geszkiewicz <micges@wp.pl>
[19:18:33] <linuxcnc-build> build #1340 of deb-lucid-rt-binary-i386 is complete: Failure [4failed shell_3] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/deb-lucid-rt-binary-i386/builds/1340 blamelist: Dave <dave@CalypsoVentures.com>
[19:24:15] <linuxcnc-build> build #839 of precise-x86-xenomai-rip is complete: Failure [4failed apt-get-update compile] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/precise-x86-xenomai-rip/builds/839 blamelist: Michael Geszkiewicz <micges@wp.pl>
[19:25:57] <linuxcnc-build> build #1734 of lucid-i386-realtime-rip is complete: Failure [4failed compile] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/lucid-i386-realtime-rip/builds/1734 blamelist: Michael Geszkiewicz <micges@wp.pl>
[19:30:09] <skunkworks> pcw, well - have you tried it? hmm - there are some failures...
[19:33:25] <PCW> works for me
[19:34:37] <PCW> at least on AMD 350E 8139 and 8169/8111
[19:39:32] <memleak> be interesting to see RT-net working with lo ethernet controller
[19:39:58] <memleak> ifconfig lo 192.168....?
[19:40:16] <PCW> well that would drop the wire time
[19:45:59] <skunkworks> how is the thread time?
[19:49:15] <PCW> I dont recal what is was before on the AMD but the nominal value has gone down to about 250 usec from I think about 400
[19:49:31] <PCW> peak maybe 450 usec
[19:52:12] <PCW> crap the "cant find package Img' problem
[19:53:49] <linuxcnc-build> build #1734 of checkin is complete: Failure [4failed] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/checkin/builds/1734 blamelist: Michael Geszkiewicz <micges@wp.pl>
[20:03:13] <PCW> nominal value has gone down to about 150 usec (forgot the silly CPU speed)
[20:06:07] <skunkworks> pcw, you have to install - tkdev-img or something like that..
[20:07:01] <skunkworks> sudo apt-get install libtk-img
[20:07:20] <PCW> yeah i remembered
[20:07:57] <skunkworks> (forgot I had installed sim here)
[20:12:32] <PCW> the d525 is faster than the AMD 350E (325 or so usec worst case read time)
[20:12:34] <PCW> but not much torture yet
[20:45:40] <PCW> 2 KHz servo thread on the D525 -->7I80
[20:50:08] <skunkworks> yeay
[20:50:09] <skunkworks> yay
[20:52:56] <PCW> will try on faster machine at home
[20:53:49] <skunkworks> can't wait to try it.
[20:54:26] <PCW> And I think mhaberlers suggestion of using PACKET_RX_RING is a great idea
[20:55:01] <skunkworks> I don't know what that means
[20:56:57] <skunkworks> using the same realtime/nic driver - just a different way to send/recieve data?
[21:58:00] <pcw_home> a more direct way to get at the data