#linuxcnc-devel | Logs for 2013-06-29

[09:17:27] -wolfe.freenode.net:#linuxcnc-devel- [freenode-info] channel trolls and no channel staff around to help? please check with freenode support: http://freenode.net/faq.shtml#gettinghelp
[09:18:37] <zultron> Ha, seb_kuzminsky, I was thrown off by Google. They've got a bug in their calendar.
[09:18:59] <Tom_itx> http://www.timeanddate.com/worldclock/converter.html
[09:19:17] <Tom_itx> for you who are trying to figure out what time the meeting starts, it's 1600 GMT
[10:06:25] <memleak> Hello all!
[10:07:00] <memleak> jepler: I know you were working on the pagefault issue with PREEMPT_RT. Just so you know, Lars Sergerlund is actively working to fix the issue and get PREEMPT_RT working.
[10:07:02] <cradek> seb_kuzminsky: I had the same problem. I know I'm -5 or -6 but sometimes get confused about which is which
[10:07:28] <memleak> He did mention there was a lot of overhead cruft, linking issues, and many other things.
[10:08:28] <memleak> "I think I might have a clue, it could be so that we are using a lib
[10:08:28] <memleak> without mlocked pages from a realtime context ! So far it's just a
[10:08:28] <memleak> hunch ... but I'll look into it.
[10:08:28] <memleak> If this is so, a build that is statically linked would perform
[10:08:28] <memleak> better, so it could be a linking problem.
[10:08:29] <memleak> This is only a guess so far, but I haven't fund anything obvious."
[10:14:01] <jepler> we can't static link, we depend on dlopen for loadrt
[10:19:47] <jepler> anyway I remain skeptical that the original problem is page faults. until a deadline-missed message is printed, ps says there are zero faults in the realtime thread of rtapi while running 'latency-test 1ms 1ms'
[10:19:51] <jepler> $ ps -L -o pid,tid,min_flt,maj_flt 25448
[10:19:54] <jepler> PID TID MINFL MAJFL
[10:19:56] <jepler> 25448 25448 8890 0
[10:19:59] <jepler> 25448 25449 0 0
[10:20:01] <jepler> this is rtos-master-v0 + dlopen RTLD_NOW fix
[10:20:53] <memleak> I've noticed that too.. Well that complicates things I guess.. :/
[10:22:17] <jepler> unless it's ps which is lying. even after provoking this:
[10:22:20] <jepler> ERROR: Missed scheduling deadline for task 1 [1 times]
[10:22:20] <jepler> Now is 206104.930071115, deadline was 206104.091605354
[10:22:20] <jepler> Absolute number of pagefaults in realtime context: 6
[10:22:35] <jepler> ps is still reporting 0 page faults in thread 25449
[10:24:53] <memleak> Do you have anything conclusive or is it just too confusing? Because I've been stuck on it for weeks, and I've been working on it full-time.
[10:26:30] <jepler> I am only a dabbler
[10:26:41] <jepler> I have not run the -rt kernel on hardware I believe has good latency anyway
[10:27:29] <memleak> I don't think it's related to hardware at all, I've tried it with 3 different systems.
[10:27:48] <memleak> All with normally good latency...
[10:28:51] <jepler> anyway as I understand it your experience is different than mine
[10:29:00] <jepler> I get roughly the same results from latency-test 1ms 1ms and from cyclictest
[10:29:16] <jepler> if I understand right your results are different (latency-test worse than cyclictest by way more than 2x)
[10:30:02] <skunkworks> memleak: have you tried idle=poll in the kernel line?
[10:30:36] <memleak> jepler: only in incredibly rare cases it does.
[10:31:06] <skunkworks> the system I am playing with xnomai - it took 2 video cards and poll=idle for the latecy to be consistent. <10us
[10:31:20] <memleak> jepler: most of the time it's around 200,000 - 300,000 nanoseconds with PREEMPT_RT
[10:31:40] <memleak> skunkworks, xenomai with linuxcnc hardly complies without a bunch of changes to the linker flags.
[10:32:20] <memleak> LinuxCNC is even further away with trying to compile it against RTAI
[10:32:41] <memleak> The submakefiles need a lot of work.
[10:33:09] <skunkworks> memleak: well - following the directions here is pretty painless running xnomai with linuxcnc - I have done it a few times.... http://wiki.linuxcnc.org/cgi-bin/wiki.pl?NewRTInstall
[10:33:22] <skunkworks> *quite a few times
[10:33:39] <memleak> make dies in LinuxCNC
[10:33:48] <memleak> ^with that guide
[10:35:24] * skunkworks just did it yesterday...
[10:35:45] <memleak> I have to hardcode several linker flags in the submakefiles then even after it compiles, I get terrible latency. Haven't tried idle=poll / poll=idle in the kernel command line but i still get the same issue with PREEMPT_RT
[10:36:04] <memleak> (i.e. the pagefault error gets thrown)
[10:36:28] <memleak> Works fine for about 2 minutes, then all the sudden, 200,000 nanoseconds.
[10:36:58] <skunkworks> You should post your errors with the xenomai and linuxcnc. I think mharbler would like to know.
[10:37:49] <memleak> I did, he laughed at my hard-coded linker flags and said I did it wrong, then disappeared.
[10:38:42] <memleak> Still isn't fixed in upstream, several weeks have gone by.
[10:39:16] <memleak> I'll post the errors to the mailing list though when I get a chance.
[10:39:41] <skunkworks> sounds good
[10:40:11] <memleak> Alright, take care all!
[10:40:25] <memleak> jepler: I'll pass your info along to Lars.
[10:40:47] <memleak> logger[psha], link
[10:40:47] <logger[psha]> memleak: Log stored at http://psha.org.ru/irc/%23linuxcnc-devel/2013-06-29.html
[10:52:40] <jepler> zultron: when you wrote rtapi_get_pagefault_count were you aware of RUSAGE_THREAD ?
[10:56:53] <mhaberler> I think that is 1:1 copy & paste from Michael Buesch or Lars Segerlund code
[10:57:37] <jepler> OK, 'git gui blame' attributed it to John Morris
[10:57:53] <zultron> Yeah, what Michael said
[10:58:58] <jepler> I wonder whether RUSAGE_THREAD should be used there, so that the number of pagefaults in this realtime thread can be reported, not just the numbers in all threads (*including the non-realtime thread*) minus a baseline number
[10:59:26] <jepler> it's quite expected to have page faults in the main thread at least during setup times
[10:59:49] <jepler> .. like when an additional component is loadrt'd
[11:01:29] <zultron> memleak, sorry you're experiencing so many build issues. I believe I've pretty much sorted all those out in my UB branch, but that hasn't been merged into Michael's kitchen sink branch yet.
[11:03:02] <zultron> I'll probably start the merge in about a week once I fix non-RIP builds.
[11:42:09] <jepler> agenda item: a limit on the number of persons named "tom", "john" and "jon" shall be set
[11:42:23] <cradek> snrk
[11:45:55] <zultron> :P
[11:49:42] <CaptHindsight> zultron: where is your UB branch located? memleak has only been working off of Michael's kitchen sink branch
[11:50:57] <memleak> zultron: which git tree has your fixes in it so I can review them?
[11:51:03] <memleak> Sorry I missed your comments!
[11:52:18] <memleak> I know you have some pre-packaged .debs but those won't help me much.
[11:52:32] <CaptHindsight> memleak: I asked the same question seconds before you reappeared here :)
[11:52:47] <memleak> Whoops..
[11:53:14] <CaptHindsight> most are in the meeting channel now
[11:58:01] <memleak> Looks like it might be this branch: http://git.mah.priv.at/gitweb?p=emc2-dev.git;a=shortlog;h=refs/heads/dynload-rtapi-common-shm-ub-wip
[12:00:28] <CaptHindsight> memlaek: I'll be around if you need to leave, or you can check the logs
[12:01:38] <memleak> I see tons of changes to Submakefiles and header files that should fix the compiling errors at least without the need to hardcode variables in a quick and dirty way.
[12:06:35] <memleak> As for the pagefault / latency issues I don't see anything yet in the tree. Still looking.
[12:07:12] <zultron> memleak, CaptHindsight, I maintain my branch on github. I'll send it after the meeting, 'k?
[12:08:04] <CaptHindsight> zultron: great! it' great!
[13:11:20] <CaptHindsight> https://github.com/zultron/linuxcnc/tree/dynload-rtapi-common-shm-ub-wip
[13:11:46] <CaptHindsight> ok, so that's the branch memleak just found
[13:13:15] <zultron> Good deal. From #linuxcnc just now:
[13:13:16] <zultron> <zultron> CaptHindsight, let me know if you guys need instructions compiling it, but the base case is quite simple: ./configure && make
[13:13:20] <zultron> <zultron> (prepended by ./autogen.sh, of course)
[13:15:25] <CaptHindsight> zultron: thanks
[13:15:51] <zultron> memleak, the branch at git.mah.priv.at may not always be up to date; I tell mhaberler to pull from github now and then, which can take a while.
[13:16:08] <memleak> It's the -wip one right?
[13:16:10] <CaptHindsight> micges: we were having a discussion about this in here :)
[13:16:13] <zultron> Yes, -wip
[13:16:35] <zultron> But use the github remote, not git.mah.priv.at.
[13:16:50] <memleak> Of course, since that one is more up to date.
[13:17:17] <memleak> Did you spend a lot of time fixing the linking errors or was it pretty straight forward?
[13:17:42] <zultron> Frankly, I didn't have a lot of issues, except maybe the -llxrt one you ran into.
[13:17:52] <memleak> Ok. Hmm.
[13:18:03] <memleak> If I have any compiling errors do you want me to let you know?
[13:18:21] <zultron> I had a huge number of problems with the various CFLAGS, and ended up spending a few days sorting that all out.
[13:18:43] <zultron> Yes, I'd like to know, but I can do just limited hand-holding. ;)
[13:18:58] <memleak> CFLAGS that may have broke SSE math support?
[13:19:32] <memleak> i.e. compiling errors with sin/cos functions w/ RTAI
[13:19:54] <zultron> No, CFLAGS that contained linker args (and LDFLAGS that contained compiler flags), per-flavor CFLAGS defined in configure.in, Makefile, various Submakefiles, etc.
[13:19:56] <memleak> RTAI switched to libm recently for 2.6+ kernels
[13:20:08] <zultron> Just lots of mess.
[13:20:13] <memleak> The RTAI math support library has been dropped for 2.6.0+
[13:20:31] <zultron> Ah, well I haven't tried the new RTAI, so I may depend on your help for that.
[13:20:53] <memleak> Ok :) I have my own RTAI repo that I keep maintained much more often than upstream.
[13:21:10] <zultron> I saw the shabbyx repo, is that what you're talking about?
[13:21:24] <memleak> Yeah, I've basically took full control over his :P
[13:22:02] <memleak> Same SSE compiling error though with upstream RTAI.
[13:22:04] <zultron> It looks like Paolo is not too interested in integrating the changes that Shabby's collected, huh?
[13:22:19] <memleak> Indeed, hence why I joined teams with Shabby.
[13:22:34] <memleak> Paolo is weird sometimes.. >_>
[13:22:59] <zultron> Yes. I fear that will be the downfall of his project.
[13:23:13] <memleak> zultron: did you touch anything in your ub-wip branch in regards to pagefaults?
[13:23:27] <zultron> Why are you putting so much work into RTAI with this bleak outlook?
[13:23:57] <zultron> No, I haven't. Keep me informed of your and Lars's advances!
[13:24:17] <memleak> zultron: Because RTAI has been a trainwreck, I'm just picking up the pieces.
[13:24:23] <memleak> I'll keep you informed!
[13:24:32] <zultron> I've been working purely on the build system.
[13:24:32] <micges> zultron: give link for Shabby changes?
[13:24:42] <memleak> Somebody has to do it... And I'm offering.
[13:24:52] <zultron> memleak, got that link handy for micges?
[13:25:07] <memleak> To be honest, once LinuxCNC works without pagefaults and has good low latency, i probably wont have time to work on RTAI anymore.
[13:25:14] <memleak> Rather move to PREEMPT_RT kernel code.
[13:25:15] <zultron> Well, that's what I'm asking: why do you think someone needs to fix RTAI?
[13:25:16] <memleak> Much cleaner.
[13:25:30] <zultron> Ah, it's the pagefaults issue? Got it.
[13:25:34] <memleak> https://github.com/ShabbyX/RTAI
[13:25:40] <micges> thanks
[13:25:54] <memleak> I'm not fixing RTAI due to pagefaults. I'm fixing it because the code is awful.
[13:26:36] <zultron> Then my question remains: why? I can show you a lot of awful code, but there's no point in fixing it for lots of reasons.
[13:26:48] <memleak> And I was hoping to fix it up enough to use for a business project and do a lot of development with but turns out I really just need to drop RTAI and become an expert with PREEMPT_RT
[13:27:03] <micges> I've heared that all rtai code is awful
[13:27:04] <memleak> Hey I gotta go to an event right now, I'll be back in a few hours.
[13:27:11] <memleak> micges: you heard right!
[13:27:13] <memleak> Bye all!
[13:27:14] <zultron> Bye then!
[13:27:47] <zultron> The RTAI build/install system certainly has problems that drove me batty for a day.
[13:27:50] <memleak> sorry about the bad timing..
[13:27:58] <memleak> zultron: we'll keep in touch!
[13:28:03] <zultron> sure! see you
[14:56:32] <Dave911> I missed the meeting, but read everything. A couple of things have been tabled, putting those issues in limbo. Waiting a month to have the next meeting seems excessively long. Why is the passage of a month required before an up and down vote on tabled issues?
[15:16:25] <KGB-linuxcnc> 03Kim 05v2.5_branch db8ebf6 06linuxcnc 10docs/src/code/Code_Notes.txt * Docs: update an outdated paragraph
[15:27:31] <cradek> Dave911: IMO the idea is that the time will let people discuss and think about it
[15:28:25] <cradek> I think "up and down vote" is the wrong way of thinking about the issues that didn't have suitable consensus
[15:29:47] <cradek> the time between meetings is for people to work together to decide what's best and come to agreement
[15:29:54] <cradek> the meeting just formalizes the result
[15:30:58] <cradek> I see the tabled items as "no clear answer yet", not as "limbo"
[15:31:41] <ssi> the votes are easy for the easy stuff
[15:31:46] <ssi> the hard stuff is gonna take some discussion
[15:31:54] <cradek> yes
[15:31:59] <ssi> my concern is that that discussion won't happen in between, and it'll just get bounced month to month
[15:32:27] <cradek> if nobody bothers to cheerlead their idea and build support, then we will continue to have the status quo
[15:32:29] <ssi> or there'll be a discussion, but then at the meeting some dissenters will pop up who didn't participate in the discussion
[15:33:28] <cradek> the onus is on the person who's trying to propose a new action: to build support. I think that's mostly a feature.
[15:34:02] <ssi> you're probably right
[15:34:29] <cradek> I do kind of wonder what happens if someone shows up and votes no on everything just to be a pain - we'll cross that bridge if we come to it I guess
[15:35:33] <cradek> I think if someone is doing that, it will be apparent to everyone
[15:35:56] <ssi> I'm not so much talking about sandbagging like that
[15:36:12] <ssi> I more mean if there's some discussion but someone doesn't participate, but suddenly at the meeting they're gung-ho against
[15:36:54] <cradek> when d rogge proposed this scheme, he said it's a feature that the people who care are the ones who bother to show up
[15:37:12] <cradek> I hope that's what we find in the long run, too
[15:38:26] <cradek> I can imagine being sure that a proposal is bad, and reading the discussion but not being convinced (and being tired of explaining why I think it's bad). In that case, from the outside, my vote would appear like you say. I'm not sure it's a bug in the process, though...? Hard to say.
[15:38:52] <ssi> true
[15:39:10] <cradek> someone today proposed that a NO voter be asked to explain why not
[15:39:16] <ssi> and I guess this is where the "what constitutes a passing vote" argument comes in
[15:39:29] <cradek> yeah, that's the hard thing isn't it
[15:39:59] <ssi> and like I said earlier, for some things it's less important than others
[15:40:07] <ssi> "I think we should implement foo"
[15:40:13] <ssi> you can say no, and thats fine, you don't have to implement it
[15:40:19] <ssi> but "I think we should move our repo"
[15:40:31] <ssi> well everyone's affected by it however it turns out
[15:40:39] <ssi> so the yardstick potentially should be different
[15:40:47] <ssi> I don't know how to generalize that though
[15:40:49] <cradek> I bet for more important things, more agreement is needed
[15:40:55] <ssi> right
[15:42:11] <cradek> I feel slightly out of my league talking about what the ideal process would be -- I'd rather talk about proposals
[15:42:24] <cradek> so I guess I'll wait and see what others say about the process
[15:42:30] <ssi> I imagine it'll evolve
[15:42:32] <cradek> all I know is it seemed to work today
[15:42:34] <cradek> right exactly
[15:46:19] <cradek> that's nice - now on SF our bugs get their own integers
[15:59:16] <Tom_itx> voting no for valid reasons is different than just voting no to be an ass
[15:59:44] <Tom_itx> are the proposals presented supposed to be backed with a solution?
[15:59:55] <Tom_itx> or is it more of a wish list?
[16:01:06] <Tom_itx> i sorta see two different patterns. those with a wish list and those that are more hard core that want core changes made
[16:01:52] <Tom_itx> and i feel i haven't been active in it long enough to propose anything
[16:02:31] <Tom_itx> but i see a few things that would be nice
[20:13:48] <mhaberler> CaptHindsight, memleak: I strongly discourage and disapprove the use of unannounced, and unreleased branches which you seem to randomly pull from here and there and hope to achieve something which I do not understand. Those are work in progress _by us_ and we will let you know when they are ready for consumption. Would you please therefore stop the practice, in particular I have no intent to discuss any contents of same be
[20:13:48] <mhaberler> I, or Jon, or Charles consider them done. If you need an RTOS-branch, this is for you: http://git.mah.priv.at/gitweb?p=emc2-dev.git;a=shortlog;h=refs/heads/rtos-integration-preview3-merged-into-master . Thank you.
[20:59:48] <memleak> I have issues with that branch too but ok..
[21:58:45] <CaptHindsight> at least we're all on the same team :)
[22:01:07] <CaptHindsight> I'll have to catch him before he comments and runs again
[22:01:40] <CaptHindsight> I must have missed something that's causing this drama
[23:56:10] <KGB-linuxcnc> 03chrisinnanaimo 05master aea5a59 06linuxcnc 10lib/python/gladevcp/hal_sourceview.py * gladevcp -add search, undo and redo functionality to hal_sourceview
[23:56:10] <KGB-linuxcnc> 03chrisinnanaimo 05master 9433e4b 06linuxcnc 10(5 files in 2 dirs) * gscreen -add code and widgets for search,undo and redo of Gcode edits