#linuxcnc-devel Logs

Nov 01 2023

12:25 AM linuxcnc-build: build #10388 of 0000.checkin is complete: Failure [4failed] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/0000.checkin/builds/10388 blamelist: Phillip Carter <phillc54@users.noreply.github.com>
02:10 AM linuxcnc-build2: Build [#1792](http://buildbot2.highlab.com/buildbot/#builders/14/builds/1792) of `10-rip.debian-10-buster-rtpreempt-amd64` 4failed.
02:17 AM linuxcnc-build2: Build [#1873](http://buildbot2.highlab.com/buildbot/#builders/13/builds/1873) of `00-checkin` 4failed.
03:21 AM linuxcnc-build: build #3570 of 1660.rip-buster-python3 is complete: Failure [4failed compile runtests] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/1660.rip-buster-python3/builds/3570 blamelist: petterreinholdtsen <pere-github@hungry.com>, Hosted Weblate <hosted@weblate.org>
03:35 AM rigid: rmu: generally it's RCS (realtime-control system architecture) sitting on top of CMS (communication management system) sitting on top of NML (neutral messaging language) which is all part of the libnml dir. zeromq + flatbuffers doesn't implement all of that and a lot of things would need to be rebuilt.
03:53 AM rmu: rigid: where did you find HTTP in there?
03:53 AM rmu: what do you think is missing?
04:00 AM rigid: rmu: https://github.com/usnistgov/rcslib/blob/master/configure.ac#L387
04:01 AM rmu: what is needed is 1) a method to serialize/deserialize messages, 2) a robust IPC mechanism that supports "broadcasting" status updates and request/reply RPC mechanism
04:01 AM rigid: like for example the "state machine" (e.g. EMC_TASK_INTERP::READING) that's currently implemented using the shmem way provided by RCS
04:01 AM rmu: that is not the RCS in linuxcnc
04:03 AM rmu: hmm. not sure what you mean with the "state machine"
04:04 AM rigid: what's unclear? there is state management
04:04 AM rigid: that'd need to change when you switch from buffers to packets
04:04 AM rigid: or you need a translation layer that turns multiple messages into a single buffer
04:05 AM rmu: this state machine is running in task
04:05 AM rmu: one process
04:06 AM rigid: I'm pretty sure states are scattered all over the place and use the RCS as transport
04:07 AM rmu: no
04:07 AM rigid: you need to provide the state to the application somehow and it needs to be synced
04:08 AM rmu: you can observe that state via emc status and that gets transported but you can influence only via RPC messages
04:08 AM rmu: how that emc status reaches the app, via RCS/shared mem or tcp or via zeromq will not make a difference
04:09 AM rigid: why would a reader want to influence the state of another element?
04:10 AM rigid: it makes a huge difference when rewriting the code. the buffer -> packet paradigm change needs a change of approach.
04:10 AM rmu: hmm. then i don't understand what you mean with "implemented using the shmem way provided by RCS"
04:10 AM rmu: no not really
04:12 AM rmu: this https://github.com/LinuxCNC/linuxcnc/blob/master/src/emc/nml_intf/emc_nml.hh would need some massaging, but from the user API POV it could probably be kept mostly like it is now
04:13 AM rigid: RCS provides you with a piece of memory (iirc it's NML::get_address())... you can read/write that like local memory. But if you additionaly trigger an update for the buffer this memory belongs to, RCS will sync that memory with all configured/connected elements.
04:13 AM rmu: so i can keep the EMC_TRAJ_STAT message with all the members intact, put the interp state into the task.interpState member, and send that message to a broadcast EMC Status topic
04:13 AM rigid: state is currently stored/synced there.
04:14 AM rmu: AFAIK you need to trigger that update from the receiving side
04:15 AM rmu: i.e. request it
04:15 AM rmu: so i'm not sure what we are talking about, if you want emc status updates, you shouldn't care if you read that out of shared memory or out of a struct some message queue is handing down to you
04:16 AM rmu: performance will not be a problem
04:17 AM rigid: how do you provide that struct to all components of your application? emcStatus is not only read/written by functions that get it handed
04:17 AM rigid: you need shmem without a major rewrite
04:17 AM rmu: emcstatus is only written by task i would hope
04:19 AM rmu: all applications would need to adapt of course, that means adapt python module used by GUIs and replace shcom with something adequate.
04:20 AM rmu: application would subscribe to topic "interp status change" and wouldn't need to poll the whole emc status struct just to receive interp status updates
04:20 AM rmu: for example
04:22 AM rigid: i'd guess you'd basically reinvent the wheel by solving everything that RCS solved using zeromq+flatbuffers, creating a new standard on the way.
04:22 AM rigid: but well... probably only a POC would show
04:23 AM rmu: zeromq and flatbuffers is the "reinvent the wheel" part, all already done and there
04:23 AM rmu: yes i'm working on a POC to "exfiltrate" emc status into an external app
04:24 AM rmu: with the current system you only have channels, you can't really "send message to all" or "send message to app <...>"
04:25 AM rmu: e.g. error messages, only one reader gets it
04:26 AM rigid: yes, it's buffer based with readers/writers... not packet based with senders and receivers. why might NIST have chosen that way? hm...
04:26 AM rmu: this stuff is >25 years old. the world was different then
04:27 AM rigid: that's a big misconception. you mean they didn't know packet based protocols back then?
04:27 AM rmu: now getting an alert on your phone if something is wrong on your machine should be trivial, back then that would be a major headache (integrating a pager service or something like that)
04:28 AM rigid: i haven't quite figured out how to write to multiple readers but I'm sure it's possible without code changes.
04:28 AM rmu: even figuring out if a new message has been received isn't really trivial and uses some hacks like comparing message serial numbers at the user-end of the API
04:29 AM linuxcnc-build2: Build [#1803](http://buildbot2.highlab.com/buildbot/#builders/11/builds/1803) of `10-rip.debian-12-bookworm-rtpreempt-amd64` 4failed.
04:29 AM rigid: nah, it would have been pretty easy back then. you'd connect your POCSAG service to RS232 and parse log output. (or connect to your NML buffers)
04:30 AM rmu: back then your typical shop running such hacks wouldn't even had internet
05:09 AM rigid: i thought you meant real, physical pager devices
05:12 AM rigid: also not sure why anyone would want to integrate that to the (realtime) machine messaging. It could be done with some script employing emcsh (which is horrible for 2023 standards btw.)
05:13 AM rigid: *is in a horrible state for 2023 standards
05:16 AM rigid: but what do I tell you... the code seems like every past contributor introduced their own debug log message handling :)
05:18 AM linuxcnc-build: build #10390 of 0000.checkin is complete: Failure [4failed] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/0000.checkin/builds/10390 blamelist: petterreinholdtsen <pere-github@hungry.com>, Hosted Weblate <hosted@weblate.org>
05:18 AM rigid: I didn't look at 100% of the code but actually now thinking of it, the libnml/ folder gave me the least "WTFs" and the the most "OHOKs"
05:23 AM rmu: the RCS in linuxcnc doesn't carry realtime messages
05:24 AM rigid: hence the parentheses.
05:25 AM rigid: it could by design (and in practice with minor changes). even remotely, although it's not recommended.
05:25 AM rmu: error message is something for human consumption so why not send it to a end-user-device
05:27 AM rigid: sure, I don't say you shouldn't send it to an end-user-device.
05:27 AM rigid: But you wouldn't send a tweet directly from your microcontroller in an industrial application aswell, would you?
05:28 AM rmu: that's a question of definition, the kind of "hard" realtime that is needed to control the hardware was not possible with RCS ever (on linux)
05:28 AM rigid: "pluck the code into the firmware layer, connect it to the net and off you go, YOLO" :)
05:28 AM rmu: i would not send a tweet, no, but i would send something to my element / matrix thingy
05:29 AM rigid: yep, but imo that shouldn't be handled by a core component but rather by an UI that glues to the protocol
05:30 AM rmu: it doesn't belong in the GUI IMHO
05:30 AM rigid: i like the unix principle of apps that can communicate. as said, i'd do it with emcrsh without side effects even if everything is crashing
05:31 AM rmu: there are a bunch of different GUIs and part of the mess we have is that everybody has to reinvent the wheel somewhat
05:31 AM rmu: you can't get error messages reliably into multiple endpoints
05:32 AM rigid: yep, that's the result of a few antipatterns. I mean, how many implementations of emc/shcom.cc are there?
05:33 AM rigid: i'm pretty sure you can. even cleanly without peeking. I almost finished my project and after that, I'll look into it.
05:34 AM rigid: I also wonder about the best way to transfer a gcode file completely, before running it. that'd be nice. but it's not top priority for me rn
05:34 AM rmu: so now we have the messaging between task and the realtime-stuff, the messagine between gui and task (NML) and then you would add another layer to push errors from GUI to end user device
05:35 AM rmu: the gcode-file-stuff is tricky
05:35 AM rmu: https://github.com/LinuxCNC/linuxcnc/issues/2490
05:36 AM rmu: there is no synchronisation with error messages, whoever reads the message consumes it
05:37 AM rigid: no, i'd use the messaging between gui and task and use an existing (G)UI to introduce the least amount of new code possible (just what's needed for matrix/twitter/etc.)
05:37 AM rigid: ah, there's an issue on that. tnx
05:38 AM rmu: thats putting another thing on top of gui.
05:38 AM rmu: a solution for specific gui but not general gui-agnostic solution
05:38 AM rigid: i wouldn't use axis. but if I would: why not just use axis-remote to get errors?
05:39 AM rigid: don't see what this issue has to do with it
05:39 AM rmu: it's just an example
05:39 AM rigid: for why I need to connect to NML to pull out error messages from linuxcnc?
05:40 AM rmu: same thing with tools, offsets, variables, multiple interpreters etc...
05:40 AM rigid: not sure if axis-remote can get error messages. it should. but emcsh can
05:41 AM rigid: it should be the command line shell for linuxcnc where I'd plug my script into
05:41 AM rigid: others might prefer using python bindings to get error messages
05:41 AM rmu: the point is only one endpoint can get error messages, if you have multiple endpoints, it is more or less pseudo-random who gets a message
05:41 AM rigid: but certainly don't handle NML messages directly
05:42 AM rigid: well, I'm not sure if it's possible to have multiple endpoints that have the same name. Might be.
05:42 AM rigid: *process name in nml cfg
05:43 AM rmu: did you try it
05:44 AM rigid: no
05:44 AM rigid: still working on emcrsh and other stuff
05:46 AM rigid: gcode file handling would be easy, if the UI would send every command line-by-line
05:47 AM rmu: as i said that's tricky
05:47 AM rigid: but that'd suck. the job shouldn't abort when the GUI crashes.
05:47 AM rmu: drip-feeding
05:47 AM rmu: hmm. not sure. i probably would prefer the machine to ESTOP if you can't control it any more
05:47 AM rmu: from the GUI
05:48 AM rmu: there are macros in external files
05:48 AM rmu: there a remaps
05:49 AM rigid: yeah, that might be desireable. but if you have multiple GUIs, one disconnecting should not halt everything.
05:49 AM rigid: i'd probably use some watchdogs to do ESTOP triggering in certain conditions.
05:50 AM rigid: ...probably also just a running instance of emcsh just receiving "set estop on" :-)
05:51 AM rmu: with multiple GUIs you will probably have one "master" gui. whatever. details.
05:52 AM rmu: at the moment you can't really use 2 GUIs at the same time, everything is a mess, errors, tools, offsets, preview, backplot etc...
05:52 AM rmu: how does a remote gui get at the g-code loaded by another gui
05:52 AM rmu: etc
05:53 AM rmu: this "g-code file is modified without the gui noticing" problem that https://github.com/LinuxCNC/linuxcnc/commit/3db35073cab2d054724ea2d4f26f7cc440a3ef59 tried to fix for axis
05:53 AM rigid: it doesn't. it gets active_commands[] attribute
05:57 AM rigid: i'm really not sure yet on how to do it properly. maybe introduce another buffer, or just add some new loaded_commands[] attribute or patching into the read-line function. i have to look deeper into it.
05:57 AM rigid: i could imagine multiple ways. best would be to do it transparently, so GUIs don't need patches.
05:58 AM linuxcnc-build: build #3571 of 1660.rip-buster-python3 is complete: Failure [4failed compile runtests] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/1660.rip-buster-python3/builds/3571 blamelist: Petter Reinholdtsen <pere@hungry.com>
06:01 AM rmu: rigid: do you know this http://static.mah.priv.at/public/tutorial/machinetalk-tutorial.pdf
06:01 AM rmu: page 4, "NML bug list"
06:02 AM rmu: I don't really agree on all points
06:03 AM linuxcnc-build: build #3973 of 1640.rip-buster-rtpreempt-amd64 is complete: Failure [4failed compile runtests] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/1640.rip-buster-rtpreempt-amd64/builds/3973 blamelist: Petter Reinholdtsen <pere@hungry.com>
06:03 AM rigid: "so uninterpreted end2end message contents not possible" that's simply not true, just change "xdr" in nml config to "ascii" (something like that, need to look it up)
06:04 AM rigid: even our libnml implementation provides multiple ways of encoding iirc
06:04 AM rmu: no support for common interaction patterns like remote procedure call or publish/subscribe
06:04 AM rmu: no routing support in presence of multiple consumers/multiple producers
06:05 AM rmu: no clear concept of how to indicate 'message consumed'
06:05 AM rmu: consumption semantics
06:05 AM rmu: unmaintained and unused outside linuxCNC
06:05 AM rigid: some bullet points boil down to linuxcnc not using a code generator where it was meant to do
06:05 AM rmu: low level of abstraction
06:06 AM rmu: no support for encryption and authentication
06:06 AM rmu: no abstraction to support message-based interaction between RT components
06:06 AM rmu: those are all painfully valid
06:07 AM rigid: yeah pub/sub is missing but why would I not configure everything beforehand? i don't see what that would solve beyond being more plug'n'playish. Which admittetly is nice to have. Wouldn't argue against that.
06:08 AM rigid: missing encryption/authentication is still not uncommon in industry applications. tunnels are used for that.
06:08 AM rigid: and it's still pretty nice if it's optional
06:08 AM rigid: i.e. if you can turn it off
06:09 AM rigid: i don't get "no abstraction to support message-based interaction between RT components" ... that's not part of NMLs job, is it?
06:11 AM rigid: i mean, I agree... there should be some simple way for every app to simply do something like "machines[0].emcStatus.motion.spindles[0].forward = 1" and not having to construct a message
06:12 AM rigid: but why would the rcs/nml part implement that. that's emc/nml_intf territory
06:12 AM rmu: no that particular example is not needed ever
06:12 AM rmu: status is one way
06:14 AM rmu: the interface between non-realtime and realtime world is currently split between HAL and task, task uses NML to communicate with non-rt-world and you don't want to put another messaging layer on top of that
06:15 AM rigid: "no complete EMC NML bindings available even today for say Python" to me sounds like "why can't I plug my word processor into the hdd driver?"
06:16 AM rmu: not really. replace python with javascript. you would need a translation layer if you wanted to write a browser-based GUI
06:16 AM rigid: i wouldn't necessarily put another messaging layer on top of that. but another abstraction layer (a clean, well defined one that's used uniformly from everywhere)
06:17 AM rmu: ?
06:17 AM rmu: currently the interface to the outside world is HAL and emctask
06:17 AM rigid: a browser based GUI would probably better work similar to a glade gui with the GUI app being a server for clients, not the client directly
06:18 AM rmu: the browser gui wants to sent "cycle run" to emctask directly not via some other protocol, additional layer or proxy
06:18 AM rigid: i mean, PLCs running web interfaces actually exist... but it's not a good idea
06:19 AM rigid: that's like asking the browser app to connect to unix domain sockets just because your backend uses that for IPC
06:19 AM rmu: because PLCs often are not a good idea in the first place and PLC firmware is notoriously buggy with things internet
06:19 AM rigid: of course you want to separate concerns
06:20 AM rmu: the GUI is the browser part, executing GUI commands is task
06:20 AM rmu: no need to add a translator
06:20 AM rigid: NML http tunneling seems to be more of an edge case where you need firewalls/proxies between components that you can't control
06:20 AM linuxcnc-build: build #10391 of 0000.checkin is complete: Failure [4failed] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/0000.checkin/builds/10391 blamelist: Petter Reinholdtsen <pere@hungry.com>
06:21 AM rmu: this browser thing won't happen over NML
06:21 AM rigid: you would also run a webfrontend from your postgresql process, wouldn't you? :)
06:21 AM rigid: exactly
06:21 AM rigid: it won't happen over NML
06:22 AM rigid: but how to do it without translator?
06:22 AM rmu: in don't run a webfrontend from postgresql process, but i'm sure you can talk directly to postgres from javascript
06:22 AM rmu: and don't need to load some javascript translator service into postgres
06:22 AM rigid: i'm sure you can. but i'm also sure that you never-ever-ever should do that. not from a browser at least.
06:22 AM rmu: https://node-postgres.com/
06:23 AM rmu: that depends. maybe for some sql development tool that would make sense.
06:23 AM rmu: --> electron
06:23 AM rigid: this is stuff that runs js in the backend. not the browser. it's used with a "translator" to serve a webserver
06:24 AM rmu: not necessarily
06:24 AM rigid: oh... you like electron
06:24 AM rigid: i think it's very necessary
06:24 AM rmu: i wouldn't say i like electron but it is there and you can't ignore it
06:25 AM rmu: postgres admin tool as electron app would talk directly to postgres
06:25 AM rmu: and is more or less directly a chrome browser
06:25 AM rigid: even having a postgresql listening on a non-dedicated network interface is bad practice i'd say
06:26 AM rmu: ssh tunnel
06:27 AM rmu: whatever. i wouldn't expose the port to the unfiltered internet for sure
06:27 AM rmu: thats beside the point
06:29 AM rmu: boundary between gui and machine is emctask.
06:30 AM rmu: and it shouldn't be that convoluted like it is now to talk to emctask.
06:30 AM rigid: sadly, it's not the only single boundary
06:31 AM rigid: yes, there should be a single clean API to "talk to emctask"
06:31 AM rigid: ...for GUIs.
06:31 AM rmu: emctask should have a clean api
06:31 AM rigid: maybe even higher level than emctask
06:32 AM rmu: not some additional layer between gui and emctask
06:33 AM rigid: dunno, zero-effort highlevel APIs are nice.
06:34 AM rigid: it would be different if emctask would be the only linuxcnc interface GUIs have to touch.
06:36 AM rmu: currently it's not possible to present a clean API to guis, they have to talk that NML stuff to task, look at hal, look at file system, and deal with tools. on top of that there are evil interactions with the interpreter doing it's own thing when to open and read param file / tool table
06:37 AM rigid: I agree. some work would be needed.
06:37 AM rmu: the HAL stuff will need to grow a real IPC interface, task will have to get rid of NML, tool table will need additional work
06:38 AM rigid: "grow a real IPC interface" ... you mean another IPC interface.
06:39 AM rigid: I think it'd be quite trivial to integrate HAL into NML (at least with io present). Maybe even arbitrarily usable so that you could config HAL stuff via NML.
06:39 AM rigid: the latter maybe not so trivial
06:39 AM rmu: you would need a bunch of messages, there is no request/reply thing in nml, so that would be a very buggy and painful experience
06:40 AM rmu: nobody did it in the last 20 years
06:40 AM rigid: tooltable is a mess. it's a shame if DB_PROGRAM goes
06:41 AM rigid: "no request/reply thing in nml"? wdym?
06:41 AM rmu: as long as the IPC situation is not sorted it will not get a standard API but stay a mess
06:41 AM rmu: you can't really do a remote procedure call
06:41 AM rmu: nobody knows how gets the answer
06:42 AM rigid: well, i can work with NML as current IPC solution, until another *working* one comes along.
06:42 AM rmu: you are welcome to put in the effort but i'm pretty sure that would be a waste of time
06:42 AM rigid: it's easy. it seriously lacks a code generator tho. and it's C/C++ but well...
06:44 AM rigid: well, i already put in some effort for a project. I'd not put any effort in beyond that if the stuff is removed in the near future.
06:45 AM rmu: predictions are hard, especially those concerning the future
06:45 AM rigid: although "near future" in terms of the "remove NML" mantra could mean "another decade" :)
06:45 AM rmu: i hope not
06:45 AM rigid: i'd rather fix the stuff we have first until there's at least a working PoC
06:45 AM rigid: i wonder what brickwalls machinekit ran into
06:46 AM rigid: or what compromises it had to make
06:46 AM rmu: they didn't replace anything, just added things on top of existing stuff
06:47 AM rigid: rmu: i didn't say "replace" but "remove". and there has been stuff removed.
06:47 AM rmu: you can't just remove NML, it has to be replaced with something else
06:47 AM rigid: (if stuff is replaced, I would see no problem at all)
06:48 AM rigid: well, you can cut out NML and do stuff differently. Of course you'd miss out all the features/checks/controls
06:49 AM rmu: no you would need some sort of IPC in any case
06:49 AM rigid: what about unscheduled shmem access YOLO?
06:50 AM rmu: like now?
06:50 AM rigid: mompls
06:53 AM rigid: rmu: iirc that's basically what happens when not using ./configure --enable-toolnml
06:57 AM rigid: what exactly happened to emc/iotask/*? i can't find any reference to what happened
07:00 AM rmu: toolnml?
07:00 AM rmu: you can't do anything with toolnml
07:00 AM rigid: why not?
07:01 AM rmu: iotask you need to ask rene, i think he moved the remaining stuff into task
07:02 AM rmu: toolnml is just a view of some version of the tooltable
07:02 AM rmu: that bloats status message
07:02 AM rigid: hm, I would think io and task are separate concerns
07:03 AM rigid: some version? you don't think it's synced with the NML master model?
07:03 AM rmu: there is no "master model", there is what the interpreter uses
07:04 AM rmu: i'm not sure that the emcstatus view of that is a bug-free reflection
07:04 AM rigid: hm, i didn't understand the mix of emcioStatus.tool and the separate tool NML buffers that once were in linuxcnc.nml
07:04 AM rmu: not sure there is anything to understand there
07:05 AM rmu: you would have to dig into old code and probably not find very much there
07:05 AM rigid: rmu: there is a master model. it's the buffer of the process that has a "1" in the "master" column of linuxcnc.nml
07:06 AM rigid: interestingly masters can be servers or clients
07:07 AM rmu: are you sure the interpreter communicates it's idea of tool table to emctask?
07:08 AM rmu: even in case of tool db program etc...?
07:08 AM rmu: that's what i meant
07:08 AM rigid: no, I haven't played with it and since there is still high fluctuation I postponed tooltable. I also don't really need it right now.
07:10 AM rigid: still DB_PROGRAM is awesome. certainly not perfect but it already covers a gazillion use cases.
07:11 AM rigid: rmu: also see how it uses simple line based pipe IPC so DB programs don't need to talk to linuxcnc directly
07:12 AM rmu: why would the DB program talk to linuxcnc
07:12 AM rmu: linuxcnc talks to DB program
07:12 AM rmu: DB program answers
07:12 AM rmu: not the other way round
07:13 AM rigid: I didn't mean to imply direction
07:14 AM rmu: if you edit tool in GUI it usually writes into tool.tbl directly
07:14 AM rigid: yeah, that blows majorly
07:14 AM rmu: there is no remote interface to edit tool table
07:15 AM rmu: and i'm pretty sure there exits races with tool.tbl similar to g-code files
07:15 AM rigid: don't you get one with --enable-toolnml ?
07:15 AM rmu: so GUI and interpreter are not in sync
07:15 AM rmu: no, you don't get to edit that, it is just a copy of whatever in emcstatus
07:16 AM rigid: hm... it would be awesome if we had a layer that does atomic shmem IPC to sync models... hmmm :)
07:16 AM rigid: yeah, the backend sync to tool.tbl might be missing
07:16 AM rigid: I wonder if that's also true for DB_PROGRAM
07:17 AM rmu: whatever your "backend" is
07:17 AM rigid: anyway, that'd be a bug, not a design issue
07:17 AM rigid: either tool.tbl or whatever DB_PROGRAM is specified
07:17 AM rmu: the bug is that nobody felt it was worth the effort to do that via NML
07:17 AM rmu: because of a bunch of NML problems
07:18 AM rmu: least one is code generator, worse ones are unclear semantics which you seem to ignore
07:19 AM rigid: afais, it was done via NML before until dngarret made it optional in 2021
07:19 AM rigid: what semantics are unclear?
07:19 AM rmu: show me the code path how a new tool would be added via NML
07:19 AM rmu: in the old code
07:20 AM rigid: i didn't say that codepath isn't missing. i could perfectly imagine that no one ever bothered editing tool.tbl by hand or using their own way to add tools via DB_PROGRAM. Should be trivial to add, tho.
07:21 AM rigid: same with hal pins/signals etc.
07:22 AM rigid: quite verbose without codegeneration, tho.
07:22 AM rmu: unclear semantics is there is no clear way to make a RPC and get an answer back
07:26 AM rigid: not sure what you mean. RW buffers can be read/written arbitrarily. Maybe there really is a misconception on the packet vs. buffer paradigm difference
07:27 AM rigid: RCS can be and is used for packet messages, but *_STAT and underlying RPC stuff uses buffers, if I got that right.
07:28 AM rmu: i don't understand what you mean with this packet vs buffer thing
07:30 AM rigid: in our case, a packet is a piece of structured data with a sender & (multiple) receiver. A buffer is an unstructured abstract piece of memory that "magically" is always in sync.
07:31 AM rmu: in HAL you have memory that is always in sync, but nowhere else
07:31 AM rigid: emcStatus is kept in sync as well as emcCommand and emcError
07:32 AM rigid: of course you have to query updates
07:32 AM rmu: you have to request updates to status
07:33 AM rmu: command is into task, so what is in sync there? error is out of task queuedepth 1 and not multiple receiver
07:33 AM rigid: yes. but you don't have to handle "status-update" packets. it's just *poof* - your model is synced.
07:33 AM rmu: you have to poll somewhere if the update has arrived
07:34 AM rigid: yeah, i'll look into that multiple receiver issue
07:35 AM rigid: yes, with realtime stuff polling is often the preferred way. although I guess a blocking implementation would be doable.
07:36 AM rigid: basically you don't care if stuff arrives. you could read back & validate status tho
07:38 AM rigid: hm. the initial problem might also be, that codegenerators were chosen to tackle missing reflection in the first place
07:38 AM rigid: but I guess they had their reasons
07:43 AM rmu: there is a code-generator somewhere
07:43 AM rmu: java program
07:44 AM rigid: yeah. it's horrible. and fullblown. This thing generates ADA code!! :)
07:44 AM rigid: what linuxcnc needs would be MUCH more simple.
07:53 AM geshwin: Good enough
07:53 AM geshwin: i dominate this channel
07:54 AM geshwin: There’s only 26here
07:54 AM geshwin: Let’s see who’s where
07:54 AM geshwin: andypugh
07:54 AM geshwin: alex_joni
07:54 AM geshwin: i will call you all out
07:54 AM geshwin: face the nature call
07:54 AM geshwin: Connor
07:54 AM rmu: geshwin: kindergarden is over in the other channel
07:55 AM rigid: kickban plz
07:55 AM geshwin: Oust you then first
07:55 AM geshwin: spam specter you both
07:55 AM geshwin: spewter
07:55 AM geshwin: rigid
07:55 AM geshwin: none of your business
07:55 AM geshwin: keep like a still stone serve you good
07:55 AM geshwin: rmu
07:56 AM geshwin: you spelt kindergarten in a wrong alphabet
07:56 AM geshwin: it is t rather than d
07:56 AM geshwin: maybe it’s you that should be ousted there
07:56 AM geshwin: rm
07:56 AM rigid: how to get /ignore'd in 20 seconds
07:56 AM geshwin: rmu
07:56 AM geshwin: rmu
07:56 AM geshwin: no way little kid
07:56 AM JT-Cave: oh I see
07:56 AM geshwin: rigid
07:57 AM geshwin: JT-Cave
07:57 AM geshwin: You are never allowed to reply in this channel
07:57 AM geshwin: raise your hands first
08:05 AM -!- #linuxcnc-devel mode set to +o by ChanServ
08:05 AM JT-Cave: well hell it's gone before I could ban it
08:08 AM -!- #linuxcnc-devel mode set to -o by ChanServ
08:08 AM linuxcnc-build2: Build [#1797](http://buildbot2.highlab.com/buildbot/#builders/14/builds/1797) of `10-rip.debian-10-buster-rtpreempt-amd64` 8completed with warnings.
08:10 AM linuxcnc-build2: Build [#1798](http://buildbot2.highlab.com/buildbot/#builders/7/builds/1798) of `10-rip.debian-11-bullseye-rtpreempt-amd64` 8completed with warnings.
08:14 AM linuxcnc-build2: Build [#1804](http://buildbot2.highlab.com/buildbot/#builders/11/builds/1804) of `10-rip.debian-12-bookworm-rtpreempt-amd64` 8completed with warnings.
08:28 AM linuxcnc-build2: Build [#1878](http://buildbot2.highlab.com/buildbot/#builders/13/builds/1878) of `00-checkin` 3completed successfully.
09:40 AM rmu: so the bot situation is under control again
10:27 AM rmu: you can take a look, it is completely unfinished and rough but it does do something. https://github.com/rmu75/cockpit
10:29 AM mozmck: Thanks. It looks like it was somewhat limited in layout capabilities so I didn't look to much further. I've used FTLK for a lot of utilities.
10:29 AM rmu: i should probably add a screenshot
10:30 AM mozmck: There is one guy working on replacing NML with OpenDDS, but it is so large it may not work on boards like the RPI which makes me not like the idea.
10:31 AM mozmck: I'm not a fan of bloat, and seems like everything is getting bigger and slower with little actual improvement.
10:32 AM rmu: OpenDDS is a behemoth
10:33 AM rmu: somebody pointed to the repository, it is empty
10:33 AM rmu: no activity in 2 or 3 years so i think that effort was abandoned
10:34 AM mozmck: https://github.com/auto-mation-assist/LinuxCnc-OpenDDS-Work
10:35 AM mozmck: Yeah, no activity this year anyhow.
10:36 AM rmu: zeromq seems lightweight enough, drawback is there is no real standard RPC mechanism on top
10:37 AM mozmck: Maybe we could simply improve RCS/NML?
10:40 AM rmu: screenshot https://unfoo.net/~robert/Screenshot_cockpit.png
10:40 AM rmu: i fear it is not so simple
10:41 AM rmu: rene advocates getting rid of nml
10:51 AM mozmck: I seems getting rid of nml is not simple either. Machinekit was planning to do that but I think they just added zeromq alongside nml and never actually repaced nml
10:51 AM rmu: yes they added a bunch of stuff with protobuf and zeromq
10:52 AM rmu: i think it would be nice to have a nice clean network transparent API to task, the tool table, and to hal
10:53 AM rmu: ideally it should be possible to talk to those apis from cc++, python, javascript and other languages "out of the box§
10:57 AM rmu: https://capnproto.org/ looks promising
11:03 AM mozmck: Hmm, that does look interesting. By the author of protobuf too!
11:05 AM rmu: in use at cloudflare, so that will not go away
11:06 AM mozmck: "ap’n Proto features an RPC system that implements time travel such that call results are returned to the client before the request even arrives at the server!"
11:07 AM rmu: messages with space-like separation
11:10 AM rmu: that promise pipelining seems to be a useful feature
11:10 AM mozmck: yeah, reading that now
11:16 AM mozmck: Looks like zeromq/nanomsg would not be needed with capn
11:21 AM rmu: how many of these message queuing standards/libraries/programs are there
11:23 AM rmu: didn't know nanomsg, that also looks very interesting
11:34 AM rmu: hmm. another sibling https://nng.nanomsg.org/
12:09 PM pere: rmu: there are hundreds of message queueing implementations.
02:09 PM rigid: mozmck: I honestly don't care if NML or something else if only it's done smoothly with deprecation notice and without cutting features. I'm not 100% sure but I fear hitting brickwalls at some point where lots of effort was spent. Wouldn't surprise me when some users come along years after the transition (there are users who upgrade in 10-year-cycles).
02:10 PM rigid: Hence I lean a bit towards cleaning things up first to maintain more/better exit strategies.
02:11 PM rigid: I mean, replacing the development documentation alone is a huge load of work. This endeavour should be thoroughly planned and the results thoroghly tested.
02:16 PM rmu: there isn't that much documentation specific to NML
02:18 PM rigid: don't forget the wiki, the forum and of course https://www.nist.gov/el/intelligent-systems-division-73500/rcs-real-time-control-systems-architecture, https://www.nist.gov/ctl/smart-connected-systems-division/networked-control-systems-group/nml-programmers-guide-c-version etc.
02:18 PM rigid: i'd say it's a significant amount of work to document it all, maybe providing examples so that future contributors can find what they find now
02:20 PM rmu: the nist links don't really apply
02:22 PM rmu: that part that is covered there is generic (and outdated), zeromq etc... also come with generic documentation
02:33 PM rigid: the nist link covers the industry standard. where does libnml differ from it besides array messages?
02:34 PM rigid: since we use a re-implementation of rcslib, at least it _should_ apply
02:35 PM rigid: rmu: speaking of NML usage, you might be able to answer this if you might find the time: https://github.com/LinuxCNC/linuxcnc/pull/2531#issuecomment-1787230029
02:36 PM rigid: pere: if you could help me testing that PR, I'd give it a shot
02:39 PM rmu: emctask doesn't receive status messages AFAIK
02:39 PM rmu: it sends them out
02:39 PM pere: rigid: as in #2531?
02:39 PM rmu: that thing is to report status, not to mutate
02:39 PM rigid: yes and it's read-only for the receivers. but it could receive them
02:39 PM rigid: pere: yep
02:40 PM rigid: pere: btw. were you able to sort out the CI issue?
02:40 PM pere: you want to test my code?
02:40 PM rigid: pere: without connecting an actual physical worklight, yes
02:40 PM pere: rigid: have not had time to look at the CI issue, no.
02:40 PM rmu: no that would be a gross violation of all and everything
02:40 PM rigid: rmu: why?
02:40 PM rmu: what do you do if somebody modified something else
02:40 PM rmu: it doesn't make sense
02:41 PM rigid: i mean, another buffer for that (ioStat, ioError, ioCmd) would be preferable, but in theory it should work
02:41 PM pere: rigid: is it not enough to see the relevant pin changing?
02:41 PM rmu: it's like making the tachometer in the car an optional accelerator pedal
02:41 PM rigid: rmu: it makes a lot of sense to have r/w buffers and not having to write 1000 LOC for this
02:41 PM rigid: yes, exactly. NML can do that.
02:41 PM rigid: move the dial, car keeps speed
02:42 PM pere: rigid: what do you need help with?
02:42 PM rigid: rmu: otoh, why would worklight readers set the state of the worklight?
02:42 PM rigid: or vice versa
02:42 PM pere: because it can detect that the bulb is dead?
02:42 PM rigid: pere: how can I test it with simulation? is there a test case I could use?
02:43 PM pere: not that I am aware of. guess an existing one could be easily modified.
02:43 PM rigid: you'd signal a dead bulb over the on-off state bit but NOT have R/W status buffer? :)
02:43 PM pere: the goal of the patch is to provide a standardized pin for work light, to allow guis to automatically handle it, similar to how coolant is handled.
02:44 PM rmu: coolant pin is needed because you control it from g-code
02:44 PM pere: rigid: 'turn on' versus 'is on', I guess.
02:44 PM rmu: coolant pin in motion or wherever it lives
02:44 PM rigid: yeah, and it's basically always read/written by the same components
02:44 PM rmu: worklight pin should be in halui or somewhere like that
02:45 PM rigid: pere: there would be another "health" attribute
02:45 PM rigid: otherwise you'd be ambigious
02:46 PM rigid: hm... everything should be controlable from g-code
02:46 PM pere: seem to be at least two discussions going on here. I do not have the bandwidth at the moment to participate in more than one. I am happy to help with testing my patch, and leave the rest of the discussion to you people. :)
02:47 PM rigid: ok. gonna finish up my linuxcncrsh pr first anyway. i'll come back to you.
02:47 PM rigid: or not, if mru convinces me using bidirectional buffers for that really doesn't make sense
02:47 PM rmu: there is nothing to discuss re. emcStatus, at least not with me ;)
02:48 PM rigid: wouldn't even need to be bidirectional actually. just don't use RPC commands
02:49 PM rigid: well, modern RCS in intelligent systems seems to use NML differently than EMC originally did. and I wonder if we shouldn't take advantage of the atomic-shmem interface we're already using
02:49 PM rigid: instead of preparing & handling message packets manually
02:50 PM rigid: but as said, I might miss some pitfall here
02:51 PM rigid: pere: oh btw, weren't you the one who asked about other usages of the old RCS architecture?
02:51 PM pere: yes. was not aware there were any.
02:52 PM rigid: iirc.. seems it's widely used in standards: vehicle domain (4D/RCS), manufacturing domain (ISAM) and space domain (NASREM)
02:52 PM rmu: rigid: you are just quoting BS buzzwords from 15 year old papers
02:53 PM rigid: "buzzwords"? It's the names of the standards.
02:53 PM rigid: or are you surprised that 15 year old papers are widely used nowadays?
02:53 PM rigid: you don't seem to understand what "industry standard" means :)
02:54 PM rmu: what about 4D/RCS is an industry standard
02:55 PM rmu: look at that
02:55 PM rmu: https://en.wikipedia.org/wiki/4D-RCS_Reference_Model_Architecture
02:55 PM rigid: That one is a reference model implementation, right.
02:56 PM rigid: i'd say it's safe to call it some kind of standard: "The National Institute of Standards and Technology's (NIST) Intelligent Systems Division (ISD) has been developing the RCS reference model architecture for over 30 years. 4D/RCS is the most recent version of RCS developed for the Army Research Lab Experimental Unmanned Ground Vehicle program..."
02:57 PM rigid: https://en.wikipedia.org/wiki/Real-time_Control_System also is interesting. it's what we use.
03:00 PM rigid: hm... maybe it'd make sense to look into ISAM and how things are done there
03:00 PM pere: rigid: thanks. how is rcs related to nml?
03:02 PM rigid: pere: rcs sits on top of nml sits on top of cms. It uses nml message buffers + rpc calls for communication/syncing.
03:02 PM rigid: i can't find the nice diagram rn
03:02 PM pere: so 4D/RCS is using NML?
03:03 PM rigid: https://www.nist.gov/sites/default/files/styles/2800_x_2800_limit/public/images/el/isd/cs/appstrut.gif
03:03 PM rigid: good question
03:03 PM rigid: but since it uses rcslib, I think it's safe to say yes
03:04 PM rmu: rcs in linuxcnc amounts to 1000 LOC
03:04 PM rigid: yeah, but there isn't much more to it. rcslib mostly is code generation, tests, dev/debug tools etc.
03:04 PM rmu: and more or less only contains stuff related to print
03:05 PM rigid: rcs/nml itself is quite lightweight. we're only missing array message support
03:05 PM rigid: ...which would be trivial to add afais. no one ever needed it.
03:06 PM rmu: i don't understand your obsession with RCS
03:07 PM rigid: what obsession? you are repeating the "remove NML" mantra. I'm just dealing with what I got.
03:08 PM rigid: i've written it in the GH discussion about this: I think we need to get rid of the notion that NML is old and outdated. It's totally unfounded. That's all. No obsession.
03:08 PM rigid: just lack of alternatives that fit the nieche that RCS fills.
03:09 PM rigid: zeromq+flatbuffers are NOT realtime control protocols, let alone design frameworks. There is A LOT that needs to be added, as I said.
03:11 PM rigid: rmu: "you" meaning to only you alone but the general dev community
03:11 PM rmu: there is no point discussing any further. nml/rcs as it is used in linuxcnc doesn't have any realtime properties and there are none needed really
03:12 PM rmu: i'm not the "general dev community", i made some bugfixes and triaged some issues that is all
03:13 PM rigid: the stateless design decision to "loop through update messages, always reflecting current state" actually is a major realtime prerequisite. it's important to understand that when starting with something like zmq
03:13 PM rmu: part of what keeps me to do more is the complete mess around nml, hal, mot, interpreter, canon and tcl/tk
03:14 PM rigid: rmu: i just meant, i didn't mean to sound like only you are saying "nml esse delendam"
03:14 PM rigid: part of what keeps me to do more is the total lack of quality. even the code formatting gives me eye cancer.
03:15 PM rmu: clang-format and black. but it will make a big mess
03:15 PM rigid: i'd not accept that from a customer tbh without noting it. says a lot about standards.
03:16 PM rigid: yeah "a big mess" nails it.
03:16 PM rigid: but the foundations are solid
03:17 PM rigid: i mean, it's foss. I don't expect every contributor to deliver perfect results. But it just can't be all "move fast and break things" without ever fixing stuff
03:18 PM rmu: one thing i would be interested hacking on is the trajectory stuff, but it is running in realtime part, moving the compute intensive stuff into userspace isn't possible because of lack of proper IPC mechanism
03:18 PM rmu: the mess i meant is you break rebasing from old branches etc... if you clang-format all code
03:18 PM rigid: isn't realtime needed for trajectory planning by definition?
03:19 PM rigid: i got you
03:19 PM rmu: no
03:19 PM rigid: hm, interesting
03:20 PM rmu: you shouldn't run out of path in the realtime part, but apart from that it's not that critical
03:20 PM rigid: so it doesn't really matter if your trajectory planning locks up for a long time?
03:20 PM rigid: i think that's a vital part
03:20 PM rmu: it can happen anyway because the interpreter could "lock up"
03:20 PM rmu: no new gcode lines -> planner runs out of things to do
03:21 PM rmu: and stops feed
03:21 PM rigid: it basically means "trajectory planning needs to guarantee to return within a certain time slice" which means realtime computing
03:21 PM Tom_L: similar to dnc on a regular cnc
03:21 PM rigid: that's basically true for every component except low-level step-signal generation or sensor level reads
03:21 PM Tom_L: where you're feeding a large program via serial etc
03:21 PM rmu: no, the machine is safe in the sense that is has to be possible to stop within planned segments
03:21 PM rigid: ...which most of you guys to with hardware not running any linuxcnc code
03:22 PM rmu: the TP now obeys that, but optimization of path (blending etc..) doesn't have to happen in the 1ms "hard" realtime part
03:23 PM rmu: where you are limited by your deadline how much effort you can put into it etc...
03:24 PM rigid: well, I think realtime is also needed to e.g. keep GUIs in sync but I guess it depends on the definition.
03:25 PM rmu: in this context the realtime part is the part running in kernel resp. SCHED_FIFO
03:25 PM rigid: ...but if you just want to have realtime to keep the machine in a safe state, it makes even less sense to run the GUI inside the realtime environment
03:25 PM rmu: all the rest is non-realtime
03:25 PM rigid: kernel? you don't use uspace?
03:25 PM rmu: you have things completely confused
03:28 PM rigid: i just think it's nice to run task on a different host than motion in a real networked realtime environment. just ensure "to stop within planned segments" is a low bar for a realtime system.
03:28 PM rmu: that doesn't make sense
03:29 PM rmu: you would need special hardware for that
03:29 PM rigid: certainly not in every case but in some cases, yes. I've seen ethernet over PCI used for that.
03:30 PM rmu: that works in limited cases like completely separate network and talking to an FPGA
03:30 PM rigid: i guess 99% of linuxcnc users are fine with simple UDP via GbE
03:31 PM rigid: nah, you can have quite a large network
03:31 PM rmu: 99% linuxcnc users don't need to split the hairy parts of the machine controller between 2 hosts
03:31 PM rigid: pretty much depends on your cycle time requirements.
03:31 PM rmu: why would you do that
03:31 PM rigid: https://en.wikipedia.org/wiki/Fourth_Industrial_Revolution
03:33 PM rmu: task/motion is the wrong boundary
03:34 PM rigid: 99% of the users are fine with a free chinese ripoff from DeskProto they get for free with their hardware. linuxcnc shines with flexibility and hackability for edge cases in education/science/industry.
03:34 PM rigid: i think it's one of many boundaries it'd be cool to break. I could think of use cases.
03:35 PM rigid: "break"... it's already there. just needs to be implemented.
03:38 PM rigid: maybe I can come up with a draft. currently focusing on some small things first.
03:47 PM rmu: https://github.com/LinuxCNC/linuxcnc/blob/master/src/emc/motion/command.c and https://github.com/LinuxCNC/linuxcnc/blob/master/src/emc/motion/usrmotintf.cc
03:47 PM rmu: that's the interface between realtime and non-realtime parts
03:47 PM rmu: no NML there
03:47 PM rmu: no RCS there
03:48 PM rmu: no network transparency
03:48 PM rigid: yet...
03:48 PM rmu: the realtime part will have to stay C
03:48 PM rigid: maybe even multiple HAL layers on different hosts would be nice.
03:48 PM rmu: and compile in kernel mode
03:49 PM rmu: i think you are just trolling
03:50 PM rigid: nice ad hominem. look at my pull requests. you are the one fantasizing about imaginary web GUIs.
03:55 PM rmu: i'm not fantasizing, that stuff is happening, just not with linuxcnc
04:08 PM rigid: a lot of questionable stuff is happening. if someone needs a webUI for linuxcnc, they could implement it right away no problem. the backend server just needs to run on the same frickin' host as the stepgen.
04:09 PM rigid: which I _think_ already should be possible right now since GTK3 provides a gdk HTML5 backend.
04:10 PM rigid: but why would I ever include the gazillion lines of code from a browser into my cnc toolchain...
04:11 PM rigid: you sound like joe from marketing ;)
04:16 PM rigid: "now let me demo how I control this 300 ton gantry crane with linuxcnc from my nintendo switch webbrowser!"
04:22 PM rmu: https://www.theverge.com/2017/9/19/16333376/us-navy-military-xbox-360-controller
04:35 PM rigid: I found the reason error messages can't go to multiple receivers: "If queuing is enabled on this buffer this read will remove the message from the queue so that other processes that are reading from this buffer will see the next message on the queue and potentially miss this one."
04:35 PM rigid: and
04:35 PM rigid: B emcError SHMEM localhost 8192 0 0 3 16 1003 TCP=5005 xdr queue <---
04:56 PM rmu: so remove it then and try it
04:56 PM pere: rigid: seb_kuzminsky said something similar a while back. I guess each receiver will need a separate queue, and some sort of publisher/subscriber pattern is needed to spread the messages?
05:03 PM rmu: reminds me of greenspuns 10th rule
05:06 PM rigid: pere: separate queue would work but I think the original intention to have a stateless design: everyone polls and if you don't get the error message, you miss it. if you get it, you show it.
05:07 PM rigid: i wanna see an implementation of linuxcnc in lisp. especially the HAL driver part...
05:07 PM rigid: lisp coders & engineering. an iconic duo.
05:09 PM rmu: adding all the stuff from nanomsq/zeromq/whathaveyou to cms that is missing doesn't make much sense IMHO
05:22 PM pere: rigid: "everyone polls" seem like a rather unstable arrangement.
05:23 PM rigid: pere: it's the principle design of a shm IPC?
05:23 PM rigid: i don't see a problem. it's stateless and realtimy.
05:24 PM pere: and a race condition just waiting to blow up. :)
05:24 PM rigid: concurrency is handled
05:25 PM rigid: or what kind of race conditions do you mean?
05:25 PM rmu: does anybody know why all the stuff that goes into interplist are nml messages
05:25 PM rmu: that doesn't leave the process
05:26 PM rmu: and the way from interpreter into that queue is a rather convoluted one for no real reason
05:26 PM rigid: pere: in this case it's just "that's the current state. here's the last error message. if you disconnected before getting the previous one, you never get it."
05:26 PM rigid: rmu: what does the git log say?
05:28 PM rigid: so it turns out, all tests pass with "queue" removed from the emcError buffer. there you go. multiple receivers.
05:29 PM pere: rigid: the kind where a subsystem do something, fail to get the error message and end up waiting forever for feedback on the failed request.
05:29 PM rigid: per design, I wouldn't queue error messages in NML. Rather do it in the GUI or using a logging mechanism).
05:29 PM rigid: Pleeease let me clean up the current error/debug/logging mess so I can just use syslog...
05:30 PM rigid: pere: errors only go in one direction. subsystem -> user. or don't they?
05:30 PM rmu: "Base set of files committed so that work can progress on emc2
05:30 PM pere: rigid: I do not really know the inner working of linuxcnc, so no idea. :)
05:30 PM rmu: that is what git log says
05:30 PM rmu: jan25 2004
05:30 PM rigid: pere: if your subsystem gets disrupted you got a problem. then another subsystem in a higher layer should send an errormessage upstream.
05:31 PM rigid: rmu: great. nothing better than meaningless commit messages :)
05:32 PM rigid: rcs_print_* functions already support syslog iirc
05:32 PM rmu: that was a svn message back then that got converted
05:32 PM rmu: initial import of the stuff into version control
05:50 PM rmu: without queue you don't "deliver" error messages and i think they can get lost if they are generated quickly enough
05:52 PM rigid: a test case for that would be good
05:54 PM rmu: it also applies to debug and text messages in g-code i think
05:56 PM rmu: yes that is losing consecutive (dbg, ) and (msg, ) messages
05:56 PM rmu: race condition
06:16 PM rigid: the race condition is what you got now. the winner of the race gets the message.
06:16 PM rigid: why would you like to queue text/debug messages at protocol level?
06:16 PM rigid: when would you want to receive all the messages you previously missed although they are outdated?
06:17 PM rigid: ...say, after a physical disconnect/reconnect.
06:21 PM rigid: I wouldn't argue against a bit more flexibility tho... like, why are we hardcoding the "xemc" string all over the place?
06:26 PM rmu: rigid: without queue the message sending doesn't wait for anybody to read the last message
06:28 PM rmu: so say i have something like "(msg, measured tool length for tool #<_toolno>)" and "(msg, tool length #<_some value>)" it will eat the first message
06:28 PM rmu: possibly et it
06:28 PM rmu: eat
06:28 PM rmu: try it
06:30 PM rmu: what you want is a queue where you stuff in your messages on one end and forget about them
06:30 PM rmu: and on the other end you can subscribe to it and every subscriber gets its own copy
06:38 PM rigid: maybe those two messages should be a single message. consider an operator display that only shows the last message.
06:38 PM rigid: if you like to queue to multiple receivers on protocol level, you need to use configurable process names.
06:38 PM rmu: so you tell the user "don't send messages too fast because we can't guarantee delivery then"
06:38 PM rigid: won't work if every receiver is called "xemc"
06:40 PM rigid: I tell the user "sorry we didn't implement stuff correctly". or maybe "you can't use a slow remote process if you're generating messages very quickly"
06:42 PM rigid: oh, do we have "write_if_read"?
06:44 PM rmu: those messages are generated faster than any gui can react
06:44 PM rmu: no preemption between
06:44 PM rmu: it's like strcpy() followed by strcpy()
06:45 PM rmu: queue and different process names doesn't work
06:45 PM rigid: interesting: "This may not work as expected with remote processes unless "CONFIRM_WRITE" is added the the corresponding BufferLine in the NML file"
06:46 PM rmu: process isn't remote
06:46 PM rigid: it works for me. why wouldn't different process names and queued buffer work?
06:47 PM rigid: yeah, local processes are not a problem.
06:47 PM rmu: error messages arrive only on one listener
06:48 PM rigid: not if they are sent to all configured emcError buffers
06:48 PM rmu: there is only one buffer
06:48 PM rigid: nah, check NML::new debug message output
06:49 PM rigid: every process gets a copy of every buffer
06:49 PM rigid: anyway, i guess fan out is doable with NML
06:54 PM rmu: whatever you see in your debug messages it doesn't work
06:56 PM rigid: right. you said the same thing about remote GUI processes.
06:58 PM rigid: i wonder how emcStatus works unqueued. aren't there a lot of missed status changes in everyday practice?
07:01 PM rmu: you mean like in your laser rays?
07:01 PM rmu: with "it doesn't work" i mean i tried it
07:01 PM rmu: just now
07:02 PM rmu: and your definition of "works" is debatable
07:03 PM rigid: same as your definition of "doesn't work" probably
07:03 PM rigid: at least remote UI works for me (currently). i don't need HAL stuff and can work around file upload. so, works.
07:04 PM rmu: two processes, axis and a second one, connecting as xemc (probably) and second name, error messages appear only in one
07:05 PM rigid: i doubt "second name" is respected while "xemc" still is hardcoded everywhere.
07:06 PM rmu: im using "cockpit" in the second process
07:07 PM rigid: if you did everything right, I wonder why "emcsvr" and "xemc" processes both receive error messages but your "cockpit" process doesn't
07:08 PM rmu: i don't know
07:08 PM rigid: even with queue.. hm
07:08 PM rigid: maybe that line in the docs about missed messages only applies to a single NML buffer instance
07:09 PM rmu: linuxcncsvr doesn't do anything except own the channels
07:09 PM rigid: there is one linuxcncsvr process for every remote process' buffer
07:10 PM rigid: it should be possible to have emc own the channels and not run emcsvr at all if you don't need remote processes.
07:11 PM rmu: i think there is a reason for that, probably some race conditions
07:12 PM rigid: "NML servers must be run for each buffer that will be accessed by remote processes. They read and write to the buffer in the same way as local processes on the behalf of remote processes."
07:12 PM rigid: LOCAL shm doesn't need it.
07:12 PM rigid: ...by design. not sure about implementation quirks.
07:18 PM rigid: oh, btw. I'd say test cases for things like lost error messages or working process name handling would be vital when introducing something like zeromq+flatbuffers. somewhat of a first step.
07:23 PM rmu: it's already working
07:25 PM rigid: you mean "f**k regression YOLO"?
07:26 PM rmu: welcome back