#linuxcnc-devel Logs
Feb 05 2022
#linuxcnc-devel Calendar
09:31 AM smoe: andypugh : Was offline and missed bits of this thread - there does not seem to be much a problem to find candidates for DIV0 crashes. I admit to have no idea about how dramatic that is - is there an emergency stop of all motors? Just for fun I thought I should find such an instance myself in the source code and the first bit that came up was a
09:37 AM smoe: division by cos(angle) without any apparent-to-me checks in "d->r_ref*cos(M_PI/d->fmultiplier)/cos((divisor/2 - fmod(uangle,divisor))*TO_RAD)" of src/hal/components/eoffset_per_angle.comp +317 .
09:37 AM smoe: I would not know how to fix this (error return codes?) but if any such pointer to potential DIV0s would be of help then I would be prepared to look into this (and learn).
09:40 AM andypugh: The ones I was looking at were assuming that the angle of an arc can never be zero (which on the face of it seems a sensible assumption). In most other places in the tracectory planner there is an explicit check, for example: https://github.com/LinuxCNC/linuxcnc/blob/master/src/libnml/posemath/_posemath.c#L343 checks the value and returns sensible defaults if division by zero is imminent.
09:41 AM andypugh: The DIV0 that I found was, eventually, handled by other that gave a sensible answer when presented with the NaN (though whether that was by design is not clear):
09:43 AM andypugh: Bacause any comparison with a NaN fails, this check here printed an (incorrect) error message (if debugging was enabled) and returned an answer of zero (which was fine)
09:43 AM andypugh: https://github.com/LinuxCNC/linuxcnc/blob/master/src/libnml/posemath/_posemath.c#L98
12:36 PM smoe: I am a bit rusty, but this code from posemath looks a bit like I would be handling this in bash:
12:36 PM smoe: int pmRotZyxConvert(PmRotationVector const * const r, PmEulerZyx * const zyx)
12:36 PM smoe: {
12:36 PM smoe: PmRotationMatrix m;
12:36 PM smoe: int r1, r2;
12:36 PM smoe: r1 = pmRotMatConvert(r, &m);
12:36 PM smoe: r2 = pmMatZyxConvert(&m, zyx);
12:36 PM smoe: return pmErrno = r1 || r2 ? PM_NORM_ERR : 0;
12:36 PM smoe: }
12:36 PM smoe: The || is an operator that operates and returns a boolean, but according to https://en.cppreference.com/w/c/language/operator_precedence it is evaluated after the ternary ?: which (I apologize upfront if my rust is already crumbling, I once used to be good at this) I would read as
12:36 PM smoe: return pmErrno = r1 || (r2 ? PM_NORM_ERR : 0);
12:36 PM smoe: when what I presume to be intended is
12:36 PM smoe: return pmErrno = (r1 || r2) ? PM_NORM_ERR : 0;
12:36 PM smoe: If I could have someone quickly peer-reviewing me over this then I would go for a respective PR as there are quite a few functions affected.
02:41 PM -!- #linuxcnc-devel mode set to +v by ChanServ
03:15 PM -!- #linuxcnc-devel mode set to +v by ChanServ
03:51 PM andypugh: I think you are probably right, except that it might work anyway?
03:51 PM andypugh: http://codepad.org/oW3jsN6j
03:53 PM andypugh: I switched the logic, and it still seems to be an “or” function.
03:53 PM andypugh: http://codepad.org/V8AZ37pw
03:54 PM andypugh: If I change the brackets, it seems to go wrong.
03:54 PM andypugh: http://codepad.org/7Izc7Ont
03:55 PM andypugh: Switch the outputs, and it looks OK again.
03:55 PM andypugh: http://codepad.org/8IGl3yoR
03:55 PM andypugh: smoe: ^^^^
03:59 PM andypugh: Hang on, that page says that || is evaluated _before_ ?: Which matches what we are seeing.
04:31 PM linuxcnc-build: build #2147 of 1640.rip-buster-rtpreempt-amd64 is complete: Failure [4failed compile runtests] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/1640.rip-buster-rtpreempt-amd64/builds/2147 blamelist: andypugh <andy@bodgesoc.org>
04:33 PM andypugh: Runtest: 237 tests run, 236 successful, 1 failed + 0 expected
04:33 PM andypugh: Failed: /home/buildslave/emc2-buildbot/buster-rtpreempt-amd64/rip-buster-rtpreempt-amd64/build/tests/hm2-idrom
04:33 PM andypugh: Pretty sure that’s not me?
04:33 PM linuxcnc-build: build #8546 of 0000.checkin is complete: Failure [4failed] Build details are at http://buildbot.linuxcnc.org/buildbot/builders/0000.checkin/builds/8546 blamelist: andypugh <andy@bodgesoc.org>
04:33 PM Lcvette[m]: andypugh: 😲
04:34 PM Lcvette[m]: you broke?
04:34 PM Lcvette[m]: /o\
05:07 PM seb_kuzminsky: andypugh: i think it's not your fault, i've seen that surprising failure occasionally
05:07 PM seb_kuzminsky: i just added some more debug logging to that test, next time it happens we'll hopefully get a clue