#linuxcnc-devel Logs

Jul 01 2018

04:37 AM rmu: it should be possible to attach from gdb to the running rtapi_app, to at least get a better backtrace
08:21 AM jepler: https://sourceforge.net/p/emc/mailman/message/32635937/ shows how to use valgrind and gdb on rtapi_app
08:21 AM jepler: or at least shows that I claim I made it possible in a non-obvious way
01:26 PM JT-Shop: what is a github pull request?
05:27 PM andypugh: I am stumped
05:28 PM andypugh: https://github.com/LinuxCNC/linuxcnc/blob/andypugh/7i65/src/hal/drivers/mesa_7i65.comp#L226
05:29 PM andypugh: If I comment-out lines 229 to 236 then the initialisation phase does not core-dump.
05:30 PM andypugh: But I don’t see anything particulary different with the AD7329 lines compared to the ADS754 or CPLD lines.
05:32 PM andypugh: (Originally the 8 AD7329 lines were done as a loop through an array, but I changed that in case arrays of pointers to pointers were difficult in some way.
05:32 PM andypugh: So now there isn’t much (if any) difference between the handling of the AD7329 and the ADS754 or CPLD.
05:40 PM pcw_home: I have no idea what going on here but the order of r += hm2_allocate_bspi_tram(name); looks odd
05:45 PM jepler: pcw_home: those functions return a negative number in the case of failure and zero(?) in the case of success, so it's a way to check for failure anywhere along the way without having an 'if' every other line...
05:45 PM jepler: andypugh: I notice this asymmetry but I don't know if it's relevant
05:45 PM jepler: r = hm2_register_tram_write_region(hm2,hm2->bspi.instance[i].addr[chan], sizeof(rtapi_u32),wbuff);
05:45 PM jepler: r = hm2_register_tram_read_region(hm2,hm2->bspi.instance[i].addr[0], sizeof(rtapi_u32),rbuff);
05:45 PM jepler: in hm2_tram_add_bspi_frame
05:45 PM jepler: the second one doesn't refer to 'chan' at all
05:47 PM jepler: (normally we encourage internal functions to return negative errno values in the case of error, but this function is either 0 or -1)
05:47 PM andypugh: Let me investigate.
05:48 PM andypugh: That sounds like it could be the problem.
05:48 PM andypugh: (Once again i am in awe of jepler’s ability to look at code and see stuff.
05:48 PM jepler: JT-Shop: https://help.github.com/articles/about-pull-requests/ should get you the basics
05:49 PM jepler: "When pushing commits to a pull request, don't force push. Force pushing can corrupt your pull request." hm I have followed this advice zero times ever...!
05:50 PM jepler: afk
05:50 PM jepler: andypugh: good luck with your crash
05:55 PM andypugh: Well, using [chan] instead of [0
05:56 PM andypugh: ] doesn’t fix the crash. So now I am wondering if past-me knew something that current me doesn’t, and perhaps the 0 is correct.
06:16 PM pcw_home: I dont think the read address important (the write address determines the SPI channel descriptor used but all read data is from the same FIFO)
06:17 PM pcw_home: s/address important/address is important/
06:56 PM andypugh: That would explain why I had a zero rather than chan, I guess.
06:57 PM andypugh: It looked too odd to be an accident, really.
07:14 PM andypugh: OK, time to retire puzzled for another night
08:53 PM cradek: chan->addr[j] = chan->base_address + j * sizeof(rtapi_u32);
08:56 PM cradek: I got distracted looking at all these sizeofs and I can't tell if they're right. this pointer math looks suspicious to me.
08:56 PM cradek: er wait that's not a pointer type at all, it's a rtapi_u16
10:02 PM jepler: hostmot2.h:774: int conf_flag[16];
10:02 PM jepler: if (hm2->bspi.instance[i].conf_flag[chan] != true){
10:02 PM jepler: it's almost certainly unrelated, but ... never compare == true, particularly when the other type is not actually a boolean
10:02 PM jepler: as written, every value but '1' stored in the 'int' conf_flag[chan] will be treated like false, and only the value '1' will cause the body of the if to be entered
10:02 PM jepler: but that's just pedantry, probably
10:03 PM Tom_L: == is 'exactly equal' isn't it?
10:05 PM jepler: if (hm2->bspi.instance[i].conf_flag[chan] != true){
10:05 PM jepler: whoops https://emergent.unpythonic.net/files/sandbox/equaltrue.c
10:08 PM jepler: reading andy's pastebin https://pastebin.ubuntu.com/p/bJtPsmVVNF/ a "corrupted unsorted chunks" error is something that happens subsequent to the real memory misuse problem, so the fact that the earlier 'HM2_PRINT_NO_LL("Here 42\n");
10:08 PM jepler: ' statements executed doesn't mean that none of them are from after when the problem is created that eventually causes free() to fail.
10:09 PM jepler: getting it to run under valgrind remains my best suggestion, it stands a good chance of localizing to the specific line where the problem occurs. That git commit message outlines how I believe it shold work
10:09 PM jepler: https://sourceforge.net/p/emc/mailman/message/32635937/
10:09 PM jepler: you would sudo env RTAPI_UID=`id -u` valgrind bin/rtapi_app and then run your lines to crash in halcmd
10:09 PM jepler: you can gdb rtapi_app similarly