#linuxcnc-devel Logs
May 09 2017
#linuxcnc-devel Calendar
07:16 AM jepler: sigh I am becoming a drive-by bug reporter in npm modules I don't even use
07:17 AM jepler: not making friends on github this way, I'm sure
07:20 AM jthornton: I had to power cycle the forum this morning
07:22 AM jepler: thanks jt
07:24 AM jthornton: you're welcome
09:46 AM skunkworks: Just printed a file from the web interface. seems to work. I could have the printer in the garage and print to it ;)
09:48 AM skunkworks: For being in the tech field - gene doesn't seem very lucky with tech
09:57 AM jepler: skunkworks: did the upload via web interface take long?
09:58 AM skunkworks: it was slow. 6.8mb probably took a minute or more
09:58 AM jepler: hm
09:58 AM jepler: I'd almost rather swap the micro sd card around then
09:59 AM skunkworks: heh
09:59 AM skunkworks: I don't know if it is something I am doing though, It also could be I am pretty far away from the router. (multible sources say make sure it is very close to the router)_
10:00 AM jepler: my experience uploading via USB was that it was slooowwww
10:01 AM jepler: I don't know what my file sizes were, but I'd guess it was much slower than 6MB/minute
10:01 AM jepler: it may have been going at an emulated 1 megabit/second serial rate, not sure
10:01 AM skunkworks: what is the hardware? Is it an arduino?
10:01 AM skunkworks: (I really don't know what is inside the thing)
10:02 AM jepler: It has an STM32 (does most stuff) and an ESP8266 (does wifi and lcd screen)
10:02 AM skunkworks: I have not tried usb.
10:03 AM jepler: STM32 is an embedded ARM, far too small to run Linux
10:03 AM jepler: but at least it is 32 bits and I think has floating point math hardware
10:03 AM skunkworks: neat
10:04 AM skunkworks: The Appimage just ran when I made it executable. Is that how it is suppose to work? Or is there some sort of install mechanism?
10:04 AM jepler: yes that's how appimage works apparently
10:04 AM skunkworks: huh
10:06 AM jepler: re: forum I've upgraded the forum to their new "monitoring" and setup an alert that will e-mail me when the CPU usage is high. Each time I've looked, the CPU usage is reported at 100% while the forum is unresponsive, so hopefully I'll hear about it sooner now.
10:06 AM jepler: I still don't see anything abnormal in the logs when this happens
10:07 AM jepler: apache - dmesg - mysql logs all look normal right up until the system goes away entirely
10:08 AM jepler: I could make the forum reboot daily or weekly but I have no idea whether that will fix anything
10:09 AM jepler: rebooting causes <2 minutes downtime I think, so I wouldn't feel bad doing it at 2AM US Central time or something
10:09 AM jepler: thoughts?
10:10 AM JT-Shop: it seems to lock up randomly is there an error log?
10:11 AM jepler: there are no useful errors logged anywhere I've thought to look
10:11 AM JT-Shop: I figured you did
10:16 AM jepler: OK, I've configured the forum to reboot daily at 3AM eastern time (forum's local timezone is apparently US eastern, *shrug*), which is 0700GMT in the summer and 0600GMT in the winter I think
10:17 AM jepler: before this, the last manual reboots via the control panel were 9 and 19 days ago, then 2 and 3 months ago
10:18 AM jepler: reboots from the unix side don't show up in the list
10:47 AM seb_kuzminsky: that's so sad
11:09 AM jepler: not sure which part is "so" sad, but it's sure irritating if the forum is going to go down in flames on the weekly for no obvious reason
11:11 AM archivist: rebooting is sad
11:12 AM seb_kuzminsky: yeah periodic therapeutic reboots are sad. Less sad than app unavailability though, so i'm glad you set it up
11:14 AM cradek: do you know anything about the failure?
11:14 AM seb_kuzminsky: he listed what he knows up above
11:14 AM seb_kuzminsky: pegged cpus but nothing in the logs
11:14 AM cradek: oh rtfs
11:14 AM seb_kuzminsky: scrollback?
11:14 AM archivist: missed a log?
11:15 AM jepler: unfortunately when I switched that droplet to the new monitoring just earlier today, the old monitoring data is gone
11:15 AM jepler: (sigh)
11:16 AM cradek: maybe if you learn about it right away you can get a backtrace or strace or something
11:16 AM jepler: in the past I've learned about it hours after the fact; at that point the system is not ssh'able and nothing will appear on the virtual console that is available in the digitalocean control panel
11:17 AM seb_kuzminsky: my uninformed first quess for those symptoms would be a memory leak and swap thrash
11:18 AM jepler: fwiw the system is 1GB RAM + 6GB swap and soon after reboot there is some swap used but not much by my standards (66MB)
11:19 AM jepler: swap usage is not part of the monitoring data
11:19 AM seb_kuzminsky: do you know how the forum process memory footprint changes? it might be interesting to run ps with the right flags on it, once a minute, writing out to a file
11:20 AM seb_kuzminsky: is i/o load part of the monitoring data? it is in kvm/libvirt, but i dont know what digital ocean uses
11:20 AM jepler: yes
11:20 AM archivist: any useless cron jobs
11:20 AM jepler: I want to say that there is often an increase in disk utilization before it crashes, which would tend to line up with seb_kuzminsky's theory
11:21 AM seb_kuzminsky: are you swapping to a file or a partition?
11:21 AM jepler: a file, because until recently that's all there was
11:21 AM archivist: heavy usage by a spider can cause problems
11:21 AM seb_kuzminsky: i have a DO account, but i don't think i have login credentials to the forum VM
11:21 AM seb_kuzminsky: if you give me an account i can set up the monitoring i was talking about and keep an eye on it
11:22 AM jepler: seb_kuzminsky: you should have the credentials to log in as root with ssh. the key installed is "seb@dub"
11:22 AM seb_kuzminsky: oh, heh
11:22 AM seb_kuzminsky: i think i still have the private part of that key backed up on tape somewhere
11:23 AM jepler: I can put an additional or alternate key, /msg me if you want
11:23 AM seb_kuzminsky: i found it!
11:23 AM seb_kuzminsky: but do i remember my passphrase?
11:23 AM seb_kuzminsky: i dooooo
11:24 AM seb_kuzminsky: hmm, but forum.linuxcnc.org rejected it
11:25 AM jepler: May 9 11:54:15 forum sshd[10083]: input_userauth_request: invalid user seb [preauth]
11:25 AM jepler: try ssh root@forum.linuxcnc.org
11:25 AM jepler: there are no user accounts
11:26 AM seb_kuzminsky: derp
11:26 AM seb_kuzminsky: thanks
11:27 AM jepler: apache logs from just before loss of service don't look unusual vis. bot activity
11:31 AM archivist: pile of users and run out of mysql connections?
11:32 AM jepler: MaxRequestWorkers may be too high
11:32 AM archivist: insufficient memory configured for mysql to be quick so more connections take up memory
11:33 AM seb_kuzminsky: ok, it's running
11:33 AM seb_kuzminsky: bbl
11:34 AM jepler: if I understand correctly, seb's monitoring will tell us what eats all the memory, so we can wait for that info before changing anything
11:34 AM jepler: like deleting the forum and setting the backup of it on fire
11:35 AM archivist: is the mysql slow query log enabled
11:35 AM jepler: I have never enabled such a thing
11:36 AM jepler: there's nothing of consequence logged by mysql
11:37 AM jepler: # ls -l /var/log/mysql/error.log
11:37 AM jepler: -rw-r----- 1 mysql adm 0 May 9 06:42 /var/log/mysql/error.log
11:37 AM archivist: sounds like a default config
11:38 AM archivist: some stoopid OS' s redirect the mysql errors to syslog
11:39 AM jepler: an older log does have some entries, so it's not that
11:39 AM jepler: 170509 6:16:44 InnoDB: Database was not shut down normally!
11:40 AM jepler: 170509 6:16:45 InnoDB: 5.5.55 started; log sequence number 103088151141
11:40 AM jepler: but then this log got rotated shortly after startup
11:40 AM jepler: the old log has no entries from before the crash
11:40 AM archivist: power cycling will give that
11:43 AM jepler: *shrug* it's the easiest tool we have available when this happens, since the system just comes back up and gets back to serving users
11:44 AM skunkworks: jepler, turn the reboot off?
11:45 AM skunkworks: Huh - I can load a model into cura that locks the whole computer up
11:46 AM skunkworks: trying a 3.something kernel
11:48 AM skunkworks: hmm - nope. locks up.
11:48 AM jepler: .. hold on, I'm going to try to break flo
11:51 AM skunkworks_: jepler, this one locks up my cura 2.5.0 http://www.thingiverse.com/thing:1307100/#files
11:51 AM jepler: skunkworks_: interesting / yuck
11:52 AM jepler: skunkworks_: let me not try that right this second
11:52 AM skunkworks: heh
11:55 AM jepler: so yes, by stressing the server remotely it can get driven to large swap usage
11:55 AM jepler: the memory usage was almost exclusively apache2, not mysqld
11:56 AM jepler: I'm going to tune mpm_prefork's MaxRequestWorkers down, because they will exceed available RAM+SWAP
11:58 AM jepler: MaxRequestWorkers 150 -> 25, and also enabling mod_reqtimeout
12:22 PM seb_kuzminsky: nice!
12:35 PM skunkworks: 2.4 locks up too.. wonder if it is a machine setting
12:37 PM skunkworks: nope. well that is just weird.
12:39 PM seb_kuzminsky: a userspace app that locks up the whole computer? that's not right
12:39 PM seb_kuzminsky: is it running on a linux box?
12:41 PM skunkworks: yes
12:42 PM skunkworks: can't ctrl-alt f1 or anything
12:43 PM skunkworks: jessie
12:59 PM pcw_mesa: does it use all of memory?
01:13 PM skunkworks: wouldn't think so.. But I could check
01:13 PM skunkworks: it has 8gb iirc
01:13 PM skunkworks: or maybe 16
01:38 PM jepler: skunkworks: whatever you figure out, we'll use that information to fix the forum, so keep troubleshooting
01:39 PM skunkworks: Will do!
01:47 PM seb_kuzminsky: jepler: do you want me to stop that memory logger script? or is it still useful?
01:47 PM jepler: seb_kuzminsky: I think it's still useful
01:48 PM seb_kuzminsky: seems like you found 1 smoking gun and fixed it
01:48 PM seb_kuzminsky: but maybe there are more
01:48 PM jepler: I changed 3 things all at once
01:48 PM seb_kuzminsky: ok i'll leave it
01:48 PM seb_kuzminsky: heh
01:48 PM jepler: bad idea
01:48 PM jepler: I mean, changing 3 things at once is a bad idea
01:50 PM seb_kuzminsky: i wonder if suavesteve (the guy on the mailing list with the libmodbus 3.1 patch) will update his patch, or if maybe Evan Foss will, or if i should just do it
01:50 PM seb_kuzminsky: i guess i'll wait a little longer
01:50 PM jepler: is the libmodbus in question compatible with GPLv2-only license?
01:51 PM seb_kuzminsky: not sure
01:52 PM seb_kuzminsky: it's by the same guy/project who did the libmodbus 3.0 that we're using, i assumed they didn't change their license between 3.0 and 3.1 but i didn't check
01:55 PM jepler: ok
01:56 PM jepler: libmodbus 2.0.0 was LGPL3 and GPL3, which are both incompatible with GPLv2-only. apparently the project has changed since then and it's LGPL v2.1+ now
01:56 PM jepler: http://libmodbus.org/documentation/ states LGPL v2.1+, while only this old release notes says something about LGPL3 http://libmodbus.org/2008/libmodbus-200-slaves-to-our-machines-is-out/
01:57 PM jepler: so I was remembering something real but out of date
01:57 PM jepler: sorry for getting all alarmed
01:58 PM seb_kuzminsky: the libmodbus deb "copyright" file says LGPL 2.1+ for everything except the tests, which are LGPL 3+
01:58 PM seb_kuzminsky: sorry, the tests are GPL 3+, not LGPL
01:58 PM seb_kuzminsky: that's for 3.0.6
02:22 PM jepler: weird the web page says "the licence of programs in the tests directory is BSD 3-clause."
02:22 PM jepler: so confuse
02:22 PM seb_kuzminsky: very contradict
02:22 PM jepler: but we don't link the tests so it doesn't matter one way or another
02:23 PM seb_kuzminsky: yay, it's tuesday, that means hackspace tonight
02:24 PM jepler: woo
02:24 PM jepler: I don't think I can drive there in time though
02:24 PM seb_kuzminsky: it might still be going by the time you get here
02:27 PM seb_kuzminsky: somebody brought in a "Protomat S62" pcb routing machine, but they dont want to gut it and run linuxcnc on it, and i dont want to try to figure out the usb connection to its windows app, so it's just gathering dust
02:27 PM seb_kuzminsky: also we dont have the windows app
02:28 PM seb_kuzminsky: does anyone have experience with the 'cyclictest' program from the rt-preempt folks? https://rt.wiki.kernel.org/index.php/Cyclictest
02:29 PM seb_kuzminsky: i'm seeing surprisingly good latency from 4.9.10 vanilla from kernel.org
02:29 PM jepler: interesitng
02:29 PM jepler: I've installed and run it but I'm not a pro at understanding the numbers or all those commandline switches
02:29 PM seb_kuzminsky: worst case latencies are up in the milliseconds, but they're very, very rare
02:31 PM skunkworks: I have had good letencies with 4.9.1
02:31 PM skunkworks: 0
02:32 PM seb_kuzminsky: 0 latency, wow!
02:32 PM skunkworks: 4.9.10 ;)
02:32 PM seb_kuzminsky: :-)
02:33 PM seb_kuzminsky: spacex just test fired the center core of the falcon heavy, that's exciting
02:39 PM skunkworks: jepler, when you get a chance try loading something large into 2.5.0.. Your laptop is really close to mine.
03:04 PM jepler: skunkworks: I will when I am at home.
03:04 PM jepler: skunkworks: fwiw I've never had a whole-computer lockup running cura 2.3.1
03:05 PM skunkworks: Thanks!
03:06 PM jepler: except that it eats all the settings going between 2.3.1 and 2.5.0 I'd suggest you try installing the 2.3.1 deb..
03:20 PM skunkworks: jepler, don't worry about it. switching from nouveau to nvidia fixed it
03:20 PM skunkworks: wonder what that does to my rt_preempt performance :)
03:21 PM jepler: skunkworks: glad you figured out a workaround
03:28 PM skunkworks: jepler, suprisingly not bad... http://electronicsam.com/images/KandT/testing/Screenshot-67.png
03:31 PM skunkworks: bbl