#linuxcnc-devel Logs
May 24 2017
#linuxcnc-devel Calendar
10:56 AM skunkworks: is there a way to figure out what drive in an array (mdadm) a logical block is assigned to? Is that even a correct question?
10:57 AM skunkworks_: I keep getting this Buffer I/O error on dev md127, logical block 2531262587, lost async page write
10:57 AM skunkworks_: same block
10:58 AM archivist: time to panic
10:59 AM skunkworks: again?
10:59 AM archivist: bad blocks are supposed to be marked bad
11:00 AM skunkworks: that is what I though
11:00 AM skunkworks: There was one drive that had some really bad smart data. I pulled it from the raid. Still getting the error.
11:01 AM archivist: and raids are supposed to be smarter
11:01 AM skunkworks: that is what I thought ;)
11:04 AM Tom_itx: meh
11:05 AM archivist: run out of bad block storage space?
11:05 AM Tom_itx: if it did, he's got bigger problems
11:12 AM seb_kuzminsky: skunkworks: it's strange that you're getting io errors from md but not from the underlying device
11:26 AM skunkworks_: https://pastebin.com/xe219Rj6
11:26 AM skunkworks_: the top half was running fsck -> bottom half is running a rsync
11:27 AM skunkworks_: (after mounting)
11:28 AM skunkworks_: I did have some hardware issues initally - because now I am only getting the above errors - before I was getting some pretty wicked resets and such.
11:30 AM skunkworks_: Maybe replacing the drive that I failed - and rebuilding the array will fix it.
11:42 AM Tom_itx is now known as Tom_L
11:43 AM seb_kuzminsky: jepler: flo went 1.9 GB into swap (25 apaches @ 170 MB each), recovered and ran fine after that
11:43 AM seb_kuzminsky: i'm going to go out on a limb and say your apache paralellism cap fixed the forum crashing problem
11:45 AM jepler: seb_kuzminsky: thanks for keeping an eye on it
11:45 AM jepler: it's too bad after all these years a linux box can still swap itself into unusability
11:50 AM jepler: cradek: did you ever attempt to measure the accuracy of your Rubidium time standards?
11:51 AM archivist: accuracy or stability
11:51 AM jepler: I probably mean stability
11:52 AM jepler: I'm geektrapped by Ed Nisely's recent articles about trying to select tuning values for an AD8950 DDS under the limitations of the AVR/Arduino environment, no double precision floats
11:53 AM archivist: I have a very old quatz standard made in the 1960s compared that to off air standard
11:54 AM jepler: .. I developed an algorithm that lets you enter the measured oscillator frequency in Hz and get the correct tuning value for a desired output frequency in Hz or milliHz
11:55 AM archivist: I was locking counters and sources at 18ghz and seeing how many cycles I was out
11:55 AM jepler: of course these direct from china boards have uncompensated crystal oscillators, so the fact that you can only specify f_osc to the nearest 1MHz is not a real limitation
11:57 AM jepler: I was wondering how .. extraordinary .. the input clock would have to be before it would be plausible to want to specify a fractional f_osc , e.g., 125'000'015.667MHz instead of 125'000'016MHz
11:58 AM archivist: I wonder how dds can be noise free
12:01 PM cradek: jepler: I listened to its 10MHz on the shortwave vs. WWV and didn't hear beating, but I'm not sure that's a valid test - would I even hear it?
12:04 PM cradek: I think no
12:04 PM cradek: it might make the "signal" fade out and in (RF amplitude changes slowly) but that's how shortwave works anyway
12:06 PM archivist: if you have an off aire source you can lock too wwv and signal gen locked to the rubidium you can measure short term
12:11 PM archivist: must get some batteries and power up the xtal see if I can get it to the accuracy I did 10 years ago
12:13 PM jepler: timekeeping is a geek trap for sure
12:15 PM archivist: very true
12:16 PM archivist: the standard I have is of the same type as in the older GMT clock http://www.collection.archivist.info/archive/DJCPD/PD/2002/2002_07_02_BHI_Greenwich_Meantime_Clock/P1010033.JPG
12:16 PM archivist: its in the middle just below the clock with a dial
12:17 PM archivist: double oven with monitoring
12:18 PM cradek: jepler: I do still have one of them, but have never tested it. the other one is in that clock that I assume is still running but I don't have
12:29 PM jepler: this is the fancy accurate timekeeping gizmo I have. too bad it's no longer being sold https://partiallystapled.com/pages/laureline-gps-ntp-server.html
12:29 PM jepler: $ ntpq -n -c peers
12:29 PM jepler: remote refid st t when poll reach delay offset jitter
12:29 PM jepler: ==============================================================================
12:29 PM jepler: *10.0.2.61 .GPS. 1 u 13 16 377 0.177 0.038 0.021
12:46 PM archivist: seems you can make a "work a like" with a pi and gps addon
01:34 PM jepler: I was reading the same thing earlier
01:55 PM skunkworks: jepler, http://electronicsam.com/images/matsuura/20170524_130222.jpg
01:55 PM skunkworks: works great
02:01 PM jepler: skunkworks: nice
02:01 PM jepler: I was just saying to cradek earlier that I haven't touched my wiring problems
05:21 PM seb_kuzminsky: jepler: i'm logged in to flo, and it's slow even though it's not in swap
05:21 PM seb_kuzminsky: vmstat suggests the slowness is associated with block writes (not swap)
05:22 PM jepler: seb_kuzminsky: "slow" in the web interface or otherwise?
05:22 PM seb_kuzminsky: iotop -o -a suggests, surprisingly, that a lot of the writes are from apache, with mysql in second place
05:22 PM seb_kuzminsky: jepler: both in the web interface and in the shell
05:22 PM seb_kuzminsky: a monitoring script i'm running at home got 10+ second page load latencies
05:23 PM seb_kuzminsky: i don't know what apache is writing (next step might be to use strace to look for open())
05:24 PM seb_kuzminsky: so i'm guessing the system gets slow enough that requests start piling up, and apache spins up more clones, and the machine goes into swap, and things suck until our active users leave disappointed
05:25 PM seb_kuzminsky: so maybe: 25 apaches seems to top out at using about 1 GB RAM and 2 GB swap. how about we try limiting it to 1/3 of 25 instead? 8 apaches max, see if that works better
05:25 PM cradek: seems like the only thing apache would write to on a block device is its log file
05:26 PM jepler: cradek: it's running php in the apache process
05:26 PM jepler: so it does whatever php is doing
05:26 PM cradek: oh hm
05:26 PM seb_kuzminsky: the log file doesn't look totally crazy
05:27 PM seb_kuzminsky: "heavy load" for flo is a few hundred requests per minute
05:27 PM seb_kuzminsky: it drops so single-digit requests/minute when it's swap thrashing...
05:27 PM seb_kuzminsky: we should call it slo instead
05:27 PM jepler: "s" for "sorry"?
05:28 PM seb_kuzminsky: "shouldn't've"
05:28 PM jepler: 263342 6956 -rw-r--r-- 1 www-data www-data 7122305 May 24 17:57 /var/www/html/cache/com_modules/25b5df813c2ce4100ccc448a4f9cb020-cache-com_modules-536f404bd35243cfab6ec504a36b7a75.php
05:28 PM jepler: there's some huge "cache" file that it writes frequently
05:28 PM jepler: 263342 540 -rw-r--r-- 1 www-data www-data 549438 May 24 17:58 /var/www/html/cache/com_modules/25b5df813c2ce4100ccc448a4f9cb020-cache-com_modules-536f404bd35243cfab6ec504a36b7a75.php
05:31 PM jepler: joomla's doc on the various caches is .. less than illuminating https://docs.joomla.org/Cache
05:32 PM jepler: that file mostly increases in size, [Wed May 24 17:31:09.380505 2017] [:error] [pid 12390] [client 71.218.151.236:39148] PHP Notice: unserialize(): Error at offset 1974235 of 1974239 bytes in /var/www/html/libraries/joomla/cache/controller.php on line 176
05:32 PM jepler: [Wed May 24 17:31:23.628751 2017] [:error] [pid 12390] [client 54.196.107.247:49118] PHP Notice: unserialize(): Error at offset 917462 of 917471 bytes in /var/www/html/libraries/joomla/cache/controller.php on line 176
05:32 PM jepler: [Wed May 24 17:31:27.881041 2017] [:error] [pid 12391] [client 66.249.66.231:44045] PHP Notice: unserialize(): Error at offset 5655802 of 5656543 bytes in /var/www/html/libraries/joomla/cache/controller.php on line 176
05:32 PM jepler: err
05:32 PM jepler: there are also many errors of this type in the apache error log
05:32 PM seb_kuzminsky: yeah, there's a ton of those errors
05:32 PM jepler: but they've always been there
05:33 PM seb_kuzminsky: what do you think of the idea of reducing the number of apaches while we try to figure out that problem?
05:33 PM cradek: so it's caching something but the caching is busted
05:33 PM seb_kuzminsky: yeah, the frequency of that error doesn't seem to covary much with the system slowness
05:43 PM jepler: anyway I changed it to 8 just now
05:44 PM seb_kuzminsky: we'll see how that works
05:46 PM jepler: .. I can also disable caching ..
05:46 PM seb_kuzminsky: i wonder if that will make it faster or slower?
05:46 PM jepler: I dunno
05:47 PM jepler: I did an 'apache2 restart' to make sure the new setting took, and the front page loads (curl https://forum.linuxcnc.org > /dev/null) are faster than before
05:47 PM seb_kuzminsky: yay
05:48 PM seb_kuzminsky: it's funny how doing less is more efficient
05:48 PM JT-Shop: Yippie!
05:48 PM jepler: ah there goes a slow one
05:48 PM jepler: but I'm routinely getting <.5s locally, while I was generally seeing >2s locally before
05:48 PM jepler: of course it could also be an effect of apache being fresh...
05:49 PM seb_kuzminsky: when i tried it (N=1) the front page and the "list of threads" pages were fast, and the "list of posts in the thread" page was slow
05:50 PM jepler: are you logged in?
05:50 PM seb_kuzminsky: hmm, my shell on flo is not responding...
05:50 PM jepler: mine is
05:51 PM seb_kuzminsky: mine is too now
05:51 PM * seb_kuzminsky shrugs
05:51 PM jepler: > [ 0.000000] Reserving 128MB of memory at 768MB for crashkernel (System RAM: 1023MB)
05:51 PM jepler: we can't possibly want this
05:51 PM seb_kuzminsky: heh, nice catch
05:51 PM seb_kuzminsky: agreed, we don't want that
05:52 PM seb_kuzminsky: every meg counts, on that machine :-P
05:53 PM jepler: apparently kexec-tools put that there
05:53 PM jepler: and it's done in such a way that it keeps doubling itself up
05:53 PM jepler: default/grub.d/kexec-tools.cfg:GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT crashkernel=384M-:128M"
05:53 PM jepler: default/grub:GRUB_CMDLINE_LINUX_DEFAULT="crashkernel=384M-:128M crashkernel=384M-:128M crashkernel=384M-:128M"
05:54 PM jepler: OK, I'm going to bounce the whole thing now...
05:54 PM seb_kuzminsky: +1
05:54 PM jepler: I don't think it increased the avaialble memory
05:54 PM jepler: # cat /proc/cmdline
05:54 PM jepler: BOOT_IMAGE=/boot/vmlinuz-3.13.0-117-generic root=UUID=050e1e34-39e6-4072-a03e-ae0bf90ba13a ro crashkernel=384M-:128M crashkernel=384M-:128M crashkernel=384M-:128M crashkernel=384M-:128M
05:55 PM seb_kuzminsky: if you say it enough times it'll happen
05:57 PM jepler: I bet it's kexec again
05:57 PM jepler: .. bouncing again
05:58 PM seb_kuzminsky: yay!!
05:58 PM jepler: yup, kexec was dutifully copying the old /proc/cmdline to the new kernel
05:58 PM jepler: bypassed kexec, and the grub setting took effect
05:59 PM jepler: so now we have about 13% more RAM I think
05:59 PM seb_kuzminsky: now we might be able to afford *9* apaches!
05:59 PM jepler: was 886744, now 1017812 (KiB I assume)
06:08 PM jepler: seb_kuzminsky: it's really helpful to me that you've taken an interest in this
06:08 PM jepler: seb_kuzminsky: I was happy to ignore it (and for a long time it was happening less frequently, it seemed)
06:08 PM seb_kuzminsky: maybe more users recently?
06:11 PM jepler: that would be a nice reason, right up until we need the $20/month level of RAM and CPU ...
06:13 PM andypugh: i don’t see any evidence of a lot more subscribers.
06:13 PM andypugh: Though that is not the same as more viewers.
06:13 PM jepler: andypugh: neat stuff on your blog as usual
06:14 PM andypugh: It feels like a bit of a cheat, one casting and an expensive component. But it’s a very nice component.
06:14 PM jepler: andypugh: except for the expense, there's no reason you shouldn't have nice things
06:15 PM andypugh: And it cost about the same as a year at the $20 rate, so if we hit that mark, I volunteer to pay a year.
06:15 PM jepler: that's kind of you
06:16 PM andypugh: Nobody else gets to have an opinion on how I spend my money. I like that, most of the time.
06:16 PM jepler: it's a bit irritating on the technical level to let "someone else" pay for the forum, you have to go through the administrative portion of the digitalocean web interface and thence to paypal
06:16 PM seb_kuzminsky: on Tuesday May 24 2016 there were 146k accesses, on Tuesday May 23 2017 there were 219k
06:17 PM jepler: wow look at seb_kuzminsky with the stats
06:17 PM jepler: (wait this thing has been up over a year? time flies)
06:17 PM jepler: seb_kuzminsky: do you have amazing log-dicing scripts or do you just open-code all these queries with awk in a shell?
06:18 PM seb_kuzminsky: /root/count-requests on flo
06:18 PM andypugh: This forum I use a lot has a (permanently empty) donations bar: http://hmvf.co.uk/forumvb/index.php
06:18 PM seb_kuzminsky: it's a python script that parses the access.log a little bit
06:18 PM andypugh: Hmm, now my catholic-atheist guilt is kicking in
06:20 PM jepler: seb_kuzminsky: and it just happens that we keep a year of logs I guess
06:20 PM seb_kuzminsky: 52 weeks, yeah, looks like
06:20 PM jepler: near enough
06:20 PM seb_kuzminsky: + or - 1, as is customary in computer science
06:20 PM seb_kuzminsky: :w
06:21 PM seb_kuzminsky: gah, it's 2017 and we don't have think-to-focus window managers yet?
06:22 PM jepler: it's not some weird beek emoji I'm just unfamiliar with?
06:22 PM andypugh: (I am not actually a particular fan of military vehicles, it just happens to be where all the folk I met at University who rebuild pre-1920s solid tyred trucks have found a strange home. I do recommend the build-logs in the 2 stickys and the 1908 and 1914 Dennis lorrys.)
06:25 PM andypugh: jepler: That makes so much more sense with “beak” I was wondering if “beek” was a word I was too old to grok.
06:26 PM jepler: oh yes beak
06:26 PM seb_kuzminsky: he's just been inhaling too much vaporized plastic lately, i hear spelling is the first thing to go
06:26 PM jepler: [beek] Scot. and North England. verb (used with or without object) 1. to bask or warm in the sunshine or before a fireplace, stove, or bonfire.
06:30 PM jepler: afk