Citadel - Build Plan and Log

ctag

Patron
Joined
Jun 16, 2017
Messages
225
OK, a month later and I finally am getting back to the system.

Upgraded to 11.2-U5. Everything looks good so far.

I need to respond to that bug ticket I had about the scrolling...

I saw this on the kernel log yesterday:
2019-09-03_22-37.png


And then got this as an email:
bns-citadel.local kernel log messages:
> ugen6.2: <CP1000PFCLCD CRDA103BJ1> at usbus6 (disconnected)
> ugen6.2: <CP1000PFCLCD CRDA103BJ1> at usbus6

-- End of security output --

I'm curious if there's something I need to do to fix my UPS configuration.. Sometimes I get a half dozen of those disconnected alerts in a daily email.
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
I'm trying to fix up my rsync backup tasks, since they stopped working when I switched to using `.local` for my home network.

I think I have it all set up right. I can SSH from the NAS to my desktop computer with a keypair. But the Freenas GUI says "Unknown OpenSSH private key algorithm"
 

Attachments

  • 1568122082225.png
    1568122082225.png
    1.4 MB · Views: 513

ctag

Patron
Joined
Jun 16, 2017
Messages
225
If I dig into /etc/crontab and take the old rsync task, update it to have .local for the hostname, and run it, it appears to work. So whatever is keeping me from saving the config on the GUI appears to be some sort of validation on the webpage..?

1568123188713.png
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
Disabling "Validate remote path" on the rsync task allows me to save the task, and the hostname is appropriately updated. But now clicking "Run now" has no effect, and nothing is written to the system log even though the "Rsync task started" popup appears...
 

onceler

Cadet
Joined
Sep 11, 2017
Messages
9
Using the .local. domain for local networks causes problems, as it is used by mDNS. See https://tools.ietf.org/html/rfc6762#appendix-G

If you have a registered domain (e.g. example.com), it's best to use a subdomain of that (e.g. local.example.com). If not, pick another TLD that is unlikely to be registered globally (the above link suggests .intranet, .internal, .private, .corp, .home, or .lan)
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
Using the .local. domain for local networks causes problems, as it is used by mDNS. See https://tools.ietf.org/html/rfc6762#appendix-G

If you have a registered domain (e.g. example.com), it's best to use a subdomain of that (e.g. local.example.com). If not, pick another TLD that is unlikely to be registered globally (the above link suggests .intranet, .internal, .private, .corp, .home, or .lan)
Ah, dang it. I was trying to do things the "right" way! Thank you for letting me know.

I own a domain, and previously had tried using it for external and internal resolution, but that seemed to cause problems. I'm not sure how to setup a subdomain for local use, but I'll look into it.

Aaaand I got this email this morning. Oh no!
2019-09-11_11-28.png
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
Woke up to this email, I don't have time to investigate until later today:


FreeNAS @ bns-citadel.local

New alert:
* Boot Pool Status Is DEGRADED: One or more devices are faulted in response to IO failures.

The following alert has been cleared:
* Boot Pool Status Is DEGRADED: One or more devices are faulted in response to persistent errors. Sufficient replicas exist for the pool to continue functioning in a degraded state.

Current alerts:
* New feature flags are available for volume main-pool. Refer to the "Upgrading a ZFS Pool" subsection in the User Guide "Installing and Upgrading" chapter and "Upgrading" section for more instructions.
* Boot Pool Status Is DEGRADED: One or more devices are faulted in response to IO failures.
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
I'm away from home right now, but I cannot seem to reach the web UI or ssh into the machine through my VPN. SSH reached "connected" and then hangs.

But I can reach the jails... Weird.
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
I had to leave for a trip Friday, and before heading out the door I noticed that the NAS was inaccessible from web or SSH, so I powered it down. Today I turned it back on, and it came up with an email:
New alert:
* Boot Pool Status Is ONLINE: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected.

I looked around a bit and ran `zpool status -v`
bns-citadel# zpool status -v
pool: freenas-boot
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: resilvered 122M in 0 days 00:01:32 with 0 errors on Mon Sep 30 11:13:44 2019
config:

NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
da7p2 ONLINE 0 0 0
da6p2 ONLINE 0 0 4

errors: No known data errors

The little USB flash drives that this thing boots from don't have SMART capabilities (as far as I know at least) so I'm just going to clear the error and carry on. Is that the right thing to do here?
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
Oh, and I found a server someone had thrown out on the curb.

IMG_20190929_172302.jpg

2x Xeon E5-2640 @ 2.5GHz and 12G of DDR3-8400 RAM.

It's running memtest right now.
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
More memory errors.

bns-citadel.local kernel log messages:
> MCA: Bank 8, Status 0x8c0000400001009f
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 0
> MCA: CPU 0 COR (1) RD channel ?? memory error
> MCA: Address 0x2a9c7880
> MCA: Misc 0x10000000080381
> arp: 192.168.13.23 moved from 02:ad:10:00:05:0a to f0:4d:a2:30:14:44 on epair1b

-- End of security output --
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
Following up on post #144 about Rsync tasks.

I went back and added --log-file=/mnt/main-pool/bkup/anarch/root_log.txt to the extra rsync options, and then re-ran the task. The log file had errors that seem to indicate SSH failed to connect. Then I read that "RSA" is the only supported SSH key type in the rsync task, so I created a new RSA key on the FreeNAS box, and configured it to log me into my desktop (anarch, named like "An Archlinux Box" not "Anarchy! Bwahaha!"). Anyway, I still couldn't get the UI to save the task, since it still returned "unknown OpenSSH algorithm" so I unchecked "Validate remote path" and saved it.

Rsync log error:
2019/10/03 10:51:22 [13558] rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
2019/10/03 10:51:22 [13558] rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.3]

Now the rsync task appears to be running successfully when I manually start it from the UI, and I see the output being saved to the separate log file.
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
bns-citadel.local kernel log messages:
> arp: 192.168.13.23 moved from 02:ad:10:00:05:0a to f0:4d:a2:30:14:44 on epair1b
> MCA: Bank 8, Status 0x8c0000400001009f
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 0
> MCA: CPU 0 COR (1) RD channel ?? memory error
> MCA: Address 0x2a9c7880
> MCA: Misc 0x10000000080180
> MCA: Bank 8, Status 0x8c0000400001009f
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 0
> MCA: CPU 0 COR (1) RD channel ?? memory error
> MCA: Address 0x2a9c7880
> MCA: Misc 0x10000000080180
> MCA: Bank 8, Status 0x8c0000400001009f
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 0
> MCA: CPU 0 COR (1) RD channel ?? memory error
> MCA: Address 0x2a9c7880
> MCA: Misc 0x10000000080480
> MCA: Bank 8, Status 0x8c0000400001009f
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 0
> MCA: CPU 0 COR (1) RD channel ?? memory error
> MCA: Address 0x2a9c7880
> MCA: Misc 0x10000000080280
> MCA: Bank 8, Status 0x8c0000400001009f
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 0
> MCA: CPU 0 COR (1) RD channel ?? memory error
> MCA: Address 0x2a9c7880
> MCA: Misc 0x10000000080686
> MCA: Bank 8, Status 0x8c0000400001009f
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 0
> MCA: CPU 0 COR (1) RD channel ?? memory error
> MCA: Address 0x2a9c7880
> MCA: Misc 0x10000000081282

-- End of security output --
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
bns-citadel.local kernel log messages:
> pid 72967 (php), uid 80: exited on signal 11
> MCA: Bank 8, Status 0x8c0000400001009f
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 0
> MCA: CPU 0 COR (1) RD channel ?? memory error
> MCA: Address 0x2a9c7880
> MCA: Misc 0x10000000080181

-- End of security output --
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
Yesterday
bns-citadel.local kernel log messages:
> MCA: Bank 8, Status 0x8c0000400001009f
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 0
> MCA: CPU 0 COR (1) RD channel ?? memory error
> MCA: Address 0x2a9c7880
> MCA: Misc 0x10000000081181
> MCA: Bank 8, Status 0x8c0000400001009f
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 0
> MCA: CPU 0 COR (1) RD channel ?? memory error
> MCA: Address 0x2a9c7880
> MCA: Misc 0x10000000080182

-- End of security output --

And today
bns-citadel.local kernel log messages:
> MCA: Bank 8, Status 0x8c0000400001009f
> MCA: Global Cap 0x0000000000001c09, Status 0x0000000000000000
> MCA: Vendor "GenuineIntel", ID 0x206c2, APIC ID 0
> MCA: CPU 0 COR (1) RD channel ?? memory error
> MCA: Address 0x2a9c7880
> MCA: Misc 0x10000000080380
> ugen5.2: <CP1000PFCLCD CRDA103BJ1> at usbus5 (disconnected)
> ugen5.2: <CP1000PFCLCD CRDA103BJ1> at usbus5
> pid 98481 (php), uid 80: exited on signal 11

-- End of security output --
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
Upgraded the base OS and a few jails to 11.3-RELEASE yesterday. So far so good.

I really appreciate all the work that's been put into making the UI more friendly, it seems much more approachable now.
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
So I was still getting the "Cannot validate remote path" error on my rsync tasks, and found that removing my ed25519 key cures the error... Which seems both silly and frustrating.
 

ctag

Patron
Joined
Jun 16, 2017
Messages
225
Now my rsync tasks are failing with:
Code:
Error: Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/middlewared/job.py", line 349, in run
    await self.future
  File "/usr/local/lib/python3.7/site-packages/middlewared/job.py", line 388, in __run_body
    rv = await self.middleware.run_in_thread(self.method, *([self] + args))
  File "/usr/local/lib/python3.7/site-packages/middlewared/utils/run_in_thread.py", line 10, in run_in_thread
    return await self.loop.run_in_executor(self.run_in_thread_executor, functools.partial(method, *args, **kwargs))
  File "/usr/local/lib/python3.7/site-packages/middlewared/utils/io_thread_pool_executor.py", line 25, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.7/site-packages/middlewared/schema.py", line 964, in nf
    return f(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/middlewared/plugins/rsync.py", line 609, in run
    f'rsync command returned {cp.returncode}. Check logs for further information.'
middlewared.service_exception.CallError: [EFAULT] rsync command returned 12. Check logs for further information.


And a log of:
Code:
rsync: This rsync lacks old-style --compress due to its external zlib.  Try -zz.
rsync error: syntax or usage error (code 1) at main.c(1578) [server=3.1.3]
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(226) [Receiver=3.1.3]


The new log download feature in 11.3 is nice, and that quickly led to the workaround: uncheck "Compress" and add "-zz" to the CLI options.
 
Top