FreeBSD GEOM mirror: fixing a degraded UFS mirror

This is a very old FreeBSD box that predates my move to ZFS, it still runs UFS on gmirror across multiple slices. After years of in-place upgrades, one leg of the /usr mirror (mirror/p5) dropped due to stale metadata, not a bad disk. SMART on the re-added drive is clean (0 realloc/pending/CRC), but it has ~127,901 power-on hours (~14.6 years), so I let the mirror rebuild online and I’ll plan a proactive replacement.

Symptoms

gmirror status shows only one active component for p5:

# gmirror status
     Name    Status  Components
mirror/p2  COMPLETE  ada0p2 (ACTIVE)
                     ada1p2 (ACTIVE)
mirror/p3  COMPLETE  ada0p3 (ACTIVE)
                     ada1p3 (ACTIVE)
mirror/p4  COMPLETE  ada0p4 (ACTIVE)
                     ada1p4 (ACTIVE)
mirror/p5  DEGRADED  ada1p5 (ACTIVE)
mirror/p6  COMPLETE  ada0p6 (ACTIVE)
                     ada1p6 (ACTIVE)
mirror/p7  COMPLETE  ada0p7 (ACTIVE)
                     ada1p7 (ACTIVE)

Mounts confirm /usr is on mirror/p5:

# mount
/dev/mirror/p2 on / (ufs, local, soft-updates, journaled soft-updates)
devfs on /dev (devfs)
/dev/mirror/p4 on /var (ufs, local, soft-updates, journaled soft-updates)
/dev/mirror/p5 on /usr (ufs, local, soft-updates, journaled soft-updates)
/dev/mirror/p6 on /home (ufs, local, soft-updates, journaled soft-updates)
/dev/mirror/p7 on /data (ufs, local, soft-updates, journaled soft-updates)
fdescfs on /dev/fd (fdescfs)

Inspect the mirror instance

Note: with gmirror you use the instance name (p5) for control commands; the device path is /dev/mirror/p5.

# gmirror list p5
Geom name: p5
State: DEGRADED
Components: 2
Balance: round-robin
Slice: 4096
Flags: NONE
GenID: 1
SyncID: 16
ID: 2312021832
Type: AUTOMATIC
Providers:
1. Name: mirror/p5
   Mediasize: 128849018368 (120G)
   Sectorsize: 512
   Mode: r1w1e1
Consumers:
1. Name: ada1p5
   Mediasize: 128849018880 (120G)
   Sectorsize: 512
   Stripesize: 0
   Stripeoffset: 27917370368
   Mode: r1w1e1
   State: ACTIVE
   Priority: 0
   Flags: NONE
   GenID: 1
   SyncID: 16
   ID: 3431532136

Verify the partition exists and matches

Both disks have identical GPT layouts; ada0p5 does exist and matches ada1p5 in size/start:

# gpart show -p ada0
=>        34  1953525101    ada0  GPT  (932G)
          34         128  ada0p1  freebsd-boot  (64K)
         162     8388608  ada0p2  freebsd-ufs   (4.0G)
     8388770     8388608  ada0p3  freebsd-swap  (4.0G)
    16777378    37748736  ada0p4  freebsd-ufs   (18G)
    54526114   251658240  ada0p5  freebsd-ufs   (120G)
   306184354    52428800  ada0p6  freebsd-ufs   (25G)
   358613154  1509949440  ada0p7  freebsd-ufs   (720G)
  1868562594    84962541          - free -      (41G)

# gpart show -p ada1
=>        34  1953525101    ada1  GPT  (932G)
          34         128  ada1p1  freebsd-boot  (64K)
         162     8388608  ada1p2  freebsd-ufs   (4.0G)
     8388770     8388608  ada1p3  freebsd-swap  (4.0G)
    16777378    37748736  ada1p4  freebsd-ufs   (18G)
    54526114   251658240  ada1p5  freebsd-ufs   (120G)
   306184354    52428800  ada1p6  freebsd-ufs   (25G)
   358613154  1509949440  ada1p7  freebsd-ufs   (720G)
  1868562594    84962541          - free -      (41G)

Fix: forget stale member, then reinsert

When reinserting the missing slice I initially hit:

gmirror: Not all disks connected.

That usually means the mirror still remembers a disconnected consumer. The fix is to forget and then re-insert:

# Tell gmirror to forget any disconnected members of p5
gmirror forget p5

# Clear metadata on the slice we're adding (harmless if none present)
gmirror clear /dev/ada0p5   # If it says "Invalid argument", it simply had no metadata

# Insert the missing member using the *instance* name
gmirror insert p5 /dev/ada0p5

Result: rebuild starts immediately.

# gmirror status
     Name    Status  Components
mirror/p5  DEGRADED  ada1p5 (ACTIVE)
                     ada0p5 (SYNCHRONIZING, 1%)

Monitoring rebuild progress

Follow system log:

tail -f /var/log/messages | grep gmirror

Periodic status:

sh -c 'while :; do clear; date; gmirror status; sleep 5; done'

When finished you should see:

mirror/p5  COMPLETE  ada0p5 (ACTIVE)  ada1p5 (ACTIVE)

Optional: drive health (SMART)

Install the proper tools and check both drives:

pkg install -y smartmontools
smartctl -a /dev/ada0
smartctl -a /dev/ada1

In my case, the key SMART fields were clean (no reallocated/pending sectors, no CRC errors). Power-on hours were high (126,905 ~14.5 years), so I’ll plan a proactive replacement even though the disk is currently healthy.

Takeaways

SMART on ada0 (the side I re-added) is clean: 0 reallocated/pending/uncorrectables and 0 CRC errors; however it has ~127,901 power-on hours (~14.5 years) and sits around 52 °C, so while the rebuild completed fine, I should schedule a preemptive replacement.