This is a very old FreeBSD box that predates my move to ZFS, it still runs UFS on gmirror across multiple slices. After years of in-place upgrades, one leg of the /usr
mirror (mirror/p5
) dropped due to stale metadata, not a bad disk. SMART on the re-added drive is clean (0 realloc/pending/CRC), but it has ~127,901 power-on hours (~14.6 years), so I let the mirror rebuild online and I’ll plan a proactive replacement.
Symptoms
gmirror status
shows only one active component for p5
:
# gmirror status
Name Status Components
mirror/p2 COMPLETE ada0p2 (ACTIVE)
ada1p2 (ACTIVE)
mirror/p3 COMPLETE ada0p3 (ACTIVE)
ada1p3 (ACTIVE)
mirror/p4 COMPLETE ada0p4 (ACTIVE)
ada1p4 (ACTIVE)
mirror/p5 DEGRADED ada1p5 (ACTIVE)
mirror/p6 COMPLETE ada0p6 (ACTIVE)
ada1p6 (ACTIVE)
mirror/p7 COMPLETE ada0p7 (ACTIVE)
ada1p7 (ACTIVE)
Mounts confirm /usr
is on mirror/p5
:
# mount
/dev/mirror/p2 on / (ufs, local, soft-updates, journaled soft-updates)
devfs on /dev (devfs)
/dev/mirror/p4 on /var (ufs, local, soft-updates, journaled soft-updates)
/dev/mirror/p5 on /usr (ufs, local, soft-updates, journaled soft-updates)
/dev/mirror/p6 on /home (ufs, local, soft-updates, journaled soft-updates)
/dev/mirror/p7 on /data (ufs, local, soft-updates, journaled soft-updates)
fdescfs on /dev/fd (fdescfs)
Inspect the mirror instance
Note: with gmirror
you use the instance name (p5
) for control commands; the device path is /dev/mirror/p5
.
# gmirror list p5
Geom name: p5
State: DEGRADED
Components: 2
Balance: round-robin
Slice: 4096
Flags: NONE
GenID: 1
SyncID: 16
ID: 2312021832
Type: AUTOMATIC
Providers:
1. Name: mirror/p5
Mediasize: 128849018368 (120G)
Sectorsize: 512
Mode: r1w1e1
Consumers:
1. Name: ada1p5
Mediasize: 128849018880 (120G)
Sectorsize: 512
Stripesize: 0
Stripeoffset: 27917370368
Mode: r1w1e1
State: ACTIVE
Priority: 0
Flags: NONE
GenID: 1
SyncID: 16
ID: 3431532136
Verify the partition exists and matches
Both disks have identical GPT layouts; ada0p5
does exist and matches ada1p5
in size/start:
# gpart show -p ada0
=> 34 1953525101 ada0 GPT (932G)
34 128 ada0p1 freebsd-boot (64K)
162 8388608 ada0p2 freebsd-ufs (4.0G)
8388770 8388608 ada0p3 freebsd-swap (4.0G)
16777378 37748736 ada0p4 freebsd-ufs (18G)
54526114 251658240 ada0p5 freebsd-ufs (120G)
306184354 52428800 ada0p6 freebsd-ufs (25G)
358613154 1509949440 ada0p7 freebsd-ufs (720G)
1868562594 84962541 - free - (41G)
# gpart show -p ada1
=> 34 1953525101 ada1 GPT (932G)
34 128 ada1p1 freebsd-boot (64K)
162 8388608 ada1p2 freebsd-ufs (4.0G)
8388770 8388608 ada1p3 freebsd-swap (4.0G)
16777378 37748736 ada1p4 freebsd-ufs (18G)
54526114 251658240 ada1p5 freebsd-ufs (120G)
306184354 52428800 ada1p6 freebsd-ufs (25G)
358613154 1509949440 ada1p7 freebsd-ufs (720G)
1868562594 84962541 - free - (41G)
Fix: forget stale member, then reinsert
When reinserting the missing slice I initially hit:
gmirror: Not all disks connected.
That usually means the mirror still remembers a disconnected consumer. The fix is to forget and then re-insert:
# Tell gmirror to forget any disconnected members of p5
gmirror forget p5
# Clear metadata on the slice we're adding (harmless if none present)
gmirror clear /dev/ada0p5 # If it says "Invalid argument", it simply had no metadata
# Insert the missing member using the *instance* name
gmirror insert p5 /dev/ada0p5
Result: rebuild starts immediately.
# gmirror status
Name Status Components
mirror/p5 DEGRADED ada1p5 (ACTIVE)
ada0p5 (SYNCHRONIZING, 1%)
Monitoring rebuild progress
Follow system log:
tail -f /var/log/messages | grep gmirror
Periodic status:
sh -c 'while :; do clear; date; gmirror status; sleep 5; done'
When finished you should see:
mirror/p5 COMPLETE ada0p5 (ACTIVE) ada1p5 (ACTIVE)
Optional: drive health (SMART)
Install the proper tools and check both drives:
pkg install -y smartmontools
smartctl -a /dev/ada0
smartctl -a /dev/ada1
In my case, the key SMART fields were clean (no reallocated/pending sectors, no CRC errors). Power-on hours were high (126,905 ~14.5 years), so I’ll plan a proactive replacement even though the disk is currently healthy.
Takeaways
SMART on ada0 (the side I re-added) is clean: 0 reallocated/pending/uncorrectables and 0 CRC errors; however it has ~127,901 power-on hours (~14.5 years) and sits around 52 °C, so while the rebuild completed fine, I should schedule a preemptive replacement.