F5-221 - High Resource Utilization, System Not Accessible After ~15 Minutes

CPU, memory, fan, process, services status, system log, tempreture
User avatar
Mascott
Posts: 9
Joined: 03 Mar 2023, 03:17

F5-221 - High Resource Utilization, System Not Accessible After ~15 Minutes

Post by Mascott »

Hi, I've been having some pretty significant issues with my new F5-221 NAS upgraded to 10 GB of memory. I've configured the system with 3x16 TB Drives in RAID 5 BTRFS, originally running a version of TM 5.0. I enabled the snapshot feature and left the options on the default, figuring I would configure them later if they caused issues.

After a few days of usage, the system started to act very sluggish, with regularly high power utilization, eventually becoming unusable. The system would no longer respond to connection requests, with the TOS resulting in a white screen. It would respond to requests either in Windows SMB or through SSH, and the power button was no longer sending any commands to the machine. I had no choice but to pull the power, and tried again,with the same results. High utilization from btrfs-cleaner was all that specifically stood out, but some process not in the list of processes was also trying to use a significant amount of memory. I tried shutting down and rebooting from the TOS web interface, but the system wouldn't respond to this.

After 15-20 minutes, the machine would no longer be accessible from the web interface or load the TOS system, so options could not be configured and saved. (I'm not sure if the options were ever actually saved after this state was entered.) I left the system running for a few days with no change, eventually plugging in a monitor to find that the system had entered a kernel panic state and was deadlocked on memory, allocating 33 GB of memory. The system would also not respond to shutdown and reboot commands entered directly via the console.

I sat on this for a few days and yesterday tried another solution - I pulled the 3x16TB drives and installed the latest TOS 5.1 on 2x4TB drives, configured as ext4 in RAID 1. I configured the system, disabling snapshots entirely, and it ran smoothly. I then shut down the system, rebooted, inserted the 3x16 TB drives, and found that while the system did boot into the latest 5.1, desktop icons would not appear to check on the processes. When I ran top by directly connecting, I found that it was the same btrfs-cleaner process using high amounts of CPU, but not the one using memory. After ~15 minutes the system would no longer be accessible, but drive activity continues. I tried manually killing btrfs-cleaner, but it doesn't seem to have affected the system. I left it running ~18 hours and found that when it froze it was trying to allocate 99 GB of memory somewhere.

Because I can't access the TOS Control Panel, but I can access an SSH session for a few minutes, what are my options for ensuring that my BTRFS drives are safe, or configuring the system in such a way on the 2x4TB drives that when I plug in the BTRFS drives that they won't lock up the system?
User avatar
Mascott
Posts: 9
Joined: 03 Mar 2023, 03:17

Re: F5-221 - High Resource Utilization, System Not Accessible After ~15 Minutes

Post by Mascott »

Quick view on what the top output looks like - it's not universally 100% CPU but it is absolutely universally capping out at 100 GiB Memory.
Image
User avatar
TMSupport
TerraMaster Team
Posts: 2314
Joined: 13 Dec 2019, 15:15

Re: F5-221 - High Resource Utilization, System Not Accessible After ~15 Minutes

Post by TMSupport »

The btrfs file system defragment has been disabled in the new TOS version. You can reinstall TOS with 3x16 TB Drives
To contact our team, please send email to following addresses, remember to replace (at) with @
Technical team: support(at)terra-master.com (for technical support)
Service team: service(at)terra-master.com (for purchasing, return, replacement, RMA service)
User avatar
Mascott
Posts: 9
Joined: 03 Mar 2023, 03:17

Re: F5-221 - High Resource Utilization, System Not Accessible After ~15 Minutes

Post by Mascott »

TMSupport wrote: 03 Mar 2023, 15:28 The btrfs file system defragment has been disabled in the new TOS version. You can reinstall TOS with 3x16 TB Drives
Hi, when I boot up and access the system with the 2x4TB Drives installed in addition to the 3x16 TB drives it is running 5.1.24. In addition, I'm not able to reinstall TOS onto those drives when I'm unable to launch any applications on the web interface, as no icons load when accessing it for the first time. Is there a way to do that reinstall through SSH?
User avatar
Mascott
Posts: 9
Joined: 03 Mar 2023, 03:17

Re: F5-221 - High Resource Utilization, System Not Accessible After ~15 Minutes

Post by Mascott »

Quick update here, I booted the NAS with only the 3x16TB drives installed and it booted into TM 5.1.24. However I'm still not seeing any desktop icons and the system is not able to be used normally.
homeone.jpg
hometwo.jpg
User avatar
TMroy
TerraMaster Team
Posts: 2598
Joined: 10 Mar 2020, 14:04
China

Re: F5-221 - High Resource Utilization, System Not Accessible After ~15 Minutes

Post by TMroy »

Your system shows running sluggish. it must be some processes running and consuming all system resources, so your system is too busy to respond. My initial suspicion is that your file system is faulty, stuck during the file system check, and has not been able to complete. Do you know how to login ssh and check the running processes with command line?
To contact our team, please send email to following addresses, remember to replace (at) with @:
Support team: support(at)terra-master.com (for technical support only)
Service team: service(at)terra-master.com (for purchasing, return, replacement, RMA service)
User avatar
Mascott
Posts: 9
Joined: 03 Mar 2023, 03:17

Re: F5-221 - High Resource Utilization, System Not Accessible After ~15 Minutes

Post by Mascott »

TMroy wrote: 04 Mar 2023, 12:11 Your system shows running sluggish. it must be some processes running and consuming all system resources, so your system is too busy to respond. My initial suspicion is that your file system is faulty, stuck during the file system check, and has not been able to complete. Do you know how to login ssh and check the running processes with command line?
The top output shows that the btrfs-cleaner is taking up the high CPU utilization, but no process seems to be responsible for the 99gb of virtual memory, which is what seems to be eventually crashing the system. I'm still able to read data off the disk for a period of time after a hard reboot - the last one I was able to pull ~100 GB off over SMB and it was not corrupted or affected in any noticeable way. All SMART data reads healthy when hotplugging in the drives.

I don't know if it was the utilization, but trying to run a mkdir command wound up never completing, so it does seem to be having trouble writing to the BTRFS volume. (All commands, including top and dir, take a significant amount of time to complete.)

My thought here is that it originally had something to do with the snapshots feature that BTRFS has - I had enabled it and left the default settings, which are 100 snapshots max, and running once every hour. I started seeing this issue about 4-5 days after the initial installation, which would be when it started trying to delete snapshots. I've disabled snapshots in the new install of 5.1.24, but the snapshots are still there on the drives that were running 5.0.

Because I was able to read from the volume, and I can fully boot into the system when only the 2x4TB ext4 drives are installed, is there a way that I can hotplug and mount the BTRFS drives in a read-only state in order to evacuate the data? That's my main concern at the moment.
User avatar
Mascott
Posts: 9
Joined: 03 Mar 2023, 03:17

Re: F5-221 - High Resource Utilization, System Not Accessible After ~15 Minutes

Post by Mascott »

Is there any further help that can be offered on this issue?
User avatar
TMSupport
TerraMaster Team
Posts: 2314
Joined: 13 Dec 2019, 15:15

Re: F5-221 - High Resource Utilization, System Not Accessible After ~15 Minutes

Post by TMSupport »

Please try to log in to the SSH terminal and disable the btrfs quota.

Code: Select all

btrfs quota disable /Volume1
disable_quota.png
To contact our team, please send email to following addresses, remember to replace (at) with @
Technical team: support(at)terra-master.com (for technical support)
Service team: service(at)terra-master.com (for purchasing, return, replacement, RMA service)
User avatar
wonkie
Posts: 9
Joined: 28 Jun 2022, 22:02

Re: F5-221 - High Resource Utilization, System Not Accessible After ~15 Minutes

Post by wonkie »

I had the exact same issue and spent a few weeks trying every suggestion and could never resolve it. I was able to take all of my drives and mount them in Linux and recover my data and move it to a different server. Good Luck.
Locked