Thrash-protect - a linux utility to prevent thrashing

in #linux5 years ago

Back in the days, Windows used to be a quite unstable system frequently requiring reboots, while Linux was rock steady. We were proud of it, and we used to brag about our uptimes - I've encountered servers that have had a handful of years of uninterrupted uptime (though, from a sysadmins point of view it's considered bad practice not to reboot a system from time to time, and by today the mantra is that any complex system should achieve uninterrupted uptime by having sufficient redundancy and avoiding single-point-of-failures, not by avoiding reboots at all cost - but I digress).

There is one grievance I've always had with Linux though - with too hard memory usage, such that the swap is being used actively (which can happen accidentally, even with a small amount of swap space installed), one may end up with "thrashing"; all resources are spent moving things between memory and disk and everything is hanging. Technically, the computer hasn't really crashed - but one never knows if it will take minutes, hours or years until one can log into it, so a reboot is often the only way out. I'd say it's a design bug, and I can't understand why nobody has bothered doing anything with it.

Well, eventually I did. First I worked two-three days writing some C-code in 1998, offline from my laptop, and almost got something working when I experienced a terrible disk crash on my laptop. Then some years ago I had a situation at work where servers in a cluster frequently was going down due to thrashing. I couldn't allow that to happen, yet turning off swap was not an option, so the project was reborn, as a "prototype" in Python. It's a user-space utility (as opposed to kernel-space) that works a bit like an ABS brake - halting the applications that are swapping the most in small periods. I could go into the details ... but I've already done so in the README so no point in repeating it :-) I think thrash-protect is such a great idea that I run it on all boxes I'm responsible for.

Doing open source contributions is often not much giving - one usually gets nothing except complaints on things that doesn't work as it should - which is actually better than nothing, because it means the software is actually used by others. Today I got an email in the inbox that made me happy ...

Date: Sat, 3 Nov 2018 21:17:34 -0400
From: --- [email protected]
To: [email protected]
Subject: your thrash-protect utility

Hi,

I just wanted to say THANK YOU very much for your Thrash-protect utility!

Running on a Fedora 28 workstation install, and oh my God it has made ALL
the difference between pitch black night and blazing blindingly bright
daylight!

I like to browse with lots of pages in open in lots of tabs in more than
one browser, all at once.

And I already had a huge swap partition enabled, but the machine could
never make use of it before an out of control forever thrash-loop occurred
to the point of having no option but to power off (when I would go and
browse a huge bunch of pages at once).

And NOW, with thrash-protect, I haven't seen a single instance of out of
control disk activity happen again, at all, (with a huge number of pages
opened) instead all of the disk activity is now actually useful and always
completes properly!.

Thank YOU ! ! (I don't have much money, But I would be glad to give a
small donation if that would be desirable.)

Sort:  

Linux never really had proper memory management.
As in, the program interface makes each program look like it is running in their own machine with all the memory.

However, not all pages are created equal. There are pages that really should never be moved to swap.

My solution, if ever i get the machinery, is to create a linux processor.
Or, basically, design a microprocessor that would work seamlessly with linux kernels.

In doing so, it would probably be very easy to make this same microprocessor run I/O for the USB/UART, or the disk drives (like the Amiga).

Then, you have this processor that runs the linux kernel, and only the kernel so it never leaves the runtime. It is always instantly available, and can more actively monitor other processes based on usage, while those processes are running.

Further, if i do this right, i can make many processor machines. If these processors can handle actual loads... which is hard to tell, because Mozilla needs a lot of resources... can it run on multiple RISC processors simultaneously, or we still need a large main processor?

But, if the multi-processors works, than i would like to array them like chip on glass, or chips on wafers. Make the individual processors able to be made by less than cutting edge wafer etching machines. And then ganging them up in post production. This also may work out better for heat.

Hi @tobixen!

Your post was upvoted by @steem-ua, new Steem dApp, using UserAuthority for algorithmic post curation!
Your UA account score is currently 3.888 which ranks you at #4099 across all Steem accounts.
Your rank has dropped 34 places in the last three days (old rank 4065).

In our last Algorithmic Curation Round, consisting of 253 contributions, your post is ranked at #125.

Evaluation of your UA score:
  • You're on the right track, try to gather more followers.
  • The readers like your work!
  • You have already shown user engagement, try to improve it further.

Feel free to join our @steem-ua Discord server

Somehow steemit had unfollowed you like it has with others that I didnt want to "unfollow"! Glad to see you still alive :)I sent you some emails :P

Coin Marketplace

STEEM 0.29
TRX 0.12
JST 0.033
BTC 63464.16
ETH 3111.33
USDT 1.00
SBD 3.98