I should add that this isn’t the first time this has happened, but it is the first time since I reduced the allocation of RAM for PostgreSQL in the configuration file. I swore that that was the problem, but I guess not. It’s been almost a full week without any usage spikes or service interruptions of this kind, but all of a sudden, my RAM and CPU are maxing out again at regular intervals. When this occurs, the instance is unreachable until the issue resolves itself, which seemingly takes 5-10 minutes.

The usage spikes only started today out of a seven-day graph; they are far above my idle usage.

I thought the issue was something to do with Lemmy periodically fetching some sort of remote data and slamming the database, which is why I reduced the RAM allocation for PostgreSQL to 1.5 GB instead of the full 2 GB. As you can see in the above graph, my idle resource utilization is really low. Since it’s probably cut off from the image, I’ll add that my disk utilization is currently 25-30%. Everything seemed to be in order for basically an entire week, but this problem showed up again.

Does anyone know what is causing this? Clearly, something is happening that is loading the server more than usual.

  • babbiorsetto@lemmy.orefice.win
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Here’s an update. I set up atop on my VPS and waited until the issue occurred again. Here’s the atop log from the event.

    ATOP - ip-172-31-7-27   2023/07/22  18:40:02   -----------------   10m0s elapsed
    PRC | sys    9m49s | user  12.66s | #proc    134 | #zombie    0 | #exit      3 |
    CPU | sys      99% | user      0% | irq       0% | idle      0% | wait      0% |
    MEM | tot   957.1M | free   49.8M | buff    0.1M | slab   95.1M | numnode    1 |
    SWP | tot     0.0M | free    0.0M | swcac   0.0M | vmcom   2.4G | vmlim 478.6M |
    PAG | numamig    0 | migrate    0 | swin       0 | swout      0 | oomkill    0 |
    PSI | cpusome  63% | memsome  99% | memfull  88% | iosome   99% | iofull    0% |
    DSK |         xvda | busy    100% | read  461505 | write    171 | avio 1.30 ms |
    DSK |        xvda1 | busy    100% | read  461505 | write    171 | avio 1.30 ms |
    NET | transport    | tcpi    2004 | tcpo    1477 | udpi       9 | udpo      11 |
    NET | network      | ipi     2035 | ipo     1521 | ipfrw     20 | deliv   2015 |
    NET | eth0    ---- | pcki    2028 | pcko    1500 | si    4 Kbps | so    1 Kbps |
    
        PID SYSCPU USRCPU  VGROW  RGROW  RDDSK  WRDSK  CPU CMD            
         41  5m17s  0.00s     0B     0B     0B     0B  53% kswapd0        
          1 21.87s  0.00s     0B -80.0K   1.2G     0B   4% systemd        
      21681 20.28s  0.00s     0B   4.0K   4.2G     0B   3% lemmy          
        435 18.00s  0.00s     0B 392.0K 163.1M     0B   3% snapd          
      21576 17.20s  0.00s     0B     0B   4.2G     0B   3% pict-rs        
    

    The culprit seems to be kswapd0 trying to move memory to swap space, although there is no swap space.

    I set memory swappiness to 0 on the system for now, I’ll check if that makes a difference.