Open Source Software Technical Articles

Want the Best of the Wazi Blogs Delivered Directly to your Inbox?

Subscribe to Wazi by Email

Your email:

Connect with Us!

Current Articles | RSS Feed RSS Feed

How to measure memory usage in Linux

  
  
  

Whether you are a system administrator or a developer, sometimes you need to consider the use of memory in GNU/Linux processes and programs. Memory is a critical resource, and limited memory plus processes that use a lot of RAM can cause a situation where the kernel goes out of memory (OOM). In this state Linux activates an OOM killer kernel process that attempts to recover the system by terminating one or more low-priority processes. Which processes the system kills is unpredictable, so though the OOM killer may keep the server from going down, it can cause problems in the delivery of services that should stay running.

In this article we'll look at three utilities that report information about the memory used on a GNU/Linux system. Each has strengths and weaknesses, with accuracy being their Achilles' heel. I'll use CentOS 6.4 as my demo system, but these programs are available on any Linux distribution.

ps

ps displays information about active processes, with a number of custom fields that you can decide to show or not. For the purposes of this article I'll focus on how to display information about memory usage. ps shows the percentage of memory that is used by each process or task running on the system, so you can easily identify memory-hogging processes.

Running ps aux shows every process on the system. Typical output looks something like this:

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.0 19228 1488 ? Ss 18:59 0:01 /sbin/init
root 2 0.0 0.0 0 0 ? S 18:59 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S 18:59 0:00 [migration/0]
...
...
root 742 0.0 0.0 0 0 ? S 19:00 0:00 [ext4-dio-unwrit]
root 776 0.0 0.0 0 0 ? S 19:00 0:00 [kauditd]
root 785 0.0 0.0 0 0 ? S 19:00 0:00 [flush-253:0]
root 939 0.0 0.0 27636 808 ? S<sl 19:00 0:00 auditd
root 955 0.0 0.0 255416 1624 ? Sl 19:00 0:00 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
root 1080 0.0 0.1 78720 3272 ? Ss 19:00 0:00 /usr/libexec/postfix/master
postfix 1088 0.0 0.1 78800 3236 ? S 19:00 0:00 pickup -l -t fifo -u
postfix 1089 0.0 0.1 78972 3284 ? S 19:00 0:00 qmgr -l -t fifo -u
root 1090 0.0 0.0 117244 1420 ? Ss 19:00 0:01 crond
root 1103 0.0 0.0 56760 1680 ? Ss 19:00 0:00 login -- root 
root 1105 0.0 0.0 4060 572 tty2 Ss+ 19:00 0:00 /sbin/mingetty /dev/tty2
root 1107 0.0 0.0 4060 576 tty3 Ss+ 19:00 0:00 /sbin/mingetty /dev/tty3
root 1109 0.0 0.0 4060 572 tty4 Ss+ 19:00 0:00 /sbin/mingetty /dev/tty4
root 1111 0.0 0.0 4060 572 tty5 Ss+ 19:00 0:00 /sbin/mingetty /dev/tty5
root 1116 0.0 0.0 4060 568 tty6 Ss+ 19:00 0:00 /sbin/mingetty /dev/tty6
root 1129 0.0 0.0 19400 952 ? Ss 19:01 0:00 /usr/sbin/anacron -s
root 1135 0.0 0.1 108296 1932 tty1 Ss+ 19:14 0:00 -bash
root 1205 0.0 0.0 9116 688 ? Ss 19:15 0:00 dhclient eth0
root 1234 0.0 0.2 97864 3912 ? Ss 19:16 0:00 sshd: root@pts/0 
root 1238 0.0 0.0 108300 1904 pts/0 Ss 19:18 0:00 -bash
root 1283 0.0 0.0 64116 1152 ? Ss 19:20 0:00 /usr/sbin/sshd
root 18990 7.0 0.0 110224 1160 pts/0 R+ 19:32 0:00 ps aux

If you are searching for memory hogs, you probably want to sort the output. The --sort argument takes key values that indicate how you want to order the output. For instance, ps aux --sort -rss sorts by resident set size, which represents the non-swapped physical memory that each taskuses. However, RSS can be misleading and may show a higher value than the real one if pages are shared, for example by several threads or by dynamically linked libraries.

You can also use -vsz – virtual set size – but it does not reflect the actual amount of memory used by applications, but rather the amount of memory reserved for them, which includes the RSS value. You usually won't want to use it when searching for processes that eat memory.

ps -aux alone isn't enough to tell you if a process is thrashing, but if your system is thrashing, it will help you identify the processes that are experiencing the biggest hits.

top

The top command displays a dynamic real-time view of system information and the running tasks managed by the Linux kernel. The memory usage stats include real-time live total, used, and free physical memory and swap memory, with buffers and cached memory size respectively. Type top at the command line to see a constantly updated stats page:

top – 19:56:33 up 56 min, 2 users, load average: 0.00, 0.00, 0.00
Tasks: 67 total, 1 running, 66 sleeping, 0 stopped, 0 zombie
Cpu(s): 4.4%us, 1.7%sy, 0.2%ni, 88.7%id, 5.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 1922680k total, 851808k used, 1070872k free, 19668k buffers
Swap: 4128760k total, 0k used, 4128760k free, 692716k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 
1 root 20 0 19228 1488 1212 S 0.0 0.1 0:01.29 init 
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd 
3 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 
4 root 20 0 0 0 0 S 0.0 0.0 0:00.17 ksoftirqd/0 
5 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 
6 root RT 0 0 0 0 S 0.0 0.0 0:00.01 watchdog/0 
7 root 20 0 0 0 0 S 0.0 0.0 0:01.27 events/0 
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cgroup 
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khelper 
10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 netns 
11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 async/mgr 
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 pm 
....

In top memory is mapped as VIRT, RES, and SHR:

  • VIRT is the virtual size of a process, which is the sum of the memory it is actually using, memory it has mapped into itself (for instance a video cards's RAM for the X server), files on disk that have been mapped into it (most notably shared libraries), and memory shared with other processes. VIRT represents how much memory the process is able to access at the present moment.
  • RES is the resident size, which is an accurate representation of how much actual physical memory a process is consuming. (This number corresponds directly to top's %MEM column.) This amount will virtually always be less than the VIRT size, since most programs depend on the C library.
  • SHR indicates how much of the VIRT size is actually sharable, so it includes memory and libraries that could be shared with other processes. In the case of libraries, it does not necessarily mean that the entire library is resident. For example, if a program only uses a few functions in a library, the whole library is mapped and counted in VIRT and SHR, but only the parts of the library file that contain the functions being used are actually loaded in and counted under RES.

Some of these numbers can be a little misleading. For instance, if you have a website that use PHP, and in particular php-fpm, you could see something like:

top – 14:15:34 up 2 days, 12:38, 1 user, load average: 0.97, 1.03, 0.93
Tasks: 124 total, 1 running, 123 sleeping, 0 stopped, 0 zombie
Cpu(s): 4.9%us, 0.3%sy, 0.0%ni, 94.6%id, 0.0%wa, 0.0%hi, 0.1%si, 0.1%st
Mem: 1029508k total, 992140k used, 37368k free, 150404k buffers
Swap: 262136k total, 2428k used, 259708k free, 551500k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6695 www-data 20 0 548m 307m 292m S 0 30.6 8:06.55 php-fpm
6697 www-data 20 0 547m 306m 292m S 0 30.4 7:59.64 php-fpm
6691 www-data 20 0 547m 305m 291m S 2 30.4 8:04.96 php-fpm
6689 www-data 20 0 547m 305m 291m S 2 30.3 8:07.55 php-fpm
6696 www-data 20 0 540m 298m 292m S 1 29.7 8:13.43 php-fpm
6705 www-data 20 0 540m 298m 292m S 0 29.7 8:17.24 php-fpm
6699 www-data 20 0 540m 298m 291m S 4 29.7 8:07.39 php-fpm
6701 www-data 20 0 541m 297m 289m S 0 29.6 7:59.87 php-fpm
6700 www-data 20 0 540m 297m 290m S 0 29.5 8:09.92 php-fpm
6694 www-data 20 0 541m 296m 288m S 2 29.5 8:05.18 php-fpm
6707 www-data 20 0 541m 296m 288m S 0 29.5 8:09.40 php-fpm
6692 www-data 20 0 541m 296m 289m S 0 29.5 8:14.23 php-fpm
6706 www-data 20 0 541m 296m 289m S 3 29.5 8:07.59 php-fpm
6698 www-data 20 0 541m 295m 288m S 4 29.4 8:04.85 php-fpm
6704 www-data 20 0 539m 295m 289m S 2 29.4 8:13.58 php-fpm
6708 www-data 20 0 540m 295m 288m S 1 29.4 8:14.27 php-fpm
6802 www-data 20 0 540m 295m 288m S 3 29.3 8:11.63 php-fpm
6690 www-data 20 0 541m 294m 287m S 3 29.3 8:14.54 php-fpm
6693 www-data 20 0 539m 293m 287m S 2 29.2 8:16.33 php-fpm
6702 www-data 20 0 540m 293m 286m S 0 29.2 8:12.41 php-fpm
8641 www-data 20 0 540m 292m 285m S 4 29.1 6:45.87 php-fpm
8640 www-data 20 0 539m 291m 285m S 2 29.0 6:47.01 php-fpm
6703 www-data 20 0 539m 291m 285m S 2 29.0 8:17.77 php-fpm

Is it possible that all these processes use around 30 percent of the total memory of the system? Yes it is, because they use a lot of shared memory – and this is why you cannot simply add the %MEM number for all of the processes to see how much of the total memory they use.

smem

While you'll find ps and top in any distribution, you probably won't find smem until you install it yourself. This command reports physical memory usage, taking shared memory pages into account. In its output, unshared memory is reported as the unique set size (USS). Shared memory is divided evenly among the processes that share that memory. The USS plus a process's proportion of shared memory is reported as the proportional set size (PSS).

USS and PSS include only physical memory usage. They do not include memory that has been swapped out to disk.

To install smem under Debian/Ubuntu Linux, type the following command:

$ sudo apt-get install smem

There is no smem package in the standard repository for CentOS or other Red Hat-based Linux distributions, but you can get it with the following commands:

# cd /tmp
# wget http://www.selenic.com/smem/download/smem-1.3.tar.gz
# tar xvf smem-1.3.tar.gz
# cp /tmp/smem-1.3/smem /usr/local/bin/
# chmod +x /usr/local/bin/smem

Once it's installed, type smem on the command line to get output like this:

PID User Command Swap USS PSS RSS 
1116 root /sbin/mingetty /dev/tty6 0 76 110 568 
1105 root /sbin/mingetty /dev/tty2 0 80 114 572 
1109 root /sbin/mingetty /dev/tty4 0 80 114 572 
1111 root /sbin/mingetty /dev/tty5 0 80 114 572 
1107 root /sbin/mingetty /dev/tty3 0 84 118 576 
939 root auditd 0 336 388 808 
1205 root dhclient eth0 0 564 571 688 
1103 root login -- root 0 532 749 1680 
1090 root crond 0 704 784 1420 
1 root /sbin/init 0 736 813 1488 
1238 root -bash 0 380 856 1924 
1283 root /usr/sbin/sshd 0 676 867 1152 
1135 root -bash 0 392 868 1932 
426 root /sbin/udevd -d 0 948 973 1268 
955 root /sbin/rsyslogd -i /var/run/ 0 996 1069 1628 
1080 root /usr/libexec/postfix/master 0 984 1602 3272 
1089 postfix qmgr -l -t fifo -u 0 1032 1642 3284 
1234 root sshd: root@pts/0 0 1772 2328 3912 
19319 postfix pickup -l -t fifo -u 0 2376 2738 3276 
19352 root python ./smem 0 5756 6039 6416 

As you can see, for each process smem shows four interesting fields:

  • Swap – The swap space used by that process.
  • USS – The amount of unshared memory unique to that process – think of it as unique memory. It does not include shared memory, so it underreports the amount of memory a process uses, but this column is helpful when you want to ignore shared memory. This column indicates how much RAM would be immediately freed up if this process exited.
  • PSS – This is the most valuable column. It adds together the unique memory (USS) and a proportion of shared memory derived by dividing total shared memory by the number of other processes sharing that memory. Thus it will give you an accurate representation of how much actual physical memory is being used per process, with shared memory truly represented as shared. Think of it as physical memory.
  • RSS – Resident Set Size, which is the amount of shared memory plus unshared memory used by each process. If any processes share memory, this will overreport the amount of memory actually used, because the same shared memory will be counted more than once, appearing again in each other process that shares the same memory. Thus it is an unreliable number, especially when high-memory processes have a lot of forks.

Now what?

Each of these memory utilities has some pros and cons. ps and top can be useful, but you have to understand what the numbers they show mean. smem is the rookie here, but it shows the most interesting information about your programs, and you can use it with the parameter -u to show the total memory used by all your users – an interesting feature on multiuser systems.

Now that you have the tools to discover what's eating up your memory, what you should do about it?

If you are a developer and you have found that your program is at fault, that's good news! You can work on the code and use a debugger to find out which function, call, or procedure is using all that memory.

If the process or program that eats up most of your memory is a daemon, such as Apache, MySQL, or nginx, you can search online for information that explains how to tweak the parameters of that daemon to save RAM.

When your uber-optimized Java web app becomes so popular that your server can't serve all your users, sometimes the only thing to do is add more RAM. This should be your last alternative, after you have checked all the other steps. If this happens, don't be sad – it means that your application is a big success!

Helpful resources

Understanding memory usage on Linux OOM Killer Linux memory management Thread about Linux memory


This work is licensed under a Creative Commons Attribution 3.0 Unported License
Creative Commons License.

Comments

I highly recommend using 'atop' for troubleshooting process resource usage (atop -mM -a 2 for example). 
 
IMO, it is better than 'top/htop' and provides more accurate accounting of process resource usage on the system, in this case memory usage.
Posted @ Monday, September 30, 2013 4:30 PM by RoseHosting.com
The proc virtual filesystem provides lots of information. 
 
# cat /proc/meminfo 
MemTotal: 3957888 kB 
MemFree: 1040908 kB 
Buffers: 82660 kB 
Cached: 648964 kB 
SwapCached: 77220 kB 
Active: 1526980 kB 
Inactive: 925348 kB 
Active(anon): 1087576 kB 
Inactive(anon): 648096 kB 
Active(file): 439404 kB 
Inactive(file): 277252 kB 
Unevictable: 20220 kB 
Mlocked: 20220 kB 
SwapTotal: 7812092 kB 
SwapFree: 7333020 kB 
Dirty: 120 kB 
Writeback: 0 kB 
AnonPages: 1712340 kB 
Mapped: 62260 kB 
Shmem: 11576 kB 
Slab: 299284 kB 
SReclaimable: 258072 kB 
SUnreclaim: 41212 kB 
KernelStack: 4208 kB 
PageTables: 38964 kB 
NFS_Unstable: 0 kB 
Bounce: 0 kB 
WritebackTmp: 0 kB 
CommitLimit: 9791036 kB 
Committed_AS: 4430788 kB 
VmallocTotal: 34359738367 kB 
VmallocUsed: 372848 kB 
VmallocChunk: 34359358456 kB 
HardwareCorrupted: 0 kB 
AnonHugePages: 0 kB 
HugePages_Total: 0 
HugePages_Free: 0 
HugePages_Rsvd: 0 
HugePages_Surp: 0 
Hugepagesize: 2048 kB 
DirectMap4k: 3893248 kB 
DirectMap2M: 217088 kB 
 
And also a detailled memory map of each process: 
 
# cat /proc/25099/maps  
00400000-00462000 r-xp 00000000 08:01 3548969 /usr/bin/xterm 
00661000-00662000 r--p 00061000 08:01 3548969 /usr/bin/xterm 
00662000-0066a000 rw-p 00062000 08:01 3548969 /usr/bin/xterm 
0066a000-0066d000 rw-p 00000000 00:00 0  
007c1000-00874000 rw-p 00000000 00:00 0 [heap] 
7f120cee7000-7f120ceec000 r-xp 00000000 08:01 3542997 /usr/lib/x86_64-linux-gnu/libXfixes.so.3.1.0 
7f120ceec000-7f120d0eb000 ---p 00005000 08:01 3542997 /usr/lib/x86_64-linux-gnu/libXfixes.so.3.1.0 
7f120d0eb000-7f120d0ec000 r--p 00004000 08:01 3542997 /usr/lib/x86_64-linux-gnu/libXfixes.so.3.1.0 
7f120d0ec000-7f120d0ed000 rw-p 00005000 08:01 3542997 /usr/lib/x86_64-linux-gnu/libXfixes.so.3.1.0 
7f120d0ed000-7f120d0f6000 r-xp 00000000 08:01 3543005 /usr/lib/x86_64-linux-gnu/libXcursor.so.1.0.2 
7f120d0f6000-7f120d2f5000 ---p 00009000 08:01 3543005 /usr/lib/x86_64-linux-gnu/libXcursor.so.1.0.2 
7f120d2f5000-7f120d2f6000 r--p 00008000 08:01 3543005 /usr/lib/x86_64-linux-gnu/libXcursor.so.1.0.2 
7f120d2f6000-7f120d2f7000 rw-p 00009000 08:01 3543005 /usr/lib/x86_64-linux-gnu/libXcursor.so.1.0.2 
7f120d2f7000-7f120d5c8000 r--p 00000000 08:01 3545092 /usr/lib/locale/locale-archive 
7f120d5c8000-7f120d5cc000 r-xp 00000000 08:01 786658 /lib/x86_64-linux-gnu/libuuid.so.1.3.0 
7f120d5cc000-7f120d7cb000 ---p 00004000 08:01 786658 /lib/x86_64-linux-gnu/libuuid.so.1.3.0 
... 
 
 
 
Posted @ Tuesday, October 01, 2013 9:58 AM by Stephane
The memory map can also be displayed in a more readable form (with the size) using the pmap tool.  
It can be combined with pgrep to find the process id: 
 
# pmap $(pgrep -f thunderbird) 
19522: /usr/lib/thunderbird/thunderbird 
0000000000400000 72K r-x-- /usr/lib/thunderbird/thunderbird 
0000000000612000 4K r---- /usr/lib/thunderbird/thunderbird 
0000000000613000 4K rw--- /usr/lib/thunderbird/thunderbird 
00007fa31a600000 1024K rw--- [ anon ] 
00007fa31f3f8000 24K r-x-- /usr/lib/x86_64-linux-gnu/libnotify.so.4.0.0 
00007fa31f3fe000 2048K ----- /usr/lib/x86_64-linux-gnu/libnotify.so.4.0.0 
00007fa31f5fe000 4K r---- /usr/lib/x86_64-linux-gnu/libnotify.so.4.0.0 
00007fa31f5ff000 4K rw--- /usr/lib/x86_64-linux-gnu/libnotify.so.4.0.0 
... 
total 1139436K 
Posted @ Tuesday, October 01, 2013 10:11 AM by Stephane
As a sys admin of Linux Freedom I monitor the server with another tool called htop. 
 
If you have a cpu with multiple cores then this will also breakdown the load of each core. 
 
You'll have to download this as it's not installed by default on any distro.
Posted @ Tuesday, October 01, 2013 6:26 PM by Jonathan
What about "free" or "free -m"? Look at +-buffers/cache to see what is really used and available minus buffers and cache.
Posted @ Tuesday, October 01, 2013 7:46 PM by Anon Y Mous
In this article I've not included the command free because it's a bit too "simple" as view of the total usage of memory. 
 
It's fr sure the first command I use when connecting to a server with problems, but it doesn't help when searching for the memory hogger :) 
 
On the other side the proc filesystem and pmap give a lot of information and can scare with all these information. 
 
Thanks for the feedbacks. 
 
Riccardo
Posted @ Wednesday, October 02, 2013 4:56 AM by Riccardo Capecchi
Post Comment
Name
 *
Email
 *
Website (optional)
Comment
 *

Allowed tags: <a> link, <b> bold, <i> italics