[CONTACT]

[ABOUT]

[POLICY]

Sorting and counting Imagine that

Found at: republic.circumlunar.space:70/~katolaz/phlog/20190125_sort.txt

   Sorting and counting   
==========================

Imagine that you are administering a small unix system and you want to
know how many processes each user is running in parallel, and sort the
list in decreasing order of number of processes. The following
one-liner:

  $ ps aux | cut -d " " -f 1 | tail -n +2 | sort | uniq -c | sort -rn  

does the trick. Let's dissect it to understand how it works.

The command ps(1) can list all the processes currently running in your
system, together with the name of the user to whom each process belongs:

  $ ps aux
  USER      PID  %CPU %MEM    VSZ   RSS TT  STAT STARTED        TIME COMMAND
  root       11 100.0  0.0      0    16  -  RNL  29Nov18 69599:49.33 [idle]
  root        0   0.0  0.0      0   240  -  DLs  29Nov18     0:13.81 [kernel]
  root        1   0.0  0.0   5424   128  -  ILs  29Nov18     0:01.03 /sbin/init --
  root        2   0.0  0.0      0    16  -  DL   29Nov18     0:00.00 [crypto]
   ...........
   ...........
  uwu     30154   0.0  0.8  11680  8004 27  I+   08:27       0:00.03 /usr/local/bin/lua52 /usr/local/bin/telem.lua uwu
  uwu     27175   0.0  0.5   8520  5320 28  Is   06:35       0:00.03 -zsh (zsh)
  uwu     27178   0.0  0.5   8188  5220 28  S+   06:35       0:58.73 lua /usr/local/bin/odlli (lua52)
  $

That is a fairly long list, but user names appear on the first column,
with other fields separated by (a variable number of) spaces. For the
moment we just need user names, so cut(1) comes handy:

  $ ps aux | cut -d " " -f 1
  USER
  root
  root
  root
  root
   ...........
   ...........
  uwu
  uwu
  uwu
  $

Notice that the first line contains "USER" which is not a real user name
(it's just part of the header added by ps(1)), so we will need to get
rid of it using the command tail(1):

  $ ps aux | cut -d " " -f 1 | tail -n +2
  root
   ...........
  uwu
  $

Now, each user name appears in that list a number of times equal to the
number of processes currently run by the user. How to count these
occurrencies? The trick is to use sort(1) and uniq(1). The command
sort(1) can sort a file (or a list of lines provided as input), and by
default it enforces a lexicographical order:

  $ ps aux | cut -d " " -f 1 | tail -n +2 | sort 
  _dhcp
  _pflogd
  bbs
  bbs
  bbs
  ben
  ben
  ben
   ...........
  slugmax
  slugmax
  slugmax
  spring
  uwu
  uwu
  uwu
  uwu
  $

The command uniq(1) will remove contiguous repetitions of each line
given on input:

  $ ps aux | cut -d " " -f 1 | tail -n +2 | sort  | uniq 
  _dhcp
  _pflogd
  bbs
  ben
  cleber
  irc
  katolaz
  leeb
  lntl
  nobody
  postfix
  root
  slugmax
  spring
  uwu
  $

Notice that this is just the list of users in the system currently
owning at least one running process, which is not exactly what we were
up to. However, the option '-c' of uniq(1) can do the job, since it
counts how many contiguous repetitions of the same line were found:

  $ ps aux | cut -d " " -f 1 | tail -n +2 | sort  | uniq -c
     1 _dhcp
     1 _pflogd
     3 bbs
     4 ben
     5 cleber
     1 irc
    22 katolaz
    10 leeb
     3 lntl
     1 nobody
     3 postfix
    56 root
    12 slugmax
     1 spring
     8 uwu
  $

This means that user _dhcp has 1 running process, user cleber has 5
running processes, user root has 56 running processes, and so on. We are
almost there. We just need to sort the resulting list according to the
numbers appearing at the beginning of each line. This is done by using
sort(1) again, with the option '-n':

  $ ps aux | cut -d " " -f 1 | tail -n +2 | sort  | uniq -c | sort -n 
     1 _dhcp
     1 _pflogd
     1 irc
     1 nobody
     1 spring
     3 bbs
     3 lntl
     3 postfix
     4 ben
     5 cleber
     8 uwu
    10 leeb
    12 slugmax
    23 katolaz
    50 root
  $

If you want the list to to be sorted in descending order of number of
processes, you need to just reverse the ordering, which can be done by
passing the option '-r'  to sort(1):

  $ ps aux | cut -d " " -f 1 | tail -n +2 | sort  | uniq -c | sort -rn 
    50 root
    23 katolaz
    12 slugmax
    10 leeb
     8 uwu
     5 cleber
     4 ben
     3 postfix
     3 lntl
     3 bbs
     1 spring
     1 nobody
     1 irc
     1 _pflogd
     1 _dhcp
  $

This is the one-liner we had at the beginning of this post. The result
indicates that I should probably close some of the screens I am not
using... :P

  -+-+-+-

Most of the tools we have seen here were forged by the ancient dwarven
blacksmiths at Murray Hill, in the Eastern Lands, and have survived
pretty unmodified in the unix environment for ages. In particular:

sort(1) appeared in UNIXv2 (March 1972)
uniq(1) appeared in UNIXv3 (February 1973)
tail(1) appeared in UNIXv7 (January 1979)

Some other tools, instead, were created in the Eastern Lands and
readjusted and perfected by the sapient master craftsmen of the West. In
particular: 

ps(1)   appeared in UNIXv4 (November 1973), although the syntax for
        options that we have used here comes from early versions of 
        BSD2.x (ca 1979-1980)


AD: