I run a VPS which has numerous sites on it. Whilst I was trying to pin down the root cause of sporadic hard lockups and runaway memory usage, I settled on a somewhat inefficient (yet very handy) line of code which I run inside a tmux session over a PuTTY SSH connection (two other panes run iftop and watch --interval=0.1 iostat -m for realtime disk I/O).
An aside: tmux is like screen on speed, way more extensible and SO MUCH EASIER TO USE, I highly recommend you give it a try if you're a commandline warrior. There's some highly useful tutorials to help you get up to speed - google "tmux tutorial", Hawk Host's two-parter has some good stuff in it.
To accomplish this I'm taking advantage of the fact that DirectAdmin (which by default provides a base of Apache 2, MySQL and PHP 5) stores its httpd access logs in a common folder: /var/log/httpd/domains/<virtualhost>.log</a>. I'm combining the tail command with grep's egrep functionality (grep -e) and some pattern matching. It's not perfect: I have to occasionally Ctrl+C and restart the command as it stalls out, but it does everything I need.
Here's my command:
tail /var/log/httpd/domains/*.log -f -n 50 | grep -e "GET / HTTP/1\|GET /2011/\|GET /2010/\|GET /2009/\|GET /2012/\|.php HTTP/1\|.html HTTP/1\|.mp3 HTTP/1"
To break it down:
- tail = invoke tail
- /var/log/httpd/domains/*.log = read all files ending in *.log from the path /var/log/httpd/domains/ (a relative path, or none at all, could be used if you invoke the command in a closer folder)
- -f = declares it to refresh the screen live as files are updated
- -n 50 = read the last 50 lines from each file (I recommend you set a large scrollback buffer if you want to specify more!)
- | = pipe symbol, used to append another command - in this case, to perform further processing on tail's raw output
- grep = invoke grep
- -e = behave like "egrep"
- the long command (actually several, separated with escaped pipe characters) inside inverted commas = only display lines containing any one of these matching strings
Notes: When including multiple desired string matches with grep -e, you need to escape the pipe symbol as it behaves differently used inside regular expressions (which is what grep and egrep use). To do this, you put a backslash directly before the pipe symbol, \| -- if you don't, it'll be ignored, or that match string will be prepended to the one following it (derp).
This accomplishes exactly what I need on this box, I can see requests to site roots, requests to WordPress-based sites running with rewritten URLs (see the year-based URLs) plus any PHP, HTML or MP3 files. You can expand upon this to your heart's content but the default should work quite nicely. You *will* have to do further work on the string if you want to match sites with rewritten URLs if there's no indicator in the URL to show it's a page of content, but that's beyond the scope of this wee article. Hope you find it useful!
---
Related reading:
Dayid's screen and tmux cheat sheet (Chris: VERY USEFUL!)
http://www.dayid.org/os/notes/tm.html
Hawk Host's tmux article: Part 1 | Part 2 (Chris: VERY USEFUL!)
screen and tmux compared (with keys)
http://www.dayid.org/os/notes/tm.html
Hawk Host TMUX tutorial: Part 1 | Part 2
Mutelight: Practical tmux
http://mutelight.org/articles/practical-tmux
Googly-oogly...
http://www.google.co.uk/search?q=connect+to+defunct+tmux+session
SU: Why do I have multiple tmux processes?
http://superuser.com/questions/259154/why-do-i-have-multiple-tmux-processes