Optimizing Performance Notes

Posted by John
on Sunday, 16 November 2008

GoodWood Revivial

Two weeks ago I visited a conference on website performance at the SUN UK offices, quite an eye opener; here's a bit of what I learnt.

Sharding

When you're site's performance is lacking the usual idea is to upgrade the hardware and buy bigger and more powerful servers, better hardware, etc. However another approach is to use a technique called Sharding.

Basically you analyze the users pulling the biggest load off your database and separate them into shards so rather than User A hitting the main server, they hit a shard of that data server, same goes with User B; in essence you separate the people requesting the most out to individual boxes rather than the main one. Employ an army of cheap Linux boxes to create this modified farm and balance the data requests strategically out so they're evenly hit.

It's a federated model, groups of users are stored together in boxes of shards.

So if one box goes down, the others still operate. The work is shared out among your virtual server farm, you get more write performance and you reduce the bottleneck, but you also work out where you're main draw is an share that out so one guy isn't left doing all the work.

There are some disadvantages in going this way but it's a good start in solving a potential problem as your site grows.

Load Balancing

You can also employ Linux's Native Kernel Load Balancer to help (Google this), plus there are two packages available for the O/S to help in this area:

Clustering

Clustering may also be a good thing to look at, which is like sharding but simpler in that you build a farm of servers and load balance the users across them. So the first user hits box 1, then the next user hits box 2, and so on; once each box has a user you go back to box 1 and add a user, and so on balancing the load out.

CDN

Content Delivery Network, if you've got enough money in the budget then you might be able to employ a company like Akamai or BitGravity to help you out with serving your videos and media to the customers rather than your boxes taking the brunt of this less complex work.

The last one is a new service Amazon are offering, currently in beta; but be warned none of these services are cheap if you've got a big hit count.

General Stuff

You could also try to make friends with your hoster, who knows you might get along well and they might even want to take an interest in what your doing; helping out along the way with extra bandwidth or advice.

Pick your quick-wins and operate on those first, if you can get the performance up with a couple of hours focusing on the rough edges of your site; that'll give you breathing room to focus on the core before things get difficult.

Latency kills FaceBook apps, the majority of their users are US-based and thus it helps having a box in their country so the network doesn't have to trek across the oceans to grab your data. Think of all those Doom DeathMatch's you played as a kid; when the server was closer to home it really helped in getting that high score.

Master-Slave relationships are simple to implement and maintain, Multi-Master is a headache.

SATA drives are bad for random-writes, SAS (Serial SCSI) discs are much more suited to this; grab some 2.5" 15k ones.

Make sure you've set an Expires header to your served data, this makes it cache-able reducing the time it takes to grab your HTML, gif, jpg files from your box. If your style-sheets don't change much you can use this to tell your client's browsers so they know not to grab a new copy each time they do a request.

Also if you've got abundant amounts of memory consider employing RAM-Disks to keep the most used apps in memory rather than constantly paging to disk.

Tools

Beginning NGINX

Posted by John
on Wednesday, 06 February 2008

In this series of articles I'll explain how to install and setup the super light and fast NGINX webserver on your Linux box and get it to host rails apps and maybe a little extra.

First off let's install NGINX

To install the latest copy of NGINX you're gonna need to build from source so make sure you install the build-essentials (gcc), to do this run...

sudo aptitude install build-essential

Now you've got the GCC compiler installed you can build from source, so let's download the latest copy of NGINX...

First some dependencies,

sudo aptitude install libpcre3 libpcre3-dev libpcrecpp0 libssl-dev zlib1g-dev

Now the bad boy himself,

cd ~/sources/
wget http://sysoev.ru/nginx/nginx-0.5.35.tar.gz
tar -zxvf nginx-0.5.35.tar.gz
cd nginx-0.5.35/

now configure the source,

./configure --sbin-path=/usr/local/sbin --with-http_ssl_module

it'll finish with a summary of locations like,

nginx path prefix: "/usr/local/nginx"
nginx binary file: "/usr/local/sbin"
nginx configuration file: "/usr/local/nginx/conf/nginx.conf"

write these down before you continue, very important!

now build,

make

and install

sudo make install

Running NGINX

As those last summary lines told us, nginx lives in -> /usr/local/sbin/nginx, so let's go start it,

sudo /usr/local/sbin/nginx

now if you navigate to your boxes ip address you should see a fancy 'welcome to nginx' message, wahey! you have it installed.

Final part, startup scripts

Now the next script I'm very thankful for PickledOnion over at Slicehost.com for providing.

First off let's create an init script so we can start it more nicely and NGINX will start on reboot, so...

sudo nano /etc/init.d/nginx

And now copy & paste this init script into nano...

#! /bin/sh

### BEGIN INIT INFO
# Provides:          nginx
# Required-Start:    $all
# Required-Stop:     $all
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: starts the nginx web server
# Description:       starts nginx using start-stop-daemon
### END INIT INFO

PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=/usr/local/sbin/nginx
NAME=nginx
DESC=nginx

test -x $DAEMON || exit 0

# Include nginx defaults if available
if [ -f /etc/default/nginx ] ; then
        . /etc/default/nginx
fi

set -e

case "$1" in
  start)
        echo -n "Starting $DESC: "
        start-stop-daemon --start --quiet --pidfile /usr/local/nginx/logs/$NAME.pid \
                --exec $DAEMON -- $DAEMON_OPTS
        echo "$NAME."
        ;;
  stop)
        echo -n "Stopping $DESC: "
        start-stop-daemon --stop --quiet --pidfile /usr/local/nginx/logs/$NAME.pid \
                --exec $DAEMON
        echo "$NAME."
        ;;
  restart|force-reload)
        echo -n "Restarting $DESC: "
        start-stop-daemon --stop --quiet --pidfile \
                /usr/local/nginx/logs/$NAME.pid --exec $DAEMON
        sleep 1
        start-stop-daemon --start --quiet --pidfile \
                /usr/local/nginx/logs/$NAME.pid --exec $DAEMON -- $DAEMON_OPTS
        echo "$NAME."
        ;;
  reload)
          echo -n "Reloading $DESC configuration: "
          start-stop-daemon --stop --signal HUP --quiet --pidfile     /usr/local/nginx/logs/$NAME.pid \
              --exec $DAEMON 
          echo "$NAME."
          ;;
      *)
            N=/etc/init.d/$NAME
            echo "Usage: $N {start|stop|restart|reload|force-reload}" >&2
            exit 1   
            ;;
    esac

    exit 0

Yes it's a monster, I've copied it over to my server so you can grab it at..

Now lets use it with...

sudo chmod +x /etc/init.d/nginx
sudo /usr/sbin/update-rc.d -f nginx defaults

You should now see...

Adding system startup for /etc/init.d/nginx ...
   /etc/rc0.d/K20nginx -> ../init.d/nginx
   /etc/rc1.d/K20nginx -> ../init.d/nginx
   /etc/rc6.d/K20nginx -> ../init.d/nginx
   /etc/rc2.d/S20nginx -> ../init.d/nginx
   /etc/rc3.d/S20nginx -> ../init.d/nginx
   /etc/rc4.d/S20nginx -> ../init.d/nginx
   /etc/rc5.d/S20nginx -> ../init.d/nginx

Now NGINX will startup on reboot and you can run these commands to control it better.

Start

sudo /etc/init.d/nginx start

Stop

sudo /etc/init.d/nginx stop

Restart

sudo /etc/init.d/nginx restart

Next up i'll put together the nginx scripts I use myself which should help you out a lot when hosting your site with this great tool.

Final Note

On a later note you may get times when you change your NGINX .conf scripts restart NGINX and it doesn't seem to have taken your latest config changes, I get this myself sometimes.

What you can do is brute-force kill the NGINX process and restart it with...

ps aux | grep nginx

which will return the process id NGINX is running at, then kill it with...

kill [processid]

and now start NGINX from fresh,

sudo /etc/init.d/nginx start

.Conf Templates

For your info and more so the guys who have working NGINX setups, I've put example config files in my downloads area, with direct links here....

Happy Anniversary, We're 5 years old!

Posted by John
on Wednesday, 12 September 2007

Looking back on the server logs it suddenly dawns on me, my pride and joy is now 5 years old, party!!!

That’s some History…

It’s been a rollercoaster ride. This site’s first incarnation appeared back on a very early incarnation of Blogger before Google bought them out. Then under the guise of wolfsclaw.com on a UK hoster running PHP 4 and Textile, then off to TextDrive PHP 5 and Wordpress for a while. Then over to DreamHost then MediaTemple Grid-Server → full DV box.

Now finally it’s sitting on it’s own custom Linux VPS server under Xen virtualisation, running Ruby on Rails with professional-grade deployment and load-balancing nodes. All made possible by SliceHost

It’s been quite a ride but I’ve enjoyed it and have no intention of letting up just yet.

Why do it?

I originally started the blogging idea as a means of putting myself out there, finding new friends, colleagues and quasi-self-promotion. But it’s thankfully grown away from that and now whenever I solve something difficult or write up some crackin’ solution it goes here.

I work on the notion if it’s helpful to me then it’s probably helpful to someone else out there, and if I end up making someone’s day then that’s thanks enough.

Still no Ad’s!

Yep I’m still sticking with my policy of no ad’s or annoying popups on the site, financing it out of my own pocket for the greater good.

Hackers go Bye Bye!

And having had no major break-ins or take downs in it’s history, life’s been mighty nice.

I’ve implemented load balancing on the web server to keep the Digg effect from happening and have a nice pool of bandwidth and processor resources and nightly disk imaging to keep things secure and bullet-proof.

Numbers?

If your interested, the logs suggest around 3,000 visitors a day, averaging 60,000 a month. Google Analytics tracks the details for me and sends a graphical chart each month, so far 200 goal achivements per month can’t be bad!.

Looking forward to the next 5!

Take care all,

‘Goodnight and Good Luck’