scale: introduction

I'm Anthony, and this is my scalability blog. I have a few projects I want to write about as I work on them - maybe I can connect with other people doing similar work, and we can help each other solve problems.

I am primarily interested these days in building web-scale distributed systems using pure Ruby, Mysql, and Memcached. I have two big opportunities to push the scalability envelope:

1) I am tasked at the company I work for (http://www.mogreet.com/) with building out and scaling up the SMS/MMS Video delivery platform I architected and wrote (with a lot of help from Blake, and a some pointers from Aber.) This involves buying a lot of hardware - servers, databases, load balancers, to run more of the existing Secret_sauce nodes - secret_sauce is the mogreet Application Server and messaging gateway, written in pure ruby. Over the next months I'll be designing a set up for redundant, high-availability Mysql servers, and working out an application serer -level load balancing solution, perhaps from the nice folks at F5.

2) My personal project, proceeding slowly on weekends, is the p-server. The p-server is a different architectural approach to scaling up a pure ruby app-server, although it could also be deployed as a cloud of cheap nodes. The basic idea of the p-server is to move the relevant portions of the web server into the app server - rather than vice-versa, and use ruby threads to manage a large number (1000? 10,000? 100,000?) of concurrent HTTP requests, which will generally be RESTFUL queries that the p-server will parse and respond to. The p-serve is designed to be a real-time game server, so it's designed to handle large numbers of concurrent players of the game, who are all interacting with the system via different User Interfaces, like social network apps, web pages, iphones, etc.
The p-server has a producer-consumer thread model, where each incoming request is passed to a consumer thread, which runs for a limited number of passes or a limited period in time.
These consumer threads re-use connections to any n number mysql databases, which are wrapped in n number Memcached distributed memory caches. Initial tests (on a laptop) show that the prior architecture I developed for the special_sauce can handle about 20 connections a second per node when performing logic that involves database action (and is massively distributed) the new p-server architecture handles 400-500 connections a second. And there is plenty of room for improvement, in how fast the Memcache client is, supporting keep-alives, etc.
My plan is to make the p-server available to other people as open source software, as soon as it's together enough, and documented enough, to be useful. The p-server relies on mysql and memcached, so I'll probably put it all together in a little software distro, compiled for Mac OSX and FreeBSD, to make it easy for people to try out.

scale

Sunday, February 8, 2009

introduction

No comments:

Post a Comment

About Me

Blog Archive

Followers