natechoe.dev The blog Contact info Other links The github repo

How I made this site

Over the past week I've been writing a web server. I'm going to be honest, it's not great, and I wouldn't recommend that you use it, but it's mine, and this blog post is going to detail how it works.

When I was designing swebs (a Simple WEB Server) I had a few constraints in mind:

  • It must work
  • It shouldn't be vulnerable to slowloris attacks
  • It should be at least bearably performant

A slowloris attack is a type of denial of service attack where you open up many many connections to a web server, each one sending data extremely slowly. The web server may then start allocating many resources for each individual connection, eventually hitting a limit so that other users can't view the site. The Apache web server is particularly vulnerable to this as each new connection spawns a new thread. Not going into too much detail on how operating systems work, threads are really computationally expensive to make and run, so having hundreds of connections to a server could really affect its performance.

The solution that nginx (pronounced engine X) uses, and which I decided to steal, is to just keep track of all the connections yourself. Threads are handled by the operating system, and thread creation is very expensive. Structures have to be set up, systems have to be notified, this can all be circumvented by just keeping track of all the current connections in memory. There are several runner threads to utilize all the CPU power available, then a main thread. The main thread's only job is to accept connections. Whenever it accepts a connection, it tells the least busiest thread about it, which handles all requests from that connection.

In the main thread:

//NOTE: This is old code, not used anymore. It is also released under the
//GNU General Public License v3.

int *pending = calloc(processes - 1, sizeof(int));
//Each runner thread has an id, and pending[id] shows the number of connections
//that thread is handling

int *schedule = malloc(2 * sizeof(int));
if (schedule == NULL)
	exit(EXIT_FAILURE);
schedule[0] = -1;
//schedule[0] is the thread that should take the next connection
//schedule[1] is the file descriptor of the next connection

pthread_t *threads = malloc(sizeof(pthread_t) * (processes - 1));
if (threads == NULL)
	exit(EXIT_FAILURE);
for (int i = 0; i < processes - 1; i++) {
	RunnerArgs *args = malloc(sizeof(RunnerArgs));
	if (args == NULL)
	exit(EXIT_FAILURE);
	args->site = site;
	args->pending = pending;
	args->schedule = schedule;
	args->id = i;
	pthread_create(threads + i, NULL,
	       (void*(*)(void*)) runServer, args);
}
//create the runner threads

for (;;) {
	fsync(fd);
	//I have no idea how this works, but for some reason the server broke
	//when I didn't have this
	if (schedule[0] == -1) {
		int newfd = accept(fd, (struct sockaddr *) &addr,
				       &addrlen);
		//accept a connection
		if (newfd < 0)
			exit(EXIT_FAILURE);
		int flags = fcntl(newfd, F_GETFL);
		if (fcntl(newfd, F_SETFL, flags | O_NONBLOCK))
			exit(EXIT_FAILURE);
		//make it nonblocking
		int lowestThread = 0;
		int lowestCount = pending[0];
		for (int i = 1; i < processes - 1; i++) {
			if (pending[i] < lowestCount) {
				lowestThread = i;
				lowestCount = pending[i];
			}
		}
		//get the least busy thread
		schedule[1] = newfd;
		schedule[0] = lowestThread;
		//give that thread the connection
	}
}

In the runner threads:

//Again, this is old code released under the GNU General Public License v3.

void *runServer(RunnerArgs *args) {
	Sitefile *site = args->site;
	//the website layout is stored within a Sitefile
	int *pending = args->pending;
	int *schedule = args->schedule;
	int id = args->id;

	Connection *connections = NULL;
	Connection *last = NULL;
	//Connections are processed in a queue, which is really just a linked
	//list where we add to the end and read from the beginning.
	for (;;) {
		if (schedule[0] == id) {
			//If a connection is available to accepet
			Connection *newconn = newConnection(schedule[1]);
			assert(newconn != NULL);
			//get that connection

			if (last == NULL)
				connections = newconn;
			else
				last->next = newconn;
			last = newconn;
			//add it to the queue
			pending[id]++;
			schedule[0] = -1;
			//and increase the pending connections
		}

		Connection *prev = NULL;
		Connection *iter = connections;
		while (iter != NULL) {
			//go through all the connections
			if (updateConnection(iter, site)) {
			//updateConnection() will get all new data from that
			//connection and process it. It returns 1 (true) if the
			//connection should be killed
				if (iter == last)
					last = prev;
				Connection *old = iter;
				iter = iter->next;
				freeConnection(old);
				if (prev == NULL)
					connections = iter;
				else
					prev->next = iter;
				pending[id]--;
				//if that connection should be killed, kill it.
			}
			else {
				prev = iter;
				iter = iter->next;
			}
		}
	}
	return NULL;
}

You can see that the only threads being created are the runner threads. Everything else is handled manually. There is a problem with this code, though, and it has to do with performance.

In the main thread, most of the time nothing gets done. When someone creates a new connection to the web server, it "wakes up" with the accept() system call, handles it, then goes back to sleep. In the runner threads, however, even when no work is getting done at all, the threads are still constantly checking to see if there's more work to do. This means that whatever CPU the runner thread is running on will be operating at 100% load 100% of the time, even if nothing's happening. The solution to this is to somehow wait until either a new connection is available or until an existing connection has sent data.

In UNIX systems, the poll() system call will wait until any one of several file descriptors is available for reading. In layman's terms it means that we can wait until any of the connections we have has data to read. This still doesn't solve the problem of waiting for a new connection, but it's a start. In UNIX, it's possible to create a pipe, where writing to one end allows reading from the other end. If we create a pipe from the main thread to each of the runner threads, and for each connection we send the runner a message through that pipe, each of the runners can wait until either one of the connections has data available to read, or until there is a new connection availabe.

In the main thread:

//This is still old code released under the GNU General Public License v3
//This code is slightly newer.
int *pending = calloc(processes - 1, sizeof(int));
int (*notify)[2] = malloc(sizeof(int[2]) * (processes - 1));
//instead of a schedule, we have several pipes
pthread_t *threads = malloc(sizeof(pthread_t) * (processes - 1));
if (threads == NULL)
	exit(EXIT_FAILURE);

for (int i = 0; i < processes - 1; i++) {
	if (pipe(notify[i]))
		exit(EXIT_FAILURE);
	//create the pipes for the runner threads
	RunnerArgs *args = malloc(sizeof(RunnerArgs));
	if (args == NULL)
		exit(EXIT_FAILURE);
	args->site = site;
	args->pending = pending;
	args->notify = notify[i][0];
	args->id = i;
	pthread_create(threads + i, NULL,
		       (void*(*)(void*)) runServer, args);
}
//create the runner threads

for (;;) {
	fsync(fd);
	//Still have no clue why this works, it's not in the new code.
	int newfd = accept(fd, (struct sockaddr *) &addr,
			&addrlen);
	//accept connections (blocking)
	if (newfd < 0)
		exit(EXIT_FAILURE);
	int flags = fcntl(newfd, F_GETFL);
	if (fcntl(newfd, F_SETFL, flags | O_NONBLOCK))
		exit(EXIT_FAILURE);
	//make the connection nonblocking
	int lowestThread = 0;
	int lowestCount = pending[0];
	for (int i = 1; i < processes - 1; i++) {
		if (pending[i] < lowestCount) {
			lowestThread = i;
			lowestCount = pending[i];
		}
	}
	//get the least busiest thread
	if (write(notify[lowestThread][1], &newfd, sizeof(newfd))
		  < sizeof(newfd))
	exit(EXIT_FAILURE);
	//send that thread the new connection
}

In the runner threads:

//This code has been slightly modified from the original commit that I took it
//from (the original commit was broken), but it's still GPLed. This code is also
//untested.
void *runServer(RunnerArgs *args) {
	Sitefile *site = args->site;
	int *pending = args->pending;
	int notify = args->notify;
	int id = args->id;

	int allocConns = 100;
	struct pollfd *fds = malloc(sizeof(struct pollfd) * allocConns);
	Connection *connections = malloc(sizeof(Connection) * allocConns);
	assert(fds != NULL);
	assert(connections != NULL);
	fds[0].fd = notify;
	fds[0].events = POLLIN;
	int connCount = 1;
	//connections are no longer linked lists, and are now stored in a
	//parallel array to fds. Note that connections are 1 indexed, as fds[0]
	//is the connection to the main thread.

	for (;;) {
		poll(fds, connCount, -1);
		//wait for either a message from the main thread or one of the
		//connections.

		for (int i = 1; i < connCount; i++) {
			//go through all the connections
			if (updateConnection(connections + i, site)) {
				connCount--;
				freeConnection(connections + i);
				memcpy(fds + i, fds + connCount,
						sizeof(struct pollfd));
				memcpy(connections + i, connections + connCount,
						sizeof(Connection));
				pending[id]--;
				//if a connection should be removed, remove it
				//by moving the last item to the removed
				//position
			}
		}

		if (fds[0].revents == POLLIN) {
			//if there's a new connection to accept
			if (connCount >= allocConns) {
				allocConns *= 2;
				struct pollfd *newfds = realloc(fds,
					sizeof(struct pollfd) * allocConns);
				if (newfds == NULL)
					exit(EXIT_FAILURE);
				fds = newfds;

				Connection *newconns = realloc(connections,
					sizeof(Connection) * allocConns);
				if (newconns == NULL)
					exit(EXIT_FAILURE);
				connections = newconns;
			}
			int newfd;
			if (read(notify, &newfd, sizeof(newfd)) < sizeof(newfd))
				exit(EXIT_FAILURE);
			fds[connCount].fd = newfd;
			fds[connCount].events = POLLIN;

			if (newConnection(newfd, connections + connCount))
				exit(EXIT_FAILURE);
			connCount++;
			pending[id]++;
			//accept it
		}
	}
	return NULL;
}

The code has gotten a lot more complicated since then, so I chose this version of the code for simplicity. This version of the code also had some bugs I noticed while commenting the new runner thread, so those have been fixed.

I'd be remiss if i said that getting a working web server was the whole story. This website is being hosted at my house, where my Dad's sites are also being hosted. We only have a single ip address for multiple websites. The http protocol states that the hostname should be sent as part of the request. The only way to differentiate the sites would be by parsing http requests and then proxying them to some other device. Luckily, nginx saves the day again, as nginx can be used as a reverse proxy. We have 3 ip addresses:

  • 192.168.50.61 (reverse-proxy, runs nginx)
  • 192.168.50.60 (natechoe.dev, receives requests from nginx, runs swebs)
  • 192.168.50.57 (my Dad's sites, runs Windows server for some reason idk)

These are all hosted on Hyper-V on Windows server (my Dad's choice not mine, my vm runs debian).