Switched to epoll, but need reliability!

Published by

on

When I started this project I thought I had a decent understanding of what I need (and I still kind of do). What I didn’t understand is how quickly I’d need to have certain systems implemented. Last time I posted I had authentication working, but the code wasn’t really “doing” anything. I’ve now come to realize I need to implement reliability into my UDP protocol now instead of later.

Also during testing, I quickly realized non-blocking sockets were a no-go. My server’s unit tests basically locked up because the server was waiting for a recvfrom and I couldn’t send a kill signal.

So I switched to using a non-blocking socket using epoll. This is of course after I re-read my old bible Unix Network Programming. This was actually a good refresher, even if the examples are some what dated. Instead of select/poll we now use epoll, kqueue, or iocompletion ports.

Implementing epoll wasn’t too bad (after finding examples on the internet). It’s a three step process, we have to set the socket to non-blocking, initialize epoll with it’s own descriptors, then call epoll_wait in our server networking thread.

Here’s how we initialize the epoll descriptor:

int Socket::EpollInit(const int MaxEpollSize)
{
	struct epoll_event EpollEvent;
	EpollFileDescriptor = epoll_create(MaxEpollSize);
	auto Ret = Bind();

	if (Ret < 0)
	{
		return Ret;
	}

	EpollEvent.events = EPOLLIN;
	EpollEvent.data.fd = EpollFileDescriptor;
	return epoll_ctl(EpollFileDescriptor, EPOLL_CTL_ADD, SocketFileDescriptor, &EpollEvent);
}

Then in the server loop, we call epoll_wait and wait for events to come in, setting a timeout of 10ms:

while (Running.load())
	{
		auto Ready = epoll_wait(Socket.GetEPollFD(), Events, MaxEpollSize, TimeoutMs);
		if (Ready < 0)
		{
			std::error_code ec(errno, std::system_category());
			fmt::print("Server::Listen exiting {}\n", ec.message());
			return;
		}
		else if (Ready == 0)
		{
			// Timeout
			continue;
		}
		else
		{
			for (int i = 0; i < Ready; i++)
			{
				// Make sure this fd matches our server epoll fd
				if (Events[i].data.fd == Socket.GetEPollFD())
				{
					struct sockaddr_in ClientAddr;
					std::memset((char *)&ClientAddr, 0, sizeof(ClientAddr));
					std::vector<unsigned char> PacketBuffer(1024);
					std::fill(PacketBuffer.begin(), PacketBuffer.end(), 0);
					auto PacketLen = Socket.Receive(ClientAddr, PacketBuffer.data(), BufferSize);
					if (PacketLen <= 0)
					{
						continue;
					}
					ProcessPacket(PacketBuffer, PacketLen, ClientAddr);
				}
			}
		}
	}

ProcessPacket at this point looks at the first byte of the message to determine what kind of message it is. I plan on having multiple layers of these kind of packet headers, this first byte is only for encrypted messages. After decrypting, they are put in to the queue for processing with the type of game packet it is. This game packet will also include an opcode to determine the purpose of the message.

I could now process the user’s authenticated packet that includes their public key and a signature from our AuthServer. From here we encrypt a symmetrical key using their public key with the assumption that the client will have the game servers public key baked into the game client.

     /**
     * @brief Encrypts the GeneratedKey with the UsersPublicKey or returns nullptr if unable to encrypt
     *
     * @param UsersPublicKey
     * @param GeneratedKey
     * @return UniqueUCharPtr
     */
    UniqueUCharPtr Cryptor::EncryptKey(const unsigned char *UsersPublicKey, const SharedUCharPtr GeneratedKey, size_t &OutputLen) const
    {
        unsigned char Nonce[crypto_box_NONCEBYTES];
        OutputLen = crypto_box_NONCEBYTES + crypto_box_MACBYTES + KeySize();
 
        auto CipherText = std::make_unique<std::vector<unsigned char>>(OutputLen);

        randombytes_buf(Nonce, sizeof(Nonce));

        // Prepend our buffer with the Nonce value
        CipherText->insert(CipherText->begin(), &Nonce[0], &Nonce[crypto_box_NONCEBYTES]);

        auto CipherAfterNonce = CipherText->data()+crypto_box_NONCEBYTES;
        if (crypto_box_easy(CipherAfterNonce, GeneratedKey->data(), KeySize(), Nonce,
                            UsersPublicKey, TestGameServerPrivateKey.data()) != 0)
        {
            return nullptr;
        }
        return CipherText;
    }

We can now send back this encryption key to the client so we can start having a shared encryption key for our game messages…

What I quickly realized is that I need to implement some layer of reliability here. We need to know that the client actually received this key. So we will need some sort of message acknowledgement system. And that’s where a reliability layer on top of UDP comes into play.

I looked at enet and a newer header only version of enet called zpl enet, and of course there is reliable.io from gafferongames. Someone also reached out on mastodon suggesting QUIC. All of these just seem great, but I’m here to learn, so I figure why not write my own?

Another reason I want to write my own is that I suspect I can build at least part of this reliability layer into the Entity Component System (ECS) I’m using. The player will have a NetworkComponent (or series of them) that I can use as part of my normal server loop to determine if things need to be re-sent.

As you can see I went from ‘send udp packets back and forth to get something working’ to..

  • Implemented a symmetrical encryption system for packets
  • Implemented a non-blocking io server loop
  • Implemented a asymmetrical + signature verification system to verify and pass symmetrical encryption key to client
  • Rewrote the code like 4x because I still have no idea what’s best; std::vector or std::array or unsigned char * and so on.

So now I guess I need to start thinking of how to add reliability into an ECS system, both of which I’ve never worked with before! Fun times!