UDP Reliability Part 4 (Resending)

This will be my last post on UDP Reliability for a while, even though it technically is incomplete. Honestly, I’ve done enough work on it that I need to move onto something else to keep my interest in this project up!

I left off where I wanted the game system(s) to not care how packets were sent, I’ve mostly achieved this. I moved from a reliability endpoint to a reliability ‘peer’. Where a Peer is a client (or server) that can have multiple channels (for now just two). One channel is for unreliable messages (movement) and the other is for reliable messages (actions that need server authority).

This logic is in the reliable_ peer.cpp file. This peer class also handles resending packets on the reliable channel. There are a few caveats however.

reliable.io doesn’t actually handle resending!

reliable.io is good for what it does, but it really only helps for fragmenting messages and dealing with message acknowledgements. It should probably be renamed to ack.io or something.

Actually dealing with resending and timeouts has to be done by the implementor (waves, dat me).

This is handled in the Listener class which is our network socket thread and is looped over while waiting for incoming messages:

	const std::chrono::duration<double> DeltaTime = std::chrono::steady_clock::now() - StartTime;
	ProcessReliableMessages(DeltaTime);

The server iterates over it’s connected/registered peers and resends any messages:

void Server::ProcessReliableMessages(std::chrono::duration<double> DeltaTime)
{
	for (auto &Peer : Peers)
	{
		Peer.second->ResendReliablePackets(Logger, Crypto, DeltaTime.count());
	}
}

Where as the client just resends to the server:

void Client::ProcessReliableMessages(std::chrono::duration<double> DeltaTime)
{
	if (!Peer)
	{
		return;
	}
	
	auto Resent = Peer->ResendReliablePackets(Logger, Crypto, DeltaTime.count());
	if (Resent > 0)
	{
		Logger.Info("Resent {} packets", Resent);
	}
}

Note we call with a delta time so we can ‘timeout’ messages. This resend stage also handles message acknowledgements and clears out old messages:

int ReliablePeer::ResendReliablePackets(Logging::Logger &Log, crypto::Cryptor &Crypto, const double Time) const
{
	int Resent = 0;
	int TotalAcks = 0;
	ReliableEndpoint &Endpoint = *Endpoints.at(Reliability::Reliable).get();
	Endpoint.Update(Time);
	auto Acks = Endpoint.GetAcks(TotalAcks);
	if (TotalAcks > 0)
	{
		for (auto Ack : Acks)
		{
			if (Ack == 0)
			{
				continue;
			}
			ProcessAck(Endpoint, Ack);
		}
	}
	
	Endpoint.ClearAcks();

	float RTT = Endpoint.GetRTT();

	// We can now re-send any un-ack'd packets
	auto Data = Endpoint.GetSentPackets()->GetEntries();
	for (auto &&Message : *Data)
	{
		// Juuuuust in case.
		if (!Message || Message->Acked)
		{
			continue;
		}

		// Don't resend if we are below the peers RTT time.
		auto TimeDiff = Time - Message->Time;
		if (TimeDiff < RTT) // may want to reduce this a bit so we more aggressively retry? Need to test later
		{
			continue;
		}
		
		// Delete this message if we are above the resend threshold.
		if (TimeDiff > DefaultConfig.ResendThreshold)
		{
			Endpoint.GetSentPackets()->RemoveWithCleanup(Message->Sequence);
			continue;
		}

		if (Message->MessageLen > DefaultConfig.FragmentAbove)
		{
			Endpoint.SendFragmentedMessage(Log, Crypto, Message.get());
			Resent++;
		}
		else
		{
			Endpoint.SendMessage(Log, Crypto, Message.get());
			Resent++;
		}
	}
	return Resent;
}

The only catch is we need to make sure we don’t resend messages if it’s under the RTT threshold otherwise we’d packets continually!

Done, but not done

While it seems like all the major functionality for reliability is in place, I have one major design flaw. I end up resending the entire series of fragmented packets, this is a huge waste, but I don’t have the book keeping in place to acknowledge fragments, which I just don’t feel like adding right now. So if we don’t get an acknowledgement, we blast out all fragments again.

I’ve now moved onto implementing and changing our input/movement works. I hope to have some concrete results soon where movement is actually calculated on the server and we can finally send state snapshots/updates back to connected clients!