Hard Light Productions Forums

Modding, Mission Design, and Coding => FS2 Open Coding - The Source Code Project (SCP) => Topic started by: chief1983 on September 01, 2015, 04:20:53 pm

Title: In-Game Network Code Discussion
Post by: chief1983 on September 01, 2015, 04:20:53 pm
So, we've tossed around ideas like RakNet replacement and such for a while, but no one has yet stepped forward to say that they think they can single-handedly spearhead the replacement.  Maybe that's the wrong approach though.  What I'd like to do is maybe get some sort of 'analysis by comittee' going, where anyone interested checks in on this thread or whatever and we try to get a group understanding and documentation of what's going on.  If we were ultimately going to use RakNet, the idea would probably be to start with replacing the in game network code module, and since that seems to be the part of the code that, until fixed, renders any other efforts towards multiplayer kind of moot, I think it's the best place to start.

But let's assume for a minute we don't need to use RakNet.  I've been theorizing lately about this.  We did not one, but two go_fasters for the graphics code, which both found some low hanging fruit and serious drains on the framerate, and fixed them in a series of individual enhancements, which led to massive gains.  Many of these were fixes for features that weren't optimized to begin with, or were optimizations on the original framework.  I think something similar might be possible for the netcode.  FS2 multiplayer was popular back in the day, so (someone correct me if I'm wrong) it had to have been playable to some extent.  But what we see today seems to me like something no one would have put up with.  If that's the case, then perhaps the changes over time to the game have actually degraded the behavior, and we need to figure out where and how, and try to speed it back up.  Even if that's not completely the case, I think we can still start with analyzing the current netcode performance, be it with enhanced logging, tracing, debugging, whatever, and attempt to identify some of the areas where the netcode might be having hiccups.

For instance, I ping my dedicated server, and see times around 35ms.  In game, FS2 reports my latency as 250-300ms.  What on earth?  Is this just because of a different transport layer, or is the code really injecting that much overhead just to ping?  If that's the case, is there similar overhead in game?  There's really no excuse for that if that's the case.

Also, from what I've seen, packets have had to grow to handle things like additional SEXPs and other expanded information over time.  Could these larger packets be slowing down client-server communication significantly at this point, resulting in significant degradation compared to retail?  If that's the case, perhaps we need to analyze how we send that kind of data and determine if there are ways to shrink the packets.

Are packets compressed?  If not, we have zlib already for png, could we use zlib to compress larger packets?  I think most modern computers can compress/decompress fast enough to make this a worthwhile consideration to get faster data transmission.

We could get a network logging/debugging branch going on Github to collaborate our efforts, if anyone has any ideas for a good place to start putting some logging facilities in.

So what say you all?  Anyone else here interested in doing a grass roots team effort towards network code refactoring?
Title: Re: In-Game Network Code Discussion
Post by: AdmiralRalwood on September 01, 2015, 04:39:59 pm
I don't understand just about any portion of the multiplayer code, but I'm always willing to help debug it.
Title: Re: In-Game Network Code Discussion
Post by: chief1983 on September 01, 2015, 04:45:24 pm
And that's what I'm getting at, I don't think anyone here really does, but it really does warrant the team's attention, so I was thinking that maybe instead of one person, the team could try to make a group effort to start untangling it.  Even simple things like identifying entry points and code flow and getting other coders all on the same page might help lead to better understanding of the whole by multiple team members.  And if anyone has any friends who might have an interest in net code, maybe send them our way.  I'm almost to the point of considering getting a bounty together for getting multiplayer in a state that FotG could be enjoyable, among other things.
Title: Re: In-Game Network Code Discussion
Post by: Cyborg17 on September 01, 2015, 05:56:53 pm
Well, I got a small glimpse at the netcode while fixing (the impressively still unimplemented) Campaign description update behavior.  In case this helps: most of the netcode revolves around numerous functions which each either send a packet, or interpret a received packet, or both with a check on whether or not you're the host (or sometimes the standalone server).  There's a dedicated function for writing the headers for packets before they're sent out and I believe one for sending them out.  Though I don't currently remember the function, I believe there's one which decides what kind of packet it is by the header.

I actually didn't see any functions that don't revolve around packets.  I may be able to look at it again later, but I'm in the middle of a move.
Title: Re: In-Game Network Code Discussion
Post by: karajorma on September 01, 2015, 07:59:10 pm
My biggest problem with dealing with the netcode is I don't know how to debug network code. So I spend inordinate amounts of time starting up a multiplayer match on my LAN and hoping I can figure out whether the issue is client or server side so I can make sure that machine is the one I'm running MSVC on.
Title: Re: In-Game Network Code Discussion
Post by: z64555 on September 01, 2015, 08:13:54 pm
Debug builds have some network related DCF's that can be accessed through the debug console (Shift+Enter anywhere), one notable DCF that I remember is one that simulated latency.

I'm guessing to be able to debug the network code by oneself, you'd setup a standalone server season and then boot up a client season on the same PC. I'm not sure how effective this is with the current state of FSO, however.
Title: Re: In-Game Network Code Discussion
Post by: niffiwan on September 01, 2015, 10:48:16 pm
I've found the "attach a debugger" method to have issues with clients timing out. Other methods like the debug console or liberal sprinklings of mprintf's tend to work better, but have their own limitations for certain classes of problem.  In general though I feel that on a LAN there aren't many lag issues, and trying to find someone 'across the internet' to test/debug with is extremely hard especially when you take different timezones, schedules and RL into account.
Title: Re: In-Game Network Code Discussion
Post by: chief1983 on September 01, 2015, 11:13:13 pm
Glad there's some interest here.  Standalone servers running on a cheap VPS can be helpful in that situation.  Since we can run them on *nix now, they can be on any *nix server, and have GDB attached if necessary, but the lag should at least be closer to real life for the local client.  You don't even need another player for many multiplayer missions, but the communications would still go through the standalone I believe.  I have a retail standalone server running on occasion in fact, under GDB with AddressSanitizer running to catch any issues that crop up.  Anyone can feel free to use it for testing.
Title: Re: In-Game Network Code Discussion
Post by: niffiwan on September 02, 2015, 01:59:15 am
Is your server with ASAN+gdb this one?  "SCP WebUI Test Server"

Also, based on what z64555 said I had a super quick look at these look like the relevant debug console lag controls:

Code: [Select]
$ git grep DCF | egrep -w 'lag|netd'
code/network/multi.cpp:DCF(netd, "change netgame debug flags (Mulitplayer)")
code/network/multilag.cpp:DCF(lag, "Sets the lag base value (Muliplayer)")
code/network/multilag.cpp:DCF(lag_min, "Sets the lag min value (Multiplayer)")
code/network/multilag.cpp:DCF(lag_max, "Sets the lag max value (Multiplayer)")
code/network/multilag.cpp:DCF(lagloss, "Help provider for the lag system (Multiplayer)")
code/network/multilag.cpp:DCF(lag_streak, "Sets the duration of lag streaks (Multiplayer)")
code/network/multilag.cpp:DCF(lag_bad, "Lag system shortcut - Sets for 'bad' lag simulation (Multiplayer)")
code/network/multilag.cpp:DCF(lag_avg, "Lag system shortcut - Sets for 'average' lag simulation (Multiplayer)")
code/network/multilag.cpp:DCF(lag_good, "Lag system shortcut - Sets for 'good' lag simulation (Multiplayer)")
Title: Re: In-Game Network Code Discussion
Post by: chief1983 on September 02, 2015, 07:25:17 am
Yup that's the one.
Title: Re: In-Game Network Code Discussion
Post by: chief1983 on September 02, 2015, 12:01:55 pm
We may want to fix those function descriptions too, dunno what 'Mulitplayer' and 'Muliplayer' are, although Mulitplayer does remind me of the Vandals' Ape Drape song.
Title: Re: In-Game Network Code Discussion
Post by: jr2 on September 02, 2015, 02:05:42 pm
Would it be possible to make regular-ish builds that collect more logging data, and ask people to use those for multi matches?

You could even have them auto-send the logs to your email maybe (assuming you made that clear to the testers, and maybe put a status upon exiting that it's sending the logs).

Just an idea, not sure if that would help.  But if so, put it in the announcements, etc.
Title: Re: In-Game Network Code Discussion
Post by: chief1983 on September 02, 2015, 02:10:44 pm
Possibly but I don't think we're to that point yet, I think just getting a handle on the tools we have, and wrapping our heads around what we do know would be more useful currently.  I already learned some things here.

Another thing to keep in mind is that multi has its own log, in data/multi.log.  One trick for getting a handle on what code runs when might be to watch that log during the game and identify what messages get written, and then we could find those messages in the code.  Might help with getting the per-frame flow understood.
Title: Re: In-Game Network Code Discussion
Post by: JGZinv on September 02, 2015, 09:59:52 pm
Not that I'm a coder, but would it be possible to host a shared folder, or even something akin to a dropbox, so someone could pull the multi.log in near realtime while several people play?
Would that be of use?
Title: Re: In-Game Network Code Discussion
Post by: chief1983 on September 03, 2015, 11:17:53 am
Not sure how much help that would be, compared to simpler ideas.  But I am willing to allow SSH access to the standalone server on the SCP account I set up, which runs the standalone and can access the multi.log on it.  So anyone using it would have access to their client and the standalone at least.  Other players' logs could be uploaded after the fact.
Title: Re: In-Game Network Code Discussion
Post by: niffiwan on September 03, 2015, 06:46:16 pm
I had a thought - it should be fairly simple to test the theory that SCP additions have slowed down the netcode, someone could run up retail multi and see how the lag differs from the current FSO version?
Title: Re: In-Game Network Code Discussion
Post by: jg18 on September 18, 2015, 04:08:30 am
Good idea, niffiwan.

Honestly, I don't think we'll know whether to try to optimize the existing netcode or just replace it with RakNet until we know  how optimizable the existing netcode is, which requires a deep understanding of it. The prerequisites for understanding it include a basic, if not moderately deep, understanding of both socket programming in general and the two networking libraries that FSO uses, Winsock and Unix sockets. That's quite a bit to learn.

If we decide the existing netcode is worth optimizing then a good first step for readability/maintainability is to port the messy blend of Winsock/Unix sockets to a cross-platform socket library like Boost Asio (http://www.boost.org/doc/libs/1_59_0/doc/html/boost_asio.html).

After I finish the SquadWar support I can dig into the netcode some more and hopefully get some sort of documentation going.