Client performance rant

Forum for alternative clients, mods & discussions on the same.

Client performance rant

Postby loftar » Tue May 09, 2017 6:49 pm

Since it is a point that is... commonly brought up, so to speak, and also since I've been pondering it a fair bit lately, I thought I'd write a bit about current reflections on client performance.

In my mind, there are three main, independent areas where client performance should be improved, so I'll write about each independently.

1. Animated meshes
There's been a lot of talk lately about extremely poor performance in larger fights. While I still haven't been able to reproduce anything close to the extreme levels that people have been talking about (1-2 FPS in a fight with a few tens of people), it seems safe to say that it's primarily a matter of animating the player characters involved. Despite my lack of ability to reproduce those particular situations, I have on the other hand noted myself that large numbers of livestock are a big part of slowdowns in complex village scenes, so it seems like an area of particular urgency anyway, and I figure fixing one should have a good chance of fixing the other as well.

There have been ideas floating around that animated meshes can be fixed be combining several animated meshes into one, big, combined vertex buffer, so that they can be submitted with fewer draw calls; but in my profiling, the draw calls seems to be a fairly small part of the problem of animated meshes anyway, the vast majority of CPU time rather being spent in updating the actual vertices, which indicates to me that the vertex processing should be transferred to the GPU instead, along with instanced rendering to reduce the draw call overhead as well. I've long been confounded about instancing animated meshes, however, since the animation data that needs to be passed for each instance is far too big to fit into the vertex data.

However, I recently found an article on the subject by nVidia, from quite a few years ago. The model used in the article is far too simplistic to be applied to Haven (no multiple or blended animations, for instance), but the technique of passing data in textures inspired me a bit, and I believe I now have a fairly thought out idea about how animated meshes could be instanced by the Haven client, where the client still combines the animations and calculates the bone matrices on the CPU, and passes that data in a texture to the vertex shader.

It should be said that said technique is only applicable to skeletal animations (like players or livestock) rather than mesh animations (like beehives or hearth fires). I do have some embryo of an idea on how to instance the latter as well, but it is not at all as thought out yet. It should also be said that the mentioned technique requires the ability of the GPU to do texture fetches in the vertex shader, a capability that is available in all semi-modern GPUs, but I have noticed some people still using pretty ancient hardware to play Haven that is probably lacking said ability, such as GeForce 6xxx- or 7xxx-series cards (as distinct from 6xx or 7xx, mind you), so they may be left out in the cold after such an update.

2. General rendering pipeline
The base of the current rendering system is that every visisble object is iterated over and allowed to draw itself. I have, since a fairly good while back, realized the alternative possibility to instead having them register with a rendering system at creation time, and maintaining a tighter coupling so that some information can be saved and reused from frame to frame. Originally, however, such a system would have only produced rather minor improvements, so it didn't seem worth the extra complexity of such a system, or the fairly large rewrite it would have entailed. However, over time, and especially lately, reasons for reconsidering that have been cropping up, and I'm now fairly convinced that such a rewrite should be done in fairly short order. Among the reasons are these:
  • Since instanced rendering was implemented more efficiently, I've noticed that there is an undesirable effect from having to recalculate the entire instancing data buffers every time an object in the instanced batch changes. While the current implementation of instancing has made it very CPU-efficient to draw, for instance, very large fields of crops, there's a large recalculation that has to be made every time any single crop object is added or removed from it, which leads to poor performance when moving around in villages. If more persistent information were kept, it would hopefully allow only the smallest possible deltas to the instancing buffers to be effected instead, which would change the rendering scaling factor from O(number of objects on screen) to O(number of instanced batches on screen), which starts to seem worth getting worked up about.
  • As a corollary, it would be nice if such a system could be formulated so that changes can be effected in parallel from the rendering loop itself, so that rendering wouldn't have to be blocked by objects popping in and out, leading to less stuttering.
  • I've also been reading up about Vulkan lately, which has been quite an interesting and enlightening experience, and it seems obvious to me that Vulkan is the future over OpenGL. Rendering in Vulkan would be greatly helped by being able to preserve data from frame to frame, as that would allow reusing pipeline-state objects from frame to frame instead of having to recalculate them, which could be worth quite a lot.
  • Also, and almost most important of all, I learned only quite recently that one is able to reuse command buffers in Vulkan from frame to frame, which is quite a fascinating proposal indeed, since it may allow improving the rendering scaling factor even further from O(number of instanced batches) to O(number of changed objects for a particular frame). If such a system could efficiently implemented, it almost holds the promise of virtually removing CPU overhead in rendering.
I've been spending a bit of time lately trying to formulate a rendering system that would allow for the aforementioned changes. I'm not quite there yet, but more and more pieces are falling into place. Either way, it's a fairly large rewrite, and I feel that the things in this section should probably be implemented prior to implementing the animation optimizations of the prior section, so that I don't just have to rewrite those anyway when the rendering system changes.

3. GPU overhead
All the aforementioned topics only touch on CPU overhead in rendering, but GPU overhead is also a very important factor, especially for users of integrated GPUs. Unfortunately, reasoning about GPU usage is much harder, especially since I have been utterly incapable of getting any GPU profiler working. I've been trying a fairly large number of GPU profilers, even going so far as to install Windows on a separate system to see if nVidia's Visual Studio-based tools would work, but all of them keep failing for one reason or another. Intel's tools only works for Direct3D and not for OpenGL (sigh), nVidia's previous profiling tools are no longer supported and the licensing server is down, preventing them from starting (double sigh), their new Linux-based tools crash because of Java, and I couldn't even try the Visual Studio based tools because they didn't support the GPU I had available (who knows if they even had worked if I had a newer GPU to try with).

I have some guesses as to optimizations that could potentially be meaningful to improve GPU rendering time, but all those guesses are too uncertain to seem worth trying before profiling (especially as many of them are fairly large changes). Thus, so far, work in this area is still... ongoing. I do feel it is a slightly lesser priority anyway, because the CPU overhead is more commonly the limiting factor, but if CPU usage is improved, GPU overhead may very soon start becoming the limiting factor instead.

It's true that there are some things that could be made to would improve the GPU situation regardless, such as introducing settings to reduce graphical complexity, but I feel that for most scenes, that just shouldn't be required anyway, so that non-degrading optimization should come first. Not sure how to proceed yet.


So there's that. It's just so you know, I guess.
"Object-oriented design is the roman numerals of computing." -- Rob Pike
User avatar
loftar
 
Posts: 6741
Joined: Fri Apr 03, 2009 7:05 am

Re: Client performance rant

Postby azrid » Tue May 09, 2017 8:16 pm

If these changes improve performance greatly it will be the best update Haven has ever had.
Image
User avatar
azrid
 
Posts: 1637
Joined: Mon Oct 17, 2011 11:33 pm

Re: Client performance rant

Postby sMartins » Wed May 10, 2017 6:26 am

Idk if you read my post on the last patch announcement...in short I discovered that using customs Nvidia profiles for haven (such as antialising, anysothropic, etc...) cause my gpu to step up from 20% to 100% full load and starting to overheat...and losing a lot of fps, 15-30...
sMartins wrote:Here the proof of what i was talking about in my previous post:
as soon as I switch 8x antialising on the Nvidia Control Panel this happen playing Hafen, pretty weird to me....and probably related some how with optimizations issues or drivers,etc...
Image
Default settings, no filters:
Image
Log file:
Without antialiasing on the NVCP:
Code: Select all
U Temperature [°C] , Fan Speed (%) [%] , Fan Speed (RPM) [RPM] , Memory Used [MB] , GPU Load [%] , Memory Controller Load [%] , Video Engine Load [%] , Bus Interface Load [%] , Power Consumption [% TDP] , PerfCap Reason [] , VDDC [V] ,
2017-04-20 19:54:08 ,              135.0   ,                162.0   ,               27.0   ,              34   ,                1516   ,            492   ,         27   ,                       22   ,                   0   ,                    1   ,                     8.4   ,              16   , 0.8620   ,
2017-04-20 19:54:09 ,              135.0   ,                162.0   ,               26.0   ,              34   ,                1518   ,            492   ,         27   ,                       22   ,                   0   ,                    1   ,                     8.5   ,              16   , 0.8620   ,
2017-04-20 19:54:10 ,              135.0   ,                162.0   ,               27.0   ,              34   ,                1514   ,            492   ,         25   ,                       20   ,                   0   ,                    0   ,                     8.4   ,              16   , 0.8620   ,
2017-04-20 19:54:11 ,              135.0   ,                162.0   ,               26.0   ,              34   ,                1513   ,            492   ,         23   ,                       20   ,                   0   ,                    1   ,                     8.4   ,              16   , 0.8620   ,
2017-04-20 19:54:12 ,             1113.4   ,               1752.8   ,               29.0   ,              34   ,                1505   ,            510   ,         36   ,                       18   ,                   0   ,                    1   ,                    33.1   ,              16   , 1.0180   ,
2017-04-20 19:54:13 ,             1113.4   ,               1752.8   ,               30.0   ,              34   ,                1502   ,            510   ,         36   ,                       18   ,                   0   ,                    1   ,                    32.4   ,              16   , 1.0180   ,
2017-04-20 19:54:14 ,             1113.4   ,               1752.8   ,               30.0   ,              34   ,                1496   ,            512   ,         37   ,                       18   ,                   0   ,                    1   ,                    32.8   ,              16   , 1.0180   ,
2017-04-20 19:54:15 ,             1113.4   ,               1752.8   ,               29.0   ,              34   ,                1495   ,            513   ,         18   ,                        4   ,                   0   ,                    1   ,                    26.1   ,              16   , 1.0180   ,
2017-04-20 19:54:16 ,             1113.4   ,               1752.8   ,               29.0   ,              34   ,                1497   ,            513   ,         21   ,                        5   ,                   0   ,                    1   ,                    26.1   ,              16   , 1.0180   ,
2017-04-20 19:54:17 ,              899.0   ,               1752.8   ,               29.0   ,              34   ,                1501   ,            515   ,         24   ,                        5   ,                   0   ,                    1   ,                    25.8   ,              16   , 1.0180   ,
2017-04-20 19:54:18 ,              899.0   ,               1752.8   ,               29.0   ,              34   ,                1500   ,            110   ,          0   ,                        1   ,                   0   ,                    0   ,                    22.7   ,              16   , 1.0180   ,

With antialising 8x:
Code: Select all
U Temperature [°C] , Fan Speed (%) [%] , Fan Speed (RPM) [RPM] , Memory Used [MB] , GPU Load [%] , Memory Controller Load [%] , Video Engine Load [%] , Bus Interface Load [%] , Power Consumption [% TDP] , PerfCap Reason [] , VDDC [V] ,
2017-04-20 19:59:00 ,             1328.9   ,               1752.8   ,               69.0   ,              69   ,                2188   ,            789   ,         99   ,                       30   ,                   0   ,                    1   ,                    83.9   ,               4   , 1.2060   ,
2017-04-20 19:59:01 ,             1328.9   ,               1752.8   ,               69.0   ,              69   ,                2194   ,            789   ,         97   ,                       30   ,                   0   ,                    1   ,                    83.8   ,               4   , 1.2060   ,
2017-04-20 19:59:02 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2194   ,            789   ,         97   ,                       30   ,                   0   ,                    1   ,                    85.3   ,               4   , 1.2060   ,
2017-04-20 19:59:03 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2195   ,            789   ,        100   ,                       30   ,                   0   ,                    1   ,                    86.1   ,               4   , 1.2060   ,
2017-04-20 19:59:04 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2197   ,            789   ,        100   ,                       30   ,                   0   ,                    1   ,                    85.0   ,               4   , 1.2060   ,
2017-04-20 19:59:05 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2195   ,            789   ,         96   ,                       29   ,                   0   ,                    1   ,                    85.7   ,               4   , 1.2060   ,
2017-04-20 19:59:06 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2193   ,            789   ,         97   ,                       29   ,                   0   ,                    1   ,                    85.2   ,               4   , 1.2060   ,
2017-04-20 19:59:07 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2196   ,            789   ,         96   ,                       29   ,                   0   ,                    1   ,                    85.0   ,               4   , 1.2060   ,
2017-04-20 19:59:08 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2201   ,            791   ,        100   ,                       30   ,                   0   ,                    1   ,                    85.8   ,               4   , 1.2060   ,
2017-04-20 19:59:09 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2203   ,            791   ,         99   ,                       30   ,                   0   ,                    1   ,                    86.0   ,               4   , 1.2060   ,
2017-04-20 19:59:10 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2207   ,            791   ,         99   ,                       30   ,                   0   ,                    1   ,                    86.4   ,               4   , 1.2060   ,
2017-04-20 19:59:11 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2212   ,            791   ,         99   ,                       30   ,                   0   ,                    1   ,                    86.3   ,               4   , 1.2060   ,
2017-04-20 19:59:12 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2215   ,            791   ,         99   ,                       31   ,                   0   ,                    1   ,                    86.1   ,               4   , 1.2060   ,
2017-04-20 19:59:13 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2208   ,            791   ,         99   ,                       30   ,                   0   ,                    1   ,                    86.2   ,               4   , 1.2060   ,
2017-04-20 19:59:14 ,             1328.9   ,               1752.8   ,               70.0   ,              69   ,                2216   ,            791   ,         99   ,                       31   ,                   0   ,                    1   ,                    86.4   ,               4   , 1.2060   ,
2017-04-20 19:59:15 ,             1328.9   ,               1752.8   ,               71.0   ,              70   ,                2215   ,            791   ,         99   ,                       31   ,                   0   ,                    1   ,                    86.2   ,               4   , 1.2060   ,
2017-04-20 19:59:16 ,             1328.9   ,               1752.8   ,               70.0   ,              70   ,                2217   ,            791   ,        100   ,                       31   ,                   0   ,                    1   ,                    86.1   ,               4   , 1.2060   ,
2017-04-20 19:59:17 ,             1328.9   ,               1752.8   ,               70.0   ,              70   ,                2220   ,            791   ,         99   ,                       30   ,                   0   ,                    1   ,                    86.7   ,               4   , 1.2060   ,
2017-04-20 19:59:18 ,             1328.9   ,               1752.8   ,               71.0   ,              70   ,                2222   ,            791   ,         99   ,                       31   ,                   0   ,                    1   ,                    86.3   ,               4   , 1.2060   ,
2017-04-20 19:59:19 ,             1328.9   ,               1752.8   ,               70.0   ,              70   ,                2217   ,            791   ,         99   ,                       31   ,                   0   ,                    1   ,                    86.3   ,               4   , 1.2060   ,
2017-04-20 19:59:20 ,             1328.9   ,               1752.8   ,               71.0   ,              70   ,                2223   ,            791   ,         99   ,                       30   ,                   0   ,                    1   ,                    86.2   ,               4   , 1.2060   ,
2017-04-20 19:59:21 ,             1328.9   ,               1752.8   ,               71.0   ,              70   ,                2228   ,            791   ,         99   ,                       31   ,                   0   ,                    1   ,                    86.5   ,               4   , 1.2060   ,
2017-04-20 19:59:22 ,             1328.9   ,               1752.8   ,               71.0   ,              70   ,                2227   ,            791   ,         99   ,                       31   ,                   0   ,                    1   ,                    86.3   ,               4   , 1.2060   ,
2017-04-20 19:59:23 ,             1328.9   ,               1752.8   ,               71.0   ,              70   ,                2233   ,            791   ,         99   ,                       31   ,                   0   ,                    1   ,                    86.0   ,               4   , 1.2060   ,
2017-04-20 19:59:24 ,             1328.9   ,               1752.8   ,               71.0   ,              70   ,                2234   ,            791   ,        100   ,                       31   ,                   0   ,                    1   ,                    86.8   ,               4   , 1.2060   ,
2017-04-20 19:59:25 ,             1328.9   ,               1752.8   ,               71.0   ,              70   ,                2230   ,            791   ,        100   ,                       31   ,                   0   ,                    1   ,                    86.1   ,               4   , 1.2060   ,
2017-04-20 19:59:26 ,             1328.9   ,               1752.8   ,               71.0   ,              70   ,                2239   ,            791   ,         99   ,                       31   ,                   0   ,                    1   ,                    86.5   ,               4   , 1.2060   ,
2017-04-20 19:59:27 ,             1328.9   ,               1752.8   ,               71.0   ,              70   ,                2238   ,            791   ,         99   ,                       31   ,                   0   ,                    1   ,                    86.8   ,               4   , 1.2060   ,
2017-04-20 19:59:28 ,             1328.9   ,               1752.8   ,               71.0   ,              70   ,                2244   ,            791   ,         99   ,                       31   ,                   0   ,                    1   ,                    86.4   ,               4   , 1.2060   ,
2017-04-20 19:59:29 ,             1328.9   ,               1752.8   ,               70.0   ,              70   ,                2238   ,            792   ,         82   ,                       26   ,                   0   ,                    1   ,                    67.8   ,               4   , 1.2060   ,
2017-04-20 19:59:30 ,             1328.9   ,               1752.8   ,               68.0   ,              70   ,                2248   ,            792   ,         78   ,                       25   ,                   0   ,                    1   ,                    65.9   ,               4   , 1.2060   ,
2017-04-20 19:59:31 ,             1328.9   ,               1752.8   ,               68.0   ,              70   ,                2247   ,            792   ,         67   ,                       21   ,                   0   ,                    1   ,                    73.8   ,               4   , 1.2060   ,

I hope that could be helpfull for you.
Out of curiosity I'd like to know what's the reason behind OpenGL instead of Direct3D....if there is a short answer...See you
Default Client, Best Client!
User avatar
sMartins
 
Posts: 2056
Joined: Wed Nov 11, 2015 10:21 pm
Location: Italy

Re: Client performance rant

Postby Granger » Wed May 10, 2017 7:23 am

sMartins wrote:Out of curiosity I'd like to know what's the reason behind OpenGL instead of Direct3D....if there is a short answer...See you

Short answer: No 'windows only'.

UI performance (in terms of how it performs regarding usability): when rewriting please add he ability to scale the UI to the users liking, with an extra slider for the font size (additional to the general size of it).
Please check Announcements, Game Guides, GLoSQ before posting questions, search is your friend.
Follow the Bug Report HOWTO to ease solving your issue and, in general, The Rules.
User avatar
Granger
 
Posts: 7118
Joined: Mon Mar 22, 2010 2:00 pm

Re: Client performance rant

Postby svino » Wed May 10, 2017 12:05 pm

Very interesting to read what you think about performance and optimization optimization loftar. Thanks for sharing your thoughts.
Some of these changes sound like big rewrites, but I think that the game would be so much better with better performance.
User avatar
svino
 
Posts: 207
Joined: Mon Jun 06, 2011 4:09 am

Re: Client performance rant

Postby shubla » Wed May 10, 2017 7:16 pm

Client should just be rewritten.
I mean, only things which are broken/bad.

So most of it.
Image
I got my own HnH custom client check it out.
Purus Pasta
User avatar
shubla
 
Posts: 7742
Joined: Sun Nov 03, 2013 11:26 am
Location: Finland

Re: Client performance rant

Postby loftar » Thu May 11, 2017 1:40 am

sMartins wrote:Idk if you read my post on the last patch announcement...in short I discovered that using customs Nvidia profiles for haven (such as antialising, anysothropic, etc...) cause my gpu to step up from 20% to 100% full load and starting to overheat...and losing a lot of fps, 15-30...

That doesn't sound particularly strange at all to me, though. Higher MSAA levels are usually the way to kill even the most powerful GPU.

sMartins wrote:Out of curiosity I'd like to know what's the reason behind OpenGL instead of Direct3D....if there is a short answer...See you

Because then I wouldn't be able to play my own game? :)

Not to mention, just, why would anyone pick Direct3D over OpenGL? OpenGL has portability, Direct3D has no other advantages to compete with that. I don't really see why anyone would choose Direct3D these days.

shubla wrote:Client should just be rewritten.

I see little point in just writing the same thing all over again. ^^
"Object-oriented design is the roman numerals of computing." -- Rob Pike
User avatar
loftar
 
Posts: 6741
Joined: Fri Apr 03, 2009 7:05 am

Re: Client performance rant

Postby sMartins » Thu May 11, 2017 3:21 am

loftar wrote:That doesn't sound particularly strange at all to me, though. Higher MSAA levels are usually the way to kill even the most powerful GPU.


Well....I guess it's normal some perfomance lost using filters,etc...but when I tested haven, even if my char was standing still without doing nothing as soon opened the game (with the 8x filter) I heard my fan spinning like crazy, that's how I realized something was not totaly right. Never heard my PC so loud before.
I mean, I can set customs Nvidia filters on heavier games, and while I can lose some fps,etc.. my gpu doesn't start to overheat like that, with costant 100% load, first time that happened to me, so I reported that.
Idk, maybe it's time to clean my PC :D (and I'll do before the summer) but currently it's not happening to me with others games.
P.S. That screenshot you see it's just less than 5 minutes in the game doing nothing.

loftar wrote:Because then I wouldn't be able to play my own game? :)

Not to mention, just, why would anyone pick Direct3D over OpenGL? OpenGL has portability, Direct3D has no other advantages to compete with that. I don't really see why anyone would choose Direct3D these days.


I see....I was just thinking at others games that I played, I remember some of those having the option for both OpenGL and Direct3D, and usually the perfomances were always better with Direct3D...but probably it's only a matter of how hard devs work on optimization for one or the other Api
Default Client, Best Client!
User avatar
sMartins
 
Posts: 2056
Joined: Wed Nov 11, 2015 10:21 pm
Location: Italy

Re: Client performance rant

Postby iamahh » Fri May 12, 2017 1:57 pm

this reminds me of creator of LOVE MMO
https://www.youtube.com/watch?v=f90R2taD1WQ

the technology seems awesome
iamahh
 
Posts: 963
Joined: Sat Dec 12, 2015 8:23 pm

Re: Client performance rant

Postby sMartins » Fri May 12, 2017 5:28 pm

iamahh wrote:this reminds me of creator of LOVE MMO
https://www.youtube.com/watch?v=f90R2taD1WQ

the technology seems awesome


WoW :o
Default Client, Best Client!
User avatar
sMartins
 
Posts: 2056
Joined: Wed Nov 11, 2015 10:21 pm
Location: Italy

Next

Return to The Wizards' Tower

Who is online

Users browsing this forum: No registered users and 4 guests

cron