Lagspike saga

Announcements about major changes in Haven & Hearth.

Re: Lagspike saga

Postby loftar » Sat Mar 02, 2019 5:28 am

DatSheep wrote:Did you do the AB BA switch mentioned?

No, I don't have physical access to the server, but I have to admit I don't really see the point even if I did have. What are you proposing would happen if they were physically switched between slots?
"Object-oriented design is the roman numerals of computing." -- Rob Pike
User avatar
loftar
 
Posts: 8926
Joined: Fri Apr 03, 2009 7:05 am

Re: Lagspike saga

Postby loftar » Sat Mar 02, 2019 5:30 am

Actually, speaking of swapping drives, it does appear that what has happened now is that the drives have in fact switched in terms of which one is causing problems. I guess that rules out hardware issues. (Unfortunately, since that would have been an easy fix if it had been true.)
"Object-oriented design is the roman numerals of computing." -- Rob Pike
User avatar
loftar
 
Posts: 8926
Joined: Fri Apr 03, 2009 7:05 am

Re: Lagspike saga

Postby Lord_of_War » Sat Mar 02, 2019 5:43 am

https://en.wikipedia.org/wiki/Muntzing If the problem drive switched try without RAID.
One death is a tragedy; one million is a statistic. -- Joseph Stalin #1 Forum troll You mad bro?
"My soul is terrifying" "Some men just want to watch the world burn"
http://www.youtube.com/watch?v=5f2M9t_tEhQ
User avatar
Lord_of_War
 
Posts: 1668
Joined: Wed Jul 17, 2013 2:53 am
Location: A long time ago in a land far away

Re: Lagspike saga

Postby Ferinex » Sat Mar 02, 2019 5:55 am

Lord_of_War wrote:https://en.wikipedia.org/wiki/Muntzing If the problem drive switched try without RAID.


If data redundancy is considered a non-functional requirement, then 2 drives in RAID is already the minimum required for the system. I think you've shed light on a different problem though, which is that the prod environment doesn't closely resemble his test environment, so he is not able to do this sort of experimentation without causing downtime. Renting a test environment that matches the prod environment of course is one thing for which they probably need funding
i guess they never miss huh
User avatar
Ferinex
 
Posts: 1040
Joined: Sun May 31, 2009 9:05 am
Location: Miami

Re: Lagspike saga

Postby MagicManICT » Sat Mar 02, 2019 6:11 am

Lord_of_War wrote:https://en.wikipedia.org/wiki/Muntzing If the problem drive switched try without RAID.

i'm not a strong Linux guy (yet), but doesn't this require rebuilding the drive structure? It at least does in Windows and can take some time, and not likely doable with the system in a running state.
Opinions expressed in this statement are the authors alone and in no way reflect on the game development values of the actual developers.
User avatar
MagicManICT
 
Posts: 18437
Joined: Tue Aug 17, 2010 1:47 am

Re: Lagspike saga

Postby TheNater » Sat Mar 02, 2019 6:23 am

loftar wrote:
DatSheep wrote:Did you do the AB BA switch mentioned?

No, I don't have physical access to the server, but I have to admit I don't really see the point even if I did have. What are you proposing would happen if they were physically switched between slots?


The AB-BA swap would help determine whether it was a software or hardware issue like you mentioned here:

"There are two SSD's.....One of these is acting strangely: Monitoring the I/O to it, it exhibits periods where the drive just seems to be "idle" ------- The question, then, is what causes this, and I'm currently trying to figure out if it's a hardware or software issue."

I mentioned this since it's an easy thing to check if the server is in front of you (i.e. physical access). Since it's in a datacenter and you're not present, not so much. It's just something I would try after the update if I was in front of the servers.
TheNater
 
Posts: 44
Joined: Mon Jan 24, 2011 12:10 am

Re: Lagspike saga

Postby loftar » Sat Mar 02, 2019 6:28 am

TheNater wrote:The AB-BA swap would help determine whether it was a software or hardware issue like you mentioned here:

I don't really see how it would determine that. It seems the only thing it might determine would be if there were something wrong with the physical PCIe connection, rather than the drive.
"Object-oriented design is the roman numerals of computing." -- Rob Pike
User avatar
loftar
 
Posts: 8926
Joined: Fri Apr 03, 2009 7:05 am

Re: Lagspike saga

Postby TheNater » Sat Mar 02, 2019 6:30 am

TheNater wrote:
loftar wrote:
DatSheep wrote:Did you do the AB BA switch mentioned?

No, I don't have physical access to the server, but I have to admit I don't really see the point even if I did have. What are you proposing would happen if they were physically switched between slots?


The AB-BA swap would help determine whether it was a software or hardware issue like you mentioned here:

"There are two SSD's.....One of these is acting strangely: Monitoring the I/O to it, it exhibits periods where the drive just seems to be "idle" ------- The question, then, is what causes this, and I'm currently trying to figure out if it's a hardware or software issue."

I mentioned this since it's an easy thing to check if the server is in front of you (i.e. physical access). Since it's in a datacenter and you're not present, not so much. It's just something I would try after the update if I was in front of the servers.


Basically, if you swapped the drives and the problem follows the 'drive' then it's hardware. If it stays the same, it's going to be a software issue.
TheNater
 
Posts: 44
Joined: Mon Jan 24, 2011 12:10 am

Re: Lagspike saga

Postby TheNater » Sat Mar 02, 2019 6:35 am

loftar wrote:
TheNater wrote:The AB-BA swap would help determine whether it was a software or hardware issue like you mentioned here:

I don't really see how it would determine that. It seems the only thing it might determine would be if there were something wrong with the physical PCIe connection, rather than the drive.


Sorry if I double-posted. You are correct, but I still think it would be beneficial information to have. It's just something I would do in my regular troubleshooting if I was right there in front of the equipment. It's just one of those "why not" things if you're in front of the server.

EDIT: Just posting in case people cant read - THE SERVER IS NOT IN FRONT OF JORB OR LOFTAR TO MAKE CHANGES.
TheNater
 
Posts: 44
Joined: Mon Jan 24, 2011 12:10 am

Re: Lagspike saga

Postby Lord_of_War » Sat Mar 02, 2019 6:38 am

There are diagnostic tests for drives done by placing an artificial load on them. However if the issue is in the software stack it won't find anything. Could redundancy be done asynchronously with COW?
One death is a tragedy; one million is a statistic. -- Joseph Stalin #1 Forum troll You mad bro?
"My soul is terrifying" "Some men just want to watch the world burn"
http://www.youtube.com/watch?v=5f2M9t_tEhQ
User avatar
Lord_of_War
 
Posts: 1668
Joined: Wed Jul 17, 2013 2:53 am
Location: A long time ago in a land far away

PreviousNext

Return to Announcements

Who is online

Users browsing this forum: Google [Bot], Naylok, Python-Requests [Bot] and 135 guests