The Good, The Bad, and the League: 11/30 - 12/13
_Your semi-weekly dose of server problem-os, NA League news, and other! _
Clash was here, and there it went! End of the year patch is incoming, the pre-season is shaking out (for better or worse) and All-Stars is over. Also here’s some problems that happened…
* **Mac voice still not working properly** Bug grouped with other Mac bugs. More .plist fun (I personally know nothing about Macs, so basically Greek to me).
**Server Stuff: **
* **Riot Direct router reboots(12/2, ~10 minutes)** Automated alerting notifies the NOC that there are a lot of game alerts going off for no discernable reason. Several minutes later, the network recovers without external influences. Post-problem investigation finds that a router decided to reboot (not the TV show, which was cool back in the day) on its own.
* **Core service flatlines, causing problems(12/2, ~66 minutes)** Automated alerts notifies the NOC that a core service is reporting 0 processing power. The noc escalates to the on-call LP as Compensation mode is enabled. Engineers restart the affected service and verify the correct processing power is allocated to it after the reboot. Comp mode is then disabled as the service stabilizes.
* **All-Star page not loading in LCU(12/4, ~42 minutes)** A Rioter reports to the NOC that the All-Stars tab within the client isn’t loading. NOC verifies they can’t open it either, and pings the proper team. Before the team gets knee-deep in coding a fix, the problem disappears as all affected parties are able to access the tab without issue.
* **Preferences not resetting(12/7, ~4 hours)** Automated alerting notifies the NOC that one core process isn’t completing a self-loop properly. Escalation and investigation shows that one of two core machines are causing the overall problems. Engineers reboot both machines and watch to see if the self-loop finishes properly (which it eventually does).
* **Disabled Practice Tool(12/7, ~205 minutes)** NOC is notified by automated alerting that CPU usage across game servers is approaching warning limits. To ensure the limit isn’t exceeded, the NOC disables practice tool games briefly, alleviating the load slightly. Once the CPU usage falls back to normal levels, the NOC re-enables practice tool and removes all ticker updates.
* **Loot Outage(12/7, ~222 minutes)** Automated alerting notifies the NOC of a possible problem with Loot. Engineers report there are issues with Loot, and the NOC disables the service so it can be triaged. Investigation leads to a failure to properly load balance. Future load balancing is added to a post-mortem, and Loot is re-enabled. Engineers verify all pending Loot was delivered properly.
* **Loot Outage(12/8, ~3 hours)** See above
* **Loot Delay(12/13, ~5 minutes)** See above
* **Custom Player game minimum set to 5(12/7, ~201 minutes)** NOC is notified by automated alerting that CPU usage across game servers is approaching warning limits. To ensure the limit isn’t exceeded, the NOC sets custom games to a minimum of 5 players, alleviating the load slightly. Once the CPU usage falls back to normal levels, the NOC re-enables practice tool and removes all ticker updates.
* **CPU Mitigation Steps(12/8, ~2 hours)** Reports filter in that the CPU usage is above the ideal threshold. In response, the NOC disables the practice tool, sets minimum custom players to 10 per game, and starts throttling non-ranked queues. Once the CPU usage falls below the red line, the NOC returns all queues to normal.
* **CPU Mitigation Steps(12/8, ~2 hours)** See above
* **Club Tag and Chat not loading(11/27, ~7 hours)** Rioter reports filter into the NOC about chat/club tags not loading properly. NOC escalates to the appropriate Riot team while also setting up tickers across Riot Regions. A Rioter discovers a certificate expired, causing the problems. The suspect certificate is updated, fixing the problem.
* **Match History is down(12/10, ~25 minutes)** The NOC is notified by Rioters that Match History isn’t populating properly. Investigation shows a config update caused the core service to fail to update. After flipping a specific switch, Match History starts working again.
* **Aphelios’ ability videos in LCU not working(12/11, ~22 hours)** Player Support notifies the NOC that various Aphelios abilities are not working inside of the LCU. NOC starts to troubleshoot, eventually escalating to appropriate Riot teams. Investigation shows that because Aphelios has no E, there’s no E video, leading to problems with displaying the R video. Look at him go, confusing people in AND out of game. Future fix to be deployed with 10.1 patch.
* **LCU Home tab failing to load(12/12, ~65 minutes)** An update pushed to the client, propagates and contains incorrect info in header. This incorrect info causes the Home tab within the LCU to break. Investigation and escalation leads to the discovery of the owner and code causing the problem. Reverting the update fixes the issue, and a future update will not include the incorrect code.
**Game Stuff: **
* **Legend of Poro King not granting Event points(12/12, ~9 hours)** NOC activates the Night & Dawn event pass, as well as enabling Legend of the Poro King. Reports filter in that the Legend of the Poro King mode is granting no event points to Event pass owners. Investigation shows a single variable has been subbed out incorrectly within a config setting. A micropatch rolls out to fix the variable, which will grant Event points for playing Legend of the Poro King.