Last Epoch News

1.0 Launch Retrospective

Hello Travelers!

Today’s blog is a little different from our usual fare. As most of you know, Last Epoch launched on February 21, and the reception has been amazing. In the first week after launch, over 1.4 million of you logged in to play Last Epoch. At our peak, we had just under 265,000 players all roaming Eterra simultaneously. That’s good enough for the 39th highest all-time concurrent player count recorded on Steam, and we’re humbled by your support and enthusiasm for the game.

There’s plenty of cause for celebration, but let’s not ignore the obvious: Last Epoch’s launch was pretty rough for the majority of you who play online. Your patience and positivity have been amazing, but it was obviously not the launch experience that you or we had hoped for.

Now that the initial launch woes are behind us, it’s time to reflect on the experience. What happened? We put a strong emphasis on testing our servers and infrastructure ahead of time, so what did we get wrong? Last Epoch’s backend team is here to give you a bit of a recap of what went on during the launch.

[h2]How Our Game Works[/h2]

Let’s begin with a quick explanation of how our game works when played online. When you boot up Last Epoch and enter the game, what you see as a player is relatively straightforward; first you log in, then you select your character, and then you join a game server. Behind the scenes, though, what you’ve just done is communicate with and move through half a dozen online services. These services log you in, provide you with game data, and give you a server to connect to so you can play the game. You connect to this game server, but then the server itself goes out and talks to even more services to authenticate you, load your character data, and check things like your party membership.

These supporting services are the “backend” of our game, and without them our game doesn’t work. Some services are more important than others, but as a general rule, most services are required for the game to function. The good news is these services are all pretty resilient. Behind the scenes, each service is not one program but many copies of the same program. If one of the copies breaks down, the other copies keep working. Crucially, if a service is overloaded, we can fix that by just deploying more copies of the program. If our services are designed properly, we can handle any number of players; all we need to do is throw more copies at the problem.

(This design applies to our game servers, too; Last Epoch has never once “run out” of game servers for players because they’re all interchangeable, and we can create as many new ones as we need. The closest we ever come to running out of servers is when we get a spike in players and need to wait a short amount of time for extra ones to boot up).

[h2]Preparing For Launch[/h2]

The “designing our services properly” bit is the hard part. For some of our services, it’s very challenging to design them in a way that actually lets us scale just by throwing more copies at the problem. If you were around for the launch of patch 0.9.0, our first multiplayer release, you saw our game go down as soon as we crossed 40,000 players. Why? The service that matches players to servers had a design flaw where it started slowing down when there were many servers available. Once about 40,000 players joined servers, the performance of the matcher got so bad that it didn’t matter how many copies we threw at the problem - every copy would crash under the strain.

The backend team spent much of the time between 0.9.0 and 1.0.0 hunting down and fixing these sorts of design flaws. Some flaws could be fixed easily, others required entirely new services to handle our players’ data in a way that could actually scale. We were given a blank check to do whatever was needed to support our launch, so the only obstacle was time.

By the week before launch, we seemed to be in a reasonably comfortable spot. Everything was built, and we’d performed multiple rounds of load testing on our entire backend to ensure it could handle the volume of requests we expected at launch. The results were promising, and we hadn’t pulled any punches on the testing either. We were as ready as we were ever going to be for launch.

[h2]The Morning Of Launch[/h2]

For something like a game launch, “readiness” is mostly about having a plan. You can have confidence that you’ve tested and prepared, you can have confidence that you know how your services work, but you can’t have confidence that nothing will go wrong. Instead, you plan for what to do if something unanticipated happens.

On the morning of launch day, we went to scale up our server matching service to the numbers we used in our tests, and to our great surprise, it refused to spin up more than half the copies we asked for. Server matching is a critical service, and in our testing, it needed a high number of copies to handle all the players we expected, so getting stuck at half capacity was a serious problem.

This wasn’t even the only pre-launch hiccup. In a case of unfortunate timing, our service host had an incident the night before - still ongoing at launch time - that affected us in a way that prevented us from deploying changes to any of our backend services using the tools we had relied on for months. Our ability to fix our services was killed at the same time one of our services needed fixing.

We had workarounds for these problems, but they were not quick fixes. We were going to need to break apart our deployment tools and move our services around manually, but this was not something we could sneak in before the doors opened to all our players. Minutes before launch, we estimated that we could handle maybe 120,000 - 150,000 players before things started to fail, and we crossed our fingers that we’d be able to resolve our issues before the player count crept too high.

Well, you know how that went.

[h2]The Next 5 Days[/h2]

What unfolded over the next five days was a blur of emergency fixes and risk management. As it so happened, our first two pre-launch problems were only the tip of the iceberg.

In software, you sometimes run into a problem called “cascading failure.” When different parts of a software system rely on each other, an error in one part can cause errors in all the other parts as well. This can make it look like the entire system is failing even though only one part actually is. Finding the root cause of the failure is very difficult when everything is failing all at once.

The server matcher problem had caused a cascading failure in our systems. When players fail to connect to a server, they usually just try again, meaning our struggling server matcher had to deal with 2-5x the number of requests it would normally need to deal with. In many cases, players would get through the first half of server matching but would fail the second half, meaning servers would bring themselves online and then shut down again because a player never connected. Servers booting up and shutting down put pressure on our other services, and so some of those also started to fail.

When we fixed the server matcher, some other services continued to fail because they had trouble recovering from the chaos. Our deployment tools still required attention, so fixing these other services was a slow and manual process. To clear up the backlog, we needed to scale some of our services way past what would have been needed had the game been working smoothly. This brought with it new challenges since cloud services have some built-in soft caps that we would never have hit under normal circumstances, and working around those caps took time and, in some cases, code changes. We identified and cleared away many of these caps before launch, but we hit new ones as we scrambled to rearrange our backend.

You may be wondering, if the problem was recovering from too many players, why did we not simply have some downtime, or at least turn on player queues, to alleviate the pressure? The answer is that we did, but the problems ran deeper than the server matcher. At various points during the launch, we brought down our services, and many of you found yourselves in long queues as we struggled to keep up. Inevitably, once we started letting you all back in, we would run into problems again, and we could not clearly see what those problems were until we scaled everything up so high that the services stayed online and operational even though they were strained from other failures in other areas.

Sprinkled in with our deployment woes, we had a couple of genuine code problems in our services. One of them - one of the few examples where we straight up overestimated our ability to scale - was a bottleneck in how quickly we could process requests for a single town in a single region of the world. In Last Epoch, once you reach the “End of Time” town, you will always load into that town when you enter the game from character select. We knew ahead of time that this bottleneck existed, but we underestimated what would happen when we suddenly fixed a broken game and hundreds of thousands of players all tried to access the same town at the same time, in the same part of the world. We thought that the server matcher would be a little slow at first, and then it would start to fix itself as more and more people got into the game. What happened instead was that it took so long to get into the game that almost no one actually succeeded, so they quit and tried again, over and over, and the problem never tapered off. This was not the kind of problem we could fix just by adding more copies of the service, so it took some emergency problem solving.

Each time we found a problem and fixed it, we immediately saw improvement, but this allowed even more players to enter the game and play, which would uncover the next problem, and so on.

[h2]The Final Fix[/h2]

By Sunday, we had managed to fix, deploy, and scale our services to the point that most of our backend was handling over 200,000 players just fine, even through flurries of retries and errors. Yet amidst all the chaos, there was still some strange behavior happening on the game servers that was causing problems. During our periods of stability, when the game was up, players were able to connect to game servers, but their connections would often time-out once they got there.

Every time a player joins a game server, the game server checks to see if the player is in a party. This is a simple operation, and in all of our testing, we saw that this check completed very quickly, even under heavy load. Yet our logs were telling us that checks for a player’s party were taking up to a minute, sometimes even longer.

Over the first four days, we made a number of changes to the party service to alleviate pressure. Each fix helped for a while, but inevitably, it always slowed down again until players could no longer join servers. On day five, with all the other backend problems solved, we were able to get a more precise look at the party errors, and the culprit was a single, innocent-looking line of code. A single line of code that was supposed to be the most efficient request in our entire party service but instead ended up consuming all of the service’s resources under heavy load, slowing the entire service to a crawl.

It took about an hour on Sunday afternoon to rearrange the party data so we could check over 200,000 players’ party memberships without bringing down the service. We deployed the fix, the game came back up, and it’s been online ever since.

[h2]Lessons Learned[/h2]

This blog post is 2,000 words long, and there is still a whole lot more we could say. Internally, we have been cataloging and planning for ways we can improve, and we want to ensure that our processes moving forward include the lessons we have learned from the launch.

First, we learned the hard way that our internal tooling for deploying our services was not robust enough on launch day. Our tools were too brittle (breaking when certain services went down) and too inflexible (too many manual adjustments needed in an emergency). When the system came under strain, we couldn’t deploy our fixes quickly, and we usually had to cause additional downtime to do it. Had our deploy tooling been stronger, we could have gotten to a stable state much more quickly. Our top priority on the team right now is improving our tooling so we can effectively respond to situations like these.

Second, our services themselves could be more flexible. We had to make many changes over the course of the launch that should have been simple configuration changes but instead required a full redeploy, which turned simple fixes into long, risky operations. This weakness was identified ahead of time and has now become a top priority to improve.

Third, we need to do a better job of anticipating how player behavior affects our backend. Our testing was designed to simulate how and when our services would break, but we needed to spend more time considering how the conditions would continue to change once things started failing. Now that we’ve seen what happens during a fraught launch - how players put pressure on services differently than when everything is working - we’ll be able to incorporate that data into future tests.

(As an aside, our testing effort, in general, was a huge success. Despite how it may look from our launch struggles, our testing identified many other critical issues leading all the way up to the week before launch, and without those efforts, we might still be trying to fix the game to this day. Even though we’re post-launch, we plan to continue incorporating load testing into our regular development cadence going forward.)

[h2]Thank You[/h2]

With the launch behind us, we’re all very thankful to you players for showing so much passion for the game, despite the rocky start.

EHG started from a group of gamers hoping to make the ARPG they wanted to play, and now EHG is a group of gamers hoping to make the game YOU want to play. Your passion, enthusiasm and, when deserved, criticism have continued to encourage our teams to deliver that game and push the definition of what an ARPG can be, should be, and will be. Our team could not have made Last Epoch what it is today without you, and we will endeavor to keep making the game you all deserve.

Here's to a bright future, Travelers.

Last Epoch Patch 1.0.4 Notes

Changes

Players can no longer stun themselves. An example of this was by using the Signet of Agony node in Bone Curse
Replaced the XP Tome sound effect in Echo of a World
Flagged more prophecy rewards as "rare/valuable" (which animates them to rotate and makes them slightly larger than other stars in the Constellation)
Increased the favor cost multiplier for Glyph of Despair Prophecies
You can no longer gain experience while in the grace period (the period of invulnerability after arriving to a new area)
Added missing name in Graveyard
Overhauled terrain, spawners, and shrines in the Hidden Oasis Echo Map to both improve visuals, and performance, and fix hideable issues with palms
Overhauled visuals and added Scene Variants for Lightless Pits Echo Map

Bug Fixes

[h2]Skills & Passives[/h2]

Fixed a bug where Upheaval's "Master of the Totem" didn't buff Tempest Strike Totem damage and armor
Fixed a bug where Storm Crow's "Arborist" didn't buff Tempest Strike Totem with Flat Spell / Melee Lightning Damage
Fixed a bug where Swipe's "Avatar of Stone" doesn't buff Tempest Strike Totem, Warcry Totem & Upheaval Totem with Flat Melee / Spell Damage
Fixed a bug where Storm Totem's "Fulgrite Core" wasn't providing Flat Spell Lightning damage based on character Shock Chance
Fixed a bug where Tempest Strike's "Heorot Arsenal" node was not providing stats to the Cold Projectile that Tempest Totem shoots when each Tempest has been removed in the skill tree.
Fixed a bug where Tempest Strike Totem, Warcry Totem, and Upheaval Totem were not benefiting from Spell Lightning Damage for Existing Totems (such as the Omen of Thunder Unique Item)
Fixed a bug where directly transitioning from channeling Warpath to channeling Rebuke while at negative mana would not start mana regeneration.
Fixed a bug where Fury Leap would play the landing animation in mid-air on long-range casts of Fury Leap
Fixed a bug where the Acolyte's Wraith's weapons would drift during animation transitions
Fixed a bug where Mirage hits on Puncture would not grant stacks of Bleeding Fury
Fixed an issue with Acid Flask's "Alchemist Gift" throwing animation where the trap would spin in place then teleport to its destination
Fixed a bug where Acid Flask's "Alchemist Gift" node did not have a throwing sound
Fixed a bug where Healing Hands would not get a Fire tag when taking the Searing Light, or Skyfall Nodes
Fixed a bug where the Spell Tag would be removed from Healing Hands with Seraph Blade when you also took Skyfall
Fixed a bug where Ballista's "Armed Construction" node was giving 1% increased radius per dexterity instead of its listed 1% increased area per dexterity

[h2]Visuals[/h2]

Fixed a bug with the sword in the Fallen Ronin set was deforming in a weird way on the Primalist
Fixed Terrain and floating vegetation in several scenes

[h2]UI[/h2]

Fixed a bug where some Game Guide pages couldn't be linked in chat
Several Localization updates

[h2]Enemies[/h2]

Fixed bug where Void Despair would spawn in place for a brief second before starting its emerging animation
Fixed a bug where the Idol (big worm) could become stuck after knocking guards off their platform in Last Refuge Outskirts

[h2]Controller[/h2]

Fixed a bug where advancing dialog with a controller would skip pages if there were multiple.
Fixed a bug where menu options could still be selected with a controller while they are invisible/inactive on the Death Screen. This was resulting in potentially closing the respawn menu preventing you from being able to respawn.

[h2]Other[/h2]

Fixed a bug where your stats could become out of sync with the server. This could result in issues such as the client believing you had less movement speed than you did.
Fixed a bug where Prophecies could be re-rolled on login in multiplayer
Fixed a bug where you could not return to character select while in-game on an offline character
Fixed a bug where Vsync was not applying during Splash Screens (this appears before the login screen)
Fixed a bug with pickup range reliability in Multiplayer
Fixed some instances of Gates blocking paths in the Preserved Sanctuary Echo map
Fixed a bug which caused the game to freeze after switching inputs on the Skills panel

Known Issues

When exiting a transform such as Werebear Form, your stat sheet may display your mana regeneration incorrectly. This is only a display issue.
Gambler’s Fallacy and Soul Gambler’s Fallacy state they do not work with channeling skills
- This was wording only and functionality has remained unchanged.
- This is a change we are intending for 1.1, and the description change got through early. To be transparent, this is a change we are intending, though we are planning to go over Disintegrate at the same time providing it with buffs to compensate and stand better on its own instead of relying on this singular item interaction.

Last Epoch dev plans "two big changes" to policy after player survey

The most overpowered builds in Last Epoch can certainly be a lot of fun, and developer Eleventh Hour Games has thus far maintained a policy of allowing 'broken' builds to run through a full seasonal cycle. It's long been a point of discussion, with ARPG rivals Diablo 4 and Path of Exile also balancing their own policies on the issue. Following a community survey, however, the Last Epoch team is making a couple of "big changes" to how it approaches balancing for builds that are highly overpowerforming as a result of bugs.

Read the rest of the story...

Mid-Cycle Balance Survey Recap

Hello Travelers,

In between attempting to save the citizens of Eterra, and eradicating them, almost 70,000 of you have joined in with your voices to discuss mid-cycle balance changes. The turnout for this survey has been fantastic; and further, is in addition to the many discussions here on the forums and other platforms on the topic. We want to first thank you all for your enthusiasm on this topic driving it forward.

Today, we want to share the results of the survey, as well as the decisions which have come as a result of the feedback.

For the purpose of reading the below results, the following information can help:

Scale:
- 1 - Strongly Disagree
- 2 - Disagree
- 3 - I have no opinion on this topic
- 4 - Agree
- 5 - Strongly Agree
When we first started the survey, the scale was in reverse. We received immediate feedback about this, and swapped it before too many results came in. While the results below do not reflect this change, we made this change very early into survey so it had minimal impact on the results

[h2]OVERPERFORMING BUGS[/h2]

With over 74% of all votes in the survey crushing the other options, as well as in written feedback, the stance from the involved community is fairly clear. We should be fixing bugs which cause skills or items to highly overperform, and as such, will be doing so. This is a change from our previous stance where we didn’t want to alter balance mid-cycle. Now, if it’s the case of a bug, we will be pushing out these fixes in mid-cycle patches.

[h2]MILDLY OVERPERFORMING BUGS[/h2]

In the case of a bug resulting in a build, skill, or item mildly overperforming, there’s a much less clear stance. From written feedback, it’s a bit clear as to why: What is “mildly” overperforming? Being a bit of a vague categorization, it’s left up to an individuals interpretation.

We have decided in this case to use case-by-case discretion. This would be based on feedback we’re seeing in the community, and just how far the bug results in a power shift. So we may chose to, or not to fix bugs which are ‘mildly’ overperforming mid-cycle and will discuss them as they arise.

[h2]OVERPERFORMING BALANCE[/h2]

On the other hand, if a build is overperforming, but not caused by a bug, the feedback has largely weighted towards “do not change”. While not quite as one sided as with bugs, this is still a fairly strong sentiment from the community with over 57% voting not to make these changes. This also matches our existing stance as well, in not taking too many steps to alter balance mid-cycle. As such, we’ll be avoiding balance changes which are not bug related, even if it’s resulting in a build, skill, or item highly overperforming.

[h2]MILDLY OVERPERFORMING BALANCE[/h2]

As one might expect, as the power from something not bug-related becomes less impactful, the desire is even less for changes to be seen to them. Weighted quite heavily towards no changes, we agree with this stance and will not be making balance changes mid-cycle which are not bug related, and only result in the skill, build, or item mildly over-performing.

[h2]MID-CYCLE LEADERBOARD RESET[/h2]

In the event that we release a change or bug-fix which was resulting in an item, skill, or build to overperform, the desire for leaderboards to reset has been quite mixed. We discussed this a fair amount, and have made the following determination: We will not reset leaderboards in this instance, however, we will instead add information to the entry to indicate when the entry occurred. The goal of this being to make the information available to identify entries which may have used a build that has since changed.

We decided against a mark or icon on the entry indicating it was an overperforming build, as we didn’t want these to appear as a “mark of shame”. We felt this was the best way to be able to allow competitive players to continue competing on the leaderboards, without taking away other player’s previous hard work on their builds, even if they were overperforming.

[h2]PARTIAL LEADERBOARD RESET[/h2]

While the above answer also addresses this question, for consistency we want to show the results of all of your votes here.

[h2]NOTIFICATIONS VIA PUBLIC POSTS[/h2]

To everyone’s surprise, it looks like almost everyone agrees that receiving notifications about upcoming balance altering bugfixes or changes is a very strong desire. As we’ve been showing this last week since we started getting feedback on the survey, we fully agree with this, and will start trying to provide more head’s up when these changes are coming.

Though with this, we will still reserve the right to not provide information regarding the upcoming change if: Doing so would result in players rushing to take advantage causing severe issues, or we could release the fix almost as fast as releasing the notification that the fix is coming

So for these changes, if we feel we can release the information in full about the change, we’ll do so. Otherwise, we may try to be more vague (such as with the Infernal Shade infinite damage bug) to limit its impact before the change, or we may withhold it completely if it’s something which is regularly crashing servers when it’s utilized to minimize its impact until we can get it fixed.

[h2]CONCLUSION[/h2]

This round of discourse with the community has resulted in some great changes to our stances that we’re quite happy with. The two big changes here being:

We will release fixes mid-cycle for bugs which result in an item, skill, or build highly overperforming
We will add leaderboard functionality displaying which specific patch, or timespan an entry happened during.

Once again, we’d like to thank everyone for your involvement in Last Epoch, and taking the time to make your voices heard. It’s with all of your feedback we’ve made Last Epoch as great as it is, and is only by continuing to work with the community and listening to feedback that we’ll continue to improve.

Until next time, may RNG be with you Travelers!

Last Epoch Hotfix 1.0.3.2 Notes

Bug Fixes

Fixed a bug where you could enter a Quest Echo regardless of Stability or Quest completion state
Fixed a bug where a Lost Cache could be opened more than once in multiplayer
Fixed an issue where if multiple characters shared the same name, only the first character with that name was showing on the character selection page.
Fixed a bug where killing some Timeline Quest Bosses too quickly would result in the Quest state not progressing resulting in the Timeline Quest not successfully being completed
Fixed a bug where the benefit from Infernal Shade’s Purgatory could persist through multiple casts of Infernal Shade
Fixed an issue where the Delete prompt on the character select screen would not block clicks, allowing players to change the selected character being deleted.
Moved the online/offline switch to be behind the Delete prompt fade on the Character Selection screen to prevent changing characters while having a delete prompt up.
Fixed a bug where logging into specific characters would result in “LE-52 Lost Connection to Game Server” while other characters would connect without issue

[h2]Update - 07:47 PM CT[/h2]

Fixed a resulting bug which prevented players from skipping the first two Timeline Quests in Empowered Timelines