My Factorio Megabase Melted the Server at 40 Hours. Here's How I Keep Co-op Worlds Alive Now.
There's a specific kind of pain that only factory game players understand. You've spent 40 hours building a Factorio base with three friends. The main bus is perfect. The train network is a work of art. You just launched your first rocket. And then the server starts stuttering. UPS drops from 60 to 35. Belts stutter. Inserters freeze mid-grab. Two minutes later, everyone gets disconnected, and the server goes silent.
That's what happened to our group on a Saturday afternoon. We'd been planning the session all week. Everyone blocked out four hours. And the server just gave up because the factory had grown past what the hardware could handle, and nobody was watching the warning signs.
Satisfactory and Factorio are the most CPU-intensive multiplayer games I've ever hosted. They aren't like FPS servers where 64 players mostly generate network traffic. These games simulate entire industrial systems, every tick, forever. Every belt, every inserter, every assembling machine, every logistics bot. The server does real physics and real math on thousands of entities simultaneously. When the factory grows, the server load grows with it. There's no cap except hardware limits.
After losing that Saturday session (and the save file getting corrupted because the crash happened mid-autosave), I decided to actually figure out how to monitor these servers properly. Not just "is the port open," but "is this thing about to melt down, and can I do something before game night?"
Why Factory Servers Are Different From Other Game Servers
I've hosted Minecraft, Valheim, Terraria, and a few other game servers over the years. Factory games are a completely different animal. Here's why:
The load grows over time. A fresh Factorio world uses barely any CPU. A 40-hour megabase might peg all cores. The server that ran fine on day one can be completely inadequate by week three. This is the opposite of most game servers, where the load is roughly constant regardless of how long players have been playing.
UPS is everything. Factorio targets 60 updates per second (UPS). When the server can't keep up, the entire game slows down for everyone. It's not like lag in an FPS where you might rubber-band for a second. The whole world literally moves in slow motion. Belts crawl. Production halves. It's miserable to play on a server running at 30 UPS when you're used to 60.
Save files are huge and fragile. A late-game Factorio save can be hundreds of megabytes. Satisfactory saves can be even larger. These files are written to disk periodically, and if the server crashes during a save operation, the file can get corrupted. That means losing everything since the last good backup. In our case, the last good backup was from the previous day, so we lost about 6 hours of four people's work.
Mods multiply the risk. Factorio's mod ecosystem is enormous. We were running about 15 mods on our server. Mods add entities, scripts, and overhead. They can conflict with each other after updates. One bad mod update and the server won't even start, or worse, it starts but corrupts the save.
Co-op sessions are scheduled. This is the part that makes downtime hit harder. Our group picks a time, usually Saturday afternoon, and everyone commits to it. Work schedules, family stuff, time zones (one of our players is in a different country). When the server dies during a scheduled session, you can't just say "we'll play tomorrow." The next available slot might be a week away. Every minute of downtime during game time is disproportionately expensive in terms of group frustration.
The Problems I've Actually Seen
Let me go through the specific issues I've run into hosting these games. Not theoretical stuff. Actual things that broke and cost us play time.
Memory exhaustion
This is the most common killer. Our Factorio server started on a VPS with 8 GB of RAM. It was fine for the first 20 hours. By hour 35, the server process was using 7.2 GB. By hour 40, the OOM killer stepped in and nuked the process. No graceful shutdown. No final save. Just dead.
The fix was upgrading to 16 GB, but the real fix was monitoring. If I'd been tracking memory trends, I would have seen the climb weeks before it hit the ceiling and upgraded proactively instead of reactively.
CPU bottleneck and UPS death
Factorio is famously single-threaded for its main simulation loop. One fast core matters more than many slow cores. Our base hit a point where the single-threaded simulation couldn't keep up, and UPS dropped below 40. The server was technically "up" and accepting connections, but the game was unplayable. Players would join, see the slideshow, and leave.
Response time monitoring helped here. When the server starts lagging behind, its response to connection probes gets slower. A port check that normally returns in 5ms suddenly taking 200ms is an early warning that the server is under heavy load.
Mod update broke everything
One of our players updated a mod locally, which triggered a mod update on the server when they connected. The updated mod had a conflict with another mod in our pack. Server crashed on the next tick. It crashed on every restart attempt too, because the save file now referenced the new mod version.
The monitoring alert for the port going down fired within two minutes. Without it, we wouldn't have known until someone tried to connect, which could have been hours later (or worse, at the start of our scheduled session).
Hosting provider network blip
We were on a budget game hosting provider. One evening their network had issues for about 20 minutes. The server process was running fine, the save was fine, but no player could connect because the network was unreachable. Ping monitoring caught this immediately. Without it, we would have assumed the game server crashed and spent 20 minutes SSHing in to investigate a problem that wasn't on our end.
Save corruption after crash
I mentioned this earlier. Server crashed during an autosave. The save file was half-written and unreadable. Factorio couldn't load it. We had to roll back to a backup from 6 hours earlier. Six hours of four people's work, gone.
Now I run autosave backups every 30 minutes with a 24-hour retention window. And monitoring alerts tell me the moment the server goes down, so I can check whether the save is intact and trigger a manual backup of the good state before trying to restart.
The "server is up but nobody can find it" problem
Satisfactory has a server browser, and Factorio has its own server list. Both use a separate query mechanism (often a separate port) from the actual game connection. Our Satisfactory server's game port was fine, but the query port (15777) had stopped responding. Players looking at the server browser couldn't see our server. Direct connect still worked, but nobody thought to try it. They just assumed the server was down.
A port monitor on the query port would have caught this instantly. This is a case where monitoring just the game port isn't enough. You need to monitor every port that matters for the player experience.
What to Monitor and How
Here's the monitoring setup I use now for our factory game servers, using UptyBots:
Game port
- Satisfactory: UDP 7777 for game traffic
- Factorio: UDP 34197 (default)
- Check interval: every 2-3 minutes
- This is your primary "is the server alive" check
Query port
- Satisfactory: 15777 for server browser queries
- Factorio: uses the same port for game and queries by default
- If players can't see your server in the browser, this is usually why
RCON / admin port
- If you use remote administration, monitor it separately
- Losing admin access during an emergency is the worst possible timing
Ping (ICMP)
- Baseline network reachability check
- Catches hosting provider outages and network-level issues
- If ping dies, the problem is below the game server level
Response time trends
- Track how fast the server responds to connection probes over time
- Rising response times often predict crashes and UPS drops
- Set alerts for response times that exceed your normal baseline by a significant margin
Multi-Region Testing Matters
If your group has players in different countries (ours does), multi-region monitoring is worth setting up. UptyBots checks from multiple geographic locations, which means you'll catch regional issues that only affect some of your players.
We had a case where our server was perfectly reachable from within the US but our European player couldn't connect for two hours due to a routing issue. Without multi-region monitoring, we'd have spent those two hours troubleshooting his local network when the problem was actually upstream.
Notifications: Getting the Alert Where It Matters
For game servers, Discord is usually where your community lives. UptyBots supports webhook alerts, which means you can pipe notifications directly into a Discord channel. When the server goes down, everyone who needs to know finds out immediately, right where they're already chatting.
I also set up Telegram notifications for myself as the admin. That way I get a ping on my phone even if I'm not looking at Discord. Email is my third channel, mainly as a paper trail.
The key is having at least two channels. If Discord's webhook is delayed (it happens), Telegram catches it. If my phone is on silent, the email is there when I check later.
Game-Specific Tips
Satisfactory
- Default game port: UDP 7777, query port: 15777
- Save files grow large with elaborate builds. Budget storage accordingly.
- Coffee Stain pushes updates that sometimes break server compatibility. After any game update, check that the server comes back online.
- Player count has a bigger impact on Satisfactory server performance than factory size, in my experience. Four players is already pushing it on modest hardware.
- The modding scene is growing but still smaller than Factorio's. Fewer mods means fewer compatibility headaches.
Factorio
- Default port: UDP 34197
- UPS (updates per second) is the single most important metric. Anything below 60 means the game is running in slow motion.
- The headless server is mature and reliable when configured properly. Most crashes I've seen are mod-related or resource-related, not bugs in the base game.
- The mod ecosystem is massive. Pin your mod versions. Don't let auto-updates happen on a production server.
- Large saves take noticeable time to write. Autosave can cause brief stutters that players notice but aren't actually crashes.
- The netcode is sensitive to latency and packet loss. Players on bad connections will desync frequently.
- Factorio uses a lockstep simulation model, which means all clients must stay in sync. If one player has a bad connection, it can affect everyone.
Best Practices I've Landed On
After a year of running factory servers and dealing with every problem on this list at least once, here's what I'd tell anyone starting out:
- Monitor the game port and the query port separately. They can fail independently, and both matter.
- Set up automated backups every 30 minutes. Keep at least 24 hours of history. Test restoration by actually loading a backup save periodically. Don't just assume it works.
- Don't update mods on the main server without testing first. Run a staging instance, load the save, apply the mod update, see what happens. Only then push to the server everyone plays on.
- Schedule daily server restarts. Factory servers accumulate memory bloat over long running periods. A clean restart clears leaked memory and resets state. I restart ours at 5 AM when nobody is playing.
- Overspec your RAM. If you think you need 8 GB, get 16. Factory game memory usage only goes up. Underspeccing RAM is the single most common cause of crashes I've dealt with.
- Watch response time trends over weeks, not hours. A factory server that's fine today might be struggling in three weeks as the base grows. Trending helps you upgrade before you crash.
- Communicate with your group. When something breaks, post in Discord immediately. Players are way more forgiving when the admin is transparent about what happened and what's being done.
- Document your recovery process. Write down exactly what to do when the server crashes, including how to restore from backup, how to roll back a mod update, and who has SSH access. When it's 11 PM and the server is dead, you don't want to be figuring this out from scratch.
- Use alerts wisely. Don't alert on every single check failure. Set a threshold of 2-3 consecutive failures to avoid false alarms from network blips. But for scheduled game nights, consider tightening it up.
Protecting Scheduled Sessions
This is the thing that matters most to my group, and probably to yours too. We play on Saturday afternoons. That's the window. If the server is dead during that window, we lose a week.
Here's what I do to protect our sessions:
- Pre-session health check. About 30 minutes before game time, I glance at the monitoring dashboard. Is the port up? Are response times normal? Any alerts in the last 24 hours?
- Fresh restart before the session. I restart the server 15 minutes before start time to clear any accumulated memory bloat and ensure a clean state.
- Manual backup right before starting. If the server does crash during the session, the most I'll lose is the time since that manual backup.
- Monitoring at tighter intervals during game time. I bump check frequency up during our play window so I get alerted faster if something goes wrong.
- A fallback plan. If the main server is down and can't be recovered quickly, I have a backup save and a procedure to spin up a temporary instance. We've used it once. It's not fun, but it's better than canceling game night.
Real Scenarios That Monitoring Saved
- Saturday morning crash. Server OOMed at 9 AM, six hours before our session. Monitoring alerted me immediately. I upgraded RAM, restored the latest backup, and had the server running by noon. Without monitoring, I wouldn't have known until 3 PM when everyone tried to connect.
- Mod conflict after Factorio update. Factorio pushed a minor update. One mod in our pack wasn't compatible. Server crashed on restart. Port monitoring caught it within 2 minutes. I pinned the game version, rolled back the mod, and we were back online in 15 minutes.
- Silent query port failure on Satisfactory. The game port was fine, but the query port died. Players couldn't find the server in the browser. Port monitoring on 15777 caught it. I restarted the server and the query port came back. Without the separate monitor, we'd have been debugging this during game time.
- Hosting provider maintenance. Our hosting provider did unscheduled maintenance at 2 AM. Ping monitoring detected the outage. When I woke up, I saw the alert, checked the timeline, and confirmed the server came back on its own. No action needed, but I knew exactly what happened.
Frequently Asked Questions
What is the default Satisfactory server port?
Satisfactory uses UDP 7777 by default. The query port is 15777. Custom hosts may use different ports.
What is the default Factorio server port?
Factorio uses UDP 34197 by default. Custom hosts and dedicated server configurations may use other ports.
How often should I back up the save file?
Every 30 minutes minimum for active servers. Keep at least 24 hours of backup history so you can roll back through multiple saves if the most recent ones are corrupted. Test your restoration process regularly.
How much RAM does a factory server need?
For Factorio with mods and a growing factory, plan for 8-16 GB minimum. Vanilla servers can get by with less early on, but the usage grows over time. Satisfactory dedicated servers also want 8-16 GB. I always recommend buying more than you think you need, because running out of memory mid-session is the worst way to find out you were wrong.
Can UptyBots monitor a self-hosted home server?
Yes, as long as your server is reachable from the public internet with proper port forwarding. UptyBots connects to your server's public IP and port to verify availability. If you're behind CGNAT or don't have a public IP, you'll need to use a tunneling service first.
Wrap Up
Satisfactory and Factorio servers aren't like other game servers. The CPU demands grow with the factory. The save files are large and fragile. The sessions are scheduled and hard to reschedule. A crash doesn't just mean "restart and reconnect." It can mean corrupted saves, lost progress, and a group of friends who blocked out four hours of their weekend for nothing.
Monitoring won't prevent every crash. But it catches problems early, gives you time to respond before game night, and protects the hours your group has invested in your shared worlds. UptyBots handles port monitoring, ping, response time tracking, and multi-channel alerts, which covers everything I need to keep our factory servers healthy.
Your megabase deserves better than dying quietly at 3 AM with nobody watching.
Start monitoring your cooperative servers today: See our tutorials.