How to handle Discourse backups?

tanner · February 20, 2014, 12:00am

How should we handle Discourse backups? The two options we’re considering are use only GitHub to store our files (without passwords, etc.) or use GitHub plus AWS S3 or something similar. It’ll be just /var/www/discourse. I’d prefer the latter, because I’m not a fan of having a single place to store our backups. In addition, GitHub has no sort of SLA, while AWS has 99.999999999% data durability guarantee for S3.

Thoughts?

yousef · February 20, 2014, 12:17am

Removing passwords would mean we can’t completely automate the backups, and
ideally the backups are compressed and encrypted.

tanner · February 20, 2014, 12:28am

Right, forgot about that. I personally think our best route would be something like this script which can handle encryption and then syncing to S3.

tad · February 20, 2014, 9:48am

I’m not sure that we need to be backing up anything that’s in a git repo.
If the servers need restoring, you do git pull. If for some crazy reason (which I’m yet to find), GitHub suddenly crashes and loses all of our stuff, we go do git remote add , git push

You mention passwords. Do you mean the passwords for the DB?

tanner · February 20, 2014, 2:51pm

Before this turns into another argument that leads nowhere, maybe @mrz or
@wdowling can chime in with some advice?

logan · February 20, 2014, 3:52pm

I see where tad is coming from. We ideally shouldn’t be modifying any core files of Discourse, so a checkout of a previous version is generally enough to restore. But we should definitely be backing up our config files somewhere, as they are obviously not stored in Discourse’s official repo.

I guess I could play devil’s advocate and say, “Hey, we don’t control Discourse’s repository! What if it gets completely borked and we can’t restore a previous version?” In that case, to play it safe, we could back up our entire installation for redundancy. I’d like some more input on whether that would be a necessary thing.

tad · February 20, 2014, 4:00pm

Config files should be in a git repo, and we should have a fork of Discourse (with our configs)
Passwords should be stripped, and we can add them back in using something similar to what JP uses. S3 bucket containing files with passwords, shell script which downloads them and echos them to where they are needed. We can execute the script using puppet.

I can understand that redundancy is important, but we could go forever on adding more redundancy. Think about it this way:
Our Git Repo = Mirror of web01 = Mirror of web02

Remember that everytime we add another node, that again is another mirror

wdowling · February 20, 2014, 5:03pm

I like the idea of storing our configs in Git. Is there a away to take snapshots of the server images in AWS?

logan · February 20, 2014, 5:15pm

Yup. However, restoring from them requires creating a new instance, according to @yousef.

wdowling · February 20, 2014, 10:07pm

That’s fine and expected - pretty much the same as in HP.

tanner · February 20, 2014, 10:08pm

It’s also best the server is shut down, according to Amazon, otherwise it may not properly copy everything over.

Topic		Replies	Views
Backup service Community Ops	4	1230	October 19, 2015
WPMS Backups Community Ops	3	739	June 8, 2014
Discourse maintenance best practices Community Ops	9	1779	December 13, 2014
Discourse S3 migration plan Community Ops/Discourse	5	1673	May 11, 2015
We should have this Community Ops	2	736	April 6, 2015

How to handle Discourse backups?

Related topics