Accidents happen even to the best of us. A few weeks ago, before writing this post, I've accidently
DROPped a database on the production server. Certainly a very clumsy manipulation, but hey, clumsiness is part of the game, right ?
Anyway. I had to restore the database one way or another. For those who follow my RSS feed or read me since a while will certainly remind themselves that I use Barman (Article in French) for my database backups.
Getting the backup
Fortunately, the affected application was screaming in scandal from not being able to connect to the database, therefore I've been able to get the time of the incident.
To get a backup from the databse, nothing simplier or nearly:
barman recover \ --target-time "date" \ --remote-ssh-command "user@host" \ server \ backup \ /path/of/restauration
I think it goes without saying to make sure that sufficient disk space to accomodate the whole data folder of PostgreSQL and assign the right permissions on the folder which will contain your backup.
Configuring the temporary PostgreSQL server
What you get is the complete database server, configuration and other databases. However, there's of course a few things to tweak because PostgreSQL, since version 12, refuses to start with a
recovery.conf file, even empty.
Path variables like:
will have to be changed.
Commenting or changing the path of
external_pid_file and pointing
stats_temp_directory to an existing folder with the right permissions should do the trick. A folder created on the go will suffice.
Finally, because we don't have a snakeoil certificate to our disposal, at least I don't have, we can deactivate SSL.
Finally, we can start the PostgreSQL server on another port:
postgres -p port -D /path/to/data
Recovering the desired database
If everything goes well, PostgreSQL shouldn't display any errors and should replay the transaction log, then stop to the specified target time.
We'll be able to connect to this very temporary server to verify if the choosen database is present:
# Connect psql -p port -h host -U user postgres # List databases \l
If everything is here, perfect. If it's not the case, restart from square zero with a new copy. Moving back
recovery_target_time will cause an error because the database cluster is a bit too new.
Finally, we can recover the desired database with a bit of
pg_dump -p port -h host -U user database > base.sql
and restore it on the production server which has been still doing its thing:
sudo -u postgres psql < base.sql
Et voilà. Everything should happen without a hiccup. We can stop the temporary server and delete its data folder; we don't need it anymore.
The end word
A process that should have taken less time that it has taken on my side. Now that I've done once and documented it for the others and myself, I think the next "oopsie" should be fixed in less time. Should be, he said.
Hoping the next one will happen in a while.