Restoring a database from Barman

Accidents happen even to the best of us. A few weeks ago, before writing this post, I've accidently DROPped a database on the production server. Certainly a very clumsy manipulation, but hey, clumsiness is part of the game, right ?

Anyway. I had to restore the database one way or another. For those who follow my RSS feed or read me since a while will certainly remind themselves that I use Barman (Article in French) for my database backups.

Getting the backup

Fortunately, the affected application was screaming in scandal from not being able to connect to the database, therefore I've been able to get the time of the incident.

To get a backup from the databse, nothing simplier or nearly:

barman recover \
    --target-time "date" \
    --remote-ssh-command "user@host" \
    server \
    backup \
    /path/of/restauration

I think it goes without saying to make sure that sufficient disk space to accomodate the whole data folder of PostgreSQL and assign the right permissions on the folder which will contain your backup.

Configuring the temporary PostgreSQL server

What you get is the complete database server, configuration and other databases. However, there's of course a few things to tweak because PostgreSQL, since version 12, refuses to start with a recovery.conf file, even empty.

Path variables like:

data_directory
hba_file;
ident_file

will have to be changed.

Commenting or changing the path of external_pid_file and pointing stats_temp_directory to an existing folder with the right permissions should do the trick. A folder created on the go will suffice.

Finally, because we don't have a snakeoil certificate to our disposal, at least I don't have, we can deactivate SSL.

{% warning(title="Mind the bikes") %} This installation is temporary. Deactivating the SSL function in this environment will not be an issue. However, don't do it in a production environment !

I think this warning is implicit but I prefer pointing things out how they should be. {% end %}

Finally, we can start the PostgreSQL server on another port:

postgres -p port -D /path/to/data

Recovering the desired database

If everything goes well, PostgreSQL shouldn't display any errors and should replay the transaction log, then stop to the specified target time.

We'll be able to connect to this very temporary server to verify if the choosen database is present:

# Connect
psql -p port -h host -U user postgres

# List databases
\l

If everything is here, perfect. If it's not the case, restart from square zero with a new copy. Moving back recovery_target_time will cause an error because the database cluster is a bit too new.

Finally, we can recover the desired database with a bit of pg_dump

pg_dump -p port -h host -U user database > base.sql

and restore it on the production server which has been still doing its thing:

sudo -u postgres psql < base.sql

Et voilà. Everything should happen without a hiccup. We can stop the temporary server and delete its data folder; we don't need it anymore.

The end word

A process that should have taken less time that it has taken on my side. Now that I've done once and documented it for the others and myself, I think the next "oopsie" should be fixed in less time. Should be, he said.

Hoping the next one will happen in a while.