When time is critical, exclude large files from your Postgres restore and save the world at the same time.
Imagine you’re a super secret spy. You’ve infiltrated the enemy’s datacenter and you need to dump their database to save the world from being destroyed by the evil villain’s satellite death rays. While fighting off ninjas, robot assassins, and a very fat Scottish overlord, you manage to dump the entire database to a secure thumbdrive. Then you make your escape in your pedal helicopter.
To save the world you need to restore the dumped file to your own database and edit the passcodes to the death ray satellites. Time is critical, you only have 10 minutes before all the major cities of the world are destroyed. You now have two options.
Restore the entire database to your Postgres server with the command
pg_restore -d postgres://localhost/postgres evil-plans-database.dump Restoring Database ############............50% complete
SORRY, You failed, all the major cities of the world have been destroyed by the evil villain’s death ray. 💀
While dumping the database and fighting off the ninjas, robot assassins, and the fat Scottish overload, you noticed that there are some really big tables that only hold log information
event_log. These tables are huge, yet all the other tables are relatively small Look out! DUCK!! 💥PUNCH💥 💥KICK💥 . That was close, you were almost hit by a flying ninja banana 🍌.
I know what you’re thinking “Postgres restore doesn’t have an exclude table option, only pg_dump has the
This is true, but
pg_restore does has an option for listing the tables in a dump file
-l --list List the table of contents of the archive. The output of this operation can be used as input to the -L option. -L list-file --use-list=list-file Restore only those archive elements that are listed in list-file, and restore them in the order they appear in the file. list-file is normally created by editing the output of a previous -l operation. Lines can be moved or removed, and can also be commented out by placing a semicolon (;) at the start of the line.
It’s saying we can print out the tables that are in the database dump file, edit them (or comment them out with a
;), and use that to tell pg_restore what tables to restore.
Using the power of
grep, we can remove the lines that mention the tables we want to exclude. Our new restore command is a bit longer, but let’s give it a try.
pg_restore -O -x -L <(pg_restore -l ./evil-plans-database.dump | grep -ivE 'TABLE DATA public (event_log|search_log)') -d postgres://localhost/postgres ./evil-plans-database.dump Restoring Database ########################100% complete 🎉
You Just Saved The World From Certain Destruction!
In this new command we are doing the following
pg_restore -l ./evil-plans-database.dump: Dumping the tables that are in the db dump file
grep -ivE 'TABLE DATA public (event_log|search_log): Piping the list into grep and excluding lines that match our table names
pg_restore -O -x -L <(PREVIOUS_COMMAND) -d postgres://localhost/postgres ./evil-plans-database.dump: Piping the updated list to
-Lflag of our restore command.
We found a way to restore a Postgres database while excluding certain large tables in the process. It’s been a good day for heroes, take a break and enjoy a nice cold drink.
Did this help you out? It took me a few days to piece together all this information together, I hope this saves you some time (who knows, maybe the future me will be thankful I wrote this down). Let me know your thoughts. firstname.lastname@example.org