How mysqldump works
I was looking for performance impact of running mysqldump on a server when I realized that there is no good article which describes how mysqldump actually works. And That was a good enough reason for me to look into it’s pretty (Not that pretty) code. So let’s get into it.
There are a lot of options you can run it with, I will just explain few of them in this, but first let’s see what you came here to see, how it actually dumps all that data:
I have shrank the code to make it easier to understand:
Consider this simple scenario where one wants to dump all the databases, now depending on what options are set, here are the steps that would generally be followed:
- Find all the database that exist
- For each database, find the tables inside and their column names
- Run either SELECT INTO OUTFILE query, or SELECT * FROM query
- Save the results into an output file honouring the options provided (this one looked complicated)
And below is the table dump function:
Now ofcourse it’s not that simple, there are a ton of things that I ate for the sake of simplicity:
- Locking of tables: opt_lock_all_tables
- Doesn’t dump replication metadata tables
- Exclude certain tables from dump
- — single-transaction: Save snapshots before starting the dumps process and handle their lifecycle
- some more
This was all for today, please do let me know if you find it helpful or if you want me to write on something.