Database backup/restore in Magento

Backup/Restore usually means lots of different operations, for example, creating a DB file for a further complete restoring.
This action helps when the DB has some critical failure, but it isn’t usable in most others cases. Due to the events that take place between the moment of backup making and the moment of restoring, some data lost after complete restoring. It also affects the data that doesn’t need to be changed.

That’s why it is reasonable to do partial backup instead of a complete one.

This process isn’t simple, that’s why coding implemented for its realization. When you do coding, it is quite useful to analyze the cases to make more flexible and precise backups/restore. But partial backup has the disadvantage – it takes more time. Especially, it is true for Magento 1.x because analyzing collections and entities updates is quite a slow process.
That’s why in some cases it can be reasonable to do backups and restore via direct operations with the database. But you need to use the database’s facilities and some “manual” work as well.

I described one of the variants below. It was made by the working directly with the database tabs.

Magento backup/restore case

The database had lots of outdated products. They were disabled and during the optimization process, there was a decision to delete them.
Products deleting is a quite simple process because Magento database deletes products correctly without affecting the important products.

See SQL command:

DELETE FROM `catalog_product_entity` WHERE `entity_id` IN (....);

It will delete the products from the database but will save the info about the purchases. The entities connection will be destroyed and the order editing won’t be available, but the purchase info will be saved in a full volume.
So, this is what has been done.

Before deleting the backup was done. After some period of time, this backup became in-demand because of new circumstances revealed. There was a decision to get back outdated products.

Initial data:

current database without products that have been deleted;

database file with the products that need to be deleted.

The task:

Restore the products via SQL request.

Conditions:

Since the last backup, there were changes in the database – 1000+ products were added.

Fortunately, their IDs do not conflict with the deleted products IDs.

But SKUs conflict. Some of the new products have the same SKUs as the deleted products.

After discussing the issue we revealed that the new products with coinciding SKUs are equal to the products that were deleted. They were automatically imported from others systems and got new IDs.
The obstacle was in 2-step import: restore the data and deliver it to the production server.

Data import steps

All operations with the database require a source that’s why it is needed a space to deploy backup file. It can be deployed to the separate database. It can require a lot of space but there is no difficulty to find this space on a local machine.

Or it can be deployed to the current (local copy of a target database) as separate tabs that can be created by demand. It isn’t always convenient to create such tabs from the file. It is at the discretion of the developer and depends on a task’s conditions.

First, prepare the data for the tab catalog_product_entity. For this purpose, we need to delete all lines with the same entity_id in the backup file. We do not do anything with these products in the current database.

In this case, we rewrite IDs in the new database to the IDs from the backup file (82033 – the last ID from Bakup file). Of course, you can do the opposite and rewrite IDs to the new ones. But it is more convenient to restore the database with the old IDs.

After fixing IDs for coinciding SKUs, it is ok to repeat the operation of deleting the same lines because we don’t need the products that were restored via IDs changing.
But keep in mind that sometimes the deleted products can contain the data that the new ones don’t have. Do not do deleting in this case because the old data can be added to the current database a little bit later.

When IDs get back to the original values, it is time to restore required products from the backup. Use regular operator INSERT .. SELECT without any modifications or filters because the database contains only lines that aren’t in the current database.

It is ok to use the option of the request IGNORE, to move records via INSERT IGNORE .. SELECT. Keep in mind that this option can do badly if there are duplicating original keys. It will ignore the insert errors. It would be great to check what the operator work on before running any operator (INSERT/UPDATE). As usual, there is no difficulty to transform the operator INSERT into SELECT that will bring conflicting lines.

To avoid using INSERT and missing conflict run the command:

SELECT * FROM `catalog_product_entity` WHERE `entity_id` IN (SELECT entity_id FROM `catalog_product_entity_backup`);

The same command has to be done for every attribute tab. I do not see reasons to do the introductory checking, and IGNORE option resolves possible conflicts.

Do not add the line value_id in the command.

The reason is that these IDs can be occupied. We don’t need their values that’s why it is better to let the system creating new IDs for the inserted rows.
If you have done the check before and there is on conflict of rows IDs, adding the line alue_id to the command can be even useful for the further changes delivery to the production.

You can add warehouse data at this step but it is better to do reindex and let Magento update different tabs by the products as well as create warehouse records for these products.

Then you need to restore data about products connections with websites and some others data via the simple commands:

catalog_product_website

catalog_category_product

catalog_product_option (all tabs set)

and others

It will good to check others tabs of the database. Often, there are products links in the tabs of the extensions, and there is no connection FOREIGN KEY between products and tabs. That’s why IDs in such tabs can be non-updated and provide conflicts.

If you haven’t done attributes and warehouse update, it is a good time to do this.

Restoring connections for config and others complex products

Check transferring of all records from the tabs catalog_product_link_xxx and catalog_product_super_xxx. After this, copy the records of the tab catalog_product_relation. These tabs contain the connections (FOREIGN KEY) for the inner system and for the others tabs.

Use the command INSERT IGNORE .. SELECT twice for the whole set to copy the data with minimum efforts. The records that won’t be copied at first try will be copied at the second time when dependencies restored.

You can disable checking external key but it will be hard to control the database and you can get broken data structure. It will more efficient to inspect storage subsystem and run commands in the order that won’t cause errors in checking external keys.

For example, the tabs order of the subsystem catalog_product_link_xxx:

catalog_product_link_type

catalog_product_link

catalog_product_link_attribute

catalog_product_link_attribute_int

catalog_product_link_attribute_decimal

catalog_product_link_attribute_varchar

If to run the records transferring commands in this order, there won’t be conflicts inside the system.
The restore process can be considered as completed at this step. After reindex, the products data will be available for all Magento parts.

What issues can you face with?

It is useful to add an extra line to the tab catalog_product_entity. The line can be named, for example, old_entity_id. Save all original IDs in this line at the first command. It will give an opportunity to restore original products IDs at any moment in case of some unexpected results or to clean the database from the trash. This line can secure in case of importing data into some tabs.

But do not add such lines to the database on a production server.

How to move data to a production server

There is no reason to repeat all the actions on your production server. The simplest way to do this is to prepare a full dump of the required tabs via SQL. Change commands INSERT to INSERT IGNORE in the text editor to prevent errors appearing while a try to duplicate inserting.

As a rule, the utilities for the dump preparing have an option of adding IGNORE.

But it will be more professional to do dump in the way that will provide including in the dump only products that aren’t on the production server.

The way to do this is to “sacrifice” the local database and “spoil” it by leaving in it only the products that have been created by the restoring process.
If there was added the extra line for old IDs in the product tab before the restoring, everything that wasn’t affected by the restoring can be deleted via the command:

DELETE FROM `catalog_product_entity` WHERE `old_entity_id` > 0;

The line old_entity_id will be empty for restored products.

If there wasn’t added such extra line, you can do the same in this way:

DELETE FROM `catalog_product_entity` WHERE `entity_id` NOT IN (SELECT entity_id FROM `catalog_product_entity_backup`);

Due to Magento using FOREIGN KEY with the option ON DELETE CASCADE almost everywhere, the database server will delete all records from the main products tab as well as all connected data.

Another issue that you have to pay attention to while creating the dump of several tabs for further importing to the production database: while the restore was on the preparation stage, there can be added new data, for example, to the attributes tab. In this case, a part of the data from the local database won’t be transferred because of the initial keys conflicts.

There are 2 ways to solve this issue:

do not specify the record ID in the dump for the attributes tab. In this case, the ID will be created with the value required by the database server while importing. (this method is preferable)

enlarge the length of an ID of the records from local database. Its length has to be larger than the length of the ID in the same tab in the production database.

And finally, do not modify tabs while reindexing or conducting similar process. It can cause slowdowns or even rejects by the database deadlocks.

The summer's over but not the #summervibes! Mobecls team invites you to dive into the summer once again and join a perfect #teambuilding event full of warm feelings, nice weather and high spirits.
https://t.co/nyPqUaUDIB

Some shots from #MM19NYC The event was absolutely marvelous. Wonderful time, amazing people.
Thank @magemojo @MarshaMarshaMa_ and everyone (a pizza guy too) for making this event so great.
Waiting for new meetings!

Mobecls Ltd. is a development company specializing in Magento that was created in 2010 in Minsk, Belarus by experienced developers, marketers, and SEO-specialists. We united our skills and knowledge to create new kind of service – intelligent development for Magento. It means that we provide development service, built specifically for your business, which takes into account your business model and market changes. Knowledge in different e-commerce spheres makes it possible to provide complex solutions with exceptionally high rate of success and impact on your business. We have more than 90 000 hours of resolving issues and solved more than 8 000 projects. We never stop to enrich our knowledge base as well as share experience with our clients.