This week, I learned some of the dangers cascading associations in Doctrine, and how they can very easily (and accidentally, of course) delete your entire database.
CASCADE in SQL
In standard SQL, you can have changes in parent-child relationships be cascaded from the parent to the child.
CREATE TABLE parent ( id INT PRIMARY KEY ); CREATE TABLE child ( child_id INT PRIMARY KEY, parent_id INT, FOREIGN KEY parent_id REFERENCES parent (id) ON UPDATE RESTRICT ON DELETE CASCADE );
Here is a quick example. The child table (conveniently labeled “child”) has a column that is intended to reference the primary key of the parent table. Of course, there is no requirement that you reference the primary key. As long as it is a unique key in the parent table, it can be used in a foreign key in the child table.
The important part is at the end: “ON UPDATE … ON DELETE …”. This creates a policy for when the parent table is changed. This is an important point. The foreign key is in the child table, but the cascade policy only takes effect when something happens to the parent table.
- ON UPDATE: creates a policy of what happens when a row in the parent table is changed. Or, more specifically, when a column in parent table that is part of the foreign key is changed.
- ON DELETE: creates a policy of what happens when a row in the parent table is deleted.
And the available policies are:
- RESTRICT: as the name implies, it restricts changes to the parent table. If you try and update or delete a row that would result in an orphaned row in the child table, it raises an error, and stops you from making the change.
- CASCADE: when a change is made in the parent table, that change is cascaded to the child table. So if the parent table’s column is updated, the referencing column in the child table is updated as well. Also, if a row in the parent table is deleted, any rows in the child table that reference it are also deleted.
- SET NULL: (only for delete) when a parent row is deleted, the referencing columns in the child table are set to NULL. Of course, this requires those columns to be nullable.
These policies are useful in cases where database entities follow a strict parent-child relationship. For example, if you have a table for song artists and another table for all of their songs, changing the name of an artist should also change that artist’s name on all of their songs. (This is assuming you are using a natural foreign key.)
Or similarly, if an artist is suing your company and wants to be removed from your database, deleting their entry from the artist table should simultaneously delete all of their songs from the songs table.
In the end, what type of cascade behavior you use depends on your workflow.
<cascade> in Doctrine
Doctrine, being an ORM, has similar cascade functionality as SQL. In fact, one might even think they are one and the same.
Wrong. They are very different.
Excerpting from the Doctrine documentation:
Cascade operations are performed in memory. That means collections and related entities are fetched into memory, even if they are still marked as lazy when the cascade operation is about to be performed. However this approach allows entity lifecycle events to be performed for each of these operations.
However, pulling objects graph into memory on cascade can cause considerable performance overhead, especially when cascading collections are large. Makes sure to weigh the benefits and downsides of each cascade operation that you define.
To rely on the database level cascade operations for the delete operation instead, you can configure each join column with the onDelete option. See the respective mapping driver chapters for more information.
As the documentation says, SQL cascading and Doctrine cascading are separate, although both can be implemented in an object definition.
Here’s an example of a configuration for Doctrine cascading:
<doctrine-mapping> <entity name="parent"> <id name="id" /> <one-to-many field="children" target-entity="child" mapped-by="parent_id"> <cascade><cascade-remove /></cascade> </one-to-many> </entity> <entity name="child"> <id name="child_id" /> <many-to-one field="parent_id" target-entity="parent" inversed-by="children"> <join-column name="parent_id" referenced-column="id" /> </many-to-one> </entity> </doctrine-mapping>
Note that, as mentioned in the documentation, you can mix both
Doctrine and SQL cascading using the
properties of the
<join-column> element. In this scenario, the SQL
cascade policy will act first, thus allowing for fewer
queries. However, in databases without CASCADE support, the cascade
policy will still be enforced via the Doctrine policy.
The Difference Between the Two
There is something critical to note here: SQL cascade operates on keys (or indexes), Doctrine cascade operates on entities.
As proof of this, find the difference between the SQL code and the
Doctrine configuration. In the SQL,
ON DELETE CASCADE is put in the
child table’s definition, whereas in Doctrine,
put in the parent entity’s definition.
In Doctrine, an operation on one entity cascades to other entities, and the cascade policy is defined on that entity. But in SQL, every child table determines whether it or not it will subscribe to events of the parent table. This is a result of the policy being defined on the foreign key, which itself is part of the affected table’s definition.
But don’t be confused: even in Doctrine, the cascade policy is a property of each individual parent-child association. It is almost functionally equivalent to SQL cascade policies. The only difference is the physical location in the configuration where the policy is defined. And if you set the policy in the wrong place, it will not go in the direction you want it to.
In other words, if you do not think carefully about where the cascade policy needs to be, you may be accidentally deleting entities!