Recover from Database Corruption
As good of a database Postgres is, unfortunately data corruption can still happen (e.g. a disk is failing, or an unsafe shutdown occurred). Should this happen, this document is intended to help you recover from the situation.
To execute this procedure, you’ll need to be running GreenArrow 4.274.0 or later.
Recover a database with invalid unique indexes
GreenArrow uses a variety of unique indexes to track data. An example of this usage is in Studio’s subscribers tables. For mailing lists that are not using an alternate primary key, we have a unique index on the lowercased email address. If multiple subscribers somehow get into the mailing list (usually due to a database corruption), this would invalidate this unique index.
In cases like this, we recommend restoring from a backup and repairing the database corruption during the restore process.
To do this, you’ll follow the Restoring Backups procedure as normal, but STOP before
running step 16 (this is the step that calls for you to run
hvmail_ini_postgres --restore-from-backup .).
In its place, you’ll do the following:
hvmail_ini_postgres --restore-from-backup . --wait-before-creating-indexesto begin restoring the Postgres database.
- Wait for the above command to print
schema and data loaded, type "continue" to continue restore.
- In another terminal/shell, connect to Postgres with the command
/var/hvmail/postgres/bin/psql -U greenarrow greenarrow.
- Repair the database corruption. If your database corruption is in your subscriber tables, we provide a demonstration of how to do this in the “Create and execute stored procedure for repairing subscriber tables” section below.
- Type continue in the original terminal/shell to proceed with creation of indexes.
- Resume the Restoring Backups procedure, skipping step 16 and moving onto step 17.
Create and execute stored procedure for repairing subscriber tables
This query will create a stored procedure for repairing subscriber tables:
CREATE OR REPLACE FUNCTION s_remove_duplcate_emails_from_subscribers_table( in_table_name varchar ) RETURNS BOOLEAN AS $$ BEGIN EXECUTE ' CREATE TEMPORARY TABLE zzz_subscribers_table_fix AS SELECT mailing_list_id AS "mailing_list_id", LOWER(email) AS "lower_email", MIN(id) AS "id_to_keep" FROM ' || in_table_name || ' GROUP BY 1,2 HAVING COUNT(1) > 1; '; CREATE INDEX zzz_subscribers_table_fix_idx ON zzz_subscribers_table_fix (mailing_list_id, lower_email); EXECUTE ' DELETE FROM ' || in_table_name || ' aa WHERE EXISTS ( SELECT 1 FROM zzz_subscribers_table_fix bb WHERE aa.mailing_list_id = bb.mailing_list_id AND LOWER(aa.email) = bb.lower_email AND aa.id <> bb.id_to_keep ); '; DROP TABLE zzz_subscribers_table_fix; RETURN true; END $$ LANGUAGE plpgsql;
It can be executed on individual subscriber tables like this:
Or it can be executed on all subscriber tables like this:
SELECT s_remove_duplcate_emails_from_subscribers_table(aa.subscribers_table_name) FROM ( ( SELECT 's_subscribers' AS subscribers_table_name ) UNION ALL ( SELECT DISTINCT subscribers_table_name FROM s_mailing_lists WHERE subscribers_table_name IS NOT NULL ) ) aa;
After completing the repair of all subscriber tables, you can remove the stored procedure:
DROP FUNCTION s_remove_duplcate_emails_from_subscribers_table(varchar);