Convert YAML column data to JSONB in Rails and Postgres

Introduction

My team had wanted to take advantage of the incredibly useful JSON features provided by Postgres 9.4+ in working with data that our rails application was storing as YAML in a text column. We had installed a very useful gem called audited that was responsible for keeping track of “when” & “what” changed in our ActiveRecord records. The “what” data was, unfortunately, being stored as YAML in a text column and makes querying this data more difficult than it needs to be. Here is a sample of that “what” data:

---pilot:-false-true

And this is what I want it to be like:

{"pilot":[false,true]}

Notably, the authors of the audited gem included the option to install it such that this data is stored in JSONB (default is YAML).

Lately, we have been interested in building more advanced queries to search by the content of this “what” data. As a result, we aimed to create a migration that can transform the data safely back and forth from YAML to JSONB (reversible for good measure).

The Result

Here is a migration file that we ended up generating that converts the audited_changes column (that “what” data) on the audits table from YAML to JSONB. You can utilize this snippet of code to transform your YAML column to JSONB.

Disclaimer - Migrate at your own risk!

classChangeAuditedChangesFromYamlToJson<ActiveRecord::Migrationdefupupdated_records=fetch_audit_records.mapdo|r|{id: r['id'],audited_changes: escape_sql(JSON.dump(YAML.safe_load(r['audited_changes'])))}endremove_column:audits,:audited_changesadd_column:audits,:audited_changes,:jsonbTransaction.transactiondoupdated_records.eachdo|ur|ActiveRecord::Base.connection.execute("""
UPDATE audits
SET audited_changes = '#{ur[:audited_changes]}'::jsonb
WHERE id = #{ur[:id]}
""")endendenddefdownupdated_records=fetch_audit_records.mapdo|r|{id: r['id'],audited_changes: escape_sql(YAML.dump(JSON.parse(r['audited_changes'])))}endremove_column:audits,:audited_changesadd_column:audits,:audited_changes,:textTransaction.transactiondoupdated_records.eachdo|ur|ActiveRecord::Base.connection.execute("""
UPDATE audits
SET audited_changes = '#{ur[:audited_changes]}'::text
WHERE id = #{ur[:id]}
""")endendendprivate## Returns the id & audit_changes column of the audits table.# @return [PG::Result]deffetch_audit_recordsActiveRecord::Base.connection.execute('SELECT id, audited_changes FROM audits')end## Returns the SQL escaped version of the string provided # For example, a "'" needs to be escaped by doubling it.# @param [String] the string to be converted# @return [String] the escaped version of the string provideddefescape_sql(string)returnstring.gsub("'","''")endend

Conclusion

It took some trial and error and navigating through some ‘gotchas’ to arrive at our resultant migration. For instance, we had to escape single-quote characters by adding another single-quote next to every instance (See the escape_sql method in the code snippet). Nonetheless, we are happy with the results and excited to start building queries with those super awesome JSON features given to us by Postgres.