Jekyll2019-02-01T01:55:29+00:00https://maxwellholder.com/feed.xmlMaxwell HolderBlog for the web developer (and pastry enthusiast) Maxwell HolderMaxwell HolderRegex anchors in Ruby and PostgreSQL2019-01-31T22:24:37+00:002019-01-31T22:24:37+00:00https://maxwellholder.com/2019/01/31/ruby-postgresql-regex-anchors<p>Regular expressions (regex) in Ruby and PostgreSQL differ in a slightly confusing way when it comes to the meaning of <code class="highlighter-rouge">/Z</code>.</p>
<h2 id="ruby">Ruby</h2>
<p>A <a href="https://batsov.com/articles/2013/12/04/regexp-anchors-in-ruby/">common pitfall</a> for people learning Ruby is using <code class="highlighter-rouge">^</code> and <code class="highlighter-rouge">$</code> in regular expressions thinking that they apply to the entire string when they actually apply to the beginning and end of a <em>line</em>:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>irb(main):001:0&gt; "hello\nworld".match?(/^hello$/)
=&gt; true
</code></pre></div></div>
<p>To match the beginning and end of the entire string in Ruby you should instead use <code class="highlighter-rouge">\A</code> and <code class="highlighter-rouge">\z</code>:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>irb(main):002:0&gt; "hello\nworld".match?(/\Ahello\z/)
=&gt; false
irb(main):003:0&gt; "hello".match?(/\Ahello\z/)
=&gt; true
</code></pre></div></div>
<h2 id="postgresql">PostgreSQL</h2>
<p>Postgres also has <a href="https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-POSIX-REGEXP">regular expression matching</a> but it works in “non-newline-sensitive” mode by default:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>postgres=# select e'hello\nworld' ~ '^hello$';
?column?
----------
f
(1 row)
</code></pre></div></div>
<p><strong>Sidenote:</strong> Postgres will not interpret <code class="highlighter-rouge">\n</code> inside a string as a newline by default. To use C-style escapes, you have to <a href="https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-ESCAPE">prefix the string with an E</a>: <code class="highlighter-rouge">e'hello\nworld'</code></p>
<p>To get the same behavior as Ruby, you have to change to “newline-sensitive” mode by adding a <a href="https://www.postgresql.org/docs/current/functions-matching.html#POSIX-METASYNTAX">special prefix</a> <code class="highlighter-rouge">(?n)</code>:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>postgres=# select e'hello\nworld' ~ '(?n)^hello$';
?column?
----------
t
(1 row)
</code></pre></div></div>
<p>In this mode, you can use <code class="highlighter-rouge">\A</code> and <code class="highlighter-rouge">\Z</code> (note: uppercase, unlike Ruby) to match the beginning and end of the string:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>postgres=# select e'hello\nworld' ~ '(?n)\Ahello\Z';
?column?
----------
f
(1 row)
postgres=# select e'hello\nworld' ~ '(?n)\Ahello\nworld\Z';
?column?
----------
t
(1 row)
</code></pre></div></div>
<p>So Ruby and Postgres have <code class="highlighter-rouge">\z</code> and <code class="highlighter-rouge">\Z</code> to match the end of a string, respectively.</p>
<h2 id="the-confusing-bit">The confusing bit</h2>
<p>Unfortunately, in addition to <code class="highlighter-rouge">\z</code>, Ruby also has <code class="highlighter-rouge">\Z</code> which will match the end of string ignoring a single newline at the end:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>irb(main):001:0&gt; "hello\nworld".match?(/\Ahello\nworld\Z/)
=&gt; true
irb(main):002:0&gt; "hello\nworld\n".match?(/\Ahello\nworld\Z/)
=&gt; true
irb(main):003:0&gt; "hello\nworld\n\n".match?(/\Ahello\nworld\Z/)
=&gt; false
irb(main):004:0&gt; "hello\nworld\n\n".match?(/\Ahello\nworld\n\Z/)
=&gt; true
</code></pre></div></div>
<p>This doesn’t seem very useful to me since you could just use <code class="highlighter-rouge">\n?\z</code> instead, which seems more obvious.</p>
<p>My guess is that this was provided to match the behavior in Perl as described <a href="https://www.regular-expressions.info/anchors.html">here</a>:</p>
<blockquote>
<p>Because Perl returns a string with a newline at the end when reading a line from a file, Perl’s regex engine matches $ at the position before the line break at the end of the string even when multi-line mode is turned off. Perl also matches $ at the very end of the string, regardless of whether that character is a line break. So <code class="highlighter-rouge">^\d+$</code> matches <code class="highlighter-rouge">123</code> whether the subject string is <code class="highlighter-rouge">123</code> or <code class="highlighter-rouge">123\n</code>.</p>
</blockquote>
<p>Personally, I would probably avoid using this at all since it’s likely to be confused for <code class="highlighter-rouge">\z</code>.</p>
<h2 id="in-summary">In summary</h2>
<p>To match the beginning and end of an entire <em>string</em>:</p>
<ul>
<li>Ruby: use <code class="highlighter-rouge">\A</code> and <code class="highlighter-rouge">\z</code> (probably best to avoid <code class="highlighter-rouge">\Z</code>)</li>
<li>Postgres: use <code class="highlighter-rouge">^</code> and <code class="highlighter-rouge">$</code></li>
</ul>
<p>To match the beginning and end of a <em>line</em>:</p>
<ul>
<li>Ruby: use <code class="highlighter-rouge">^</code> and <code class="highlighter-rouge">$</code></li>
<li>Postgres: use <code class="highlighter-rouge">^</code> and <code class="highlighter-rouge">$</code> but change to “newline-sensitive” mode</li>
</ul>Maxwell HolderRegular expressions (regex) in Ruby and PostgreSQL differ in a slightly confusing way when it comes to the meaning of /Z.Use ERB in your .rspec file to skip requiring byebug on CI2019-01-14T22:24:37+00:002019-01-14T22:24:37+00:00https://maxwellholder.com/2019/01/14/dot-rspec-erb-require-byebug<p>You might already be using RSpec’s <code class="highlighter-rouge">.rspec</code> file to <a href="https://makandracards.com/makandra/36897-stop-writing-require-spec_helper-in-every-spec">avoid</a> having to add the <code class="highlighter-rouge">require 'spec_helper'</code> line at the top of all your spec files.</p>
<p>But did you know that RSpec supports <a href="https://relishapp.com/rspec/rspec-core/v/2-0/docs/configuration/read-command-line-configuration-options-from-files#using-erb-in-.rspec">ERB in the <code class="highlighter-rouge">.rspec</code> file</a>?</p>
<p>Which means you can make your <code class="highlighter-rouge">.rspec</code> file this:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>--require spec_helper
&lt;%= "--require byebug" unless ENV["CI"] %&gt;
</code></pre></div></div>
<p>in order to:</p>
<ul>
<li>always load <a href="https://github.com/deivid-rodriguez/byebug">byebug</a> when running tests on your local machine so you don’t have to add <code class="highlighter-rouge">require 'byebug'</code> every time you need to start a debugger with <code class="highlighter-rouge">byebug</code></li>
<li>skip loading <code class="highlighter-rouge">byebug</code> during your <a href="https://en.wikipedia.org/wiki/Continuous_integration">Continuous Integration</a> (CI) builds (assuming your CI server sets the <code class="highlighter-rouge">CI</code> environment variable like Travis CI <a href="https://docs.travis-ci.com/user/environment-variables/#default-environment-variables">does</a>)</li>
<li>fail your CI builds fast whenever you accidentally leave in a call to <code class="highlighter-rouge">byebug</code> (instead of hanging waiting for input until eventually timing out)</li>
</ul>Maxwell HolderYou might already be using RSpec’s .rspec file to avoid having to add the require 'spec_helper' line at the top of all your spec files.Database migration strategies2019-01-09T22:24:37+00:002019-01-09T22:24:37+00:00https://maxwellholder.com/2019/01/09/database-migration-strategies<p>Most web applications use a database to persist state.
Since the database is separate from the application and changes cannot be made to both simultaneously, there are various strategies for keeping them in sync.</p>
<h2 id="fully-coupled-releases">Fully-coupled releases</h2>
<p>Many web applications are deployed by:</p>
<ol>
<li>stopping the application</li>
<li>running any database migrations</li>
<li>deploying and starting the new application</li>
</ol>
<p>If a problem is discovered during the deploy, the database change must be rolled back and the old version of the application must be redeployed.</p>
<p>This is a simple approach that does not require backwards- or forwards-compatibility between the old and new application and database versions since it is assumed that both the application and database will be running either the old or new versions.</p>
<p>A downside to this approach is that it requires the application to go offline during deploys, causing some period of unavailability.</p>
<h2 id="partially-coupled-releases">Partially-coupled releases</h2>
<p>In order to provide zero-downtime deployments, another strategy exists involving <a href="https://martinfowler.com/bliki/BlueGreenDeployment.html">blue-green deployment</a>.</p>
<h3 id="pre-deploy-migrations">Pre-deploy migrations</h3>
<p>An example of such an approach could involve multiple application servers running against a single database instance where during a deploy:</p>
<ol>
<li>the database is migrated to the new version</li>
<li>the new application is deployed to application servers in groups and bounced so that at any one time some servers are running the old application code and the new application code</li>
<li>until eventually all application servers are running the new application code</li>
</ol>
<p>Since multiple application servers are used and at any point in time at least some are running, there is no period of unavailability.</p>
<p>This approach requires the new database version to be backwards-compatible with the old application version (which would be equivalent to saying the old application version must be forwards-compatible with the new database version) since any server could be running either the old or new application code.</p>
<p>This approach does <em>not</em> require the old database version to be forwards-compatible with the new application version (which would be equivalent to saying the new application version does not have to be backwards-compatible with the old database version) since it can always be assumed that if an application server is running the new code, the database changes have already be made due to the fact that the database migration is always run prior to deploying the new code.</p>
<p>However this also means that if a problem is discovered after the deploy and the database needs to be rolled back, the application <em>must also</em> be rolled back first.</p>
<p>A downside to this approach is that it still requires database migrations to be run prior to application deploys, which means long-running database migrations can hold up application deploys.</p>
<h3 id="post-deploy-migrations">Post-deploy migrations</h3>
<p>An alternative to this approach could be to reverse the order and do database migrations at the end of deploys, which would invert the backwards- and forwards-compatibility.
That is, it would require the new application version to be backwards-compatible with the old database version (but not require the new database version to be backwards-compatible with the old application version).</p>
<p>This also means that if a problem is discovered after the deploy and the application needs to be rolled back, the database <em>must also</em> be rolled back.</p>
<p>However, the problem of database migrations holding up a deploy still applies in either case; it would just be shifted to happening later.</p>
<h3 id="why-most-everyone-does-pre-deploy-migrations">Why most everyone does pre-deploy migrations</h3>
<p>This alternative is not usually chosen since additive changes are more common.
Consider the example of adding a new column.</p>
<p>In the pre-migration approach, the database change happens first and then the new application code is deployed.
This means the new application code can then assume that the column will be present (though of course the old application code still needs to work with and without the new column but this is usually easily done e.g. by selecting specific columns instead of <code class="highlighter-rouge">select *</code>).</p>
<p>In the post-migration approach, the new application code cannot assume the column is present.
This would require the new application code to conditionally handle either case, adding complexity, or more likely, spreading out the change over two releases (adding the column in the first release and then using it in the second).
The trade-off is that the old application code need not work with the new column since it would never be run with it present.</p>
<p>The benefit of the post-migration approach is more apparent when we consider the example of removing a column (ignoring the specific issues that stem from some frameworks like Ruby on Rails caching columns that necessitate <a href="https://github.com/rails/rails/pull/21720">ignoring columns</a> prior to dropping them even when the application no longer uses the column).</p>
<p>In the pre-migration approach, the column is dropped before the new code is deployed.
This would necessitate the old code to work without the column that is about to be removed, which would require conditionally handling either case (again adding complexity), or more commonly, would require two releases.</p>
<p>In the post-migration approach, the new code that removes all usages of the column is deployed before the column is dropped so it can be safely dropped as no running apps will attempt to use it.</p>
<p>If you had to pick between having either pre- or post-migrations only, it’s more common to choose pre-migrations since they make additive changes only require a single release and additive changes are more common as applications tend to grow larger over time.</p>
<h3 id="both-pre--and-post-deploy-migrations">Both pre- and post-deploy migrations</h3>
<p>Given the trade-offs to these approaches, there is of course another approach possible: to run some migrations before the new code is deployed and some after.</p>
<p>This approach often requires building out some kind of migration tagging to allow running pre-migrations at one point during a release and post-migrations at another since most database migration tooling (e.g. Active Record migrations) is built with the assumption that running migrations means running all previously unrun migrations.</p>
<p>There is also conceptual overhead required in having to think through whether any particular migration should be run as a pre-migration or as a post-migration.</p>
<p>It also complicates thinking about how to handle rollbacks, e.g. if a post-migration fails, does that mean you should also rollback the application and the pre-migration?</p>
<p>This approach also still has the problem of long-running migrations holding up an app release.</p>
<h2 id="decoupled-releases">Decoupled releases</h2>
<p>In all of the previously mentioned approaches, performing the database migration at a specified point during a release along with an application deploy affords not having to maintain either backwards- or forwards-compatibility (though not both except in the first case).</p>
<p>Of course, it is possible to forgo this affordance and instead strive for both backwards- and forwards-compatibility even when not strictly necessary, removing the problem of having to rollback the application and database together.</p>
<p>However once you do that, you might as well consider one final possible approach: to fully decouple application and database changes so that either can occur in isolation.</p>
<p>A simple way of implementing this approach while keeping a single release pipeline would be to have each release contain either application changes or database changes, but not both.</p>
<p>This would require maintaining both backwards- and forwards-compatibility, which adds some conceptual overhead, but also provides confidence for rollbacks.</p>
<p>Each change could also be rolled back in isolation since at any point:</p>
<ol>
<li>when deploying the application, the database must support both the old and new application versions</li>
<li>when migrating the database, the application must support both the old and new database versions</li>
</ol>
<p>If some additional work is taken to separate the release pipelines, this approach could provide the capability to deploy application changes at any time, whether a database migration is currently running or not.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Each approach outlined above has trade-offs. As coupling decreases, the conceptual overhead required to understand the necessary backwards- and/or forwards-compatibility quickly increases.</p>
<p>It’s often assumed that the work necessary to decouple application changes from database changes is not worth pursuing until reaching a certain scale, but as soon as they are partially decoupled (for zero-downtime deployment), it becomes essential to consider the need for backwards- and forwards-compatibility. I believe that it may prove to be worth it to expend the additional effort to fully decouple application and database changes early on to establish good habits and provide confidence for rollbacks.</p>
<p>Whichever approach you take, it’s important to understand the possible problems that can emerge from the fundamental problem of the database and application not existing as a single entity that can be changed atomically.</p>
<h2 id="prior-work">Prior work</h2>
<p>The following blog posts greatly helped me formulate my thoughts more concretely on this subject:</p>
<ol>
<li>Philip Potter - <a href="http://www.philandstuff.com/2018/04/04/keep-database-deploys-separate.html">Keep database deploys separate</a></li>
<li>Michael Brunton-Spall - <a href="http://www.brunton-spall.co.uk/post/2014/05/06/database-migrations-done-right/">Database Migrations Done Right</a></li>
<li>Philip I. Thomas - <a href="https://blog.staffjoy.com/dont-migrate-databases-automatically-5039ab061365">Don’t Migrate Databases Automatically</a></li>
</ol>Maxwell HolderMost web applications use a database to persist state. Since the database is separate from the application and changes cannot be made to both simultaneously, there are various strategies for keeping them in sync.Silencing tzinfo-data Bundler warnings2019-01-04T22:24:37+00:002019-01-04T22:24:37+00:00https://maxwellholder.com/2019/01/04/tzinfo-data-warning<p>If you’ve ever created a new <a href="https://rubyonrails.org/">Rails</a> app and then ran <code class="highlighter-rouge">bundle install</code> on a Unix-like system, you’ve probably seen this warning:</p>
<blockquote>
<p>The dependency tzinfo-data (&gt;= 0) will be unused by any of the platforms Bundler is installing for. Bundler is installing for ruby but the dependency is only for x86-mingw32, x86-mswin32, x64-mingw32, java. To add those platforms to the bundle, run bundle lock –add-platform mingw, mswin, x64_mingw, jruby.</p>
</blockquote>
<p>This is a classic example of a noisy warning.</p>
<p>Bundler is trying to be helpful and telling you that you have a gem in your Gemfile that it’s not going to install because it’s marked as only being for certain platforms.</p>
<p>Usually, this is a good thing: it can be <a href="https://github.com/bundler/bundler/pull/5003">confusing</a> if Bundler skips gems listed in your Gemfile without telling you why.</p>
<p>In this case, it’s not so helpful.</p>
<p>Rails includes this gem because it needs to be able to do timezone conversions and the library it uses (<a href="https://tzinfo.github.io/">tzinfo</a>) depends on having timezone definitions available.</p>
<p>On Unix-like systems, these are usually provided by the system itself so the tzinfo gem will just use those.
Windows, however, does not provide these definitions so the tzinfo-data gem needs to be included to provide them instead.</p>
<p>If you’re using a Unix-like system, you can safely ignore these warnings but since they happen every time you run <code class="highlighter-rouge">bundle install</code>, they can get very annoying and it’s useful to know how to turn them off.</p>
<h2 id="how-to-silence-these-warnings">How to silence these warnings</h2>
<p>Fortunately, Bundler (versions &gt;= 1.17.0) has an <a href="https://bundler.io/v1.17/bundle_config.html">option</a> <code class="highlighter-rouge">disable_platform_warnings</code> for silencing these warnings.</p>
<p>You can set it for a specific app by running:</p>
<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bundle config <span class="nt">--local</span> disable_platform_warnings <span class="nb">true</span>
</code></pre></div></div>
<p>Every developer has to do this individually since Rails git ignores the <code class="highlighter-rouge">.bundle</code> directory where this config is stored (which is <a href="https://stackoverflow.com/questions/6963496/why-does-rails-ignore-bundle-by-default">intentional</a> since most of the options used for configuring Bundler have to do with the local system e.g. gem installation options specific to a machine).</p>
<p>If you leave off the <code class="highlighter-rouge">--local</code> option, the warnings will be silenced globally for the current machine, regardless of which project you’re in.</p>
<p>This is preferable to <a href="https://github.com/tzinfo/tzinfo-data/issues/12#issuecomment-279554001">other ways</a> to avoid seeing this warning since it keeps the app working in Windows and doesn’t install unneeded dependencies on non-Windows systems.</p>Maxwell HolderIf you’ve ever created a new Rails app and then ran bundle install on a Unix-like system, you’ve probably seen this warning: The dependency tzinfo-data (&gt;= 0) will be unused by any of the platforms Bundler is installing for. Bundler is installing for ruby but the dependency is only for x86-mingw32, x86-mswin32, x64-mingw32, java. To add those platforms to the bundle, run bundle lock –add-platform mingw, mswin, x64_mingw, jruby.