Batch email sending

Nov 7, 2016

In our application we send out LOTS of emails. And our clients need to control when the emails are sent and the exact content. Here is a previous post on how we attempted to solve it. We later switched to use SendGrid bulk sending API to avoid making individual API calls for every email.

The problem with this approach is that newsletter might go to 100 users or 100K users. And the process runs sequentially so one large sending can delay others. And it’s best to pass email addresses to SendGrid in reasonable sizes chunks (say 100 at a time).

The first step is to break up each newsletter sending into separate job so they can run in parallel.

One problem with this approach is newsletter.update(status: :sent). We did not actually send the emails to the users yet, the jobs are simply queued. What we really want to do is run each sending job and update newsletter status when the last job completes.

We need to record the IDs of all individual jobs in the batch. I like using Redis for storing this kind of ephemeral data. For unique list of IDs Redis SETs are a good data structure.

Now in each sending job upon completion we can remove its own job ID from Redis and check whether there are other jobs left.

# app/jobs/send_newsletter_user_group_job.rbclassSendNewsletterUserGroupJob<ApplicationJobafter_perform:batch_tasksdefperformnewsletter,users_ids,batch_id...endprivatedefbatch_tasks# remove own IDSEND_NEWSLTTER_BATCH.srem(batch_id,self.job_id)# check if other IDs are presentifSEND_NEWSLTTER_BATCH.scard(batch_id)==0newsletter.update(status: :sent)SEND_NEWSLTTER_BATCH.delbatch_idendendend

We can now consolidate our jobs so SendNewslettersJob calls SendNewsletterUserGroupJob directly.