One tool that was featured rather prominently was the Penn Phonetics Lab Forced Aligner (P2FA). This tool takes as input a recording of speech, a transcription of the speech, and returns a word and phone level alignment of the transcription to the audio (please see the P2FA page for more details).

Of course, once you have a large corpus of time aligned transcription, the ideal thing to do is an automated analysis of the acoustic data. In fact, this is the goal of the FAAV project, which focuses on analyzing vowel formant data. The most recent version of our code to automatically analyze vowels is hosted here: https://github.com/JoFrhwld/FAAV/tree/master/extractFormants

However, for most purposes, there doesn't already exist an automated method for acoustic analysis. For example, if you wanted to study -ing ~ -in variation, or TD deletion, you would have to first build a classifier, which would require some hand coded data anyway.

Documentation

So, I've written an interactive Praat script that allows you to rather flexibly define segments to search for, narrow down the search context to specific word and segmental contexts, and define segmental contexts to exclude, as well as a list of stop words. Given an audio file, and the output of P2FA or FAAValign, the script will search for the specified contexts, play them, and allow you to enter a code. It will then write your code along with other important information about the token which can be used for analysis in and of itself, or as training data for a classifier.

Setup
Open a Long Sound file and a Text Grid into Praat. These two objects must have the same name. Next open handCoder.praat. To run the script, select Run>Run.

Defining the Search
A dialogue box will open, allowing you to define segments to search for, and refinements of the search context. The default settings are for coding TD deletion.

You can understand these settings this way:

Search objects with these names.

Send output to this file.

Search for T and D.

Restrict the search to word final contexts.

The segment must be preceded by a consonant.

No restriction on following context.

Exclude segments preceded by R.

Exclude segments followed by T, D, TH, DH, JH, and CH.

Exclude AND.

Play a window of 3 words preceding and following the word the segment is in.

There is no default code

These are what the settings from -ing, or str- coding would look like.

Coding
As the script runs, it will play segments of the audio surrounding segments which meet the search criteria. Then, the coding window will open. It contains two fields: one for codes, and one for comments. After entering codes and comments, hitting enter, or clicking on Continue will move along to the next segment.

Output
The output of this script is a tab delimited file with the following pieces of data for each segment.