Pages

Saturday, February 12, 2011

Q: A new software for analyzing survey data

Q is a new market research software from Australia that specializes in the analysis of market research surveys. Q has been designed for analysts who primarily work with survey data. I test drove the professional edition of Q and found it to be a welcome addition.

I downloaded Q from http://www.q-researchsoftware.com/download.aspx and obtained a free 30-day trial license, which offers full functionality, including importing one’s own data for analysis. I imported a few data sets into Q without any hassle. Q reads SPSS as well as CSV files.

Q has a tabbed GUI with four distinct tabs. The opening tab is called Table, which by default presents the summary statistics of the first numeric variable in the data set. The second tab is called Variables and Questions, which in fact is the most important tab because here variables are designated a particular type, i.e., categorical, continuous, etc., which then determines what analytics could be performed using a particular
variable.

Since Q is specifically designed for analysing survey data, variables are also designated by the type of answer solicited in the actual survey instrument. Thus variables are categorized as ‘pick one’ in instances where respondents were presented with multiples choices from which they had to pick one choice. Similarly, other types include ‘pick any’, ‘number grid’, or ‘ranking’. The type ‘experiment’ refers to variables that capture conjoint analysis data in a stated preference choice experiment.

The Data tab presents a tabular view of the underlying data and the Notes tab allows one to review notes related with the data set. The Q GUI also informs the analyst if the data were filtered or weighted using a weight variable. The filtering option allows simple as well as complex filtering that may involve one or more variables.

The first variable in my data set was the ID variable that identified each respondent in the data set. Q by default computed descriptive statistics for the ID variable. Since I was not particularly interested in determining the average value for the ID variable, I selected a categorical variable from my data, and Q quickly displayed the frequency table as percentages. When I changed the Summary option and instead opted for another categorical variable, Q quickly displayed a crosstab between the two variables.

For stated preference data, Q uses the wide data format where each respondent is represented in a single row. Most econometrics software use the long format for stated preference data because it prevents one from managing a large number of variables. Consider working with a stated preference data about the choice of airlines where the survey respondents are presented with five alternatives (i.e., airlines). Let us further assume that each alternative is identified by three attributes, such as airfare, flight duration, and on-flight amenities. Let us also assume that each attribute, such as price, has three levels: low, medium, and high. And lastly, let us assume that each respondent is presented with five sets of choices, which Q refers to as tasks.

This experiment will generate 225 variables (5 x 3 x 3 x 5), which are sometimes hard to manage, though Q offers a sophisticated environment to setup the experiment as described above in the Variables and Questions tab. Also, when the raw stated preference data are recovered from computerized survey instruments that automatically populate the database, such as Sawtooth or web-based survey tools, the data are already in wide format, which Q can handle with ease.

Strengths

Q offers certain unique features that are not available in other software. When it presents crosstabs, it adds arrows to suggest statistical significance such that the blue arrow represents positive significance and a red arrow suggests negative significance. In the choice experiments, the software presents estimated coefficients from the model in tabular format representing different coefficients for explanatory variables. A crosstab between airline brand type and trip purpose is presented in Figure 2. The significance test reported by Q is not testing the null hypothesis that the estimated coefficient equals 0. Instead, Q compares the significance of the estimated coefficient for one trip purpose against its compliment. Notice the last row which shows a blue coloured upward arrow for the price coefficient for business travellers against the red coloured downward arrow for the price coefficient for holiday travellers. The coloured coefficients suggest that they are statistically different from each other, where the business traveller appear to be less price-sensitive than the holiday travellers.

While most econometrics software allow one to test if the estimated coefficients for different groups are statistically different from each other, the process requires several extra steps. Q on the other hand does it on the fly.

Another unique feature is called Banner, which allows complex crosstabs involving more than two categorical variables. For instance, consider one wants to determine if the preference for a particular brand differs by gender, age groups, and country of residence. Q would present these differences in a single table whereas most other statistical analysis software would generate multiple crosstabs. Furthermore, Q permits one to aggregate categories by clicking on the output in a crosstab, thus eliminating the need to first recode the variable.

Q is also well integrated with Excel. I was able to export tables from Q into Excel with a simple mouse click. Q automatically formatted the same table in Excel and generated a graph from the same table a separate sheet. This further simplifies sharing results with colleagues who may not work with Q and therefore can review the results in MS Excel. Q also comes with a free version that allows one to review Q’s output.

Weaknesses

Q is distinct from other software in many ways. However, some features in Q are very unique and do not conform to the intuitive base knowledge, which most analyst have usually accumulated by working with other software in the past, such as SPSS and Excel. One key distinction is Q’s unique nomenclature. Q calls a simple regression model with a categorical explanatory variable ‘split cell experiment’. The main disadvantage of its unique nomenclature is that most new users of Q, who may have worked with other similar software or have taken courses in statistics/ market research, would have no exposure to Q’s unique terminology, which therefore has to be learnt afresh.

Q supports a point and click environment and does not generate a log or a syntax file. This makes reproducing results or repeating the analysis a more cumbersome task. Perhaps the developers may want to include this feature in a later version.

Q is rather expensive for the analytics it offers. Q professional costs $1,499 per license. A transferable license costs three times as much. Within advanced analytics Q supports OLS, Generalized least squares, logit models, cluster analysis, and principal component analysis. There are whole host of other advanced econometric tools, which are available in other competing software that cost much less.

Final word

Whereas Q has many unique features, most of its advanced core competencies are readily available in other software, such as SPSS, and Stata, Eviews, and R. It does stand out in offering advanced data managing capabilities for survey data. Q will be a preferred tool for those market researchers who rely more on cross tabulations. For others who subject their data to advanced econometrics, such as nested logit models, testing for self-selection biases, or post-estimation tests, Q offers a rather restricted set of tools.

About Me

I am an associate professor at the Ted Rogers School of Management at Ryerson University. I am the author of Getting Started With data Science: Making Sense of Data with Analytics.
My academic interests are analytics, housing and transport markets in urban contexts. My other interests are South Asian culture, politics, and economics.