Here is the code for generating the necessary data. I focus on non-dismissed officers and use a first name list which originates from a list provided by a German computer magazine (c’t) and maintained by Matthias Winkelmann. This is not perfect but the best that I can do, given the data. I eliminate names that have duplicate gender classifications.

This does not look good for Baden-Wuerttemberg. Also, the female officer share is larger in former East Germany (16.4 % vs. 17.8 %, Chi-Square test is highly significant). Without further analysis I can only guess that regions with significant tourist industry (Northern Mecklenburg-Vorpommern, Southern Bavaria) tend to have a higher share of female officers while areas with a significant agricultural sector (e.g., Northern Brandenburg) have lower female representation.

Can we say something when we look at a more granular level (PLZs instead of two digit PLZ regions)? Here I limit the data to PLZs that list at least 30 officers.

Joachim Gassen

The ‘ExPanDaR’ package offers a toolbox for interactive exploratory data analysis (EDA). You can read more about it here. The ‘ExPanD’ shiny app allows you to customize your analysis to some extent but often you might want to continue and extend your analysis with additional models and visualizations that are not part of the ‘ExPanDaR’ package.
Thus, I am currently developing an option to export the ‘ExPanD’ data and analysis to an R Notebook.

Interactive EDA is nice but customized interactive EDA is even nicer. To celebrate the new CRAN version of my ‘ExPanDaR’ package I prepare a customized variant of ‘ExPanD’ to explore the U.S. EPA data on fuel economy. Our objective is to develop an interactive display that guides the reader on how to explore the fuel economy data in an intuitive way.
First, let’s load the packages and the data from EPA’s web page.

Overview
Today, I used a shiny app to run a classroom experiment in the first class of my introductory cost accounting course. I uploaded code, data and materials to github so that everybody can reuse it to construct similar experiments and, of course, to replicate the results from our experiment.
The experiment tests whether cost allocation (variable cost or full cost) affects pricing decisions in a simple one product pricing setting.

I am an applied economist working in the area of accounting and corporate transparency. I work with observational data a lot, meaning with data that is already available and not under my control. Whenever I set sails to design a test, there are a lot of decisions to take: Which sample should I use? What is the appropriate time frame? How do I define my dependent and independent variables? What is the functional relation that I expect between dependent and independent variables?

Last week, the German NGO Open Knowledge Foundation Deutschland e.V. has made German Trade Resister data available via the project OffeneRegister.de, together with the British NGO opencorporates. In my last blog post I checked the general accessibility of the data. In this quick follow-up post I follow an idea inspired by a tweet by Johannes Filter to map the gender balance of German corporate officers.
Here is the code for generating the necessary data.

Last week, the German NGO Open Knowledge Foundation Deutschland e.V. has made German Trade Resister data available via the project OffeneRegister.de, together with the British NGO opencorporates. While the data from German Trade Resister is publicly available in principle, retrieving the data is a case-by-case activity and is very cumbersome (try for yourself if you like). The data provided by OffeneRegister.de instead comes with an easy to navigate API, and, what is even more convenient, is available for bulk download (alternatively as a JSON or as a SQLite database file).

As the year is closing down, why not spend some of the free time to explore your email data using R and the tidyverse? When I learned that Mac OS Mail stores its internal data in a SQLite database file I was hooked. A quick dive in your email archive might uncover some of your old acquaintances. Let’s take a peak.
Obviously, the below is only applicable when you are a regular user of the Mail app for Mac OS.

Exploratory data analysis is important, everybody knows that. With R, it is also easy. Below you see three lines of code that allow you to interactively explore the Preston Curve, the prominent association of country level real income per capita with life expectancy.
install.packages("ExPanDaR")
library(ExPanDaR)
wb <- read.csv("https://joachim-gassen.github.io/data/wb_condensed.csv")
ExPanD(wb, cs_id = "country", ts_id = "year")
After running these three lines of code (OK, four if you have to install the ExPanDaR package first), a shiny window will open, allowing you to explore a country-year panel of World Bank data and looking something like this.

The awesome blog post by Tyler Morgan-Wall on 3d printing maps with his rayshader package rekindled an old desire of mine: Sometimes I would like to touch data. I am a big fan of data visualization and being able to add a third dimension and this haptic feel to the mix was just too much for me to let this idea pass.
While Tyler is keeping teasing us with references to an upcoming rayshader update that will allow the 3d mapping of ggplot output, I could not wait for this to hit GitHub.

Did you ever want to do a quick exploratory pass on a panel data set? Did you ever wish to give somebody (e.g., a reviewer or a fellow researcher), the opportunity to explore your data and your findings but can’t provide your raw data? Do you get bored from producing the same tables and figures over and over again for your panel data project? If your answer to one of the questions above is yes, then the new ExPanDaR R package might be worth a look.