Text classification is an important task for publishers. It allows editorial teams to identify overperforming and underperforming topics, spot changes in readership preferences, and manage overall content production in a strategic long-term way.

From my experience in news and evergreen publishers (CNN, Men’s Health, Runners World, Bicycling, etc.), I’ve never seen such an engaged audience until I got to Fusion Media Group. The Pareto principle applies overall across media companies: 20% of audiences consume 80% of stories. But what I’m seeing is a smaller…

Every day, the Kinja team works to iterate on the platform Fusion Media Group uses for its network of media sites. As part of our job on the data team, we provide product managers and developers with information to help them create new features that improve users’ experiences, enhance editorial’s storytelling tools…

The day has come where Josh Laurito, OG founder of the FMG data blog, lover of eggs and logs, is leaving us. To say goodbye, former and present Gawker/GMG/FMG employees have come together to share their memories of Josh, get a few last jabs in, and send him off in true Kinja fashion.

We love A/B testing here at Fusion Media Group. We’ve come a long way since 2015, when we first started testing: in the past three years we’ve grown the testing culture from none at all to A/B tests being a key and necessary part of the product development process. Experiments allow us to measure and be confident in…

Here at Fusion Media Group, we have fully embraced Doubleclick For Publisher’s custom key-values - we use them in part as they are intended: for improved targeting and forecasting around things DFP doesn’t track natively (ex: article topics, how many pages a user has seen so far in their visit.) But we’ve also been…

We just started using anaconda, rather than pip/virtualenv, to manage dependencies in the codebase for our data warehouse. The combination of pip and virtualenvs with requirements.txt files has served us well, but we switched because conda is more standard for analytics work, and because it’s by far the easiest way to…

Around the end of the year, the data team gets a lot of requests from the editorial teams to pull which posts had the highest traffic that year. Usually, they’re looking to figure out which of the posts they wrote that year did the best - but there are always posts that were published in previous years that gain…

Just like last semester and the year before that, my students in City University of New York’s data visualization class (IS608) have finished their final projects. You can check out all of them (from this year and previous years) here.

Do you ever find yourself using the same formula, day in and day out, and think, “Wow, I wish Excel just had a function for this.” Turns out, you can create your own functions in Excel to do whatever you want (for the most part)!

This week, we got A/B testing working on our AMP pages. While we really like AMP, we had more trouble than I expected due to some rough edges and components of the AMP documentation that I think are slightly misleading, so I figured I would write down my experiences here.