A personal view on statistics in earth sciences

Menu

Tag Archives: Statistics

And here we go! Another EGU has started today, all in all my fifth. It is always nice to spend some days in Vienna at the start of spring, so also this year the Austrian capital is offering us some sun for the start. Again I took the night train to Vienna, this time with a massive delay (5 hours).

After the reception last night the conference really started today with the first sessions. I started with an interesting time series analysis session. The talks switched a bit through the topics, some quite mathematical, some very detailed in their methodology. Next stop was the big data and machine learning session, which was really busy. Much too many scientists wanted to get in, which is understandable how these topics are hyped currently.

After lunch it was time for the sea level sessions with great talks. As this is still a topic of much interest for me I always enjoy an interesting session on that topic. The last presentation session of the day was for me the one on dynamical extremes, before the day was closed with the poster session.

The conference centre

So all in all a quite statistical day at the start, the next day will certainly look a bit different. My contributions will come up on Wednesday with a poster and on Thursday with a talk, also both times quite statistical topics, but more on that closer to the mid of the week.

Time runs fast, that is true for everyone. From time to time, at the end of the year or at birthdays you take a look back, what the last year has brought to you. The good things, the bad and of cause what you want to achieve in the future. Five years ago today, I have finished my PhD, so it is a good time to do the same.

So what has happened since? I have worked into two completely new topics, palaeo-sea-level reconstruction and long-term, especially seasonal, climate prediction. In that time I have published three first author papers and four minor author ones, have lived more than two years abroad and have been to numerous conferences in Northern America and Europe. I have been active in teaching, have done co-supervision of students and learned many more skills. In short, I have worked as a scientist and am in my second post-doc phase.

That is a long list, but as always, under the pressure scientists are today, you always want more. Hopping between topics has proven a challenge for me, costed time to adapt and changing institutions always requires care not to completely starting from scratch. The first station, sea-level research, has proven as a surprise for me, as I really enjoyed working in the interdisciplinary environment. As a meteorologist, looking at topics based in oceanography is sometimes quite strange. Ideas are different even when the methodology, the math and physics, is in many perspectives the same. I focused on simple models and data assimilation and am quite happy with the results. The second station, seasonal climate prediction, has been more challenging as expected. Going from simple to complex models and working on real meteorology was new for me and it needed time to adapt. But finally it seems things come to fruition and I am positive for the future.

During the past five years I have specialised even more into the field I started in during my PhD: the development and application of new statistical methodology. I have done it now in many different fields and am due to cover the three main fields in publications of statistical data analysis in geosciences: data analysis and data assimilation are done, verification is hopefully done in the upcoming year. And with this we come to the future. What are the aims for the next years?

The main aim is of course to create more publications, get even better in teaching and get better in doing science and research. Currently there are four first author papers in the final phase before submission, so I hope this will become a successful year in this area. The biggest steps I had done in teaching in the last year, so I am working on steadying it. New challenges will be of couse to apply for funding and working even more in supervision of students. All this hopefully lead to the next steps and of cause some new collaborations.

So all in all, I am quite happy with the last five years, I have learned a lot and hopefully I will be able to learn even more in the next years. Science is still a lot of fun for me and I am still on track that it stays this way. So bring on the next five years and see where we end up.

History is important: it explains us how we got to the place were we are and interferes more with our future than many would admit. This is true in life, but also in science. In lecturing we usually teach concepts and methodologies, many developed in the last five centuries and they are all developed with a background. This background tells us a lot about why these methodologies gained its importance they nowadays have and only when we understand them we understand why they are so highlighted compared to other methodologies, which we do not necessarily teach nowadays. Nevertheless, usually we keep the mentioning of this background quite brief and when at all, some words about it can be found in books. But is it the right way?

From time to time you listen to talks and in a moment you do not expect any surprises you hear the argument that statistics is objective. It is often used to strengthen other arguments and tries to prevent doubts in them. Much to often statistics is given a credibility it does not deserve and so the times it gets usefully applied get devalued. With this in mind, one thing have to be clear: Statistics is subjective…. always!

Quite often the words on objectivity are used in haste, often also from scientists who should know it better. Many arguments on the objectivity of statistics comes from the past, where frequentist statistics was the norm and its application for nearly every problem was seen as appropriate and therefore objective. But many forget, that by using frequentist statistics they make a choice. A choice on assumptions many have learnt years ago and are long forgotten.

May it be an assumption on normal distributions, on stationarity or ergodicity. Let’s be honest, those are never fulfilled, but always accepted indirectly by choosing a standard methodology. And when you doubt that you have a choice on your statistical methodology, then the answer is in nearly ever case that there is one. You do not have to go all the way to Bayesian statistics, there are many steps in between. The start is usually to think about the assumptions you are currently making in your methodology and then go the extra step to think what happens when one of these fails.

Most important is that we start to teach the students that statistics is based on assumptions. For myself it is a game of assumptions and usually you have a lot of freedom to make them. That does not mean that standard methods should not be taught, yes, for everyone who works in geoscience a foundation in statistical techniques is necessary. But it is important at the same time that we make clear that each methodology has its disadvantages and that alternatives exist. They are not necessarily easily to calculate, but they are certainly at least some seconds worth to think about.

After I had written nearly two months ago how the preparations for the lecture in the new term has started, it is now the time to wrap up the preparations as from next week on the term starts. So what have I achieved up to now? Well, more or less nearly all lectures are prepared, I have one left to do, but this will be done nearer to the actual lecture, because I need one for a bit of wiggle room in the middle (so when I am too slow or I see that students do not get used to my concepts). Also I have managed to have ideas and prepare most of the practical sheets, which the students have to do. So far, I am quite happy with that, but I will only see in the active phase, whether this will really work out as planned. Continue reading →

Part of an academical job is to lecture. Myself am very lucky that this duty is part of my obligations as I really like to do it. In the past I have mainly assisted teaching or did tutoring in various lectures, but next term I will get my own lecture to plan and give in full. I will get important assistance on one or two lectures as my schedule require me to be away for some dates, but apart from that it I will have to fill the four hours a week. The topic will be in a statistical area and so more in my core expertise as my lecturing I did up to now, which was mainly in the physical areas of climate science.

In the upcoming months I will write some posts about this topic, my experience of preparing the lectures and my thoughts about concepts. Of cause I will omit talking about the actual lectures, as students should never fear that they are put on the spot. As the topic of the lecture will be the basis of statistics, it will be not so much about the actual topics, but on how to present them and how to make it an interesting learning experience for the students.

As there are another two month to go I have started to prepare the first lectures. All in all there will be roughly 15 weeks to fill, partly with predefined content and with practicals. The german system sets a fixed numbers of hours the student should work on any lecture and in my case this number can be worked out as 12 hours per week. That is a lot, because even with taking the four hours of presence study not into account, there are eight hours left. So it will be a balance to get enough stuff into the lectures and explaining it in a way that a general unloved topic can be understood. Statistics is for many students like maths and that is in applied physics courses like meteorology/oceanography/geophysics usually not very popular for them. Usually one to two years of mathematical studies, mostly not very connected to the rest of the curriculum, are the beginning of every students life and so the next step with a mostly quite dry topic like statistics is thought to be the same. And unfortunately, therein lies a problem. When you get into statistics too much on the applied side, then you do not give context to the maths lectures given before and it will get harder for the students in the future to get into statistics properly (so not only as an auxiliary subject, but a real tool which is comfortable to handle). On the other side when you do it too mathematically, it is just another hated maths subject. Balancing in the middle of it is certainly an aim, but not really realistic to achieve.

I am looking forward to this experience, but am also aware that all my planning and thoughts might not work out as planned and it ends up it a struggle for the students and myself. That is a challenge and I like challenges.

The new year has started and in the recent weeks two new papers with myself in the author list have been published. Both are covering a wide spectrum and my contribution was in both cases more something I would classify as statistical assistance. Therfore, I will keep my comments brief at this place and just quickly introduce the topics.

This paper is about the sea-level height at Bermuda at roughy 70,000 years back. It is mainly a geological paper and focusses on the evidence from speleotherms, that indicate that sea-level was positive compared to today at that time. That is important, because the rest of the world has in many places lower than modern sea-level at that time. A plot in the later part of the paper shows, that the difference at different locations in the carribean can be up to 30-40m. Explained can this be with GIA modelling and the paper is therefore a good help to better calibrate those models.

The second paper focusses on the hindcast skill of two decadal forecasting systems of the Atlantic meridional overturning circulation (AMOC). It shows that both system have significant hindcast skill in predicting the AMOC for up to five years in advance, while an uninitialised model run has not. The time series for evaluationg the systems are still quite short, but the extensive statistics in the paper allows to transparently follow the argument, why the system do have this capability.