SAS® and software development best practice. Hints, tips, & experience of interest to a wide range of SAS practitioners. Published by Andrew Ratcliffe's RTSL.eu, guiding clients to knowledge since 1993

There's a BI evolutionary path that starts with simple, static reporting on historic data (often delivered with spreadsheets) through to real-time predictive analytics embedded into front-office transactional systems. Many suppliers who claim to offer BI systems barely get off the ground on the BI flight to delivering real value to the enterprise.

All those products that offer sexy, shiny, slick graphics with animated 2.5D fuel gauges that make your historic data look exciting but don't begin to tell you about where you're headed are flattering to deceive. If you're considering implementing a BI solution, make sure your chosen software will give you the headroom to grow the value that the solution delivers. Don't box yourself in with a sexy solution that ultimately offers no real intelligence.

An oft overlooked parameter for PROC MEANS (and PROC SUMMARY) is COMPLETETYPES. It tells MEANS to create all possible combinations of the values of the classification variables, even if some of those combinations don't exist in the data. And PRELOADFMT will create combinations from values that don't even exist in your input data. This can be very useful in presenting what appears to be a more complete picture of the input data and can be equally useful in presenting a consistent layout amongst a group of reports (or regularly produced reports).

Tuesday, 23 February 2010

Alongside our series of posts on creating Gantt charts in Excel for the purpose of managing small to medium sized projects, a discussion on "what is a project?" might be useful. Most of us feel we understand the general usage of the term "project", but what does it mean in the context of Project Management?

The Cambridge Advanced Leaner's Dictionary defines the noun project as "a piece of planned work or an activity which is finished over a period of time and intended to achieve a particular aim". The key attributes are a) having an aim or purpose, and b) being able to define a start and end date/time. These attributes make projects distinct from ongoing operational, live, production or business as usual (BAU) activities.

The existence of the end date is an important element of the project, and a lot of work goes into agreeing the end date and then making sure the project is delivered/completed by the end date. Plans are drawn-up, often including Gantt charts.

In the previous post in this series I described how to use Conditional Formatting to create a neat and simple Gantt chart alongside a simple Excel-based project plan. In this post I’ll describe how to use dates in addition to the day numbers that were featured in the previous post. The picture alongside (right) shows the result from today's post.

As with the previous case, I’m going to describe a quick and simple method. This method also takes weekends into account as non-working days. We ended the last post with what you see alongside (left).

So, let’s begin by adding the date for day 1 into cell F1. I'm typing “22/2” to represent 22nd February). It’s not readable in the small width of the cell, so we’ll go to the Format Cells window (you can use Ctrl-1 to get there quickly) and select text orientation as 90 degrees. Then, to get the date format that we want, we’ll stay in the Format Cells window and specify a custom number format of “dd-mmm (ddd)”. If the height of row 1 doesn’t automatically increase for you, just do it manually. You should have a result like this:

Having successfully conducted his long-titled course in the USA, Netherlands, and online, my friend Sunil Gupta is considering the possibility of running it in one or two cities in India. While the schedule is not yet finalised, the favoured period for the two-day class is around April end/early May in Bangalore and maybe Hyderabad.

You might also like to know that Sunil is presenting his "Preparing SAS Programmers for the Pharmaceutical Industry (An Introduction)" course as a pre-conference course at SAS Global Forum (SGF) 2010. There may still be places available.

Wednesday, 17 February 2010

Systems Seminar Consultants' newsletter (named The Missing Semicolon) is always a good read, so I was pleased to get notification of the Winter 2010 issue last week. Featuring a mixture of topics, this issue seems to focus on writing good documentation (program documentation and system documentation). Please don't view this as a switch-off topic! Read the articles and you'll better understand the benefits that properly targeted and focused documentation offers.

However, I do strongly disagree with the author's rule of adding a comment to every line of code. Programming standards always give rise to a strong degree of discussion, but in my opinion slavishly putting comments onto every line of code doesn't add anything to the reader's knowledge of the code. Indeed, in the example code given, the vast majority of on-the-line comments are stating the obvious. Comments should describe what is not obvious in the code - that typically means describing what blocks of code are doing and/or why a particular approach was taken (and why other approaches were considered but discarded).

The issue also offers a review of The Little SAS Book (by Lora Delwich and Susan Slaughter whom I featured yesterday), and a nice tip regarding the INFILE statement's MISSOVER parameter.

I recommend you hop over to Systems Seminar Consultants' publications page and a) sign-up for a free subscription, and b) take some time to browse through the archive of issues.

For any piece of work other than the smallest, it’s worth planning. Planning doesn’t have to mean creating a huge monster in Microsoft Project - I find that Microsoft Excel (or similar) is often sufficient (and a lot more accessible to the team). This post (and the series of posts that follow) describes how to quickly and efficiently create an adequate plan for small to medium sized projects.

I don't expect all developers to be expert project managers, but I do expect my team members to understand the role of the project manager, to know how to work to a plan, and to focus on delivery. And I do expect developers to run their own (small to medium sized) projects from time-to-time.

A project plan can consist of just a list of tasks (preferably with start and end dates) together with the name of the person who will complete the task, but this can be made to communicate a lot more if you can deliver a Gantt Chart too. The name “Gantt Chart” sounds challenging to anybody who hasn’t met one before, but actually it’s rather simple format that most people are familiar with (often without knowing the name). Gantt Charts can contain a lot of detail and embellishment, but I’m going to describe how to create a simple yet communicative chart very quickly.

Tuesday, 16 February 2010

I'm a keen follower of Susan Slaughter's books (in conjunction with Lora Delwiche) and her Avocet Solutions web site. The web site is very nicely structured and contains a wealth of solid information. Last week, the Little SAS Book Series featured an article about informats and Enterprise Guide 4.2. The article highlighted useful, user-friendly features of EG 4.2's Data Grid, but also warned of the fact that said Data Grid ignores informats.

Reflecting further on the unconference & BarCamp format of the Analytics Camp NC event that I mentioned last week, whereby sessions are proposed and scheduled each day by the attendees and based upon pitches from the potential speakers, I realised that this is a good means of giving feedback to potential speakers and thereby encouraging new speakers.

It's a little daunting to write a paper and send it off to some anonymous conference organiser in the hope that you might be seen to offer something of interest to fellow conference attendees. And I've recently seen at first-hand how conference organisers can be dismissive of those whose papers are not selected (to the extent of not even bothering to tell them that their paper was not selected). To get some constructive criticism out of them, in order to do a better job next time around, can be like getting blood out of a stone. People who have had papers accepted for conferences on previous occasions will not be put off by such behaviour; however, for a first-timer the anonymous rejection can easily put them off of ever submitting a paper again.

By contrast, the atmosphere at Analytics Camp seems to have been very informal and welcoming. It sounds like just the sort of atmosphere where a novice might be tempted to propose a topic and be given positive encouragement to proceed with their idea.

I continue to warm to the unconference & BarCamp ideas and ideals. More importantly, if you're organising a conference, please be sure your section chairs show respect and offer encouragement for all of those who take the time and effort to prepare a paper and submit it. For conferences to thrive they need a regular influx of new thoughts and ideas; don't stifle and discourage first-timers.

Wednesday, 10 February 2010

I noticed a lot of tweeting last weekend with hashtags of sas and acampnc. I managed to figure-out that there was some kind of informal, analytics event in North Carolina named Analytics Camp NC. I took the time this week to find out more. Seems it was a useful event, and an interesting format too.

Angela Hall (she of the SAS-BI blog and latterly a Technical Architect at SAS) has offered two very informative posts: Designing Dashboards Successfully (answering the question "What should all dashboards have to make them useful and successful?") and Content Analytics "All Abouts" on text analytics. Thanks Angela. Other follow-up articles are listed on the home page of the Analytics Camp web site. Social media is a growing area of organisations' marketing plans, and it's clear that there's a lot of growing interest in the area of social media analytics, i.e. tracking readers, followers, fans, and (most importantly) buyers.

The Analytics Camp web site offers further information about the objectives and organisation of the event:

Tuesday, 9 February 2010

I wrote recently about the paucity of 3rd-party add-ins for Enterprise Guide. I finished the post by wondering aloud whether there were more that I hadn't found. Well, I've had no reports of any 3rd-party add-ins, but in Chris Hemedinger's SAS Dummy blog he summarised a good range of SAS-supplied add-ins that are available for free download.

Whilst they are largely unsupported, they certainly provide some jolly useful functionality. And the source code is provided so you can support them yourself.

Monday, 8 February 2010

At the risk of stumbling over the description of another "new" function, I discovered the DIVIDE function alongside my discovery of the IFC/IFN functions. This is definitely new in V9.2; confirmed by the What's New in the Base SAS 9.2 Language web page.

It caught my eye because it handles those division scenarios where SAS normally issues notes or warnings to the log, e.g. divide by zero. In many cases you would want to be warned of missing values and divide by zero, there are some cases where you do not, but you need to create long-winded conditional coding around your division in order to avoid the messages. Well, with the DIVIDE function you don't.

It seems my recent post on the IFC and IFC functions caused a fair bit of interest. Not least in further related functions.

Firstly I must own-up to my mistake of originally stating IFC and IFN were new with 9.2. A number of correspondents pointed-out that they were available from the beginning of V9. Plus, I have it on good authority that they were experimental in V8.2 (albeit possibly with different names). They were implemented primarily to provide a logical construct that could be used interchangeably in both SQL and data step code. They happen to be a lot shorter than SQL's case/when/else construct too! My thanks to the little ex-SAS birdie and the correspondent who passed the message along.

Secondly, The SAS Plumber started a thread on news:comp.soft-sys.sas. In response, Data _Null_ pointed-out that the third conditional value only gets returned when the value of the first parameter is actually missing; hence my description of the functionality was incorrect. Jason Secosky pointed-out the same thing in comments on the original post. And Ron Fehd highlighted the safe, traditional option of using the SELECT statement.

Finally, Jack Hamilton suggested a look at CHOOSEC and CHOOSEN. These were new to me too - I've clearly been walking around with my eyes closed (and bumped into something, causing the poor quality of the original IFC/IFN post!). The syntax for both is:

CHOOSEx (index-expression, selection-1 <,...selection-n>)

The CHOOSEx function uses the value of index-expression to select from the arguments that follow. For example, if index-expression is three, CHOOSEx returns the value of selection-3. If the first argument is negative, the function counts backwards from the list of arguments, and returns that value.

In his comments to the original posting, Jack also mentioned that IFC/IFN are also available in the macro language via %SYSFUNC, and so provide a primitive IF/THEN/ELSE mechanism in open code. Jack highlighted the example in the sascommunity.org wiki.

So, I now know four functions that I'd not heard of just a few short weeks ago. And I've made a note to read the documentation more carefully before posting next time! My thanks to all of you who contributed to the discussion.

Friday, 5 February 2010

The Clipper 09-10 Round the World Yacht Race, featuring UK-based SAS consultant Andy Phillips onboard the 68-foot Team Finland, continues to offer incidents and accidents. As race 5 was building to a close-fought contest between a number of yachts, one of Team Finland's rivals, Cork, struck a submerged reef in the Java Sea, leaving its crew to hurriedly launch the life rafts and paddle to the nearby small island of Gosong Mampango. Thankfully nobody was hurt. All 16 crew were subsequently evacuated to two sister yachts, Team Finland and California.