Creating calculations, hierarchies, and groups

Keyboard Shortcuts

Practice creating calculated fields, building and modifying hierarchies, and defining new data groups within the refine interface. Understand the importance of the "rows" field, and demonstrate how to add or remove columns from the refined data set.

- [Instructor] Before we dive into our analysis,let's practice working with calculations, hierarchies,and groups to finish refining our data set.So we'll go ahead and close our column propertiesand scroll to the left-most column to get started.The first thing I'd like to do hereis drill into my column list and enable the Rows column,which was automatically created for us,by simply giving it a check mark to the left of the name.Rows will be a valuable tool that we can useto give us a count of observations under specific criteria.

If I scroll down it also looks like we have some work to dowith the auto-generated hierarchies here,since Watson is grouping Origin Stateswith Destination Cities and vice versa.If you look closely, you'll see why this is happening,which is because the word origin in the Origin City columnis missing the first I.So Watson didn't recognize that it should be pairedwith the associated Origin State column,which is spelled correctly.But not to worry, all we need to do is selectthe hierarchy and we can modify it as we see fit.

So let's go ahead and select Destination State Origin City,remove the Origin City level.We'll add a new level for Destination Cityand then give our hierarchy a meaningful name.So let's call it Destination State slash City.Now we're essentially structuring our data setsuch that a user can drill downinto a particular destination state to revealthe destination cities within that state,a feature which can also be referred to as a drill path.

So let's go ahead and repeat that same processfor origin hierarchy as well.So we'll remove Destination City,We'll add a level for Origin City,and then we'll name this hierarchy Origin State slash City.Last but not least, we can select the check marksto include both of the new fields into our data set.Now let's say we'd like to calculate the total travel time,which we can define as the departure delayplus the total flight time.

To do this all we need to do is select Calculation,give it a name, let's call it Total Travel Time,and insert the columns and operators that we need.In this case it's pretty simple,We'll select thew A and search for departure delayin minutes.We'll keep the plus operatorand then click on B to pull in flight time in minutesto finish this off.

And once we've clicked done,our new field populates here at the bottomof our column list, Total Travel Time.Finally, let's build out an example data groupto categorize shopping spend-amounts.We can simply choose the Data Group option,click on year flight date to select the fieldthat we're interested in, which in this case,is shopping amount at airportand customize how we want to bucket or group these values.

Let's keep things pretty simpleand create three distinct bucketsand we can customize the thresholdsfor those who spend less than 100 dollars,between 100 and 300 dollars,or over 300.Finally, I can change my bucket labels to low,medium, and high.And finally name the data group as a whole.In this case, we'll call it Shopping Spend Level.

Now if I click done and scroll all the way to the rightof my table, I can see the group that I just created.And now if we hide the column list,we can confirm that our shopping spend labelsare populating correctly.So anyone who spent less than a hundred dollarsshows up with a label of low.Anyone who spent between 100 and 300 shows a level of mediumand anyone who spent over 300 shows a label of high.

Last but not least, let's go ahead and do a save asat the top of our screen,so that we can preserve our original data set as well.And we'll call this one Airline Satisfaction Survey Refined.Because I'm using a professional version of Watson AnalyticsI have the option to save thisin a shared or personal folder.In this case, I'm going to keep it in my personal assetsand click the Save button.

Resume Transcript Auto-Scroll

Author

Released

10/4/2016

IBM Watson Analytics (WA) is a smarter solution for data analytics, providing automated discovery and visualization features that you can't find anywhere else. Learn how to use this powerful yet easy-to-use cloud-based tool for data science and business analytics. Chris Dutton shows how to use features such as machine learning and predictive analytics—minimizing the learning curve and maximizing your proficiency with Watson's tools. Discover how to import and refine data from local or cloud-based sources; build new calculations, hierarchies, and data groups on the fly; and leverage cognitive starting points, natural language queries, and dynamic insights. Find out how to auto-detect trends and correlations and generate predictive models and decision trees quickly. Plus, learn how to create and share visualizations, dashboards, and infographics to bring your insights to life.