ibm

Post navigation

Companies like Google and IBM are opening up services through APIs that will allow you to do things like check if an image contains adult/violent content, check to see what mood a face on a picture is in, or detect the language a piece of text is written in. Artificial Intelligence as a Service as it were (or maybe Machine Learning as a Service would be more appropriate).

So imagine building your product on top of these services. What happens if they start asking you to pay? Or if they censor particular types of input? Or if they stop existing? Where are the open alternatives that you can host yourself?

For anyone who likes logical Lego, the availability of these plug and play services means that in many cases you don’t have to worry about the base technology, at least to get a simple demo running. Instead, the creativity comes in the orchestration of services, and putting them together in interesting ways in order to do useful things with them…

It has been a few months since I attended SxSW in Austin. Time to do a bit of reflection and see which things have stuck with me as major takeaways and trends to remember.

Let me start by saying that going there has changed the way I think about learning and technology in many tacit ways that are hard to describe. That must have something to do with the techno-optimism, the incredible scale/breadth and the inclusive atmosphere. I will definitely make it a priority to go there again. The following things made me think:

Teaching at scale

One thing that we are now slowly starting to understand is how to do things at scale. Virtualized technology allows us to cooperate and collaborate in groups that are orders of magnitude larger than groups coming together in a physical space. The ways of working inside these massive groups are different too.

Wikipedia was probably one of the first sites that showed the power of doing things at this new scale (or was it Craigslist?). Now we have semi-commercial platforms like WordPress.com or hyper-commercial platforms like Facebook that are leveraging the same type of affordances.

The teaching profession is now catching on too. From non-commercial efforts like MOOCs and the Peer 2 Peer university to initiatives springing from major universities: Stanford’s AI course, Udacity, Coursera, MITx to the now heavily endowed Khan Academy: all have found ways to scale a pedagogical process from a classroom full of students to audiences of tens of thousands if not hundreds of thousands. They have now even become mainstream news with Thom Friedman writing about them in the New York Times (conveniently forgetting to mention the truly free alternatives).

I don’t see any of this in Corporate Learning Functions yet. The only way we currently help thousands of staff learn is through non-facilitated e-learning modules. That paradigm is now 15-20 years old and has not taken on board any of the lessons that the net has taught us. Soon we will all agree that this type of e-learning is mostly ineffectual and thus ultimately also non-efficient. The imperative for change is there. Events like the Jams that IBM organize are just the beginning of new ways of learning at the scale of the web.

Small companies creating new/innovative practices

The future of how we will soon all work is already on view in many small companies around the world. Automattic blew my mind with their global fully distributed workforce of slightly over a hundred people. This allows them to truly only hire the best people for the job (rather than the people who live conveniently close to an office location). All these people need to start being productive is a laptop with an Internet connection.

Automattic has also found a way to make sure that people feel connected to the company and stay productive: they ask people to share as much as possible what it is they are doing (they called it “oversharing”, I would call it narrating your work). There are some great lessons there for small global virtual teams in large companies.

The smallest company possible is a company of one. A few sessions at SxSW focused on “free radicals”. These are people who work in ever-shifting small project groups and often aren’t very bounded to a particular location. These people live what Charles Handy, in The Elephant and The Flea, called a portfolio lifestyle. They are obviously not on a career track with promotions, instead they get their feedback, discipline and refinement from the meritocratic communities and co-working spaces they work in.

Personally I am wondering whether it is possible to become a free radical in a large multinational. Would that be the first step towards a flatter, less hierarchical and more expertise-based organization? I for one wouldn’t mind stepping outside of my line (and out of my silo) and finding my own work on the basis of where I can add the most value for the company. I know this is already possible in smaller companies (see the Valve handbook for an example). It will be hard for big enterprises to start doing this, but I am quite sure we will all end up there eventually.

Hyperspecialization

One trend that is very recognizable for me is hyperspecialization. When I made my first website around 2000, I was able to quickly learn everything there was to know about building websites. There were a few technologies and their scope was limited. Now the level of specialization in the creation of websites is incredible. There is absolutely no way anybody can be an expert in a substantial part of the total field. The modern-day renaissance man just can’t exist.

Transaction costs are going down everywhere. This means that integrated solutions and companies/people who can deliver things end-to-end are losing their competitive edge. As a client I prefer to buy each element of what I need from a niche specialist, rather then get it in one go from somebody who does an average job. Topcoder has made this a core part of their business model: each project that they get is split up into as many pieces as possible and individuals (free radicals again) bid on the work.

Let’s assume that this trends towards specialization will continue. What would that mean for the Learning Function? One thing that would become critical is your ability to quickly assess expertise. How do you know that somebody who calls themselves and expert really is one? What does this mean for competency management? How will this affect the way you build up teams for projects?

Evolution of the interface

Everybody was completely focused on mobile technology at SxSW. I couldn’t keep track of the number of new apps I’ve seen presented. Smartphones and tablets have created a completely new paradigm for interacting with our computers. We have all become enamoured with touch-interfaces right now and have bought into the idea that a mobile operating system contains apps and an appstore (with what I like to call the matching “update hell”).

Some visionaries were already talking about what lies beyond the touch-based interface and apps (e.g. Scott Jenson and Amber Case. More than one person talked about how location and other context creating attributes of the world will allow our computers to be much smarter in what they present to us. Rather than us starting an app to get something done, it will be the world that will push its apps on to us. You don’t have to start the app with the public transport schedule anymore, instead you will be shown the schedule as soon as you arrive at the bus stop. You don’t start Shazam to capture a piece of music, but your phone will just notify you of what music is playing around you (and probably what you could be listening to if you were willing to switch channel). Social cues will become even stronger and this means that cities become the places for what someone called “coindensity” (a place with more serendipity than other places).

This is likely to have profound consequences for the way we deliver learning. Physical objects and location will have learning attached to them and this will get pushed to people’s devices (especially when the systems knows that your certification is expired or that you haven’t dealt with this object before). You can see vendors of Electronic Performance Support Systems slowly moving into this direction. They are waiting for the mobile infrastructure to be there. The one thing we can start doing from today is to make sure we geotag absolutely everything.

One step further are brain-computer interfaces (commanding computers with pure thought). Many prototypes already exist and the first real products are now coming to market. There are many open questions, but it is fascinating to start playing with the conceptual design of how these tools would work.

Storytelling

Every time I go to any learning-related conference I come back with the same thought: I should really focus more on storytelling. At SxSW there was a psychologist making this point again. She talked about our tripartite brain and how the only way to engage with the “older” (I guess she meant Limbic) parts of our brain is through stories. Her memorable quote for me was: “You design for people. So the psychology matters.”

Just before SxSW I had the opportunity to spend two days at the amazing Applied Minds. They solve tough engineering problems, bringing ideas from concept to working prototype (focusing on the really tough things that other companies are not capable of doing). What was surprising is that about half of their staff has an artistic background. They realise the value of story. I’m convinced there is a lot to be gained if large engineering companies would start to take their diversity statements seriously and started hiring writers, architects, sculptors and cineasts.

Open wins again

Call it confirmation bias (my regular readers know I always prefer “open”), but I kept seeing examples at SxSW where open technology beats closed solutions. My favourite example was around OpenStreetMap: companies have been relying on Google Maps to help them out with their mapping needs. Many of them are now starting to realise how limiting Google’s functionality is and what kind of dependence it creates for them. Many companies are switching to Open Street Map. Examples include Yahoo (Flickr), Apple and Foursquare.

Maybe it is because Google is straddling the line between creating more value than they capture and not doing that: I heartily agree with Tim O’Reilly and Doc Searl‘s statements at SxSW that free customers will always create more value than captured ones.

There is one place where open doesn’t seem to be winning currently and that is in the enterprise SaaS market. I’ve been quite amazed with the mafia like way in which Yammer has managed to acquire its customers: it gives away free accounts and puts people in a single network with other people in their domain. Yammer maximizes the virality and tells people they will get more value out of Yammer if they invite their colleagues. Once a few thousand users are in the network large companies have three options:

Don’t engage with Yammer and let people just keep using it without paying for it. This creates unacceptable information risks and liability. Not an option.

Tell people that they are not allowed to use Yammer. This is possible in theory, but would most likely enrage users, plus any network blocks would need to be very advanced (blocking Yammer emails so that people can’t use their own technology to access Yammer). Not a feasible option.

Bite the bullet and pay for the network. Companies are doing this in droves. Yammer is acquiring customers straight into a locked-in position.

SaaS-based solutions are outperforming traditional IT solutions. Rather than four releases a year (if you are lucky), these SaaS based offerings release multiple times a day. They keep adding new functionality based on their customers demands. I have an example of where a SaaS based solution was a factor 2000 faster in implementation (2 hours instead of 6 months) and a factor 5000 cheaper ($100 instead of $500,000) than the enterprise IT way of doing things. The solution was likely better too. Companies like Salesforce are trying very hard to obsolete the traditional IT department. I am not sure how companies could leverage SaaS without falling in another lock-in trap though.

Resource constraints as an innovation catalyst

One lesson that I learned during my trip through the US is that affluence is not a good situation to innovate from. Creativity comes from constraints (this is why Arjen Vrielink and I kept constraining ourselves in different ways for our Parallax series). The African Maker “Safari” at SxSW showed what can become possible when you combine severe resource constraints with regulatory whitespace. Make sure to subscribe to Makeshift Magazine if you are interested to see more of these type of inventions and innovations.

I believe that many large corporations have too much budget in their teams to be really innovative. What would it mean if you wouldn’t cut the budget with 10% every year, but cut it with 90% instead? Wouldn’t you save a lot of money and force people to be more creative? In a world of abundance we will need to limit ourselves artificially to be able to deliver to our best potential.

Education ≠ Content

There is precious few people in the world who have a deep understanding of education. My encounter with Venture Capitalists at SxSW talking about how to fix education did not end well. George Siemens was much more eloquent in the way that he described his unease with the VCs. Reflecting back I see one thing that is most probably at the root of the problem: most people still equate education/learning to content. I see this fallacy all around me: It is the layperson’s view on learning. It is what drives people to buy Learning Content Management Systems that can deliver to mobile. It is why we think that different Virtual Learning Environments are interchangeable. This is why we think that creating a full curriculum of great teachers explaining things on video will solve our educational woes. Wrong!

My recommendation would be to stop focusing on content all together (as an exercise in constraining yourself). Who will create the first contentless course? Maybe Dean Kamen is already doing this. He wanted more children with engineering mindsets. Rather than creating lesson plans for teacher he decided to organise a sport- and entertainment based competition (I don’t how successful he is in creating more engineers with this method by the way).

That’s all

So far for my reflections. A blow-by-blow description of all the sessions I attended at SxSW is available here.

Two weeks ago I visited Learning Technologies 2011 in London (blog post forthcoming). This meant I had less time to write down some thoughts on Lak11. I did manage to read most of the reading materials from the syllabus and did some experimenting with the different tools that are out there. Here are my reflections on week 3 and 4 (and a little bit of 5) of the course.

The Semantic Web and Linked Data

This was the main topic of week three of the course. Basically the semantic web has a couple of characteristics. It tries to separate the presentation of the data and the data itself. It does this by structuring the data which then allows linking up all the data. The technical way that this is done is through so-called RDF-triples: a subject, a predicate and an object.

Although he is a better writer than speaker, I still enjoyed this video of Tim Berners-Lee (the inventor of the web) explaining the concept of linked data. His point about the fact that we cannot predict what we are going to make with this technology is well taken: “If we end up only building the things I can imagine, we would have failed“.

The benefits of this are easy to see. In the forums there was a lot of discussion around whether the semantic web is feasible and whether it is actually necessary to put effort into it. People seemed to think that putting in a lot of human effort to make something easier to read for machines is turning the world upside down. I actually don’t think that is strictly true. I don’t believe we need strict ontologies, but I do think we could define more simple machine readable formats and create great interfaces for inputting data into these formats.

Microformats: where are the learning related ones?

These formats actually already exist and they are called microformats. Examples are hCard, hCalendar and hReview. These formats are simple and easy to understand and are created in a transparent and open process. Currently it does require some understanding of how these formats work to be able to use them, but in the near future this functionality will be build into the tools that we use to publish to the web. So just by filling in a little form about yourself you would be able to create an editable piece of text with an embedded hCard microformat.

So where are the learning related formats? I think it would be great to have small microformats that can describe a course or a learning object. I am aware of Dublin Core and IEEE LOM as ways of describing content, but these are a bit too complex (and actually do mix data and presentation is some weird way). Is anybody aware of initiatives to create some more simple formats? Are they built into any existing learning-related products?

Thinking about this has inspired me to add two microformats to my blog. The little text about me now contains machine readable hCard information and the license at the bottom of the sidebar is now machine readable too (using rel=”license”). I will also start to work on building my resume into the hResume format and publish it on my site. Check http://www.hansdezwart.info/qr in a couple of weeks to see how I have been getting on.

Use cases for analytics in corporate learning

Weeks ago Bert De Coutere started creating a set of use cases for analytics in corporate learning. I have been wanting to add some of my own ideas, but wasn’t able to create enough “thinking time” earlier. This week I finally managed to take part in the discussion. Thinking about the problem I noticed that I often found it difficult to make a distinction between learning and improving performance. In the end I decided not to worry about it. I also did not stick to the format: it should be pretty obvious what kind of analytics could deliver these use cases. These are the ideas that I added:

Portfolio management through monitoring search terms
You are responsible for the project management portfolio learning portfolio. In the past you mostly worried about “closing skill gaps” through making sure there were enough courses on the topic. In recent years you have switched to making sure the community is healthy and you have switched from developing “just in case” learning intervention towards “just in time” learning interventions. One thing that really helps you in doing your work is the weekly trending questions/topics/problems list you get in your mailbox. It is an ever-changing list of things that have been discussed and searched for recently in the project management space. It wasn’t until you saw this dashboard that you noticed a sharp increase in demand for information about privacy laws in China. Because of it you were able to create a document with some relevant links that you now show as a recommended result when people search for privacy and China.

Social Contextualization of Content
Whenever you look at any piece of content in your company (e.g. a video on the internal YouTube, an office document from a SharePoint site or news article on the intranet), you will not only see the content itself, but you will also see which other people in the company have seen that content, what tags they gave it, which passages they highlighted or annotated and what rating they gave the piece of content. There are easy ways for you to manage which “social context” you want to see. You can limit it to the people in your direct team, in your personal network or to the experts (either as defined by you or by an algorithm). You love the “aggregated highlights view” where you can see a heat map overlay of the important passages of a document. Another great feature is how you can play back chronologically who looked at each URL (seeing how it spread through the organization).

Data enabled meetings
Just before you go into a meeting you open the invite. Below the title of the meeting and the location you see the list of participants of the meeting. Next to each participant you see which other people in your network they have met with before and which people in your network they have emailed with and how recent those engagements have been. This gives you more context for the meeting. You don’t have to ask the vendor anymore whether your company is already using their product in some other part of the business. The list also jogs your memory: often you vaguely remember speaking to somebody but cannot seem to remember when you spoke and what you spoke about. This tools also gives you easy access to notes on and recordings of past conversations.

Automatic “getting-to-know-yous”
About once a week you get an invite created by “The Connector”. It invites you to get to know a person that you haven’t met before and always picks a convenient time to do it. Each time you and the other invitee accept one of these invites you are both surprised that you have never met before as you operate with similar stakeholders, work in similar topics or have similar challenges. In your settings you have given your preference for face to face meetings, so “The Connector” does not bother you with those video-conferencing sessions that other people seem to like so much.

“Train me now!”
You are in the lobby of the head office waiting for your appointment to arrive. She has just texted you that she will be 10 minutes late as she has been delayed by the traffic. You open the “Train me now!” app and tell it you have 8 minutes to spare. The app looks at the required training that is coming up for you, at the expiration dates of your certificates and at your current projects and interests. It also looks at the most popular pieces of learning content in the company and checks to see if any of your peers have recommended something to you (actually it also sees if they have recommended it to somebody else, because the algorithm has learned that this is a useful signal too), it eliminates anything that is longer than 8 minutes, anything that you have looked at before (and haven’t marked as something that could be shown again to you) and anything from a content provider that is on your blacklist. This all happens in a fraction of a second after which it presents you with a shortlist of videos for you to watch. The fact that you chose the second pick instead of the first is of course something that will get fed back into the system to make an even better recommendation next time.

Using micro formats for CVs
The way that a simple structured data format has been used to capture all CVs in the central HR management system in combination with the API that was put on top of it has allowed a wealth of applications for this structured data.

There are three more titles that I wanted to do, but did not have the chance to do yet.

Using external information inside the company

Suggested learning groups to self-organize

Linking performance data to learning excellence

Book: Head First Data Analytics

I have always been intrigued by O’Reilly’s Head First series of books. I don’t know any other publisher who is that explicit about how their books try to implement research based good practices like an informal style, repetition and the use of visuals. So when I encountered Data Analysis in the series I decided to give it a go. I wrote the following review on Goodreads:

The “Head First” series has a refreshing ambition: to create books that help people learn. They try to do this by following a set of evidence-based learning principles. Things like repetition, visual information and practice are all incorporated into the book. This good introduction to data analysis, in the end only scratches the surface and was a bit too simplistic for my taste. I liked the refreshers around hypothesis testing, solver optimisation in Excel, simple linear regression, cleaning up data and visualisation. The best thing about the book is how it introduced me to the open source multi-platform statistical package “R”.

Learning impact measurement and Knowledge Advisers

The day before Learning Technologies, Bersin and KnowledgeAdvisors organized a seminar about measuring the impact of learning. David Mallon, analyst at Bersin, presented their High-Impact Measurement framework.

Bersin High-Impact Measurement Framework

The thing that I thought was interesting was how the maturity of your measurement strategy is basically a function of how much your learning organization has moved towards performance consulting. How can you measure business impact if your planning and gap analysis isn’t close to the business?

Jeffrey Berk from KnowledgeAdvisors then tried to show how their Metrics that Matter product allows measurement and then dashboarding around all the parts of the Bersin framework. They basically do this by asking participants to fill in surveys after they have attended any kind of learning event. Their name for these surveys is “smart sheets” (an much improved iteration of the familiar “happy sheets”). KnowledgeAdvisors has a complete software as a service based infrastructure for sending out these digital surveys and collating the results. Because they have all this data they can benchmark your scores against yourself or against their other customers (in aggregate of course). They have done all the sensible statistics for you, so you don’t have to filter out the bias on self-reporting or think about cultural differences in the way people respond to these surveys. Another thing you can do is pull in real business data (think things like sales volumes). By doing some fancy regression analysis it is then possible to see what part of the improvement can be attributed with some level of confidence to the learning intervention, allowing you to calculate return on investment (ROI) for the learning programs.

All in all I was quite impressed with the toolset that they can provide and I do think they will probably serve a genuine need for many businesses.

The best question of the day came from Charles Jennings who pointed out to David Mallon that his talk had referred to the increasing importance of learning on the job and informal learning, but that the learning measurement framework only addresses measurement strategies for top-down and formal learning. Why was that the case? Unfortunately I cannot remember Mallon’s answer (which probably does say something about the quality or relevance of it!)

Experimenting with Needlebase, R, Google charts, Gephi and ManyEyes

The first tool that I tried out this week was Needlebase. This tool allows you to create a data model by defining the nodes in the model and their relations. Then you can train it on a web page of your choice to teach it how to scrape the information from the page. Once you have done that Needlebase will go out to collect all the information and will display it in a way that allows you to sort and graph the information. Watch this video to get a better idea of how this works:

I decided to see if I could use Needlebase to get some insights into resources on Delicious that are tagged with the “lak11” tag. Once you understands how it works, it only takes about 10 minutes to create the model and start scraping the page.

I wanted to get answers to the following questions:

Which five users have added the most links and what is the distribution of links over users?

Which twenty links were added the most with a “lak11” tag?

Which twenty links with a “lak11” tag are the most popular on Delicious?

Can the tags be put into a tag cloud based on the frequency of their use?

In which week were the Delicious users the most active when it came to bookmarking “lak11” resources?

Imagine that the answers to the questions above would be all somebody were able to see about this Knowledge and Learning Analytics course. Would they get a relatively balanced idea about the key topics, resources and people related to the course? What are some of the key things that would they would miss?

Unfortunately after I had done all the machine learning (and had written the above) I learned that Delicious explicitly blocks Needlebase from accessing the site. I therefore had to switch plans.

The Twapperkeeper service keeps a copy of all the tweets with a particular tag (Twitter itself only gives access to the last two weeks of messages through its search interface). I manage to train Needlebase to scrape all the tweets, the username, URL to user picture and userid of the person adding the tweet, who the tweet was a reply to, the unique ID of the tweet, the longitude and latitude, the client that was used and the date of the tweet.

I had to change my questions too:

Which ten users have added the most tweets and what is the distribution of tweets over users?
This was easy to get and graph with Needlebase itself:

Top 11 Lak11 Twitter Users

I personally like treemaps for this kind of data, so I tried to create one in IBM’s ManyEyes. Unfortunately they seem to have some persistent issues with their site:

ManyEyes error message

Which twenty links were added the most with a “lak11” tag? Another way of asking this would be: which twenty links created the most buzz?
This was a bit harder because Needlebase did not get the links for me. I had to download all the text into a text file and use some regular expressions to get a list of all the URLs in the tweets. 796 of the 967 tweets had a URL (that is more than 80%), 453 of these were unique. I could then do some manipulations in a spreadsheet (sorting, adding and some appending) to come up with a list. Most of these URLs are shortened, so I had to check them online to get their titles. This is the result:

One problem I noticed is that two of the twenty results were the same URL with a different shortened URLs (the link to the Moodle course and to the Paper.li paper): URL shorteners make the web the more difficult place in many ways.

What other hashtags are used next to Lak11?
Here I used a similar methodology as for the URLs. In the end I had a list of all the tags with their frequencies. I used Wordle and ManyEyes to put them into tag clouds:

Wordle Lak11 Hashtags

ManyEyes Lak11 Hashtags

Also compare them to tag clouds of the complete texts of the tweets (cleaned up to remove usernames, “RT”, “Lak11” URLs and the # in front of the hash tags):

Wordle Lak11 Tweets Texts

ManyEyes Lak11 Tweets Texts

Which one do you find more insightful? I personally prefer the latter one as it would give somebody who knows nothing about Lak11 a good flavor of the course.

How are the Tweets distributed over time? Is the traffic increasing with time or decreasing?
I decided to just get a simple list of days with the number of tweets per day. As an exercise I wanted to graph it in R. These are the results:

Tweets per day

I couldn’t learn anything interesting from that one.

Imagine that the answers to the questions above would be all somebody were able to see about this Knowledge and Learning Analytics course. Would they get a relatively balanced idea about the key topics, resources and people related to the course? What are some of the key things that would they would miss? If you would automate getting answers to all these question (no more manual writing of regex!) would that be useful for learners and facilitators?
I have to say that I was pleasantly surprised by how fruitful the little exercise with getting the top 20 links was. I really do believe that these links capture much of the best materials of the first couple of weeks of the course. If you would use the Wordle as the single image to give a flavour of the course and then point to the 20 URLs and get the names of the top Twitterers, than you would be off to badly.

Another great resource that I re-encountered in these weeks of the course was the Rosling’s Gapminder project:

Google has acquired some part of that technology and thus allows a similar kind of visualization with their spreadsheet data. What makes the data smart is the way that it shows three variables (x-axis, y-axis and size of the bubble and how they change over time. I thought hard about how I could use the Twitter data in this way, but couldn’t find anything sensible. I still wanted to play with the visualization. So at the World Bank’s Open Data Initiative I could download data about population size, investment in education and unemployment figures for a set of countries per year (they have a nice iPhone app too). When I loaded that data I got the following result:

Click to be able to play the motion graph

The last tool I installed and took a look at was Gephi. I first used SNAPP on the forums of week and exported that data into an XML based format. I then loaded that in Gephi and could play around a bit:

Week 1 forum relations in Gephi

My participation in numbers

I will have to add up my participation for the two (to three) weeks, so in week 3 and week 4 of the course I did 6 Moodle posts, tweeted 3 times about Lak11, wrote 1 blogpost and saved 49 bookmarks to Diigo.

The hours that I have played with all the different tools mentioned above are not mentioned in my self-measurement. However, I did really enjoy playing with these tools and learned a lot of new things.

These are my reflection and thoughts on the second week of Learning and Knowledge analytics (Lak11). These notes are first an foremost to cement my own learning experience, so for everybody but me they might feel a bit disjointed.

What was week 2 about?

This week was an introduction to the topic of “big data”. As a result of all the exponential laws in computing, the amount of data that gets generated every single day is growing massively. New methods of dealing with the data deluge have cropped up in computer science. Businesses, governments and scientists are learning how to use the data that is available to their advantage. Some people actually think this will fundamentally change our scientific method (like Chris Anderson in Wired).

Big data: Hadoop

Hadoop is one of these things that I heard a lot about without ever really understanding what it was. This Scoble interview with the CEO of Cloudera made things a lot clearer for me.

Here is the short version: Hadoop is a set of open source technologies (it is part of the Apache project) that allows anyone to do large scale distributed computing. The main parts of Hadoop are a distributed filesystem and a software framework for processing large data sets on clusters.

The technology is commoditised, imagination is what is needed now

The Hadoop story confirmed for me that this type of computing is already largely commoditised. The interesting problems in big data analytics are probably not technical anymore. What is needed isn’t more computing power, we need more imagination.

The adoption barriers that organizations face most are managerial and cultural rather than related to data and technology. The leading obstacle to wide-spread analytics adoption is lack of understanding of how to use analytics to improve the business, according to almost four of 10 respondents.

This means that we should start thinking much harder about what things we want to know that we couldn’t get before in a data-starved world. This means we have to start with the questions. From the same article:

Instead, organizations should start in what might seem like the middle of the pro-cess, implementing analytics by first defining the insights and questions needed to meet the big busi-ness objective and then identifying those pieces of data needed for answers.

I will therefore commit myself to try and formulate some questions that I would like to have answered. I think that Bert De Coutere’s use cases could be an interesting way of approaching this.

This BusinessWeek excerpt from Stephen Baker’s The Numerati gives some insight into where this direction will take us in the next couple of years. It profiles a mathematician at IBM, Haren, who is busy working on algorithms that help IBM match expertise to demand in real time, creating teams of people that would maximise profits. In the example, one of the deep experts takes a ten minute call while being on the skiing slopes. By doing that he:

[..] assumes his place in what Haren calls a virtual assembly line. “This is the equivalent of the industrial revolution for white-collar workers,”

Something to look forward to?

Data scientists, what skills are necessary?

This new way of working requires a new skill set. There was some discussion on this topic in the Moodle forums. I liked Drew Conway’s simple perspective, basically a data scientist needs to be on the intersection of Math & Statistics Knowledge, Substantive Expertise and Hacking Skills. I think that captures it quite well.

Connecting connectivism with learning analytics

It struck me that many of the terms that he used there are things that are easily quantifiable with Learning Analytics. Concepts like Amplification, Resonance, Synchronization, Information Diffusion and Influence are all things that could be turned into metrics for assessing the “knowledge health” of an organisation. Would it be an idea to get clearer and more common definitions of these metrics for use in an educational context?

Worries/concerns from the perspective of what technology wants

Probably the most lively discussion in the Moodle forums was around critiques of learning analytics. My main concern for analytics is the kind of feedback loop it introduces once you become public with the analytics. I expressed this in a reference to Goodhart’s law which states that:

Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes

George Siemens did a very good job in writing down the main concerns here. I will quote them in full for my future easy reference.

1. It reduces complexity down to numbers, thereby changing what we’re trying to understand
2. It sets the stage for the measurement becoming the target (standardized testing is a great example)
3. The uniqueness of being human (qualia, art, emotions) will be ignored as the focus turns to numbers. As Gombrich states in “The Story of Art”: The trouble about beauty is that tastes and standards of what is beautiful vary so much”. Even here, we can’t get away from this notion of weighting/valuing/defining/setting standards.
4. We’ll misjudge the balance between what computers do best…and what people do best (I’ve been harping for several years about this distinction as well as for understanding sensemaking through social and technological means).
5. Analytics can be gamed. And they will be.
6. Analytics favour concreteness over accepting ambiguity. Some questions dont have answers yet.
7. The number/quantitative bias is not capable of anticipating all events (black swans) or even accurately mapping to reality (Long Term Capital Management is a good example of “when quants fail”: http://en.wikipedia.org/wiki/Long-Term_Capital_Management )
8. Analytics serve administrators in organizations well and will influence the type of work that is done by faculty/employees (see this rather disturbing article of the KPI influence in universities in UK: http://www.nybooks.com/articles/archives/2011/jan/13/grim-threat-british-universities/?page=1 )
9. Analytics risk commoditizing learners and faculty – see the discussion on Texas A & M’s use of analytics to quantify faculty economic contributions to the institution: http://www.nybooks.com/articles/archives/2011/jan/13/grim-threat-british-universities/?page=2 ).
10. Ethics and privacy are significant issues. How can we address the value of analytics for individuals and organizations…and the inevitability that some uses of analytics will be borderline unethical?

Snapp’s website gives a good overview of some of the things that a tool like this can be used for. Think about finding disconnected or at-risk students, seeing who are the key information brokers in the class, use it for “before and after” snapshots of a particular intervention, etc.

Before I was able to use it inside my organisation I needed to make sure that the tool does not send any of the data it scrapes back home to the creators of the software (why wouldn’t it, it is a research project after all). I had an exchange with Lori Lockyer, professor at Wollongong, who assured me that:

SNAPP locally complies the data in your Moodle discussion forum but it does not send data from the server (where the discussion forum is hosted) to the local machine nor does it send data from the local machine to the server.

Making social networks inside applications (and ultimately inside organisations) more visible to many more people using standard interfaces is a nice future to look forward to. Which LMS is the first to have these types of graphs next to their forum posts? Which LMS will export graphs in some standard format for further processing with tools like Gephi?

Gephi is one of the tools by the way, that I really should start to experiment with sooner rather than later.

The intelligent spammability of open online courses: where are the vendors?

One thing that I have been thinking about in relation to these Open Online Courses is how easy it would be for vendors of particular related software products to come and crash the party. The open nature of these courses lends itself to spam I would say.

Doing this in an obnoxious way will ultimately not help you with this critical crowd, but being part of the conversation (Cluetrain anybody?) could be hugely beneficial from a commercial point of view. As a marketeer where else would you find as many people deeply interested into Learning Analytics as in this course? Will these people not be the influencers in this space in the near future?

So where are the vendors? Do you think they are lurking, or am I overstating the opportunity that lies here for them?

Last week I wrote a small teaser on learning for the team that I work in (mostly consisting of IT professionals, rather than learning professionals). I realized that some of the things I wrote could be interesting for this blog’s readers too. So here goes…

The collective set of organizational values, conventions, processes and practices that influence and encourage both individuals and the collective organization to continuously increase knowledge, competence and performance.

Using a solid research methodology they identified key best practices that affect business outcomes. The most influential practices all center around empowering employees and demonstrating the value of learning. According to Bersin, it is management who has the biggest role to play as they have the most influence on these cultural practices. Their research showed

[..] that learning culture (represented by the 40 High-Impact Learning Culture practices) directly accounts for 46 percent of overall improved business performance as measured by the business outcomes examined [..]

Learning agility and innovation are the two business outcomes that benefit the most from a strong learning culture.

Many organizations have productive employees, but 98 percent of organizations with strong learning cultures have highly productive workforces.

That should be enough of a business case to try and strengthen the learning culture in any business.

Fast pace of change: activities and methodology over content
It is a cliché, but we really are working in an environment where the pace of change is ever increasing. Working with learning content that has taken months to produce will only be relevant for skills that do not change much. That content will not help in keeping knowledge workers up to date and will have little or no business impact.

An alternative is to focus on methodology and activities rather than on content. How can we change the things we do, our behavior, to create a culture of learning and more reflective way of collaborating? How can we truly embed learning? Trying to answer that question will require a very conscious design effort.

Leveraging the teaching paradox
There is a terrible paradox in teaching: by the very nature of the process it is the teacher who learns the most. Learning is most effective when creating something for others to experience (see the explanation of constructionism here or this great article about the death of the digital dropbox). That is the reason why I love to present and also why I write this blog. If we want our employees to learn we have to put them into the role of teachers too.

Turning consumers into producers
You can overcome the teaching paradox by making sure that instead of asking people to consume content (i.e. going to a course from the SkillSoft catalogue or listening to a webcast by a senior learner) you ask them to produce content. Unfortunately for you, I have learned way more by writing this blog post, than you will ever learn by reading it. In fact, if I was allowed to give a single piece of advice to people designing a learning intervention, I would tell them to turn their participants from consumers into producers. They should ask themselves the following question: What am I asking them to make?

So how do we do all this? Here are four ideas that align with the above and that could be done immediately in any global organization with virtual teams.

Microteaching

Planning and creating collaborative one-pagers and microteaching events
Each week of the year a team of two could be made responsible for creating a one-pager about a particular topic. These one-pagers could give very factual information about the work we are doing (e.g. How are our three main learning systems integrated? Which five learning innovations have gotten the most traction in the past year and why?) or they could be more meta: talking about how we do our work (e.g. What is the best way to do a virtual meeting? Which 10 things should we stop doing today?).

Maybe one-pager is not the best word for this. It could also be a diagram, a video or a virtual role play, as long as it can be presented and understood within five minutes. Each month you could schedule an hour with the team in which the four or five one-pagers of that month would be presented by its creators to the rest of the team. The content itself is not important (you can let people choose their own topic and provide a list of suitable topics on a wiki for the less creative), but the methodology is. I would propose the following “rules”:

Each one-pager has a question as the title and is made collaboratively by two people. It is not allowed to do any work on it by yourself.

The two people are matched semi-randomly with a skewed bias to virtual collaborations and pairs that haven’t worked together before.

The presentation of the one-pagers is done virtually using a microteaching methodology with an active start (3 min.), an exercise (6 min.), a discussion (4 min.) and a look at how to continue (2 min.).

Narrating your work
In virtual teams it is hard to know what all the people in the team are doing. It is therefore also harder to learn from each other and find synergies in the work we do. A well-known way of battling this problem is through a concept called narrating your work. Each person in the team writes down what they have been doing in a couple of sentences. They should be asked to do in a regular interval (i.e. daily, three times a week, weekly) this three times a week. Microblogging technology is the ideal candidate to support this kind of process.

This will not only help the team in doing their work better and more efficiently, it should also help in making it a better team through the ambient intimacy that it creates.

Increasing the effectiveness of webcasts
Most teams in global organizations have a webcast with senior leaders every couple of weeks. These are usually not very interactive affairs: they are more about knowledge dissemination than about knowledge creation. Although there is sometimes space for questions at the end, it is often the case that the usual suspects speak up and discussion on topics barely scratch the surface.

One way to change this would be to have mini-jams (see here for IBM’s way of doing jams) before each webcast. It could work like this: 48 hours before the webcast the topics of the webcast are made available, any documents or presentations are shared and a couple of key questions are posed to the team. The team then spends the time until the start of the webcast discussing the questions. Each topic will have a moderator who is there to guide the discussion and tease out participation. It will be expected of each and every team member to participate and give their view. Microblogging tools, once again, would be good to facilitate this.

As a result it should be possible to make the webcasts shorter and spend the time in them addressing the issues that showed to be contentious or in need of clarification during the jam.

The power of video in interaction
The most powerful of our senses is vision. Technology has finally caught up with our innate ability and can now help us in using this sense in virtual teams. To facilitate working together as a virtual team, you should have the ambition to try and use video in all our your virtual meetings. This would mean the following:

Everybody in the virtual organisation needs to have a laptop with a built-in webcam. If they don’t have one now, we make sure that this gets changed as soon as possible.

The software to create video calls should be ubiquitous in the organization, it should be easy to use and be supported.

These are just examples…
There is a lot more that we can do: I would really like to have your input on how to really re-design the way we work and learn!