Quite lot of stuff I have been pondering for quite a time is being discussed in this video blog. If we will ever fully understand mechanisms behind the consciousnes the path to implementing proper self aware AI will be opened.

Beginning with the origins, Oycib means in Mayan language "the place of honey". In this projet, Oycib is an e-Research infrastructure for the Collective Intelligence Analysis.

With Oycib infrastructure we propose an analysis model, based in the digital practices and collaboration profiles for the development of Social Learning and the Context Awareness in the Collective Intelligence process.

The infrastructure design and the profiles proposed here, are based on historical studies about social organization glyphs in Mayan culture made by Montgomery (2002) and Calvin (2012).

Initially we worked with four collaboration profiles: the "Itzaat", the "Pitziil", the "Ayuxul" and the "Sajal" (profiles), but we can find others depending of the organization context. Thus, it's important to mention that each profile is found based on the e-Xploración model and they are the qualitative and quantitative interpretation of the collaborative practices. In this way, we propose methods based on Social Network Analysis for the learning and knowledge management.

Thus, the network in Oycib is called "Kaan" (sky or network in Mayan Lenguage). In the "Kaan" we present the visualization of the subjects and objects, such as persons, forums, blogs, files, groups and all the interactions among them. Additionally, each profile and their interactions is presented.

A new set of search tools called Memex, developed by DARPA, peers into the “deep Web” to reveal illegal activity

luiy's insight:

DARPA has said very little about Memex and its use by law enforcement and prosecutors to investigate suspected criminals.

According to published reports, including one from Carnegie Mellon University, the NYDA’s Office is one of several law enforcement agencies that have used early versions of Memex software over the past year to find and prosecute human traffickers, who coerce or abduct people—typically women and children—for the purposes of exploitation, sexual or otherwise. “Memex”—a combination of the words “memory” and “index” first coined in a 1945 article for The Atlantic—currently includes eight open-source, browser-based search, analysis and data-visualization programs as well as back-end server software that perform complex computations and data analysis.

Such capabilities could become a crucial component of fighting human trafficking, a crime with low conviction rates, primarily because of strategies that traffickers use to disguise their victims’ identities (pdf). The United Nations Office on Drugs and Crimeestimates there are about 2.5 million human trafficking victims worldwide at any given time, yet putting the criminals who press them into service behind bars is difficult. In its 2014 study on human trafficking (pdf) the U.N. agency found that 40 percent of countries surveyed reported less than 10 convictions per year between 2010 and 2012. About 15 percent of the 128 countries covered in the report did not record any convictions.

Social scientists have never understood why some countries are more corrupt than others. But the first study that links corruption with wealth could help change that.

One question that social scientists and economists have long puzzled over is how corruption arises in different cultures and why it is more prevalent in some countries than others. But it has always been difficult to find correlations between corruption and other measures of economic or social activity.

Michal Paulus and Ladislav Kristoufek at Charles University in Prague, Czech Republic, have for the first time found a correlation between the perception of corruption in different countries and their economic development.

The data they use comes from Transparency International, a nonprofit campaigning organisation based in Berlin, Germany, and which defines corruption as the misuse of public power for private benefit. Each year, this organization publishes a global list of countries ranked according to their perceived levels of corruption. The list is compiled using at least three sources of information but does not directly measure corruption, because of the difficulties in gathering such data.

Instead, it gathers information from a wide range of sources such as the African Development Bank and the Economist Intelligence Unit. But it also places significant weight on the opinions of experts who are asked to assess corruption levels.

The result is the Corruption Perceptions Index ranking countries between 0 (highly corrupt) to 100 (very clean). In 2014, Denmark occupied of the top spot as the world’s least corrupt nation while Somalia and North Korea prop up the table in an unenviable tie for the most corrupt countries on the planet.

- Deep learning is a set of algorithms in machine learning that attempt to model high-level abstractions in data by using model architectures composed of multiple non-linear transformations.

- Online machine learning is a model of induction that learns one instance at a time thus reducing the amount of memory required.

- Natural Language Toolkit (NLTK) - a leading tool for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.

-Computer Vision. OpenCV – popular computer vision library designed to by computational efficiency with a strong focus on real-time applications.

In the python world, there are multiple options for visualizing your data. Because of this variety, it can be really challenging to figure out which one to use when. This article contains a sample of some of the more popular ones and illustrates how to use them to create a simple bar chart. I will create examples of plotting data with:

OpenGraphiti is a free and open source 3D data visualization engine for data scientists to visualize semantic networks and to work with them. It offers an easy-to-use API with several associated libraries to create custom-made datasets. It leverages the power of GPUs to process and explore the data and sits on a homemade 3D engine.

The rapidly evolving ecosystems associated with personal data is creating an entirely new field of scientific study, say computer scientists. And this requires a much more powerful ethics-based infrastructure.

luiy's insight:

... Richard Mortier at the University of Nottingham in the UK and a few pals say the increasingly complex, invasive and opaque use of data should be a call to arms to change the way we study data, interact with it and control its use. Today, they publish a manifesto describing how a new science of human-data interaction is emerging from this “data ecosystem” and say that it combines disciplines such as computer science, statistics, sociology, psychology and behavioural economics.

They start by pointing out that the long-standing discipline of human-computer interaction research has always focused on computers as devices to be interacted with. But our interaction with the cyber world has become more sophisticated as computing power has become ubiquitous, a phenomenon driven by the Internet but also through mobile devices such as smartphones. Consequently, humans are constantly producing and revealing data in all kinds of different ways.

Mortier and co say there is an important distinction between data that is consciously created and released such as a Facebook profile; observed data such as online shopping behaviour; and inferred data that is created by other organisations about us, such as preferences based on friends’ preferences.

Imagine the progress that can happen—in health, science, education—when scholarly research is made freely available among scientists, patients, inventors, and others.

luiy's insight:

Before the open access model existed, almost all peer-reviewed articles based on scholarly research were published in corporate-owned print journals, whose subscription fees were often prohibitively expensive—despite the fact that authors are not paid for their articles. Publishers rarely invest in the actual research and typically provide little added value in the articles’ preparation and distribution.

These journals were available to the general public only at university libraries in wealthy countries. This meant that doctors treating patients with HIV and AIDS in remote regions of Africa, for instance, could not access complete articles describing the results of the latest medical research on treatments, even when the research upon which these articles were based was undertaken in their remote regions.

The internet has become embedded into our daily lives, no longer an esoteric phenomenon, but instead an unremarkable way of carrying out our interactions with one another. Online and offline are interwoven in everyday experience. Using the internet has become accepted as a way of being present in the world, rather than a means of accessing some discrete virtual domain. Ethnographers of these contemporary Internet-infused societies consequently find themselves facing serious methodological dilemmas: where should they go, what should they do there and how can they acquire robust knowledge about what people do in, through and with the internet?

This book presents an overview of the challenges faced by ethnographers who wish to understand activities that involve the internet. Suitable for both new and experienced ethnographers, it explores both methodological principles and practical strategies for coming to terms with the definition of field sites, the connections between online and offline and the changing nature of embodied experience. Examples are drawn from a wide range of settings, including ethnographies of scientific institutions, television, social media and locally based gift-giving networks. - See more at: http://www.bloomsbury.com/uk/ethnography-for-the-internet-9780857855701/#sthash.q1UHC7O1.dpuf

In this post we're looking at examples of generating some really cool fractals called dragon curves (also referred to as Heighway dragons). This post is a continuation of the previous one on fractal ferns. Take a look at that post if you want some basic info on fractals and some links I found useful. Fractals are a world unto themselves, so there are plenty of interesting things to be investigated in this area. We are just scratching the surface with these two posts.

How to Transition from Excel to R - An Intro to R for Microsoft Excel Users

luiy's insight:

In today's increasingly data-driven world, business people are constantly talking about how they want more powerful and flexible analytical tools, but are usually intimidated by the programming knowledge these tools require and the learning curve they must overcome just to be able to reproduce what they already know how to do in the programs they've become accustomed to using. For most business people, the go-to tool for doing anything analytical is Microsoft Excel.

We were looking for a different type of visualization for a project at work this past week and my thoughts immediately gravitated towards streamgraphs. The TLDR on streamgraphs is they they are generalized versions of stacked area graphs with free baselines across the x axis. They are somewhat controversial but have a “draw you in” […]

luiy's insight:

Streamgraphs require a continuous variable for the x axis, and thestreamgraph widget/package works with years or dates (support for xtsobjects and POSIXct types coming soon). Since they display categorical values in the area regions, the data in R needs to be in long format which is easy to do with dplyr & tidyr.

The package recognizes when years are being used and does all the necessary conversions for you. It also uses a technique similar to expand.grid to ensure all categories are represented at every observation (not doing so makesd3.stack unhappy).

How does a company keep tabs on thousands of suppliers? That’s the question Bruce Arntzen tried to answer when he started the Hi-Viz Research Project. As Executive Director of MIT’s Supply Chain Management Program, Arntzen works with corporations to find innovative solutions to supply chain problems. The idea for the Hi-Viz project came during a 2011 meeting of the Supply Chain Risk Leadership Council. A survey of attendees listed Supply Chain Visibility as the top concern. Why? With thousands of suppliers and sub-suppliers, it can be very time-consuming to find the weakest link in a supply chain. Arntzen’s solution: an automatic visualization of the end-to-end supply chain where the weakest links could be seen in real time. Watch his interview to learn how MIT and Sourcemap developed the first automated risk visualization [more details below the fold].

In 2015, the Hi-Viz project is partnering with actuarial data providers to provide predictive risk analytics. Sourcemap is making available inventory risk mapping as part of its enterprise software-as-a-service. Want to get involved? Learn more about the Hi-Viz project, or contact Sourcemap for a demo.

The Shogun Machine learning toolbox provides a wide range of unified and efficientMachine Learning (ML) methods. The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. This enables both rapid prototyping of data pipelines and extensibility in terms of new algorithms. We combine modern software architecture in C++ with both efficient low-level computing backends and cutting edge algorithm implementations to solve large-scale Machine Learning problems (yet) on single machines.

One of Shogun's most exciting features is that you can use the toolbox through aunified interface from C++, Python, Octave, R, Java, Lua, C#, etc. This not just means that we are independent of trends in computing languages, but it also lets you use Shogun as a vehicle to expose your algorithm to multiple communities. We use SWIGto enable bidirectional communication between C++ and target languages. Shogun runs under Linux/Unix, MacOS, Windows.

Reporters love Twitter and geeks love coding. Today, I’m merging the best of both worlds! On the menu: Python scripts to use Twitter to its full potential!

luiy's insight:

When my friend @TerraCiolfe showed me @WeAreTheDeads project, I said to myself that I really need to learn how to control Twitter through Python. @WeAreTheDeads is a Twitter account publishing the name of a fallen soldiers at the 11th minute of each hour.

Of course, nobody is working behind the screen. A program chooses the soldier in a database and publishes his name, hour after hour. With 119,000 names to publish, the script will run until 2023, according to the author of this great idea, the reporter @GlenMcGregor from the Ottawa Citizen.

With a little bit of research (my sources are at the end of the article), I learnt how to work with Twitter from a Python script. Actually, we can do way more than automatically publish tweets! It’s also possible to extract a lot of data about users and their tweets. For example, you can research specific tweets in a specific location. I created a nice animated map at the end. You’ll see!

Tulip is a software system dedicated to the visualization of huge graphs. It enables, 3D visualizations, 3D modifications, plugin support, support for clusters and navigation, and automatic graph drawing.

luiy's insight:

Tulip is an information visualization framework dedicated to the analysis and visualization of relational data. Tulip aims to provide the developer with a complete library, supporting the design of interactive information visualization applications for relational data that can be tailored to the problems he or she is addressing.

Written in C++ the framework enables the development of algorithms, visual encodings, interaction techniques, data models, and domain-specific visualizations. One of the goal of Tulip is to facilitates the reuse of components and allows the developers to focus on programming their application. This development pipeline makes the framework efficient for research prototyping as well as the development of end-user applications.

Every year, the World Economic Forum brings together the most recognisable figures of business and politics. With all eyes on Davos, we decided to turn the optics upside down and see who the twitterati gathered in Switzerland follow on social media.

The inner ring of circles represent the 20 most-followed accounts by Davos attendees, while the outer circles are individual attendees.

On Sept. 26, 2014, municipal police attacked a group of students from Ayotzinapa school in Mexico’s Guerrero state. Of the 43 disappeared students, eight came from Tecoanapa. Now their fellow citizens have shut down the local government buildings and set up a people’s council. It’s a movement that is gathering momentum across Guerrero.

Whilst Jim Gray envisages the fourth paradigm of science to be data-intensive and a radically new extension of the established scientific method, others suggest that Big Data ushers in a new era of empiricism, wherein the volume of data, accompanied by techniques that can reveal their inherent truth, enables data to speak for themselves free of theory. The empiricist view has gained credence outside of the academy, especially within business circles, but its ideas have also taken root in the new field of data science and other sciences. In contrast, a new mode of data-driven science is emerging within traditional disciplines in the academy. In this section, the epistemological claims of both approaches are critically examined, mindful of the different drivers and aspirations of business and the academy, with the former preoccupied with employing data analytics to identify new products, markets and opportunities rather than advance knowledge per se, and the latter focused on how best to make sense of the world and to determine explanations as to phenomena and processes.

The key to understanding the positive influence of diversity is the concept of informational diversity. When people are brought together to solve problems in groups, they bring different information, opinions and perspectives. This makes obvious sense when we talk about diversity of disciplinary backgrounds—think again of the interdisciplinary team building a car. The same logic applies to social diversity. People who are different from one another in race, gender and other dimensions bring unique information and experiences to bear on the task at hand. A male and a female engineer might have perspectives as different from one another as an engineer and a physicist—and that is a good thing.

Our system allows users to examine the factors influencing the predictions, so users can determine how “Liking” a certain item changes the predictions regarding their intel ligence, or how changing the number of friends they have affects the predictions regarding their personality. Clearly, these factors are under the control of the user, and users may modify their behavior on Facebook to be perceived in a positive manner. As people can form judgments on others based on their social media profiles [4], this phenomenon is not new. However, we believe an automated tool can allow people to easily determine how others may perceive thembased on their behavior on social networks.

How many different graph types exist? How do they relate to one another? Can you use the same graphic type for different types of data? These are the questions that we tried to tackle in our recent project, The Graphic Continuum.

Documenting the many chart types is something we were both working on independently for the past couple of years. Severino was building his Data Visualisation Catalogue, an online reference tool of data visualizations. At the same time, I was teaching data visualization to different audiences and was thinking about how to best show my students different graphic types and how they relate to one another.

It’s been said that we’re living in the golden age of data visualization. And why shouldn’t we be? Every move we make is potential fodder for a bar chart or line graph. Regardless of how you feel about our constant quantification, its been a boon for designers who have made some exceptional infographics—and some not…

luiy's insight:

A new book from graphic guru and School of Visual Arts professor Steven Heller and designer Rick Landers looks at that the process of more than 200 designers, from first sketch to final product. The Infographic Designers Sketchbook is almost exactly what it sounds like. The 350-page tome is essentially a deep dive into the minds of data designers. Heller and Landers have chosen more than 50 designers and asked them to fork over their earliest sketches to give us insights into how they turn a complex set of data into coherent, visually stunning data visualizations. “You see a lot more unbridled, unfettered work when you’re looking at a sketchbook,” says Heller. “You might be looking at a lot of junk, but even that junk tells you something about the artist who is doing it.”

Sharing your scoops to your social media accounts is a must to distribute your curated content. Not only will it drive traffic and leads through your content, but it will help show your expertise with your followers.

Integrating your curated content to your website or blog will allow you to increase your website visitors’ engagement, boost SEO and acquire new visitors. By redirecting your social media traffic to your website, Scoop.it will also help you generate more qualified traffic and leads from your curation work.

Distributing your curated content through a newsletter is a great way to nurture and engage your email subscribers will developing your traffic and visibility.
Creating engaging newsletters with your curated content is really easy.