Archive

Django has a nifty url dispatcher which allows you to direct requests from a url to an event handler in the form of a callable object in python. If you followed the tutorials, your code is now probably littered with functions. You can also use classes for event handling as follows.

Essentially, all models are wrong, but some are useful. — Box, George E. P

I was watching a show on discovery a while back, and they were talking a lot about the complexity involved in the human genome. It turned out, that the human genome contained an estimated 30,000 genes, while some species of plant contained roughly the same amount of genetic material. This raises an interesting question; if both species have the same amount of genetic material, how can we explain why humanity is so much more complex then a plant.

Lets take a look at another example, the stock market. The rules governing the actions of an individual investor seem rather trivial. He will try to invest when he thinks the market is low, he will try to sell when he thinks the market is high. One plausible attempt at understanding the stock market, is to attempt to understand the agents which compose the stock market. In the case of complex adaptive systems, this idea quickly falls apart because the agents have very complex interactions with each other; a large number of investors selling at the same time can cause the market to crash. Intuitively, you would expect low prices to mean people would start buying, but that is not always the case. Humans may also panic.

This brings up the concept of self organized criticality. These systems, like the stock market, usually have a point at which the behavior of the system as a whole changes drastically; this is called a point attractor. We can see self organized criticality in action in the below example.

In this model, we start with a 1000×1000 grid of dirt. Every time step, each dirt node has a chance to be turned into a tree, and every tree node has a chance of being turned into fire; this simulates a lightning strike. After a certain amount of time, the fire nodes return to dirt. The point attractor in this system is defined in terms of the ratio of growth rate by the chance of being turned into fire (lightning striking). If the growth rate is low, and the lightning strike rate is high, then enough trees do not get to grow to see the cascading failure we notice above. If we have too few lightning strikes, and too high a growth rate, then we see a cascade failure, just like in the above example.

This model is actually more useful then the obvious example; then one where you predict when a forest is getting too thick and preemptively burn parts of it to the ground. The forest fire model happens to be a relatively close model to cascading systems, like banks.

Lets go back to my first question; how do we explain the complexity of humans compared to plants, while some plants have approximately the same amount of genetic material? We explain it the same way we explain the complexity of the forest fire model. In the documentary they mentioned that some genes behave like switches, turning off segments of genetic material; genes have cause effect relationships on each other. This leads to the possibility that the whole is bigger then the sum of the parts.

Computer Science is no more about computers than astronomy is about telescopes.
— E.W. Dijkstra

The summer REU has ended and it was, overall, a very rewarding experience. I do not think I was really prepared for the overwhelming amount of work that goes into research, but I did learn a lot about computer science in general. The project I was participating in was entitled, Dynamics of Knowledge Creation in Open Biomedical Ontologies. The goal of this project was to examine how knowledge grows over time, and what intrinsic qualities in a social network leads to maximal growth in a technical network.

First, a little bit of background. An ontology is a set of concepts within a domain and these concepts must be representable in a machine parseable form. Think of an ontology like a dictionary; ontologies contain a set of terms which define a domain, and a set of relations mapping terms to other terms, and terms to other domains. My part of this involved parsing these ontologies and placing them into a database so that we can build knowledge graphs, and look at how they change over time.

Most of the communities in Open Biomedical Ontologies communicate by either mailing lists, SourceForge’s bug tracker, or both. Notre Dame provided us with an API to access most of the data required, but some of it was not provided; like the mailing list data. I built a webcrawler to place this information into a database.

Once I had all of the required data, I needed a tool to calculate some graph metrics on it; such as centrality, density, and clustering coefficient. I built a gdf parser to parse Guess graph information, and output the data in the form of csv files.

In essence, I spent most of the summer building tools to download empirical data. I plan to compare this data to a simulation designed by Dr. Yilmaz. Once we are confident in the accuracy of this simulation, we can begin to examine how changing structural elements of the social network affect the technical network, and which characteristics maximize innovation.

All in all this project has been fascinating. It quickly turned into a multidisciplinary project involving sociology, graph theory, and computer science. A friend of mine once sent me that E.W. Dijkstra quote. He mentioned that Computer Scientists are often wanted for their ability to manipulate massive quantities of data in a variety of fields. I am now convinced he is right.