Michael O. Vertolli

Investigating questions for problems not yet envisioned.

Welcome.
My name is Michael O. Vertolli and I am currently a PhD student in the Institute of Cognitive Science at Carleton University. My research focuses on data-driven procedural generation with cognitive constraints. My current focus is on deep learning at a systems level: large-scale network topologies rather than layer-level or neuron-level structures. I am primarily working with Generative Adversarial Networks (GANs).
Interested parties may contact me at michaelvertolli(Google email).

Research Projects

When I have more time, I will list more of the projects that I am currently working on, plan to be working on, or have already finished. Many of my projects can be found on my git here.

ConnectorGraph Tensorflow Framework

I have started building an extension of Tensorflow that is designed to handle large-scale deep neural networks that are composed of multiple, reusable subnetworks. The code and some sample output can be found here.

Contextual coherence in the visual imagination

This is my Masters of Cognitive Science primary project. My focus is now on my PhD research, which is closely related. Many of these vectors are still present in some form in the new project. For those interested in the details, I will refer you to the papers page of my website. Texts and abstracts are listed there.

The research directions of this project were:

an extension of my work with holographic vectors, but with a focus on text instead of vision;

assessing my M.Cog. research in terms of a new compression technique: supervised clustering;

integrating spatial data into my M.Cog. research to create a innovative form of memory chunking;

and, using Coherence Net as a meta-search technique for an adaptive search engine framework.

The code for Coherencer, one of the main models of the project, can be found here.

Mental rotation in a spiking neural network

This project applies some of my dissertation research in a slightly different way. Using a mix of Tensorflow and NengoDL, we have been able to train a deep spiking neural network to perform rotations of MNIST digits. I am mainly functioning in a supervisory role for an undergraduate student on this project.

OImaGener

This is my dissertation research on 2D generative modeling. At present, I have completed research on a detailed theoretical neural model and a mapping from that model to contemporary computational models, specifically network topologies of interconnected Generative Adversarial Networks. I am now in the process of coding the system using my ConnectorGraph framework.

The planned components of the project are as follows (* mark completion):

A new evaluation and training approach using Image Quality Assessment techniques (*)

A new output refinement approach using GAN encoders (*)

Further improvements on evaluation using GAN encoders

Unsupervised clustering using modified MAD-GANs

Improved generative diversity and selection using stacked GAN

Large-scale image composition using GAN networks

The models that I am building and working on are available in the models folder in the ConnectorGraph git.

Optimization problem collection with genetic algorithm solutions

This is a python implementation of a GA and Evo Strategy that I did for my graduate level evolutionary computation class. I chose to make it object-oriented so I could reuse the code across the different problems (it was not a requirement of the assignment). I similarly chose to build a user interface to try my hand in UI and make the code even more generic. The UI was particularly challenging: it is running on a separate thread for better optimization; it has a range of selections that have to be passed between threads; and it does a lot of the stitching together of the various sub-components.

It was a few years ago, but I believe the problems are as follows:

Evo Strategy evolves an identical string of lowercase letters and spaces to the user input;

OneMax starts with a random bit string and optimizes it to all 1's;

SimpleMax optimizes the function that multiplies the first five numbers and divides by the last five;

LeadingOnes optimizes the function that adds all the ones from left to right up to the first zero in the bit string;

TSP is a particular instantiation of the travelling salesman problem with a known solution (the exact optimum was never achieved with this particular implementation, but it got pretty close; impressive given how simple the implementation was);

Test Crossover is a proof of concept that the crossover function works.

Here is a link to a Git of all the files. If you download the "dist" folder and extract it, the file WXGAGUI.exe will run the program on Windows machines with minimal setup. The readme file gives all the details.

Phonomorphism mAPP

I created a simple algorithm based on Lendl Barcelos' work that re-maps a text while preserving its phonetic structure. The purpose of the work was to demonstrate the (near infinite) productivity that structured re-representation can generate.

Papers

Below is a list of academic papers from newest to oldest with links and abstracts where available.

An incoherent visualization is when aspects of different senses of a word (e.g. the biological 'mouse' versus the computer 'mouse') are present in the same visualization (e.g., a visualization of a biological mouse in the same image with a computer tower). We describe and implement a new model of creating contextual coherence in the visual imagination called Coherencer, based on the SOILIE model of imagination. We show that Coherencer is able to generate scene descriptions that are more coherent than SOILIE's original approach as well as a parallel connectionist algorithm that is considered competitive in the literature on general coherence. We also show that co-occurrence probabilities are a better association representation than holographic vectors and that better models of coherence improve the resulting output independent of the association type that is used. Theoretically, we show that Coherencer is consistent with other models of cognitive generation. In particular, Coherencer is a similar, but more cognitively plausible model than the C3 model of concept combination created by Costello and Keane (2000). We show that Coherencer is also consistent with both the modal schematic indices of Perceptual Symbol Systems theory (Barsalou, 1999) and the amodal contextual constraints of Thagard's (2002) theory of coherence. Finally, we describe how Coherencer is consistent with contemporary research on the hippocampus, and show evidence that the process of making a visualization coherent is serial.

We propose a large-scale system, with minimal global topological structure, no local internal structure, and a simple online biologically plausible local learning rule that captures supervised learning in the barn owl. We outline how our computational model corresponds to both the underlying neuroscience and the experimental paradigm used in the relevant prism studies of the barn owl. We show that our model is able to capture the basic outcomes of this experimental research despite learning the initial tuning curves, which is not done in other computational models, and a much more restricted time frame relative to the original experimental condition. We outline some variations between our model and the neuroscience outcomes and suggest future extensions in terms of larger models and time frames, more detailed analyses of the learning parameters, and richer model designs.

We propose a framework for sonic creativity via computational methods and artificial intelligence research. We extend Norman Sieroka's theory of sound from a 3-fold to a 4-fold hierarchy so that sound now becomes characterised in terms of the acoustic, physiological, speculative and phenomenal layers. We describe how manipulations in the proposed speculative layer can directly act on how one orients the very apprehension of sound. Black box algorithms, in particular deep neural networks as instantiated by Google's DeepDream, are then discussed as an illustrative example that can be used to manipulate the speculative layer. We caution that DeepDream offers a warning of how easily our tools capture our phenomenal apprehensions, potentially obfuscating what is just beyond our perception in the process.

We propose a new algorithm and formal description of generative cognition in terms of the multi-label bag-of-words paradigm. The algorithm, Coherence Net, takes its inspiration from evolutionary strategies, genetic programming, and neural networks. We approach generative cognition in spatial reasoning as the decompression of images that were compressed into lossy feature sets, namely, conditional probabilities of labels. We show that the globally parallel and locally serial optimization technique described by Coherence Net is better at accurately generating contextually coherent subsections of the original compressed images than a competitive, purely serial model from the literature: Coherencer.

This paper proposes that perception and generative cognition can be generally portrayed as a dyadic, compression-decompression sequence. We argue that both of these processes are necessary for the successful functioning of an agent in all domains where stimuli reduction is a requirement. We support this claim by comparing two compression representations, co-occurrence probabilities and holographic vectors, and two decompression procedures, top-n and Coherencer, on a context generation task from the visual imagination literature. We tentatively conclude that better decompression procedures will increase optimality regardless of the underlying compression representation.

A cognitive model of the visual imagination will produce “incoherent” results when it adds elements to an imagined scene that comes from different contexts (e.g., “computer” and “cheese” with “mouse”). We approach this problem with a model that infers coherence relations from co-occurrence probabilities of labels in images. We show that this algorithm’s serial traversal of networks of co-occurrence relations for a particular query produces greater coherence than one leading model in the field of computational coherence: Thagard’s connectionist model.

We describe the overall theory of the SOILIE model of the human imagination. In this description, we outline cognitive capacities for learning and storage, image component selection and placement, as well as analogical reasoning. The guiding theory behind SOILIE is that visual imagination is constrained by regularities in visual memories.

T. B. Ward (1994) investigated creativity by asking participants to draw alien creatures that they imagined to be from a planet very different from Earth. He found that participant drawings reliably contained features typical of common Earth animals. As a consequence, Ward concluded that creativity is structured. The present investigation predicts that this limitation on creativity is not restricted to drawings: the use of different technology will not change creative output. To investigate this question, participants performed Ward's task twice: once using pencil and paper and once using software made to design creatures (the Spore Creature Creator). Only minor significant differences were found. This preliminarily suggests that changing tools does not affect the overall rigidity of the creative process. This lends further support to Ward's thesis on the structural rigidity of creativity. We conclude by suggesting an elaboration to Ward's thesis that will be explored in future work. We suggest that aesthetics might be one of the factors that contribute to creative constraint, in that creatures that are too unusual would be less interesting.

A cognitive model of visual imagination will produce what we call “incoherent” results when it adds to an imagined scene that comes from multiple contexts (e.g., “arrow” and “violin” with “bow”). We approach this problem by exploring the co-occurrence of labels in images. We show that adding an incremental algorithm for examining networks of co-occurrence associations to the top-n co-occurring labels with a particular query produces greater coherence than just selecting the top-n labels or randomly selecting labels.

In 2005, John Ioannidis remarked on the abundance of false positives in modern, scientific research. The adoption of these false positives as ‘truth’ by the larger scientific and non-scientific communities is at best problematic. At worst, it may be directly harmful. This work seeks to situate Ioannidis’s claim in the work of Thomas Teo. Teo (2008) argues that any interpretation that has a negative impact on human sub-populations should be considered an act of epistemological violence. Thus, the abundance of false positives in the literature is not only scientifically problematic; it is socially and ethically problematic as well. Ioannidis’s solution, a more critical approach to scientific methodology, suggests a way forward. By developing a more formal account of the production of knowledge, including methodology, it is possible to determine constraints that are necessary for accurate and ethical knowledge claims. The work of Jean Piaget is then used to provide an account of such formalism and necessity, especially his idea of a ‘structure.’ This work concludes by urging other researchers to continue the search for necessary constraints in Piaget and similar theorist’s formal architectures. In this way, modern, scientific research can increase the accuracy of its knowledge claims and begin to meet Teo’s ethical imperative.

Vertolli, M. O. & Burman, J. T. (2011, June). On the cultural support of “cognizance”: The Lakatosian key to Piaget 3.0. Paper presented at the 41st annual meeting of the International Jean Piaget Society, Berkeley, CA.

This talk outlines an exploration of an anomalous diagram from a later, less well-known work by Jean Piaget: The Grasp of Consciousness: Action and Concept in the Young Child (1974/1976). By situating this work in the historical development of Piaget’s concept of “cognizance,” at least three possible interpretations are found (i.e., it is consistent with Piaget 1.0, 2.0, and 3.0). Only the last interpretation lends itself to the work in question. Thus, an illustrative text for this last interpretation, Psychogenesis and the History of Science, is used to situate this diagram in relation to another researcher, Imre Lakatos. Lakatos, later cited by Piaget, had a similar view of the development of knowledge, but, due to a different disciplinary allegiance, remained largely outside of the Piagetian discourse. By reintegrating this dialectic, a missing piece of Piaget’s cultural context is found: it is suggested that the contents of new insights are shaped both by action (as is understood of Piaget 2.0) and by the context of implication in which those actions are carried out (Piaget 3.0). Thus, “reason” can be supported through the construction of “commensurable contexts” with the contents one is trying to teach.

The current work seeks to situate a conflict with significant implications for the funding of early education that can be found in the literature on human development, specifically that relating to the nature-nurture dichotomy. It is argued that the contradictions inherent to Francis Galton’s separation of nature from nurture are made especially apparent when combined with the concept of underdetermination: the idea that interpretation is not determined strictly and solely by empirical content. This, in turn, is how the dichotomy is problematic: Galton’s opposing interpretations suggest contradictory alternatives, either an increase or decrease in funding, that are irresolvable due to the underdetermination of their respective empirical content. In order to support this reading of dichotomies, the degree of underdetermination is assessed relative to this conflict as it plays out in three books that have been recognized as exemplary to the field of developmental psychology by both the American Psychological Association and the Society for Research in Child Development.