Monthly Archive

I will not go to the generalized linear model class any more this semester, since I have known what the class is talking about and I think from now on it will be techniquly talking about binary response models and count response models. I will learn it later on by myself since the instructor is also the first time to teach this class and he is not familiar with this course too, and thus I think I can not learn more from him than from the textbook. If anyone has good materials to recommend to me, I will appreciate much. Thanks.

A standard paradigm assumes that the data comes from some underlying geometric structure, such as a curved submanifold or a singular algebraic variety. The observed data is obtained as a random sample from this space, and the objective is to statistically recover features of the underlying space and/or the distribution that generated the sample.

One then considers the magnitude of the statistical accuracy of these estimators. Considerable progress has been achieved in terms of optimal estimation in the minimax sense. These ideas have far reaching implications in the analysis of high-dimensional data such as, for example, in astronomy, biomechanics, medical imaging, microwave engineering and texture analysis.

In Computational Topology, one attempts to recover more qualitative global features of the underlying data instead, such as connectedness, or the number of holes, or the existence of obstructions to certain constructions, based upon the random sample. In other words, one hopes to recover the underlying topology. An advantage of topology is that it is stable under deformations and thus insensitive to errors introduced in the sampling.

A combinatorial construction such as the alpha complex or the Cˇ ech complex converts the discrete data into an object for which it is possible to compute the topology. However, it is quickly apparent that such a construction and its calculated topology depend on the scale at which one considers the data. A multiscale solution to this problem is the technique of persistent homology. It quantifies the persistence of topological features as the scale changes. Persistent homology is useful for visualization, feature detection and object recognition. It has been successfully applied to analyze natural images, neurological data, gene-chip data, protein binding and sensor networks.

Although Geometric Statistics and Computational Topology have a disparate appearance and seem to have different objectives, it has recently been noticed that they share a commonality through statistical sampling. In particular it has been noticed that the metric distance of persistent homology in Computational Topology, is intimately related to the sup-norm metric between the underlying density that generates a random sample on a Riemannian manifold, and its statistical estimator. Consequently, the qualitative and quantitative data analyses are intimately linked, which is not surprising because of the close connection between geometry and topology traditionally.

The use of geometric and topological methods for statistical data analysis is currently being pursued in the three allied fields of computer science, mathematics and statistics. Although each field has their own particular approach and questions of interest, the amount of similarity is striking and this workshop was able to synthesize all three fields together. The open problems that were considered was the development of computational and statistical algorithms and methods using aspects of geometry and topology when data over the geometric object was only available.

We can summarize the type of investigations as it pertains to the aforementioned three fields. A more detailed description is provided in the following section:

In computer science the pursuit naturally focused on efficient algorithms and visualization. Some specific items discussed included algorithms for the discrete approximation of the Laplacian, algorithms for approximating cut-locus, data reduction techniques, and recovery from noisy data;

In mathematics the interest focused on certain constructions. Here such topics included zigzag persistence, Hodge theory, and recovering the topology over a random field;

In statistics parameter estimation was the main interest and topics included bootstrapping and MCMC on manifolds, geodesic PCA, asymptotic minimaxity, conditional independence, statistical multiscale analysis and analysis over the Euclidean motion group.

Additionally, some physical applications were also discussed such as brain mapping, network analysis and biomechanics of osteoarthritis.

Abstract: We present a geometrical method for analyzing sequential estimating procedures. It is based on the design principle of the second-order efficient sequential estimation provided in Okamoto, Amari and Takeuchi (1991). By introducing a dual conformal curvature quantity, we clarify the conditions for the covariance minimization of sequential estimators. These conditions are further elabolated for the multidimensional curved exponential family. The theoretical results are then numerically examined by using typical statistical models, von Mises-Fisher and hyperboloid models.

Abstract: The current definition of a conditional probability distribution enables one to update probabilities only on the basis of stochastic information. This paper provides a definition for conditional probability distributions with non-stochastic information. The definition is derived as a solution of a decision theoretic problem, where the information is connected to the outcome of interest via a loss function. We shall show that the Kullback-Leibler divergence plays a central role. Some illustrations are presented.

This semester I am learning theoretical statistics, especially on testing statistical hypotheses. The main textbook is Lehmann’s classical book. But I hope you could help me get an excellent course note or other books on this topic. If you have some recommendations, please let me know. Thanks very much.

Abstract. We introduce vector diffusion maps (VDM), a new mathematical framework for organizing and analyzing massive high dimensional data sets, images and shapes. VDM is a mathematical and algorithmic generalization of di usion maps and other non-linear dimensionality reduction methods, such as LLE, ISOMAP and Laplacian eigenmaps. While existing methods are either directly or indirectly related to the heat kernel for functions over the data, VDM is based on the heat kernel for vector elds. VDM provides tools for organizing complex data sets, embedding them in a low dimensional space, and interpolating and regressing vector elds over the data. In particular, it equips the data with a metric, which we refer to as the vector diffusion distance. In the manifold learning setup, where the data set is distributed on (or near) a low dimensional manifold Md embedded in Rp, we prove the relation between VDM and the connection-Laplacian operator for vector elds over
the manifold.

D. K. Biss (Topology and its Applications 124 (2002) 355-371) introduced the topological fundamental group and presented some interesting basic properties of the notion. In this article we intend to extend the above notion to homotopy groups and try to prove some similar basic properties of the topological homotopy groups. We also study more on the topology of the topological homotopy groups in order to find necessary and sufficient conditions for which the topology is discrete. Moreover, we show that studying topological homotopy groups may be more useful than topological fundamental groups.

This paper describes the structure of the moduli space of holomorphic curves and constructs Gromov Witten invariants in the category of exploded manifolds. This includes defining Gromov Witten invariants relative to normal crossing divisors and proving the associated gluing theorem which involves summing relative invariants over a count of tropical curves.

These are lecture notes that arose from a representation theory course given by the first author to the remaining six authors in March 2004 within the framework of the Clay Mathematics Institute Research Academy for high school students, and its extended version given by the first author to MIT undergraduate math students in the Fall of 2008. The notes cover a number of standard topics in representation theory of groups, Lie algebras, and quivers, and contain many problems and exercises. They should be accessible to students with a strong background in linear algebra and a basic knowledge of abstract algebra, and may be used for an undergraduate or introductory graduate course in representation theory.

ps:In the latest version, misprints and errors were corrected and new exercises were added, in particular ones suggested by Darij Grinberg

It is argued that zero should be considered as a cardinal number but not an ordinal number. One should make a clear distinction between order types that are labels for well-ordered sets and ordinal numbers that are labels for the elements in these sets.