Category: search

I’m entering the final stages of a figure search engine, a nice wrapper for the new API method discussed below. It’s also a chance to properly release data mined directly from arxiv figures, and take advantage of the lambda + S3 processing pipeline I developed when pushing the p2t algorithms to cloud initially. Attached isContinue Reading figure meta data

We’re very grateful to Dr Piatetsky-Shapiro for the chance to publish an item in kdnuggets, check it out here. In the process of putting some examples together for the article, I think I’ve finally landed on a useful workflow and schema for the figure data search engine, hoping to get that out asap; stay tunedContinue Reading kdnuggets

I’ve uploaded about 500k of the 1M or so figures extracted from arxiv, to AWS S3 for storage. As mentioned, each figure is represented as a Gaussian mixture model (in *CSV) and an image showing the locations (indices) of the model components. Below is a quick summary of the dataset. The largest three subject areasContinue Reading Figure Search Engine III

I previously described modeling figure pixels with Gaussian mixtures. A few months ago, I took the same procedure and applied it to over 100k PDF documents in arXiv, which yielded close to 1M figures. The output of the process, for each figure, is a CSV spreadsheet of model parameters, and an image showing the location ofContinue Reading Figure Search Engine II