Menu

On popular demand, we added a new feature called “weekly news digest” in Polytopix. Our ranking algorithm picks the top (at most 10) news articles (along with the contextual explanatory articles) from the last one week and emails them every Friday midnight. I will post the details of our ranking algorithm in a future blog post.

Share this:

Like this:

I am inaugurating this year’s blogging with some very exciting news. I am starting a startup called Polytopix. I will be finishing my spring semester teaching responsibilities at Princeton and moving to Bay Area this summer to work full-time on Polytopix.

Meanwhile, I am actively designing, coding and deploying new algorithms, hiring R&D engineers, working on legal aspects and many more related action items. For the first time, I am using a book (instead of post-it’s) to keep track of my to-do list.

I can hear the clock ticking louder and faster than usual, perhaps because I am behind my schedule. According to my original plan (enthusiastically devised during my final semester of PhD), this blog post was supposed to appear in January 2013 !! A bunch of interesting (to say the least) events contributed generously to this delay.

Polytopix is aimed at adding context to news articles by analyzing the semantics of the news events. Polytopix’s algorithms retrieve news articles, categorize them, analyze their semantics and augment them with contextual explanatory articles.

Why Startup ?

The main idea of ‘semantic analysis of news’ is in my mind since my undergraduate senior thesis defense in 1999. Back then, I developed a summarization engine using lexical cohesion and information retrieval algorithms. See my very first publication here. Starting from the final semester of my PhD (Spring 2011), I started developing a semantic engine to understand and analyze news, especially financial news. I used it as a stock picker to invest my savings. It performed much better than my mutual funds. This is my first realization of the potential of ‘semantic analysis’. Polytopix applies semantic analysis to daily news articles. I am planning to “spin-off” the ‘financial news analysis’ as a separate startup. More on this, in a later blog post.

You probably heard the advice “Do not do a PhD just for the sake of doing it”. The same advice applies (with much more emphasis) to starting a startup. You should only start a startup if you are really passionate to solve a particular problem and you are strongly convinced that starting a company is the best way to solve it. I have all the right reasons to start Polytopix.

How do I feel now ?

Well, I am feeling very excited and energetic. The roadmap looks very challenging.

In my last post, I said my next post will be about one of my papers on directed minors. But that paper is taking more time to write than I expected. Meanwhile, here is my review of two algorithms textbooks: Algorithms Unplugged and The Power of Algorithms, to appear in the SIGACT book review column soon.

Merry Christmas and Happy New Year to everybody !!

——————————————————————————————————–

Introduction

Algorithms play an integral part in our daily lives. They are everywhere. They help us travel efficiently, retrieve relevant information from huge data sets, secure money transactions, recommend movies, books, videos, predict stock market etc. Algorithms are essentially simple extensions of our daily rational thinking process. It is very tough to think about a daily task that does not benefit from efficient algorithms. Often the algorithms to solve our daily tasks are very simple, yet their impact is tremendous.

Most of the common books on algorithms start with sorting, searching, graph algorithms and conclude with NP-completeness and perhaps some approximation and online algorithms. The breadth of algorithms cannot be covered by a single book. Algorithms Unplugged and The Power of Algorithms take different approach compared to standard Algorithms textbooks. They are aimed at explaining several basics algorithms (written by multiple authors) in an intuitive manner with real-life examples, without compromising the details. This is how algorithms should be taught.

Algorithms Unplugged

This book is divided into four major parts. Each part has several chapters. Here is an overview of these parts and chapters.

Part I: The first part is about sorting and systematic search i.e., finding things quickly. Chapter 1 introduces binary search, one of the most basic search strategies. A recursive implementation of binary search is explained using an intuitive example to find a CD is a sorted sequence of CD’s. Chapter 2 explains insertion sort, one of the most intuitive comparison-based sorting algorithms and Chapter 3 explains merge sort and quick sort, two sorting algorithms based on the divide and conquer paradigm. Chapter 4 explains bitonic sorting circuit to implement a parallel sorting algorithm. Chapter 5 describes topological sorting and explains how to schedule jobs without violating any dependencies between jobs. Chapter 6 considers string searching problem and explains the Boyer-Moore-Horspool algorithm. Chapter 7 considers the search problem in several real-world applications and explains the depth first search algorithm. Chapter 8 explains how to escape from a dark labyrinth using Pledge’s algorithm. Chapter 9 defines strongly-connected components in directed graphs and explains how to efficiently find directed cycles. Chapter 10 introduces basic principles of search engines, introduces PageRank and explains how to find relevant pages in the World-Wide Web.

Part II: The second part deals with arithmetic problems, number theoretic, cryptographic, compression and coding algorithms. Chapter 11 presents Karatsuba’s method of multiplying long integers that is much more efficient than the basic grade school method. Chapter 12 explains how to compute the greatest common divisor of two numbers using the centuries old Euclidean algorithm. Chapter 13 explains the Sieve of Eratosthenes, a practical algorithm to compute the table of prime numbers. Chapter 14 introduces the basics of one-way functions which play a crucial role in the following chapters. Chapter 15 presents One-Time-Pad, a basic symmetric cryptographic algorithm. Chapter 16 explains Public-Key Cryptography, an asymmetric cryptographic method using different keys for encryption and decryption. Chapter 17 explains how to share a secret in such a way that all participants must meet to decode the secret. Chapter 18 presents a method to play poker by email using cryptographic methods. Chapter 19 and 20 presents fingerprinting and hashing techniques to compress large data sets so that they can be compared using only a few bits. Chapter 21 introduces the basics of coding algorithms to protect data against errors and loss.

Part IV: The final part is about optimization problems. Chapter 32, 33 and 34 describes shortest path, minimum spanning tree and maximum-flow algorithms, three basic optimization problems. Chapter 35 discusses stable marriage problem and presents an algorithm to find a stable matching in a bipartite graph. Chapter 36 explains an algorithm to find the smallest enclosing cycle of a given set of points. Chapter 37 presents online algorithms for the Ski-Rental and Paging problems. Chapter 38 and 39 discusses the Bin-Packing and the Knapsack problems. Chapter 40 discusses the Traveling Salesman Problem, one of the most important optimization problems that challenged mathematicians and computer scientists for decades. Chapter 41 introduces Simulated Annealing method to solve a basic tiling problem and the Traveling Salesman Problem.

At the end of every chapter there are references for further reading. Readers are highly encouraged to go through these references to get better understanding of the corresponding concepts.

The Power of Algorithms

This book is divided into two major parts. Each part has several chapters. Here is an overview of these parts and chapters.

Part I: The first part is divided into three chapters. Chapter 1 gives a historical perspective of algorithms, origin of the word {\em algorithm}, recreational algorithms and reasoning with computers. Chapter 2 aims at explaining how to design algorithms by introducing the basics of graph theory and two algorithms techniques: the backtracking technique and the greedy technique. Chapter 3 quickly introduces the complexity classes P and NP and the million dollar P vs NP problem.

Part II: The second part is aimed at explaining several algorithms of daily life. Chapter 4 explains the Shortest Path problems, one of the basic optimization problems. Chapter 5 discusses the basics of Internet and Web Graphs and explains several related algorithms to Crawl, Index and Search the Internet. Chapter 6 discusses the basics of cryptographic algorithms such as RSA and digital signatures. Chapter 7 discusses biological algorithms. Chapter 8 explains networks algorithms with transmission delays. Chapter 9 discusses algorithms for auctions and games. It presents Prisoner’s dilemma, Coordination games, Randomized strategies, Zero-sum games, Nash’s Theorem, Spurners Lemma, Vickery-Clarke-Groves auctions and competitive equilibria. Chapter 10 explains the power of randomness and its role in complexity theory.

At the end of every chapter there are Bibliographic Notes with several pointers for further reading.

Opinion

Overall I found these two books very interesting and well-written. There is a nice balance between informal introductions and formal algorithms. It was a joy for me to read these books and I recommend them to anyone (including beginners) who is curios to learn some of the basic algorithms that we use in our daily lives in a rigorous way. I strongly encourage you to read these two books even if you have already read a bunch of algorithms books.

There is no prerequisite to follow these books, except for a reasonable mathematical maturity and perhaps some familiarity with basic constructs of at least one programming language. They can be used as a self-study text by undergraduate and advanced high-school students. In terms of being used in a course, some of the topics in these books can be used in an undergraduate algorithms course. I would definitely suggest that you get them for yourself or your university/department library.

This summer, I went to a local bookstore to checkout the vocabulary section. There are several expensive books with limited number of practice tests. I also noticed a box of paper flashcards (with only 300 words) for around $25 !!! After doing some more research, I realized that the existing solutions (to learn english vocabulary) are either too hard to use and/or expensive and/or old-fashioned.

So I started building an app with ‘adaptiveness’ and ‘usability’ as primary goals. The result is the Vocabulary App (for iPhone and iPad). Here is a short description of my app.

Vocabulary app uses a sophisticated algorithm (based on spaced repetition and Leitner system) to design adaptive multiple-choice vocabulary questions. It is built on a hypergraph of words constructed using lexical cohesion.

Learning tasks are divided into small sets of multiple-choice tests designed to help you master basic words before moving on to advanced words. Words that you have the hardest time are selected more frequently. For a fixed word, the correct and wrong answers are selected adaptively giving rise to hundreds of combinations. After each wrong answer, you receive a detailed feedback with the meaning and usage of the underlying word.

Works best when used every day. Take a test whenever you have free time.

At any given waking moment I spend my time either (1) math monkeying around (or) (2) code monkeying around. During math monkeying phase, I work on math open problems (currently related to directed minors). During code monkeying phase, I work on developing apps (currently Algorithms App, Vocabulary App) or adding new features to my websites TrueShelf or Polytopix. I try to maintain a balance between (1) and (2), subject to the nearest available equipment (a laptop or pen-and-paper). My next post will be on one of my papers (on directed minors) that is nearing completion. Stay tuned.

Thomas [Tho’90] proved that every undirected graph admits a linked tree decomposition of width equal to its treewidth. This theorem is a key technical tool for proving that every set of bounded treewidth graphs is well-quasi-ordered. An analogous theorem for branch-width was proved by Geelen, Gerards and Whittle [GGW’02]. They used this result to prove that all matroids representable over a fixed finite field and with bounded branch-width are well-quasi-ordered under minors. Kim and Seymour [KS’12] proved that every semi-complete digraph admits a linked directed path decomposition of width equal to its directed pathwidth. They used this result to show that all semi-complete digraphs are well-quasi-ordered under “strong” minors.

In this paper, we generalize Thomas’s theorem to all digraphs.

Theorem : Every digraph G admits a linked directed path decomposition and a linked DAG decomposition of width equal to its directed pathwidth and DAG-width respectively.

The above theorem is crucial to prove well-quasi-ordering of some interesting classes of digraphs. I will release Directed Minors IV soon. Stay tuned !!

We prove that the directed treewidth, DAG-width and Kelly-width of a digraph are bounded above by its circumference plus one. This generalizes a theorem of Birmele stating that the treewidth of an undirected graph is at most its circumference.

Theorem : Let G be a digraph of circumference l. Then the directed treewidth, DAG-width and Kelly-width of G are at most l + 1.

The above theorem can be seen as a mini mini mini directed grid minor theorem. I will be using this theorem in future papers to make progress towards a directed grid minor theorem. Stay tuned !!

Share this:

Like this:

Wish you all a Very Happy New Year. Here is a list of my 10 favorite open problems for 2014. They belong to several research areas inside discrete mathematics and theoretical computer science. Some of them are baby steps towards resolving much bigger open problems. May this new year shed new light on these open problems.

2. Optimization :Improve the approximation factor for the undirected graphic TSP. The best known bound is 7/5 by Sebo and Vygen.

3. Algorithms :Prove that the tree-width of a planar graph can be computed in polynomial time (or) is NP-complete.

4. Fixed-parameter tractability : Treewidth and Pathwidth are known to be fixed-parameter tractable. Are directed treewidth/DAG-width/Kelly-width (generalizations of treewidth) and directed pathwidth (a generalization of pathwidth) fixed-parameter tractable ? This is a very important problem to understand the algorithmic and structural differences between undirected and directed width parameters.

5. Space complexity :Is Planar ST-connectvity in logspace ? This is perhaps the most natural special case of the NL vs L problem. Planar ST-connectivity is known to be in . Recently, Imai, Nakagawa, Pavan, Vinodchandran and Watanabe proved that it can be solved simultaneously in polynomial time and approximately O(√n) space.

6. Metric embedding :Is the minor-free embedding conjecture true for partial 3-trees (graphs of treewidth 3) ? Minor-free conjecture states that “every minor-free graph can be embedded in with constant distortion. The special case of planar graphs also seems very difficult. I think the special case of partial 3-trees is a very interesting baby step.

7. Structural graph theory : Characterize pfaffians of tree-width at most 3 (i.e., partial 3-trees). It is a long-standing open problem to give a nice characterization of pfaffians and design a polynomial time algorithm to decide if an input graph is a pfaffian. The special of partial 3-trees is an interesting baby step.

8. Structural graph theory :Prove that every minimal brick has at least fourvertices of degree three. Bricks and braces are defined to better understand pfaffians. The characterization of pfaffian braces is known (more generally characterization of bipartite pfaffians is known). To understand pfaffians, it is important to understand the structure of bricks. Norine,Thomas proved that every minimal brick has at least three vertices of degree three and conjectured that every minimal brick has at least cn vertices of degree three.

9. Communication Complexity :Improve bounds for the log-rank conjecture. The best known bound is

10. Approximation algorithms :Improve the approximation factor for the uniform sparsest cut problem. The best known factor is .

Here are my conjectures for 2014🙂

Weak Conjecture : at least one of the above 10 problems will be resolved in 2014.

Conjecture : at least five of the above 10 problems will be resolved in 2014.

Strong Conjecture : All of the above 10 problems will be resolved in 2014.