Native Navigation in the Mobile Web World

Posted on October 6th, 2014

Much has been written on the subject of web-based mobile apps and how they
compare to native apps. Often there's a lot of unfair prejudice against web
apps, but one point that consistently stands is related to navigation in web
apps. Although there are hundreds of libraries that claim to implement
native-feeling navigation for the web, it's incredibly difficult to match user
expectations about interaction, responsiveness, guesture support, and other
default behaviours. To complicate matters, these expectations can vary
significantly from platform to platform, and version to version (iOS6 vs iOS7).

The common tools for packaging web apps for mobile, such as Apache
Cordova/Adobe Phonegap, don't offer much of a solution to this navigation
problem. Your application essentially consists of a full-screen web view that
displays your web content, leaving you responsible for all app navigation
controls.

37signals and the Hybrid approach

There was a post in early 2014 by DHH from 37signals about their
Basecamp mobile app and how it combined web content with native navigation.
This is where things start moving into so-called "hybrid" apps: a mix of web
content and native controls.

Several people took the 37signals blog as inspiration and looked for ways to
build tools that would automatically generate native navigation controls based
on website content. A few examples:

GoNative parses a website for navigation menus, generates a JSON
configuration file for them, and packages it up into an app that builds
native menus from the JSON config. Their Android and iOS app containers are
open-source on GitHub, but the website parsing code is done as a service
through the website.

Stacker for iOS uses URL parameters to update native controls in the
title bar. Websites are loaded in a web view, which intercepts URL
navigation to update the native controls. This allows the native controls to
be controlled through the HTML dynamically, and by intercepting the requests
before they are sent over the network there is no delay before the native
controls update. This makes the app feel more responsive.

These are good solutions for taking web content from a server and displaying it
with native navigation controls, but in many cases web apps are built with
client-side frameworks that run entirely client-side. These are cases where
Cordova/Phonegap is often the technology of choice because it will package up
existing files and serve them locally on the mobile device.

Another disadvantage to these solutions is that some of the customization has
to happen in native code. Ideally we could control the native navigation
controls with the same ease and flexibility that we have with the HTML
<title> tag.

So how do we make native controls which are defined entirely by HTML? Could
these be defined in such a way that browsers can automatically generate native
navigation controls for apps?

Imagining an Ideal Solution

There is already one case where HTML can control an aspect of the native
interface, through the <title> tag. If you change the title in HTML via
JavaScript, the browser updates the window automatically with the new title.
One common feature of mobile navigation is a title bar displayed across the top
of the screen, and it makes sense for the content of that title to reflect the
title of the page in the web view.

Another existing convention found in web browsers is the back button. History
on the web works like a stack, with each new page or state adding to the top of
that stack, and the back button popping the most recent state off the top.
With the HTML5 History API, we have pushState() and replaceState() to
allow single-page applications to add to the history stack without page
reloads. Unfortunately, there is one key piece missing from the History API,
which is the ability to remove pages from the history stack programmatically.
It's somewhat understandable that we wouldn't want any website to be able to
start removing entries from the history stack, but for web apps there are
plenty of use cases.

Imagine you have an app with a landing page, a signup page, and a logged-in
dashboard. Someone launches the app for the first time and sees the landing
page, then clicks signup. At this point, hitting back should take them to the
landing page. After signing up, they are redirected to the dashboard. Hitting
back here shouldn't really do anything; it's definitely wrong to take them back
to signup when they're already logged in. If we used replaceState() when
redirecting, hitting back on the dashboard would still take them back to the
landing page which might also be undesirable. Unfortunately, there's no
JavaScript API (or even native web view API in most cases) for clearing the
history stack.

One part of the HTML5 spec that hasn't seen much adoption is the <menu>
element. In HTML3 this was just treated like a list, and was
removed from HTML4, but it's been added back in HTML5 along with a <menuitem>
element to build native toolbar and context menus. The idea is that web apps
can define a menu with a unique ID containing items and submenus, and then
refer to that menu by ID using the contextmenu="" attribute. When someone
right-clicks on long-presses on the element, the browser should open a context
menu with the items and submenus from the HTML.At the time of writing (October 2014) only Firefox natively supports this, but
there is ongoing work to implement this in Chrome.

Custom context menus with the <menu> tag would give developers more ability
to affect native interactions via HTML, but doesn't go quite far enough to
provide everything that would be needed for native navigation. In particular,
the actual navigation part is still missing. This is harder to solve, partly
because of the disparate navigation controls across platforms and devices, and
partly because there isn't a clear mapping to HTML.

Defining Navigation

<nav> elements

One possibility is to pull content from the <nav> element, but its contents
are not required to be structured in any consistent way, and it doesn't have a
tightly scoped use case. The HTML5 spec essentially says to use <nav>
anywhere that is "a section with navigation links". There are no restrictions
on the number of <nav> elements on the page. While it would be possible to
make some guesses based on the structure of the navigation content, it's an
ugly situation that should be avoided if possible.

Meta tags

Navigation and primary page actions could be specified by custom meta tags,
which is roughly the approach used by Internet Explorer for its pinned site
jump lists. In a single-page application however, this isn't
an ideal solution: meta tags are intended to apply to the document as a whole,
and modifying them at runtime can be buggy. Most client-side frameworks are
also focussed on the page content in the body, and aren't set up to manage meta
tags.

JavaScript

Some mobile platforms have exposed JavaScript APIs for defining navigation
menus, such as WinJS.UI.Menu and blackberry.ui.menu. These work well
for their respective platforms, but there's no consistent API for
multi-platform applications (which web apps often are). It's also my belief
that the navigation should be defined as part of the page structure, rather
than manually constructed through JavaScript. This allows for fallbacks in
browsers that don't support generating native navigation elements.

Web Components

Web components allow the creation of new custom elements with registration
events and logic available through JavaScript. Defining a new <native-menu>
element would be easy, and using JavaScript could be mapped onto existing APIs
like WinJS.UI.Menu or blackberry.ui.menu. Ultimately it has the same issues as
those JavaScript APIs in that the implementation would be platform-specific.
The only benefit is that the navigation would be defined in the HTML.

<menu type="toolbar"> elements

Finally we're left with the toolbar version of the <menu> tag. The HTML5
spec says that a menu of type toolbar should <li> elements or flow content.
That gives us some degree of structure that can be parsed, but those <li>
tags could still contain anything. Although not explicitly stated in the spec,
the toolbar type of the menu tag is essentially intended to behave like a
<ul> tag for compatibility with HTML3.

We could informally make some assumptions about what would be considered a
valid menu. The easy path would be to require <li> elements that contain a
valid command type (as defined in the HTML5 spec, which is basically one of
<a>, <button>, or <input>). That would serve as a starting point, but
there's really no way to enforce these informal guidelines. There's also
nothing preventing multiple menus from being declared in the page, which would
lead to ambiguity when trying to parse and generate native components. It
might be reasonable to only build native navigation for a menu that is a direct
child of the body, but that's just adding more unofficial rules to the mix.

The Path Forward

Unfortunately there's no single clear solution to all of this that can work in
browsers today, or that's even an official proposal for browsers tomorrow.
I've spoken informally to some browser developers about what sort of solution
they envision, but it's a hard problem to describe and the brief answer has
usually been to point to Web Components for custom elements.

Some attempts have been made to make due with the <menu> tag, despite the
open questions about how suitable it is. One of the Cordova developers
started a plugin in 2011 to act as a polyfill for native
menus, and I am responsible for a newer plugin along the same lines.
Ultimately these can't really be polyfills for native navigation if there's
no clear guideline on what should be polyfilled and how it should behave.

I would love to hear from browser developers, mobile OS developers, app
developers, and hybrid platform developers what they envision as an ideal
solution to this problem. The web improves through discussion.