We are building a data import feature in which our users will open an arbitrary CSV file and define a one-to-one mapping of its fields to a list of known fields in our database. The list of fields is very long (around 30 known fields and an arbitrary number in the CSV [though likely between 20 and 50 in the CSV).

A naive implementation would be simply laying out all of the discovered fields in a vertical list paired with combo boxes for selecting the known-field mapping.

[discovered 1] [combo]
[discovered 2] [combo]
etc

Or horizontally,

[discovered 1] [discovered 2] [etc...]
[combo] [combo] [combo]

The issues here are:

Long list makes it impossible to view the entire mapping without scrolling, for example to check if a known field has already been mapped.

Preventing multiple selection of known field or allowing an assignment to be changed.

Possibly a better approach would be similar to the above, except with draggable boxes for the known fields that can be positioned adjacent to the discovered field.

Another issue to consider is CSV files without header rows: how to identify each discovered field?

Are there existing examples out there of a good solution to this interaction?

5 Answers
5

A lot hinges on what "The list of fields is very long" actually means. Can you clarify your question by giving an actual number? both for number of fields in the CSV and number of fields in the database. Also are the field names relatively short or are some descriptive long names?

Grids can display more information easily than lists. Have a grid for the fields found from the CSV and a grid for the fields in the database, both shown at the same time. Drag and drop to match them up. There is then visual feedback on what has been used and what hasn't been used, from both points of view, and there's a vary natural way to make changes.

Large Grids

To have more than about 200 fields in the database you will need ways to hide some of the fields and only show fields that are of interest. A tab control with 8 tabs, based on alphabetic sorting, is one straightforward way to fit more on.

Another issue to consider is CSV files without header rows: how to identify each discovered field?

To be nicer - helping the user to actually identify the field - there should be an info-box on the screen that has more information for one field, for example 'Field 1: Appears to be a date; values between 9/Jul/1066 and 01/Apr/2035; 17 blank entries; 2 entries not in a valid date format; View'. This can show information for the first unmapped field, or the information for a field that has just been clicked on or dragged over.

This will help whether the user chooses to drag from discovered CSV field to database field or from database field to discovered CSV field.

Auto-Match

Something to consider, you asked for optimal interface, in the case where you do have field headers for the discovered field. If your software is able to make an educated guess about what field from the database matches based on name and type, that can be shown as 'suggested match' in the info-box. Checkbox to confirm. You could also have a screen that shows all suggestions and allows them to be reviewed/accepted in bulk.

Thanks for the ideas! You and @AlexFeinman proposed similar ideas of providing the known fields in some kind of grouping (alpha, category) which is much better than a long list. I love the 'discovered field info box' and auto-match ideas.
–
Aidan RyanJun 14 '11 at 16:10

As for the "what is a large number" question, around 30 known fields, and really an arbitrary number in the CSV, though probably between 20 and 50 (updated the main question as well).
–
Aidan RyanJun 14 '11 at 16:13

If you have a large set of known fields (more than two dozen), consider grouping them or arranging them hierarchically, so the user can see them all or at least sort through them all and understand the available set of fields.

Present them in a 'parking lot'--a sidebar or area of the screen. Then, present the discovered fields horizontally or vertically (depends on screen space, width of fields, number of fields).

Then, let the user specify known fields for discovered ones some way like one of these:

Drag and drop from parking lot to discovered field

Typing, with the software helping by guessing (like the tags feature here at SE)

Combo boxes for each discovered field, with the known fields in there in the same order as in the parking lot

You can highlight or grey out the known fields which are already used as the user selects them.

If the user still wants to drag a field or type it, you can either silently remove it from the previous location, or give an animation, or even a pop-up saying "are you sure you want to take this away from discovered field XXX?"

Show DB columns already assigned (*---*) on the bottom, grayed out and with an indicator icon. Allow choosing them, and clear the old mapping immediately when a new one is selected.

If the list is likely to be really long, consider multiple columns in the dropdown. Columns also help divide available vs assigned items (and avoid accidental clicks on assigned ones).

Auto-complete filters the list (so scrolling is not required).

Button to clear selection quickly next to the box (x)

Trying to automatically match up the items (best-guess) would be remarkably helpful.

Note: I'm strongly biased against drag-and-drop, especially if scrolling is involved. A combo box can be improved incrementally, whereas a fancier control would required more work for basic functionality. It is also a highly familiar/expected UI element.

Also consider rotating that grid view by 90°, such that the list of imported fields progress down the page - vertical scrolling works better for the case of there being many fields.
–
EricsOct 27 '11 at 23:01

On identifying each discovered field, a design I've seen is to provide a drop-down menu of the first 5 unique non-blank values for each of the imported fields (with the first value on the drop-down being "imported field N").

Another approach is to provide a way for the user to thumb through the imported records.

I don't think the combo boxes are ideal. With complete disregard to development cost, I'd suggest a little game:

Put them in two columns. Clicking on an item (left or right) draws an arrow that follow the mouse pointer until you click on an item from the other colum (match) or somewhere else (mismatch)

Move successful matches to the top, so they are "out of the way", but still allow to select and re-route them.

Provide keyboard nav (Cursor up / down / left / right to select, Enter or Space to "click", you need to indicate the "selected" state).

Provide visual (and maybe even auditory) cues. Treat it like a childrens game.

If space is not enough, scrolling should be mostly fine. If not, you can vertically split the screen, and arrange each set in a grid on each side:

Rationale

The combo box does not exploit image memory. "Charles was somewhere over there". This would be a big help for a large part of users. It may be a degradation for people who lack that trait, but that' sufficiently low and probably not even that much worse than the combobox solution.

The combo box - in trivial execution - always lists all items, successfully matched items are not removed. That would already be an improvement, but then you need a possibility to remove matches.