Strategies for Internet citizens

Tagging mechanisms and strategies part 3: Taxonomy and folksonomy

Should a tag namespace be a top-down taxonomy or a bottom-up folksonomy? My answer is: both. In recent months, as I curate calendar hubs for selected cities, I’ve been working toward an approach that harmonizes the two styles.

Principle: Top-down and bottom-up

In the elmcity context, the most important taggable object is the calendar feed. That’s because when you can characterize a whole feed with a tag, all the events in that feed inherit the tag. The primary sources of taggable feeds are Eventful, Upcoming, Facebook, Meetup, and EventBrite. I call them taggable because, while some of these services tag individual events, none tag feeds based on venues (Eventful, Upcoming) or Pages (Facebook) or groups (Meetup) or organizers (EventBrite). Assigning feed-level tags is an editorial exercise for the curator.

To these sources I add as many standalone iCalendar feeds as I can find. For Boston and Seattle, the results add up to lists of over 600 tagged iCalendar feeds. Here’s a table of the current list of tags for Boston, the current list for Seattle, and the intersection of the two lists.

Boston

adoption

1

african

1

animals

28

arabic

1

architecture

38

art

222

asian

30

astronomy

13

baseball

94

basketball

78

berklee

285

boating

7

books

160

boston

7

boston.com

1020

boston.gov

395

boston-latin-hs

109

bpl

153

bu

101

business

191

cathedral-hs

15

children

73

church

469

climbing

23

comedy

174

comics

1

community

883

conferences

96

confernces

1

conflict-resolution

7

cycling

21

dance

156

dining

4

diving

18

dorchester

4

east-boston-hs

17

education

365

english

7

environment

12

european

4

eventbrite

452

eventful

2692

facebook

436

family

79

fashion

4

film

194

finance

2

fitness

2

food

115

football

1

french

3

games

33

german

3

government

42

green-technology

2

harvard

11

health

206

highschool

142

hiking

44

hispanic

1

history

156

hockey

35

indian

4

irish

1

islamic

15

italian

13

japanese

50

jazz

70

language

83

law

5

lectures

304

lgbt

1

library

378

martial-arts

47

massart

10

meditation

10

meetup

975

mensa

46

museum

94

music

2023

nature

91

networking

105

northeastern

167

performing-arts

222

philanthropy

1

philosophy

6

photography

53

poetry

10

politics

153

polyamory

10

portuguese

8

pub-crawl

4

recreation

201

running

59

sailing

7

science

151

seminars

10

simmons

15

social-justice

89

softball

1

south-boston-hs

1

spanish

1

spirituality

106

sports

590

statistics

2

suffolk

2

support

46

surfing

2

swimming

29

synagogue

5

technology

320

theater

90

tourism

43

tours

90

travel

3

umass

9

university

599

upcoming

212

visual-arts

72

volunteer

46

women

25

writing

17

ymca

4

yoga

27

Seattle

africa

4

animals

26

aquarium

13

art

563

arts-and-crafts

14

ballet

4

basketball

31

beer

1

boating

9

books

201

business

62

business-and-technology

31

charity-and-volunteer

10

children

302

chinese

28

church

399

circus

23

cleveland-high

9

climbing

16

coffee

4

comedy

47

comics

1

community

574

conferences

65

cooking

3

dance

151

diving

8

dogs

3

education

103

environment

123

eventbrite

139

eventful

1996

facebook

216

fairs-and-festivals

13

film

136

finance

22

fitness

243

food

40

food-and-dining

26

games

114

garfield-high

12

german

13

government

145

gradeschool

17

green-technology

1

health

192

highschool

35

hiking

74

history

4

ingraham-high

1

insurance

1

italian

3

japanese

17

jazz

46

knitting

35

language

153

latin-american

30

lectures

101

lgbt

21

library

190

martial-arts

1

meetup

1107

museum

77

music

1223

native-american

20

nature

32

networking

54

nonprofit

4

nscc

1

opera

2

pacific-science-center

609

performing-arts

337

philosophy

2

photography

12

police

11

politics

31

real-estate

1

recreation

195

roosevelt-high

4

running

108

science

174

sculpture

1

seattle.gov

449

seattlepi

347

seattleu

12

seminars

23

skiing

2

spanish

19

spirituality

29

sports

151

storytelling

1

sustainability

5

swedish

73

synagogue

4

technology

98

teens

94

theater

166

tourism

11

town-hall-seattle

54

transportation

87

travel

9

trumba

296

university

390

upcoming

66

uw

366

vegan

4

visual-arts

89

volunteer

48

walk-bike-ride

3

walking

41

wallingford

159

wine

9

witches

13

women

26

writing

42

yoga

21

youth

105

Common Tags

animals

art

basketball

boating

books

business

children

church

climbing

comedy

comics

community

conferences

dance

diving

education

environment

eventbrite

eventful

facebook

film

finance

fitness

food

games

german

government

green-technology

health

highschool

hiking

history

italian

japanese

jazz

language

lectures

lgbt

library

martial-arts

meetup

museum

music

nature

networking

performing-arts

philosophy

photography

politics

recreation

running

science

seminars

spanish

spirituality

sports

synagogue

technology

theater

tourism

travel

university

upcoming

visual-arts

volunteer

women

writing

yoga

Among the dynamics in play here, we can see the general and specific principle at work. For a general tag like university there are city-specific instantiations: bu and northeastern for Boston, uw and seattleu and nscc for Seattle. Likewise for the general tag highschool there are specific tags like boston-latin-hs and cathedral-hs for Boston, garfield-high and ingraham-high for Seattle.

These city-specific tags are top-down in the sense that I, as curator of the hub, have assigned them and made them part of the hub’s core tag vocabulary. But they are also bottom-up in the sense that they represent discoverable sources that are providing enough event flow to warrant such treatment.

These core hub vocabularies are fluid. As I move from hub to hub I’ve been keeping an eye on the common core and refactoring all the hub vocabularies as I go along. I also use these evolving hub vocabularies as templates against which to match vocabularies from other sources.

Mechanism: Tag matching

Some of the source services, notably Eventful and EventBrite, include per-event tags. When one of these tags matches a tag in the (evolving) core vocabulary for that hub, the elmcity service adds that tag to the event’s list of tags which it inherited from its feed.

There are also tables for each foreign service that map tags used there to tags in the hub’s core vocabulary. So, for example, the Eventful tag movies_film and the EventBrite tag movies both map to the core tag film.

As we saw in Portable tags, some iCalendar feeds use the CATEGORIES property of the iCalendar format to express per-event tags. Managing these tags is trickier because, well, they’re unmanaged. Until recently I was suppressing them. Now I’m experimentally allowing them to appear, but segregating them from the core vocabulary. If you check the tags for Boston or Seattle or another city you’ll see that the list divides into two sections. The first presents managed tags: the core vocabulary. The second presents unmanaged tags from iCalendar feeds, enclosed in squiggly brackets to differentiate them from the core vocabulary.

Here’s the current set of unmanaged tags for Boston and Seattle:

Boston

{academics}

10

{adams street}

4

{air pollution control
commission hearings}

1

{alumni relations}

6

{athletics}

6

{bikes}

1

{blc}

3

{boston home center}

2

{boston main streets}

1

{boston public library}

3

{brighton}

2

{central library}

84

{charlestown}

2

{city clerk}

15

{city council}

8

{college of arts &
sciences}

10

{college of business
administration}

1

{college of computer
& information science}

13

{college of engineering}

13

{commercial}

1

{connolly}

12

{dnd}

1

{dudley literacy center}

11

{dudley}

19

{east boston}

9

{egleston square}

3

{elderly commission}

1

{election}

1

{faneuil}

2

{fields corner}

11

{group exercise}

4

{grove hall}

11

{honan- allston}

11

{hyde park}

10

{jamaica plain}

7

{licensing}

1

{lower mills}

5

{massart events}

1

{mattapan}

15

{north end}

11

{ongoing}

33

{orient heights}

5

{other}

90

{parker hill}

10

{performing/visual arts}

60

{president}

1

{public event}

139

{public health
commission}

1

{roslindale}

8

{social}

6

{south boston}

18

{south end}

6

{student affairs}

2

{student development}

9

{uphams corner}

2

{washington village}

4

{west end}

13

{west roxbury}

4

Seattle

{animal shelter}

23

{athletics/varsity
sports/men}

3

{athletics/varsity
sports/women}

4

{athletics}

6

{boards &
commissions}

32

{bothell}

29

{built environments}

3

{career management}

1

{city council}

88

{community centers}

29

{community outreach}

10

{community technology}

18

{concerts}

17

{continuing education}

20

{diversity}

2

{eastside }

58

{emergency}

12

{engineering}

13

{environmental learning}

3

{exhibits}

97

{farther afield}

10

{forums}

8

{global health}

1

{health sciences}

18

{hearing examiner}

12

{hr-benefits}

1

{jackson school of
international studies}

6

{libraries}

1

{meetings}

5

{north sound}

19

{office of the mayor}

2

{other}

1

{panel discussions}

2

{parks}

2

{performing/visual arts}

29

{psychology}

4

{ptsa}

6

{public outreach and
engagement}

68

{public}

23

{readings}

1

{research}

1

{sales}

1

{school of art}

92

{school of business}

22

{schoolof art}

1

{seattle area}

188

{seattle fire department}

2

{seattle youth
commission}

11

{south sound}

21

{special events}

16

{sports/spirit}

1

{student activities}

6

{tacoma}

3

{technical communication}

8

{the center for wooden
boats – south lake union}

9

{tours}

27

{training}

13

{urbanization}

1

{vst}

1

{walk bike ride}

3

{workshops}

6

When one of these tags matches a tag in a hub’s core vocabulary I promote it — that is, I treat it as part of the managed core and it no longer shows up in squigglies. That’s a top-down approach. But there’s a complementary bottom-up approach. As I scan the unmanaged tags, both within and across hubs, it can become clear that an unmanaged tag belongs in the managed core. To accomplish that I simply use the unmanaged tag somewhere in the managed core. From then on, occurrences of the unmanaged tag are promoted into the core.

A logical next step is to enable curators to edit per-hub maps so that, for example, Seattle’s {central library} and Boston’s {libraries} will be promoted to simply library. I haven’t built this mapping feature yet but it’s on the todo list.

I’m still exploring the interplay between the top-down and bottom-up approaches. But it definitely feels like the right way to handle common vocabularies augmented by different (and regionally-varying) vocabularies.