Monday, January 23, 2012

In the previous post, I discussed the current state of embedded clause thought in Koa, and was about to go on to other theoretical possibilities that might do a better job of maintaining the spirit of the rest of the language's design.

I have a contender, actually. It occurred to me spontaneously while walking the dog, and I'm not sure just how crazy it actually is: I know there are languages that do it this way, but none of the ones I speak. It goes like this.

What if i doesn't just mark a verb with a third-person nominal subject like I've always said? What if it marks the clause itself as being independent, finite, verbal? And what if it can be replaced with other particles to switch the clause into one of the other two main Koa roles, nominals and adjectivals? We could end up with something like:

I was astonished: I had never ever before considered the possibility that i might alternate with anything else, but suddenly I seemed to be looking at a system that completely paralleled the rest of Koa syntax. And there's no question about scope, because the clause type of every verb is clearly identified.

Well, sort of. There's no i in clauses with a pronominal subject, of course, since we've always assumed that the pronoun was taking its place; should there be an u or ko, then? My best answer to this question so far amounts to a complete reenvisioning of Koa clause structure, from this

(SUBJECT) PRONOUN (TAM) VERB (OBJECT)

to this

(SUBJECT) STATUS (PRONOUN) (TAM) VERB (OBJECT)

In other words, i is no longer a kind of pronoun: it marks the status of the clause, and the pronoun falls between it and the verb. All we need to say is that the i is generally deleted before pronouns, and all of our existing material still conforms. U and ko, then, would seemingly need to be present even with pronouns, though I'm open to further thought on this.

My thought is that generally a clause would either have a pronominal or a nominal subject, but not both -- in other words, not le Keoni i ta paólo mo pili "John [he] smells like a lizard" -- but this will also require more thought. It has occurred to me that using "they" in this way might be an option for overt number marking on the subject, something we currently have no way of doing:

But we'll leave that aside for the moment. Just to establish exactly what we're talking about here, I'm going to recast all of the preceding example sentences (and one or two new ones) using this strategy.

In theory, this all works beautifully. Unfortunately, though, much though it pains me given the gorgeous symmetry of this system, I'm concerned that in many cases this might just be too weird, too typologically marked, for an IAL. For what it's worth, though, I do note that the use of ko above mimics Latin complement clause structure surprisingly closely, imagining ko + verb as equivalent to an infinitive.

Leaving this idea for further deliberation, there's something in the last series of examples that I'd like to point out. Take a look at these two sentences:

These both lack either a ko or an u, since any other adjectival phrase wouldn't have one either: pai mehísi "foggy day," ti pai i mehísi "this day is foggy," etc. It occurrs to me that, now that there's no i in there, if we were going to put the ko or u back in, it might just make as much sense to slap it on the beginning of the clause rather than in front of the verb. In other words:

In other words, the function of i would, among other things, be to mark the clause as finite; removing it would allow the clause to behave like any other predicate. Let's see how this would affect the rest of our example set.

My first reaction is that this instinctively feels like the best so far. I see two objections we'll need to investigate:

1) Why do all these embedded clauses have to be non-finite? I'm asking both in terms of Koa structure and in terms of what is typologically reasonable.

I'm thinking, with undisguised relief, that the finiteness question might actually be a bit tautological. If I say kunu kona "black dog," why does the adjective "have to be non-finite" here? Well, it has to be non-finite because it's in a position in which all Koa predicates are non-finite. The same is true of phrases like ka kona "the black one." Koa predicates are finite only when preceded by i or a pronoun. Looking at it this way, I don't see that there's any other way to do it.

2) What happens when I interpret a phrase like ko le Malía ipo a sahi koa, translated above effectively as "Mary('s) drinking some good wine," according to the usual rules of Koa predicate relationships? Does it make any sense?

Well, let's see. As we know, when two Koa predicates stand in the order XY, Y modifies or describes X. If Y is specified, the relationship is seen as genitive. Ipo a sahi koa, then, will mean "drinker of good wine."

Since this phrase has no specifier, it modifies rather than possesses the head; how to translate this into English? "Drinker-of-good-wine Mary," perhaps; or "Mary, drinker of good wine." Okay.

If puna means "red one," then ko puna means "the quality/concept of being a red one," thus "redness." Can we apply this back onto our longer predicate? "The idea of Mary, (being a) drinker of good wine." "The idea of drinker-of-good-wine Mary." Great Scott, this may just work -- I really wasn't daring to hope for the literal translation to make any sense at all.

Okay. Wow. The suddenness of this discovery has kind of left me reeling.

For relative clauses, then — that is, clauses that serve to further describe or delimit their head — there are two options. One is the internally-headed relative clause, which is fully finite and marked with ke preceding the head; and the other is a standard gapping strategy for which the clause is in its non-finite form (i.e. no i occurs before the verb phrase) and may, like any other "adjectival" predicate, be preceded by u.

Complement and adverbial clauses, at this point, now seem to have one solid strategy: the clause appears in its non-finite form (sans i). If the clause is in a nominal environment, the usual specifier would be ko. One note: predicates frequently appear without a specifier when preceded by a "preposition": la koto "home(wards)," etc. If clauses are to have the same rules as other predicates, we might be able to say not only

Although in the past, now that I think about it, the deleted specifier has always been ka or a in the past, so maybe deleting it here would mess with the intelligibility of the phrase. So yes, all seems fine...I think. I'll need to let this sit for a while and try it out in all kinds of different contexts to make sure there are no...er...side effects.

So here's a thought. Since modifier clauses have two strategies (non-finite and finite), what if the same were true of nominal clauses? We have our non-finite strategy worked out above; I think my favorite of the finite options was with ve used as a complementizer.

Now, deciding to do this would mean using up one of only four remaining particles (we've currently got hi, ve, ie and iu). We're going to have to evaluate whether that really makes sense. I do think, though, that requiring all complement and adverbial clauses to be non-finite would be typologically unusual in its restrictiveness. What we're looking at, then, are the following kinds of alternatives:

I imagine that one or another of the strategies would make more sense, and be likely to be used more, in specific contexts. I suppose these trends will emerge with lots more use, and like all similar situations in Koa, it'll never actually be incorrect either way.

So hey! That ended up being quite a bit easier than I expected, actually. The next task is to go back through all of my existing multiple-clause structures and see how they fare under these new models. In particular, I'm concerned about frames like te tai ko... "it's possible that...," "maybe..." I'll report soon.

Sunday, January 22, 2012

I've been deliberately staying agnostic for the last 12 to 13 years, biding my time until I felt I had the wisdom or clarity to make a decision. Meanwhile, though, my interim strategies have been seeing so much use that they've actually been influencing other important choices that will not be easy to disentangle. It's clearly time to make up my mind.

The matter under consideration is that of embedded clauses. These fall into the two broad categories of modifiers (relative clauses) and nominals (complement and adverbial clauses), though as ever these categories are fluid in Koa.

Before diving into this discussion, I should first mention that one of Koa's relativization strategies, the internally-headed relative clause, is actually fairly uncontroversial. In this structure the particle ke marks the head while it remains in situ:

So far so good. This is easy to form, pretty unproblematic to parse, and works well much of the time. It feels a little odd to speakers of IE and neighboring languages, though, which is enough of a reason to have another option even without the fact that long chains of relative clauses can end up pretty unparseable using this strategy:

vo ke sene i si tapa ke hili i si suo ke lepa i tai ne le Iako i si tei ke talo
~ "this is which cat killed which mouse ate which bread was in Jack built which house"

What we need is a way for a clause to modify a nominal head that stays in its usual position within the matrix clause: that is, something more along the lines of a traditional Indo-European relative clause. This is easy when the head is the subject of the relative clause, because in that syntactic context the verb phrase can just as easily be considered adjectival anyway:

I suspect that the difference between these would be about the same as the difference between "the man drinking my wine" and "the man who's drinking my wine" in English: in other words, pretty ethereal. One can envision contexts in which one or the other sounds better, but in general they're equivalent. The above is unproblematic, and indeed follows automatically from the basic principles of Koa structure.

Once the head occupies a position other than subject within the relative clause, though, we immediately run into apparently insoluble problems -- or at least, problems whose solutions have not seemed obvious to me for 12 to 13 years. Suppose, for example, that we want to say "The wine John drank had gone bad." Calquing the English structure would lead us to do something like this:

It seems so normal that I might not even object, but then I remember the optionality of u in my previous examples. It was optional because the relative clause was "adjectival" on its own, with the same meaning. The same cannot be said here: there is no precedent anywhere in the language for a phrase like le Keoni i si ipo being able to directly modify anything.

Likewise, this clause doesn't seem to be modular, able to be slipped into any syntactic position, like all other parts of Koa are. For example, it should be possible to say this:

?ka [ le Keoni i si ipo ] i koa nai
DEF [ NAME John 3P PERF drink ] 3P good some
"the one John drank was pretty good"

...but I have no confidence in this at all. I don't see why I should expect that the bolded phrase should have the given English translation considering the remainder of Koa grammar, other than my English language intuition.

What we want is some way of forming a clause in Koa that would sound something like "the John-having-drunk(-it) wine" when translated literally into English. How would this be done?

Let's leave this for a moment and turn to the other embedded clause type. Complement and adverbial clauses occur in environments in which they are formally nominal -- that is to say, ordinarily one would see "nouns" in those contexts -- so a reasonable starting assumption might be that clauses of this type would have some kind of specifier. In fact, ko works very well for this purpose when, as with relative clauses, there isn't a subject expressed within the embedded phrase:

The second example, which in English is an adverbial clause, might just as well be translated "John doesn't feel well because of having drunksome bad wine," and as such demonstrates why ko is the appropriate particle: ko si ipo a sahi pua really does mean "[the idea/state of] having drunk some bad wine." It's a little more of a stretch as a complement clause: I'm not sure it's as obvious that the literal "John wants [ drinking some good wine ]" ought to have the given meaning. Nonetheless, it's the only remotely reasonable way of doing this that I've ever come up with.

Having used this kind of structure for some time now, I found it easy enough to start regarding ko as a kind of complementizer, having in its scope an entire following clause. It seemed like this was a reasonable extension of its usual role of marking abstract concepts. Using it this way, we might see phrases like this:

You may be noticing a similarity developing with what happens with relative clauses. The problem is that ko marks the abstraction of a root. That's its one function. Every particle in Koa has one function. In using it this way I've done a very natural, linguistically neutral thing, but a fundamentally very un-Koa thing. Given the meaning of ko everywhere else, does it make any sense to express "John not feeling well" as ko le Keoni i na ma mai koa, in the same way that ko puna means "redness?" I'm not convinced that it does.

Scope is definitely part of this feeling. Even though the languages I know best don't have any problem parsing the appropriate scope of a complementizer, I feel uncomfortable assuming that everyone should just understand where this ko-phrase ends.

Another spot of discomfort is in the fact that, by preposing ko, I'm making this clause into a nominal. The clause, especially with that i in there, feels awfully finite for a nominalization.

This also fails the modularity test. If ko mevúa means "raininess," and pai mevúa means "rainy day," we should be able to say:

Similarly, since I can say ti pai i mevúa "this day is rainy," why not:

ti pai i [ le Keoni i na ma mai koa ]
this day 3P [ NAME John 3P NEG IMPF feel good ]
"this day is John-not-feeling-well-y"

I don't think either of these is very well motivated. Although I wish I could clearly articulate why, my instinct is strong enough that I don't think I can use this kind of structure moving forward...unless I decide that it's okay for ko to lead a double life as a complementizer, in which its structures are not modular in the same way as other sort-of nominals.

If I'm opening up that line of inquiry, there's also the option of using one of my few remaining particles as a bona fide complementizer: perhaps ve, in homage to Bislama:

It doesn't seem to work as well without an overt subject on the embedded clause, I suppose because what follows ve here is supposed to be a fully formed finite expression. We'd need to use structures like

Not so much. Whether we read this as "John wants himself to drink..." or "John wants him to drink," it's not inspiring any applause. I guess its structure is parallel to the same kind of clause in Greek/Romanian/Bulgarian/etc., though:

Anyway, before looking much more closely at that kind of strategy, or giving up on my principles, I would like to see if it might be possible to come up with a way of doing all this that really does work the way I had been envisioning. This is already a ridiculously long post, so we'll go on to that in the next one.

Thursday, January 19, 2012

The following is a slightly modified transcript of a correspondence with Adam regarding the previous post.

I think I should clarify something that I've clearly been too lax about explaining in the past: this "singer" issue that I know people have had problems with intuitively.

The thing about Koa is that it doesn't actually have parts of speech that correspond to European languages at all. I keep cavalierly using terms like "nominalization" as if it were entirely clear to everyone else exactly what I mean, but in fact you don't really have nominalization in Koa. In the same way, there's no verbalization, or "adjectivalization": just words used as predicates or modifiers.

My suggesting that ka lalu is the nominalized form of the verb lalu, then, is actually almost completely unhelpful. I'm realizing that it goes beyond that to the point of being positively deceptive.

The idea behind "words" in Koa is that they take their apparent lexical class -- their sense -- from the place they're used in a clause, but that none of this is inherent. If we take a simple clause like ka kane mata i ma luke "the short man is/was reading," we could equally visualize the "nominal" here as

a noun: the man
an adjective: the male one
a verb: the one-who-is-male, the one being male

likewise, the "adjective" mata could be seen as

an adjective: short
a verb: who-is-short, being short, "shorting"
a noun: a short one (appositive)

and the "predicate" could be framed as

a verb: is reading
a noun: is being a reader
an adjective: is a reading one, is one who is reading

The above is true for Koa not just in theory but in practice. There is unavoidably a question of arbitrariness in terms of what arrangement of semantic roles to foist onto each lexeme in order to make these interpretations what they are, and this is something I want to talk about more in a minute. There is not, however, anything arbitrary about the way the words resolve into each apparent lexical class once you know their basic meaning.

This is all building to an attempt to get at the feeling that it doesn't make intuitive sense for the "nominalized" form of "sing" to mean "singer." I think it's important to note that it actually doesn't mean "singer" with any of the semantic or aspectual/modal sense of the English word. It might be helpful to look at what the verb lalu actually means in a sentence like le Keoni i lalu. I've translated it as "John sings," but that statement is aspectually ambiguous in English.

The best example I can come up with is in the context of, say, a party, at which a certain portion of the contingent has decided it's time to start singing some songs. The folks who want to sing start asking less-well-known attendees whether they'd like to participate by saying "Do you sing?" to which they might respond, "Yeah, I sing" or "No, I don't sing." In Koa, these would get translated as Ai se lalu?Ia, ni lalu and Na, ni na lalu. In a sense, then, le Keoni i lalu might be more helpfully translated as "John is willing to sing," or "John can sing." The assertion is that John has the general potential to sing, whatever the specific realization of that fact in context might be.

I tend to translate ka into English as "the." It's more accurate to say that ka placed before a Koa predicate ("predicate" just meaning "content word" in the Koa grammatical tradition) gives it the sense "a definite instantiation of the meaning of the root," in other words "the one which..." From i puna "is red," then, we get ka puna "the one which is red," "the red one." In the same way, i lalu "sings" gives us ka lalu "the one which can/will sing, the one which sings, the 'singer'." Looking at it another way, i lalu could be equally correctly translated as "is one who can/will sing, is one who sings, is a 'singer'."

Given the semantics of a particular predicate, then, there is only one possible meaning that it could have in its nominal, adjectival or verbal role. There's no question of what semantic role to "promote" during the apparent process of nominalization: it's the same semantic role that it has everywhere else as well. What I've been calling null derivation should really be called "apparent null derivation," because there isn't actually any derivation going on here. Traditional concepts of lexical class just don't apply.

At least, that's the way it's always been; what I brought up in my previous post was the possibility of adding another layer of arbitrariness that would firmly reintroduce traditional lexical classes into the language for the theoretical benefit of greater intuitiveness, and/or greater word-worthiness of basic roots in each lexical class guise. Or reducing the average number of morphemes per word to give the language more of the feel of a creole. The disadvantage, beyond the increase in arbitrariness itself, would be that it would break the carefully constructed elegant system above.

When it comes to which thematic roles to map onto arguments for a given word, of course it's unavoidably true that it's philosophically an arbitrary choice. My guiding strategy, though, is based in the assumption that pragmatically most of that theoretical arbitrariness disappears. Once in a while one sees a language make some bizarre encoding choices -- Maltese where "thief" means literally "one who is considered a thief," for example, being my favorite -- but by and large this isn't what I've seen languages do. Dogs are nominal. Killing is verbal, and the agent will be the subject. Inasmuch as a language has a class anything like adjectives, "big" will be an adjective. I would suggest that, statistically, there are good and bad choices where these assignments are concerned.

Where there is genuine disagreement across large chunks of the language spectrum, I've tried to make my decisions at least consistent and predictable. Experiencers are encoded as subjects in Koa, for example. Bodily substances are their own nominal subjects (i.e. i taku means "is blood" not "bleeds"). I do actually intend to present the dictionary in a somewhat Loglanny way, demonstrating a clause frame for each word to illustrate the semantic structure; ideally my choices would feel intuitive or at least reasonable to the greatest percentage of humans, and where there is guesswork for a learner, there is at least a system on which to lean. This kind of arbitrariness is, I feel, pretty distinct from Esperanto's where kombo is the verbal noun "combing" but broŝo is the instrument "brush."

I've been relying on my intuition based on a very large number of studied languages in making these choices, but I really ought to be more scientific about it. In general I'm being pretty agnostic about the thematic role/argument deployment of potentially problematic words, with the understanding that I'll firm that up (or not -- some ambiguity is probably okay) later on through philological investigation.

I wouldn't want anyone to think I'm getting caught up in my own idea or anything. I'm firmly grounded in the fact that IAL design is always fundamentally going to be a variety of intellectual navel-gazing. The process is fascinating to me, though, and I really do have the conceit of thinking, or at least hoping, that this particular language, when it's "finished," would be easier to learn for a much greater number of people than Esperanto -- the founding goal back in 1999.

The question I was posing in the previous post, then, was that of whether this goal would be best served by coherence to an established and predictable, though more complex, system, or by the addition of ambiguity in order that more forms might be morphologically simpler and more typologically neutral. I'm leaning strongly towards the former at this point, if only for the reason that Koa was designed from the bottom up with the existing system informing every choice, and this change would destroy all internal consistency. It would be better, in terms of optimality, to start over from scratch if this were going to be a primary design goal.

Thursday, January 12, 2012

A core principle of Koa design from the very, very beginning has been avoiding the kinds of problems caused by inherent lexical class in Esperanto. By this I refer to the fact that, for example, the root komb- is inherently verbal, which gives us kombi "to comb" and, counterintuitively for me, kombo "combing." In order to designate a nominal comb, one needs an instrumental affix: kombilo. Martel- "hammer," on the other hand, is nominal, so we have martelo "hammer," marteli "to hammer," and construct the verbal noun with an affix: martelado "hammering."

I've always felt that this was a terribly sloppy state of affairs for an IAL, and would be confusing enough for learners without the fact that it's barely touched upon by the textbooks. Koa, I thought, would conquer this territory, by making the business of part-of-speech conversion completely logical and, therefore, predictable. Some examples of the way this kind of thing works follow:

ka lalu "the one who is willing/able to sing, the singing one, the singer"
le Keoni i lalu "John sings"
ka kane lalu "the singing man, the man who sings"

ka pa lalu > ka palálu "the thing sung, i.e. the song"
le Amazing Grace i palálu "Amazing Grace is a song"
ka iune palálu "the one who steals songs, the song thief"

ka ne ka talo "the one in the house"
le Keoni i ne ka talo "John is in the house"
ka moa ne ka talo "the bird in the house, the bird that is in the house"

All of these structures are 100% parallel, in a manner entirely different from the way e.g. Esperanto does it. The idea is that every part of the language should follow this framework. The problem is that I'm worrying that by doing so, I'm diverging seriously from cross-linguistic neutrality and, in some cases, basic common sense.

I've written about this before, but I think the time has come to do some more rigorous investigation of the consequences of these assumptions, and evaluation of their reasonableness.

One of the most obvious effects of this system is that a lot of basic roots in English get encoded in Koa via the passive marker pa. "Song" above is an example of this: whereas in Esperanto kanti means "to sing" and the noun form kanto means "song," Koa lalu when used nominally means something like "one who sings in an aorist kind of way." This is not an obviously useful concept to be able to express, but it's necessary to maintain the parallelism of structures.

I should note that ka lalu doesn't really quite mean "the singer," in the sense of someone who sings a lot and constructs their identity partially around this fact. For this kind of meaning, that is, something characterized by the meaning of the root, we have an affix -ma, so láluma "singer." One could also employ the usitative particle va to form ka va lalu or ka valálu "the one who frequently sings," "the singer." It's easier to visualize its meaning in the negative: a na lalu "a person who doesn't/won't sing."

If ka lalu doesn't have a particularly useful existence with its current semantics, I have to ask this question: what if ka lalu meant "the song" instead? Not because it's logical, but because it's highly intuitive and useful. Some other examples of this general quandary:

In all these cases, we theoretically have the option of equipping the nominalized base root morpheme with the same meaning that currently needs derivational morphology to attain. What would be the advantages and disadvantages of such a system?

Before getting into syntactical ambiguities, we can state right off the bat that this would give Koa that same frustrating arbitrariness of intrinsic root meanings as Esperanto. Suo above is the perfect example: should this root used as a noun mean "food" or "meal?" Esperanto chooses the latter with manĝo, but it could really go either way. I don't like the thought of having to look up and memorize the meanings of various derived forms of every word: the whole point with Koa is that it's all there, free to interpret, in the morphology and syntax.

Leaving aside that qualm, I'd like to see if there are any really conspicuous structural problems with this idea. One issue that comes up repeatedly is that of what happens when the root is used as a predicate: as things stand, there is no formal difference between a predicate nominal, a stative verb, or any other kind of verb. Thus, le Keonii lalu means "John sings" and is identical in structure with le Keoni i moa "John is a chicken."

If lalu, for example, means "sing" as a verb and "song" as a noun, clauses like le Keoni i lalu suddenly become ambiguous. It could either continue to mean "John sings," or more fancifully, "John is a song."

There are three possible responses to this that I see. First, we could decide that this potential ambiguity is unacceptable, and can the idea right now. Second, we could point out that the second interpretation is entirely semantically anomalous and therefore the ambiguity is artificial: in actual context, the meaning would be clear. Third, we could eliminate the ambiguity by requiring verb phrases to bear a tense/aspect/mood marker: thus the verbal meaning would have to be expressed as le Keoni i va lalu, currently meaning "John sings regularly/habitually." This is not quite the same as the aorist sense of le Keoni i lalu, but we could theoretically add this to the arsenal of va.

Well, I have no truck with monosignificance -- all languages are full of ambiguity -- so I can throw out the first response. I'm not a fan of the third either, because (A) I don't want to have to mark every verb phrase this way, and (B) it doesn't actually eliminate the ambiguity anyway, because le Keoni i va lalu could equally be interpreted as "John is often a song." This means that deciding to make this change to Koa null derivation semantics would entail accepting a healthy dose of intrinsic ambiguity into the language, for better or for worse.

One good thing about this that I'd like to throw in before I forget is that it would also obviate the need for a helper verb. Thus instead of tei kaka "go poop" (if kaka is nominal, that is), kaka could have both verbal and nominal force.

Returning to ambiguity, I would like to point one thing out. In more poetic contexts, where potential meanings range more freely, I find I can easily come up with examples where this ambiguity would no longer be trivial. Take the theoretical Koa sentence ka ela ni i lalu, for instance. I don't think there's a problem with ka ela ni "my life" (this would be ka koéla ni in standard Koa) despite the fact that it could also be saying "my living one," whatever that means. The predicate, though, could mean either "sings" (i lalu) or "is a song" (i palálu), and given the poetic nature of the utterance, there's really no reason to prefer one reading over the other. Some ambiguity is, of course, acceptable in poetry, but this seems to seriously encumber the expressiveness of the language. I think this may be the strongest argument yet in favor of not making this change.

In terms of typological appropriateness, I really need more data. I can say that, from the perspective of the inflectional IE languages I speak, I have nothing to worry about either way. "Food" in English is unrelated to "eat," but transparently connected to "feed." In Polish we have jedzenie, literally the verbal noun "eating." Spanish has comida, literally "(female) eaten thing": an exact parallel of Koa pasúo. What do more isolating languages do, though? I have absolutely no idea. Inexcusably, I don't have a grammar of Mandarin, Vietnamese, Burmese, Thai or any other related language, but Bislama, Malay and Yoruba ought to give me something to work with. I'll come back with part II soon; in the mean time, I think I'm seeing some reasons to leave things as they are.