edited

Yet another enum proposal

First of all, what is the issue with const? Why can't we use that instead?

Well first of all, iota of course only works with anything that works with an untyped integer. Also, the namespace for the constants are at the package level, meaning that if your package provides multiple utilities, there is no distinction between them other than their type, which may not be immediately obvious.

For instance if I had my own mat (material) package, I'd want to define mat.Metal, mat.Plastic, and mat.Wood. Then maybe classify my materials as mat.Soft, mat.Neutral, and mat.Hard. Currently, all of these would be in the same namespace. What would be good is to have something along the lines of mat.Material.Metal, mat.Material.Plastic, mat.Material.Wood, and then mat.Hardness.Soft, mat.Hardness.Neutral, and mat.Hardness.Hard.

Another issue with using constants is that they may have a lot of runtime issues. Consider the
following:

Not only is there a lot of boilerplate code where we define the "enum", but there is also a lot of boilerplate whenever we use the "enum", not to mention that it means that we need to do runtime error checking, as there are bitflags that are not valid.

I thought to myself. What even are enums? Let's take a look at some other languages:

C

This ends up being similar to Go's iota. But it suffers the same pitfalls that we have with iota, of course. But since it has a dedicated type, there is some compile-time checking to make sure that you don't mess up too easily. I had assumed there was compile-time checking to make sure that things like Weekday day = 20 were at least compile-time warnings, but at least with gcc -Wextra -Wall there are no warnings for it.

C++

This section was added in an edit, originally C and C++ were grouped together, but C++11 has added enum class and enum struct which are very similar to Java's (next section). They do have compile-time checking to make sure that you don't compare two different types, or do something like Weekday day = 20. Weeday day = static_cast<Weekday>(20) still works, however. We should not allow something like this. #28987 (comment)

I personally like this implementation, although I would appreciate if the objects were immutable.

The good thing about this implementation is that you are able to define methods on your enum types, which can be extremely useful. We can do this in Go today, but with Go you need to validate the value at runtime which adds quite a bit of boilerplate and a small efficiency cost. This is not a problem in Java because there are no possible enum values other than the ones you define.

Kotlin

Kotlin, being heavily inspired by Java, has the same implementation. They are even more clearly
objects, as they are called enum class instead of simply enum.

Swift

Proposal #28438 was inspired by these. I personally don't think they're a good fit for Go, but it's a different one, so let's take a look:

enum Weekday {
case Sunday
case Monday
case Tuesday
// ...
}

The idea becomes more powerful, as you can define "case functions" (syntax is case SomeCase(args...), which allow something like EnumType.number(5) being separate from EnumType.number(6). I personally think it is more fitting to just use a function instead, although it does seem like a powerful idea.

I barely have any Swift experience though, so I don't know the advantages of a lot of the features that come with Swift's implementation.

package mat // import "github.com/user/mat"
// iota can be used in enums
type Hardness int enum {
Soft = iota
Neutral
Hard
}
// Enums should be able to be objects similar to Java, but
// they should be required to be immutable. A readonly types
// proposal may help this out. Until then, it may be good just to either
// have it as a special case that enum values' fields cannot be edited,
// or have a `go vet` warning if you try to assign to an enum value's field.
type Material struct {
Name string
Strength Hardness
} enum {
Metal = Material{Name: "Metal", Strength: values(Hardness).Hard } // these would greatly benefit from issue #12854
Plastic = Material{Name: "Plastic", Strength: values(Hardness).Neutral }
Foam = Material{Name: "Foam", Strength: values(Hardness).Soft }
}
// We can define functions on `Material` like we can on any type.
// Strong returns true if this is a strong material
func (m Material) Strong() bool {
return m.Strength >= Hardness.Neutral
}

The following would be true with enums:

int enum { ... } would be the type that Hardness is based on. int enum { ... } has the underlying type int, so Hardness also has the underlying type int.

Assigning an untyped constant to a variable with an enum type is allowed, but results in a compile error if the enum does not support the constant expression's value (That's a long winded way of saying var h Hardness = 1 is allowed, but var h Hardness = 100 is not. This is similar how it is a compile error to do var u uint = -5)

As with normal types, assigning a typed expression to a variable (var h Hardness = int(5)) of a different type is not allowed

There is a runtime validation check sometimes, although this can be ommited in most cases. The runtime check occurs when converting to the new type. For instance var h Hardness = Hardness(x) where x is an integer variable.

Using arithmetic operators on enums with underlying arithmetic types should probably either not be allowed, or be a runtime panic with a go vet flag. This is because h + 1 may not be a valid Hardness.

Syntax ideas for reading syntax values:

Type.Name

It's a common syntax people are familiar with, but it makes Type look like a value.

Type#Name, Type@Name, etc

Something like these would make the distinction that Type is not a value, but it doesn't feel familiar or intuitive.

Type().Name

This one doesn't make too much sense to me but it popped in my head.

values(Type).Name, enum(Type).Name, etc

values would be a builtin function that takes a type, and returns its enumeration values as a struct value. Passing a type that has no enum part would of trivially return struct{}{}. It seems extremely verbose though. It would also clash as values is a pretty common name. Many go vet errors may result from this name. A different name such as enum may be good.

I personally believe values(Type).Name (or something similar) is the best option, although I can see Type.Name being used because of it's familiarity.

I would like more critique on the enum definitions rather than reading the values, as that is mainly what the proposal mainly focuses on. Reading values from an enum is trivial once you have a syntax, so it doesn't really shouldn't need to be critiqued too much. What needs to be critiqued is what the goal of an enum is, how well this solution accomplishes that goal, and if the solution is feasible.

Points of discussion

There has been some discussion in the comments about how we can improve the design, mainly the syntax. I'll take the highlights and put them here. If new things come up and I forget to add them, please remind me.

Value list for the enum should use parentheses instead of curly braces, to match var/const declaration syntax.

Disadvantage: (In my eyes) it doesn't illustrate how this enum implementation works quite as well.

Add type inference to enum declarations

Advantage: Definitions become more concise, especially when declaring inline types with enums.

Disadvantage: The concise-ness comes with a price to readability, in that the original type of the enum is not in a consistent location.

My Comment: Type inference in Go is typically done in places which would benefit from it often, like declaring a variable. There really should be very few enum declarations "per capita" of code, so I (personally) think the verbosity of requiring the type is justified.

Use the Type.Value syntax for reading enum values

I've already talked about advantages and disadvantages to this above, but it was mentioned that we already use Type.Method to reference methods, so it wouldn't be quite as bad to reference enum values as Type.Value.

Ranging over enum values is not discussed

I forgot about it when writing the original text, but luckily it doesn't undermine the proposal. This is an easy thing to fit in though. We can use Type.Slice which returns a []Type

Regarding zero values

We have two choices - either the first enum value, or the zero value of the underlying type.

First enum value: Makes more intuitive sense when you first look at it

Zero value of type: More consistent with the rest of Go, but may cause a compile error if the zero value of the type is not in the enum

My Comment: I think the zero value of the type should be used. The zero value of a type is always represented as all-zeros in binary, and this shouldn't change that. On top of that, the only thing the enum "attachment" to a type does is limit what values variables of the type can hold. So under this rationale, I think it makes intuitive sense that if the enum for a type doesn't include the zero-value, then declaring a variable with the zero-value should fail to compile. This may seem strange at first, but as long as the compile error message is something intuitive (ie illegal assignment to fooVar: FooType's enum does not contain value <value>) it shouldn't be much of a problem.

This comment has been minimized.

Is there a section missing after "Let's take a look at some other languages"?

Yes, that was my bad. I've updated to include other languages.

A commonly requested feature for enums is a way to iterate through the valid enum values. That doesn't seem to be supported here.

Oops... That's my bad. Either way, once there is a syntax for it, it should be easily supported. In your upcoming example I will use values(Type).slice, which evaluates to a []Type.

I'm not sure but I suspect that this syntax is going to introduce parsing ambiguities. There are many places where a type can appear. You also have to consider cases like...

That's true. It's confusing if []struct { f int } enum { ... } is an enum of []struct { f int } or a slice of struct { f int } enum { ... }, making the contents of the enum very confusing. This also isn't even really that contrived of a case either, as type TSlice []T isn't uncommon. I'd personally assume that the enum is the last thing that is "applied" to the type, since the syntax is type Type <underlying type> enum <values>, but this isn't immediately obvious.

This comment has been minimized.

edited

The reason behind picking { ... } over ( ... ) was that it just looked more visually appealing when defining enums of struct types.

example:

type Example struct {
i int
} enum (
A = Example{1}
B = Example{2}
)

versus

type Example struct {
i int
} enum {
A = Example{1}
B = Example{2}
}

The symmetry of } enum { looks much nicer. I would say using parentheses does make more sense, since we are declaring a list of variables. I was pretty tied on which one to use.

b) defining a namespace from a parent type name

I addressed this, I didn't like it because Type.Value makes Type look like a value, even though it isn't. It does feel much more familiar to other languages however.

Something that bothers me about the example code is that I don't like the type inference. I think that since we don't typically need to declare enums too often, the extra verbosity makes the code much more readable. For instance:

type SomeNumbers enum (
A = 15
B = 92
C = 29993
D = 29.3
E = 1939
)

What is the underlying type of SomeNumbers? Well you'd look at the first few numbers and think it's an int type, but because the D constant is 29.3, it would instead have to be float64. This is not immediately obvious and would be an especially large problem for long enums. Just using type SomeNumbers float64 enum ( ... ) would mitigate this issue

This comment has been minimized.

edited

Fair point, completely forgot about that. One of those features that don't get used much, haha

Leveraging the const/var (...) pattern, let the first item clarify the type:

I personally think that

type Status int enum (
Success = iota
TimedOut
Interrupted
// ...
)

is more readable than

type Status enum (
Success int = iota
TimedOut
Interrupted
// ...
)

Although that's just be a matter of opinion. I think that the extra verbosity helps the readability in this case, and since people shouldn't be declaring enums all the time, it comes at a relatively low cost.

I actually thought about forcing the underlying struct and enum definition to be separate (which this would effectively do). It was actually the initial design, but as I made examples, it raised a few issues.

Now, you have defined two types (an internal one for the structure, and the type with the enum to actually be used). So now you have this useless internal type floating around, which is only used to define an enum. Not a huge deal per se, but it's pretty inconvenient and seems like a waste of typing. It would also clutter up autocompleters for those who use them.

Another issue is documentation. Presumably you wouldn't want that struct to be exported, because it's only use is to be used by the enum. The issue with not exporting the struct is that now it doesn't appear on godoc.org, so people don't know what fields your enum has. So your two choices are to either export your struct, which is bad design since it's not supposed to be used outside of the enum, or to keep the struct unexported, which makes it invisible to godoc. The type T struct { ... } enum { ... } fixes this issue, since the underlying type is defined along with the enum rather than separate.

Also, defining the enum all at once illustrates that enums are an extension on types, and not types on their own. Doing type SomeNumbers enum ( ... ) makes it look like that enum ( ... ) is the underlying type, even though it's actually an int where enum just limits which int values that SomeNumbers can hold. The proposed syntax is a bit more verbose, but I think that a syntax where the underlying type is defined along with the type illustrates how it works a bit better.

Also, if you want to define the struct and enum separately, you still can:

This comment has been minimized.

But Example isn't just a struct { int }, is it? I think it really is a struct { int } wrapped as an enum. Putting enum first would also be compatible with future type additions to the language. However, that's all syntax. What do you think about the "bit flag" use case for enums?

This comment has been minimized.

edited

But Example isn't just a struct { int }, is it? I think it really is a struct { int } wrapped as an enum. Putting enum first would also be compatible with future type additions to the language. However, that's all syntax.

Exampleis a struct. "Enum type" (while I have used this term) is a misnomer. The enum keyword simply limits what values type Example is able to hold.

What do you think about the "bit flag" use case for enums?

I personally think that bit-flags should not be handled by enums. They are safe to just use as a raw numeric type in my eyes, since with bitflags you can simply ignore any bits which do not have meaning. I do not see a use-case for bitflags to need a construct, what we have for them now is more than enough.

Example is not wrapped as an enum. The enum specifier is an "attribute" to a type which limits what values it can hold.

Anonymous enum types are valuable (as I described above)...

You could equally achieve that with v int enum ( a = 1 ) which could follow the same namespacing rules that you described earlier. I didn't think of this in my original design, thanks for bringing it up!

I will accept that it may be bad for enum to be after the struct since type E struct ... looks like a plain struct type, but you don't see that it's enumerated until further down. But I could not think of a better syntax for defining the new type's underlying type AND value within the same statement.

Actually - I'd rather not. Not because I don't like type inference or anything, but these things are mentioned in a bulletted list. The list is meant to be a TL;DR of the discussion, and I don't want to be polluting it with long examples. I personally don't think it's "essential for anonymous value types" or anything, struct and array enums are just as much enums as ints are and really doesn't take much to wrap your head around how they work.

c) the default value; is it the type's zero value or the first enum value?

Thanks for the reminder, I'll add that in

d) enum (x int8 = iota; y) for consistency with the Go 1 enumeration method. Granted that doesn't work well for an anonymous struct.

This comment has been minimized.

edited

Re type inference, you have this example, which weakens your case

Material is not an enum. It is a type just like any other, but the enum keyword limits what values a type may hold. Doing Material{ ... }outside of the enum section of the type is still valid as long as the value comes out to a value that is within the enum section. I'd imagine tools like golint should discourage this behavior though to make it more clear that an enum is being used.

Re zero value, I often do this: const ( _ int = iota; eFirst; eSecond ) Here the zero-value is in the enum, but anonymous. That probably shouldn't produce a compile error.

I'd argue it should. iota is always zero for the zero'th index const. If you do _ MyEnum = 0 on an enum that does not contain a 0 value, it should produce a compile error as the second bullet in the "The following would be true with enums:" part states.

This comment has been minimized.

edited

There's going to be a pretty significant language change either way. Although I think my main gripe with returning an unexported type is that an unexported type logically should only be used by the package that uses it. In fact, var x pack.matValue is invalid, which is a similar issue to what this proposal encounters with the whole "what's the zero value of an enum" problem.

This comment has been minimized.

That example has the issue that you explained earlier, where we don't see that type t is an enum until much later in the code. Also seems a bit clumsy since you need to restate yourself. Ideally, you should only need to type out the list of variables once

This comment has been minimized.

It's an interesting idea to store the enum values in a variable of type t enum .... But now declaring an enum in a new project becomes clumsy, and I'd like to avoid having multiple ways to do write the same code.

Also, I'd like to point out that under this proposal, we cannot create an enum from a list of var because enum values should be required to be immutable. Take the following example:

This comment has been minimized.

edited

It's no more clumsy than in Go1; you simply add enum to a variable declaration; supplying a list of values would be uncommon. Values would be immutable when const, and there are open items proposing new const types.

My proposal actually preserves the Go1 method as the only way, and is a relatively small language change. An embedded definition was just a thought; more below.

This comment has been minimized.

It's not just as clumsy as Go 1, because in Go 1 we don't have to redeclare our values (because we don't have enums), which is where I feel that the clumsiness comes from.

Also I really think that we should be primarily focusing on named enums which get reused in many places, not enums that are inlined (and therefore only used in one place). That's the primary goal of this proposal, other enum proposals, and enums in other languages.

This comment has been minimized.

edited

Sorry, when I say "redeclare" I mean you declare the enum values (in the var declaration) and then a second time in the enum declaration (ie with var B ...)

The var A t enum actually works pretty okay, however.

Enums should work on the type-level instead of the var level. An ideal use case for an enum would be something like time.Weekday, where we would be using time.Weekday in multiple areas. Things like that are the primary use-case that I am going for. In these cases, enums need to be on the type level, so that when I am writing a function that takes a weekday, it doesn't look like

edited

This comment has been minimized.

edited

Small typo (I'm assuming), but you'd want to return time.Weekday enum, not just time.Weekday. Otherwise you wouldn't be able to assign the result to a time.Weekday enum.

And this statement may be controversial, but I'd much rather have a feature that feels good to use, than one that is easy to convert legacy code to but seems half-baked.

And of course another issue is that I'd rather not need to write enum after every time I want to use one. It's the reason that typedef came about in C, because it's clumsy to write struct, enum, union, etc. every time I wanted to use one of those types. I'd much rather just use the type's name. A type is already a set of what values a variable can hold, so ideally if we're limiting what values a variable can hold, we should do that at the type level.

This comment has been minimized.

Since Workday is based on time.Weekday enum (...) ideally we should just use Workday rather than Workday enum.

Also, since Workday and Weekday are separate types, they aren't assignable, unfortunately. I wouldn't like to change that. A typealias would make sense in this case though, since workdays are literally weekdays (rather than just being based on them).

So then perhaps the "make my pre-existing type an enum" code would become

This comment has been minimized.

I have no clue where these requirements are coming from. Also I'd personally much rather require constant types, and wait for const to be allowed on all types, than to get this proposal accepted too early, thereby requiring we comply to unwanted behavior.

I'd also like it to mostly remain unchanged, I feel like a lot of the proposed changes you have been making are more apt to a new proposal rather than an amendment to this one.

In conclusion, I'd really like this proposal to remain mostly unchanged. Maybe some syntax here and there, but the overall concept should remain untouched. This proposal values making a good implementation over sacrificing quality to make it "fit better" into old code.

This comment has been minimized.

Don't forget enumeration subtypes.
e.g. you want to declare an enumeration type WEEKDAYS
and a subtype WORKDAYS of type WEEKDAYS.

So the value "Monday" evaluates to True when testing to be of type WEEKDAYS and WORKDAYS.

If you have atype WEEKDAYS with values (Mon, Tue, Wed, Thu, Fri, Sat, Sun)
You want to be able to define your subtype as:

an list of values, e.g. (Mon, Tue, Wed, Thu, Fri)

a range of values, e.g. (Mon ... Fri)

additional constraints:

a subtype can only have values from the main type, i.e. no additional values

the numeric value of a subtype needs to be identical as its corresponding value in the main type, e.g. if the numeric value for enumeration value Monday is 2 in WEEKDAYS, then it needs to have the numeric value in WORKDAYS as well.

You need the capability to convert between unconstrained types (e.g. int) and an enumeration.
This is useful when decoding bytestreams to and from data types.
E.g. decoding an array of bytes received over a serial line into a structure of fields. Often these fields can be enumeration types in communication protocols.

This comment has been minimized.

I don't want to complicate enums and make their learning curve steeper by adding additional features that are (the way I see it) not absolutely needed.

That is a subjective statement.
You can state that YOU don't see a need for it, which is fine, but making a global statement for the whole golang community is a bold statement at best.

I suggested the feature because I got used to it when I was programming in Ada, and I saw a definite need for it back then (and a lot of Ada programmers agreed with me), but that is just my opinion (and that of a lot of Ada programmers). Please don't discard suggestions from people that were able to use the feature in other languages. If you don't find it useful, then don't use it, but at least allow others who do find it useful to use it.
Remember: "In the eyes of a hammer, a screw doesn't seem to be very useful"

This comment has been minimized.

The idea of "sub enums" also seems to be a more object-oriented concept which involves inheritance, which makes sense with Ada being object oriented. I'd definitely like to stay away from that. This integrates well with Go's current (and simple) type system, so I'm very wary of adding too many features to it. Go is great (in my eyes, of course) because it's picky about what gets added into the language.

I definitely didn't mean to discard it entirely, I'm just stating that it doesn't really match with the values of this proposal. I know I kinda come off as harsh sometimes, I promise I'm not trying to do that

This comment has been minimized.

edited

A Go1 backwards compatible way to do this, and to unify this with #29649 would be to reuse the range keyword in type definitions, and use it like range(type allowed values and ranges), much like we now have chan(message type) . Furthermore the simplest thing that could work do would be to limit the allowed values to constant expressions and constant ranges. You would have to define the values of the range then outside of the range type declaration itself, but that solves a few sticky issues on name spacing as well. Something like this:

This comment has been minimized.

edited

The reason I am opposed to defining the underlying type and enum is illustrated in an earlier comment, which is that now we have a "useless" type (WeekdayValue in your case) that shouldn't be used anywhere other than to define the "useful" type (Weekday). I think that needless types should be avoided (aka we should define the enum type and it's values together) which is a large part of the reason that I made this proposal.

That approach also would not work for enums of structs which was a large part of this proposal. Instead you would need to have an enum of indices which reference a slice. This isn't a bad solution, and has been discussed previously, but this proposal just doesn't really "get along" with that one very well.

It looks like your proposal would actually be the exact same as #29649 though, it's doesn't appear any different (at least immediately) besides that you can specify a comma separated list of numbers, which I personally think defeats the purpose of a range type. Please correct me if I am wrong though.

Not trying to rip on this idea or your prospective proposal though! A number range type would be another good solution to an enum proposal that feels Go-like.

This comment has been minimized.

edited

Well, my idea is to unify ranged types with enumerations, in a way that is backwards compatible with Go1. I don't mind not getting enums of structs, enums of ConstantExpressions with matching underlying types would be very useful. Maybe later we will get some structs as ConstantExpressions as well.

I see your point about wanting to declare the values of the enums as well, as in a struct's members. It would not be too hard to extend my idea for that, I think.

This comment has been minimized.

I don't know which UserTypes can I choose, IDE can't do the intelligent completion.
The only way to do that is jumping to the definition of UserType and see if there're some related const definition like:

But if the author doesn't define those constants near UserType, it's hard to find them.

Another thing is that, UserType is just an alias name for byte(not strictly), people can convert any byte value to UserType like UserType(100). So the consumer of UserType have to check its validation on runtime, like: