It feels like you want to include DataType as a field of Data; the Data struct owns the bytes, and also knows how to interpret those bytes. Implement Index and Iterator for Data, and you should be able to interact with your generic data without matching on the enum directly; let the trait impls do that.

I’m confused by the description. It sounds like you have a file which contains:

A single header

Followed by many values of the corresponding type

Is that right? Then I would write:

// Private; this only exists so you can write
// a function that parses the header and returns
// the data type.
//
// You might not even need it. E.g. you could
// use a Data with an empty vec.
enum Tag {
Ubyte,
Int,
Double,
}
// A Vec of enums (or iterator of enums, etc.)
// is conceptually incorrect if all of the
// enums have the same variant.
//
// You want an enum of vectors.
pub enum Data {
Ubyte(Vec<u8>),
Int(Vec<i32>),
Double(Vec<f64>),
}

In what I wrote I did not have a vector of enums.
I had a struct in which I had a Vec<u8> and in this vector I stored the actual data from the file, which is u8. In order to obtain the data in actual format I used from_be_bytes() function. The Index and Iter were implemented for that struct.

I agree that enum of a vector maybe better, but my question still applies:

when I parse everything and get Data I have to match over it in order to extract the data, either enum of vectors or how I did it enum of values, actually In my notations from the first post

The problem here is that you haven’t told us what you want to do with the data. Since you have data of different types you have to decide what to do with each type of data, which is why you have a match statement. Since you aren’t satisfied with this, there must be some commonality between the code for the different types, but you haven’t hinted at what that commonality might be. Without knowing more about your goals, it’s hard to tell you how you might achieve them.

Basically I can’t picture a scenario in which you have to match over each individual value.

To me your code example is highly contrived because I cannot imagine a case where I’ve ever needed to look at the nth element of a vector of unknown type read from a file. It also clearly doesn’t reflect your own usage because it doesn’t typecheck. (x has different types in different branches!)

Basically, if different files have different types in them, then I can only imagine that I would want to do different things to them! (there are some exceptions I’ll cover at the end)

Just this Thursday, I had to write a text-based parser of a similar format, and I had no trouble parsing it into an enum of vecs up front. Here is an adaption of my code to your binary format. (Note this requires nom 3.2.1 or lesser, because I have no idea how to use many0! in nom 4).

Helper parsers:
If you’re not familiar with nom, named!{fn_name<I, O>, ...} defines a parsing function that takes I (either &[u8] or &str) as input and produces some Result<(I, O)> with the parsed value and the unparsed remainder.

Now, there are a few exceptions to what I said at the beginning of this post. E.g. It’s possible I might want to get the len of the inside Vec without knowing what type it is.

I generally try to avoid these situations, but if a large number of them pop up and I have no alternative, I do have a technique for dealing with the boilerplate. Basically, the goal is to implement four functions as_ref, as_mut, map and fold that together serve 99% of use cases.

I’ll write something up about this technique next, even if only to give myself something to link to from other threads. I’ll warn you though: it can be pretty costly in terms of syntax, so it’s only useful if you have an extremely large number of variants.

Basically I can’t picture a scenario in which you have to match over each individual value.

Now after thinking about it more, I completely agree that it makes little sense to match over individual value. At the beginning I was thinking that I would read the content into the Vec<u8>, how I wrote, and then when I need to have the individual value from the vector, my index() function will provide the enum.

But now, after reading your code, of course it makes much more sense to convert the content into the Vec of the corresponding type at the beginning and then have a enum of vectors.

Actually when I first start doing it I wrote in C++ just to test (I’m new to rust) and there I did put the content into the vector of the type. However, I did not write the general version for all possible types, only for i32 so I did not realise that enum of individual value is a bad solution. Thanks for pointing it out.

It also clearly doesn’t reflect your own usage because it doesn’t typecheck. ( x has different types in different branches!)

I completely agree.

ExpHP:

Here’s the next response I alluded to.

I ended up not getting to some of the crazier stuff, as I decided to start with macros (which are actually a pretty ergonomic solution!).

Thanks a lot. This is really helpful. Now I’m going to dig into details.