I was looking into Lucene and noticed that it has a SortedDocValues Class. Does anybody know if Elasticsearch's doc_values are already using this class or if Elastic plans on implementing anything like this in future versions?

Being able to store field data pre-sorted would drastically improve query speed in my use case.

That would be an awesome feature. I just found out that sort-scrolling anything with 10+ millions documents (with doc_values) is basically so slow that doing it doesn't make sense... 30M takes hours and 100M takes hunders of hours. And I don't think that scaling up horizontally or vertically would help...

Yes, SortedDocValues/SortedSetDocValues are what elasticsearch is using to store doc values on not_analyzed string fields.

However I'm not sure it does what you think it does: this class just maps every unique value to an ordinal and then every document to the ordinals of the values that it contains. Which we later use for sorting and aggregations.