Edit: there is a small bug in the sorting that I need to fix if the list is presorted.
This may look short and sweet, but it does a double sided selection sort and uses only low level operations, and should work on C arrays of any type that can work with <,>,== operations (basically no weird structs)

Note: you can eliminate the need to pass a buffer (buf) of the same type if you replace the buffer swaps with xor swaps, which look something like:
a^=b;b^=a^=b;
or use memcpy on with a generic buffer like char[sizeof(array[0])]
or use assembly such as in MIT licensed musl libc:

Here is an example working with file extensions to get MIME-type (though you could rearrange/replace the case statements with anything you want) see libmagic for a content based solution - this just uses the extensions and doesn't open or even stat the file.

Code:

/** in order to compare strings you can't directly use a switch like:
* switch (string){ case "mystring" : ... }
* because switch only works on int types (char, short, long, long long)
* so we bit shift the strings into a compatible int type
*
**/

Your "file"-code really is nice! Thanks. Its extremely fast. I have used hexd in combination with a shell script in pUPnGO for some time. Maybe a combination of the two could yield an even better replacement for file...and avoid calling the shell/other bins...attached the shell-script that is to be used with the hexd-binary.
Some timings:
Your file:

the "file" implementation was just a simple test case. If that was its full purpose, I never would have bothered. Where it does matter is on a web server with thousands of clients in order to tell the browser what kind of file it is about to send (from all of the web servers I checked, that code was a bottleneck) ... in this case, the difference between 1/100th(+) of a second and 1/1000th(-) can make a _HUGE_ difference

Story:
I was thinking about writing my own web server and wanted it to be small, fast, AND functional ... nweb seemed to be the most basic starting point, but it only accepted a few file types and the code wasn't very efficient (sometimes space efficient code does note equate to efficient code) so I set out to optimize it as follows
but...
a series of if strcasecmp ... has a lot of underlying calls in each one:
recalling my switch case trick used for the first 4 or 8 characters only needed to do 1 check since the characters get compiled to an integer constant ... seemed like a good starting point

problems:
1. need the end of a variable length string
2. has to be 8 characters long
3. needs to be case insensitive

1 (this is actually multiple problems)
I could use strlen(s) and store it in an int ... and use it to work backwards, but after looking at how various strlen implementations worked, it was obvious that it would be better to set up my own counter variable to loop to the end of the string, meaning I could use that counter as my length variable and even reuse it as my counter.

2. I didn't want to overwrite anything in the original string so I needed a char *buffer to keep a lower case 8 character string. I already had the end of the string stored, so all I needed to do was work backward and copy the last 8 chars to the buffer. I also needed to pad zeroes to the beginning so that it worked for short files such as aa.gz

3. rather than setting up a second buffer and doing a tolower(), it seemed like a pretty good idea to lower case it as it was copied over ... the most obvious way is to check if the char falls in the A-Z range and add +('a'-'A') ... which would reduce to a constant and be sufficiently fast, but A and a are exactly one bit different (see ascii table) so I figured there was _some_ bit shift operation that would work (bit ops are typically faster than an add) thus the | 32 in
( 'A' <= c && c <= 'Z' ) ? (c | 32) : (c)

Note: initially I use a char* ret value to hold the type, but it is more efficient (code size and compiler wise both) to just return int as a char* function

feel free to use it in a file redo (that is basically what the main() function is) or webserver or in any other code that uses a large series of strcmp() calls._________________Web Programming - Pet Packaging 100 & 101

here are a bunch of alternative string functions that I wrote (the ones I most commonly encounter), just remove the T-prefix and use them instead of including <string.h>

most are a level of magnitude smaller than their libc counterparts (including smaller libc implementations) relying on simple design and the compiler for speed improvements (some ended up being faster and none appeared to be significantly slower) They may show to be less efficient for large strings because I didn't do any casting tricks to compare 4 or 8 characters at a time, mainly do to the extra logic taking up nearly as many instructions as just leaving it simple and letting the compiler help.

strerror is typically unnecessarily large, storing as much as 10kb of unused string constants in global, I condensed these down to a set of macros that only stores the necessary error strings, while I was at it, I threw in a set of enums for the error codes so you can use it without any system includes

I see that the musl source has "FIXME" there (ie, patches welcome)...
...What is the license on this?
(if it's ok and compatibly licensed, I'd be willing to try working it into a patch)

That would be fine.

This work is released to Public Domain.
In locales that do not recognize public domain it is:
Copyright Brad Conroy 2012, permission is hereby granted to use this work in accordance with any license approved by the Open Source Initiative for any purpose without restriction in perpetuity.

I have a note here about switching a do while loop instead on the last 2 loops - (for non-numerical matches) ...

btw, one thing I would be interested in seeing in musl is a platform independent version with no assembly (or just basic assembly common to nearly all platforms) not just for ease of porting, but this will be an absolute godsend for JIT compilation like llvm/clang especially once the opencl (using the gpu) stuff stabilizes because it would allow the same compiled bytecode to be run on any system using only a small bytecode->machine code compiler_________________Web Programming - Pet Packaging 100 & 101

I see that the musl source has "FIXME" there (ie, patches welcome)...
...What is the license on this?
(if it's ok and compatibly licensed, I'd be willing to try working it into a patch)

That would be fine.

This work is released to Public Domain.
In locales that do not recognize public domain it is:
Copyright Brad Conroy 2012, permission is hereby granted to use this work in accordance with any license approved by the Open Source Initiative for any purpose without restriction in perpetuity.

I have a note here about switching a do while loop instead on the last 2 loops - (for non-numerical matches) ...

btw, one thing I would be interested in seeing in musl is a platform independent version with no assembly (or just basic assembly common to nearly all platforms) not just for ease of porting, but this will be an absolute godsend for JIT compilation like llvm/clang especially once the opencl (using the gpu) stuff stabilizes because it would allow the same compiled bytecode to be run on any system using only a small bytecode->machine code compiler

I'm not sure that noasm is practical...

When I submitted the patch to Rich, he pointed out that there's an integer overflow risk. We discussed the method and came up with another method (probably faster thanks to the lack of multiplication):

You cannot post new topics in this forumYou cannot reply to topics in this forumYou cannot edit your posts in this forumYou cannot delete your posts in this forumYou cannot vote in polls in this forumYou cannot attach files in this forumYou can download files in this forum