This originally was a problem I ran into at work, but is now something I'm just trying to solve for my own curiosity.

I want to find out if int 'a' contains the int 'b' in the most efficient way possible. I wrote some code, but it seems no matter what I write, parsing it into a string and then using indexOf is twice as fast as doing it mathematically.

So although this isn't really required for me to complete my work, I was just wondering if anyone could think of any way to further optimize my way of doing it mathematically, or an entirely new approach altogether. Again memory is no problem, I am just shooting for sheer speed.

I'm really interested to see or hear anything anyone has to offer on this.

EDIT: When I say contains I mean can be anywhere, so for example, findMatch(1234, 23) == true

EDIT: For everyone saying that this crap is unreadable and unnecessary: you're missing the point. The point was to get to geek out on an interesting problem, not come up with an answer to be used in production code.

As written, your question is impossible to answer. Trim it down to the bare essentials
–
1800 INFORMATIONOct 23 '08 at 23:26

its interesting that the string version is faster, since doesn't the toString of a number have to do similar shift/mod/div operaitons to turn the number into its digits?
–
John GardnerOct 24 '08 at 18:17

Since 300 characters is far too little to make an argument in, I'm editing this main post to respond to Pyrolistical.

Unlike the OP, I wasn't that surprised that a native compiled indexOf was faster than Java code with primitives. So my goal was not to find something I thought was faster than a native method called zillions of times all over Java code.

The OP made it clear that this was not a production problem and more along the lines of an idle curiosity, so my answer solves that curiosity. My guess was that speed was an issue, when he was trying to solve it in production, but as an idle curiosity, "This method will be called millions and millions of times" no longer applies. As he had to explain to one poster, it's no longer pursued as production code, so the complexity no longer matters.

Plus it provides the only implementation on the page that manages to find the "123" in "551241238", so unless correctness is an extraneous concern, it provides that. Also the solution space of "an algorithm that solves the problem mathematically using Java primitives but beats optimized native code" might be EMPTY.

Plus, it's not clear from your comment whether or not you compared apples to apples. The functional spec is f( int, int )-> boolean, not f( String, String )-> boolean (which is kind of the domain of indexOf) . So unless you tested something like this (which could still beat mine, and I wouldn't be awfully surprised.) the additional overhead might eat up some of that excess 40%.

If you read the question and Nalandial's comments, you'll see he is looking for a faster replacement. So yes, the correct answer needs to be faster not just be correct.
–
PyrolisticalOct 29 '08 at 17:30

It should be faster string way, because your problem is textual, not mathematical. Notice that the your "contains" relationship says nothing about the numbers, it only says something about their decimal representations.

Notice also that the function you want to write will be unreadable - another developer will never understand what you are doing. (See what trouble you had with that here.) The string version, on the other hand, is perfectly clear.

then the challenge i'm proposing just for curiosity's sake: make the mathematical one faster! :D
–
Alex BeardsleyOct 24 '08 at 0:00

oh i'm well aware that it's completely unreadable. again as i mentioned this is just for curiosity's sake and won't be going in any production code. just giving a chance for some people to geek out :P
–
Alex BeardsleyOct 24 '08 at 16:52

I understand, I only pointed on one aspect, not answering the main quesion. As to the main question, tvanfosson pointed out the only hope I see to get extra speed: convert a to decimal only as far as you need. If match is there, you save some time.
–
buti-oxaOct 25 '08 at 16:18

The only optimization that I can think of is to do the conversion to string on your own and compare digits (right to left) as you do the conversion. First convert all the digits of b, then convert from the right on a until you find a match on the first digit of b (from right). Compare until all of b matches or you hit a mismatch. If you hit a mismatch, backtrack to the point where you starting matching the first digit of b, advance in a and start over.

IndexOf will have to do basically the same back tracking algorithm, except from the left. Depending on the actual numbers this may be faster. I think if the numbers are random, it should be since there should be many times when it doesn't have to convert all of a.

I was actually looking for a way to make the mathematical way of doing it faster than the String comparison, if such a way existed. As I said this has become more of a personal challenge on my own time and has gone away from what the project actually requires.
–
Alex BeardsleyOct 23 '08 at 23:40

I like the idea, it saves some time, especially when a hit is probable. Plus, while accepted solution limits a and b to 16 digits, this one only limits smaller b in this way.
–
buti-oxaOct 25 '08 at 16:21

that's actually a really good point, and it does optimize it quite a bit. nice catch!
–
Alex BeardsleyOct 24 '08 at 0:02

running more tests, this is great for cases where b is fairly large. when b is large, the string method takes the same amount of time whereas the math one beats it by a landslide!
–
Alex BeardsleyOct 24 '08 at 0:11

This is an interesting problem. Many of String.class's functions are actually native making beating String a difficult proposition. But here's some helpers:

TIP 1: Different simple integer operations have different speeds.

By quick calculations in sample programs showed:

% ~ T
* ~ 4T
/ ~ 7T

So you want to use as little division as possible in favor of multiplication or modulo. Not shown are subtraction, addition, and comparison operators cause they blow all of these out of the water. Also, using "final" as much as possible allows the JVM to do certain optimizations. Speeding up you "getLength" function:

I'm guessing that "getLength" isn't called anywhere else, so while it might be nice to have a separate function, from a optimization standpoint its an unnecessary method call and creation of the object "len". We can put that code right where we use it.

Also, note I changed the bottom while loop to also include "a <= b". I haven't tested that and not sure if the per-iteration penalty beats the fact that you don't waste any iterations. I'm sure there's a way to get rid of the division using clever math, but I can't think of it right now.

This in no way answers your question, whatsoever, but it's advice anyway :-)

The method name findMatch is not very descriptive. In this case, I'd have a static method ContainerBuilder.number(int), which returned a ContainerBuilder, which has the method contains on it. In this way your code becomes:

boolean b = number(12345).contains(234);

Juts some advice for the long run!

Oh yes, I meant to say also, you should define what you mean by "contains"

@Jonathan: I was about to write the same thing :)
–
abahgatOct 23 '08 at 23:34

Yes, well done. LOL! Just because, in my day job, I endlessly come across huge methods full of undocumented code called things like findMatch I'm totally unqualified to post a little (friendly) general advice. Oh and thanks for putting some suggestions in my mouth too!
–
oxbow_lakesOct 23 '08 at 23:41

Nalandial - no, this looks like a homework assignment. Hence the general advice for the future.
–
oxbow_lakesOct 23 '08 at 23:41

Is there any way to calculate this in binary? Obviously the binary value of an integer containing the binary integer of another character doesn't mean that the decical does the same. However, is there some kind of binary trickary that could be used? Maybe convert a numer like 12345 to 0001 0010 0011 0100 0101, and then do some bit shifting to figure out if 23 (0010 0011) is contained in there. Because your character set is only 10 characters, you could cut down the computation time by store 2 characters values in a single byte.

EDIT

Expanding on this idea a bit. if you have 2 integers, A and B, and want to know if A contains B, you check 2 things first. if A is less than B, then A cannot contain B. If A = B then A contains B. At this point you can convert them to strings*. If A contains the same number of character numbers as B, then A does not contain B, unless they are equal, but we wouldn't be here if they are equal, so if both strings are the same length, a does not contain b. At this point, the length of A will be longer than B. So, now you can convert the strings to their packed binary values as I noted in the first part of this post. Store these values in an array of integers. Now you do a bitwise AND Of the integer values in your array, and if the result is A, then A contains B. Now you shift the array of integers for B, to the left 4 bits, and do the conparison again. Do this until you start popping bits off the left of B.

*That * in the previous paragraph means you may be able to skip this step. There may be a way to do this without using strings at all. There might be some fancy binary trick you can do to get the packed binary representation I discussed in the first paragraph. There should be some binary trick you can use, or some quick math which will convert an integer to the decimal value I discussed before.

i was thinking about this, but i honestly couldn't think of a way to do it.
–
Alex BeardsleyOct 24 '08 at 0:13

Isn't this almost exactly what the "convert to string" plan does, albeit with a small factor memory savings since you make BCD strings instead of ASCII/UTF-8/UTF-16/whatever strings?
–
Doug McCleanOct 24 '08 at 4:03

Can I ask where you're using this function in your code? Maybe there's another way to solve the problem it is currently solving which would be much faster. This could be like when my friend asked me to completely re-tune his guitar, and I did it before realizing I could have just lowered the bottom string by a whole step and gotten an equivalent result.

originally it was because the client wants to be able to search on a unique id, but they want to have a "contains" searchability like the one described here. however it also needs to have > and < functionality, and > on a string != > on a number.
–
Alex BeardsleyOct 24 '08 at 15:01

Ugh! My personal guess is that the negative point is for SPAM? Anyway, I only recomended another tool (as Stackoverflow) that usually works, is that bad?
–
David SantamariaOct 24 '08 at 8:44

because it looks like SPAM. you didn't give enough information about what the link is. comments of the form "Hi [link] could work" look like spam to just about everyone.
–
John GardnerOct 24 '08 at 18:16