You ever apply for a job and they make you do a little programming question? These are fun sarcasm because they usually have some math involved and it provides me a small and unimportant challenge - similar to a video game.
The Question
Find a 9 letter string of characters that contains only letters from
acdegilmnoprstuw
such that the hash(the_string) is
910897038977002
if hash is defined by the following pseudo-code:
Int64 hash (String s) {
Int64 h = 7
String letters = "acdegilmnoprstuw"
for(Int32 i = 0; i < s.length; i++) {
h = (h * 37 + letters.indexOf(s[i]))
}
return h
}
For example, if we were trying to find the 7 letter string where hash(the_string) was 680131659347, the answer would be “leepadg”.)
My Answer
First I need to define the alphabet we can pick letters from. The bigger the alphabet the more difficult it is to brute force an answer. I wrote a brute force approach as a comparasion and just in case I can’t figure this out solution, but I ended up solving the problem the non-brute force way before execution was finished.
That was just luck though because the answer starts with an ‘a’ which happens to be one of the first characters of the alphabet. If our answer had started with a ‘w’ then we would have searched about 15^9 (over 38 billion) different hash combinations which would take many lifetimes to accomplish.
It’s never good to have code that dramatically changes execution time depending on input, especially code that takes 100s of years to complete, so instead we are going to use math and reverse engineer the hashing algorithm so that we can find answers in constant time regardless of input.
So here is our given alphabet.
var alphabet = "acdegilmnoprstuw";
Next I establish the base we are in. There are fifteen letters in this alphabet so we are using the mathematical base 15 (HEX).
var base = alphabet.length;
Next I duplicate the hash function from the question. I use this to hash various results and to also test that hashing the string leepadg returns 680131659347.
function hash(the_string)
{
var h = 7;
for (var i = 0; i < the_string.length; i++)
{
h = (h * 37 + alphabet.indexOf(the_string[i]));
}
return h;
}
The mathematical formula for this hash function expanded is
{% blockquote %} 37^9 * 7 + 37^8 * s1 + 37^7 * s2 + … + 37^2 * s7 + 37^1 * s8 + 37^0 * s9 = 910897038977002 {% endblockquote %}
where s1 is the first character in our 9 digit string and s9 is the last. The value of s1 thru s9 cannot be greater than 15 or less than 0 either.
Now the real magic comes in. All I am going to do is cut down the number of searches needed to find the answer dramatically. I still use hash
function though, instead of searching for billions of solutions, we smartly guide our search with this algorithm below.
We start with the smallest possible hash the answer could be. Since the letter a is the smallest letter in the alphabet there can be no string before the string aaaaaaaaa
. Now if I treat this string like it is some very large number in the base of 15 of our alphabet then I start at the very highest power which is the very first ‘a’. If you don’t get this let me explain in further detail.
It might be strange to see aaaaaaaaa
as a number. But what if this was written as 111111111
instead? The highest power is the first 1 which is 1 * 10^9 = 10000000. Since we are treating letters of our alphabet as a set of numbers we can do the same thing. We are used to dealing with the set of real numbers, but in this we only need to worry about the set of characters inside of our alphabet.
Let A = 910897038977002. We were given this number and it is the one we are trying to crack.
So if we look at A - hash('aaaaaaaaa')
there are three possible outcomes. The result is bigger than 0, smaller than 0 or exactly 0.
So let’s talk about each one of those possibilities.
- If the answer is 0 we are finished, and we have found our solution.
- If the answer here is less than 0 then there is no possible solution.
- If the answer here is more than 0 then we need to continue looking for solutions.
Our job is to traverse through every letter from beginning to end and find the highest letter possible for each power. This will give us an answer if there is a solution. In this case A - hash('aaaaaaaaa') > 0
but A - hash('caaaaaaaa') < 0
so we know the first letter must begin with an a
. If this doesn’t make sense then allow me to explain differently.
Let’s play this algorithm out but imagine we are dealing with just simple normal numbers. We have the answer 422. We want to find the number 422. So we start with the lowest number 000. Next we ask ourselves a series of questions in this order
000 < 422 (yes) 100 < 422 (yes) 200 < 422 (yes) 300 < 422 (yes) 400 < 422 (yes) 500 < 422 (no)
Now on to the next power
410 < 422 (yes) 420 < 422 (yes) 430 < 422 (no)
Now to the smallest power
421 < 422 (yes) 422 = 422 (solution found)
This gives you an idea of the alogrithm we are using but the only difference is that we make use of the hash(string)
function each time we do our comparison. This is essentially much like prime factorization, especially since the number 37 is a prime number and we are subtracting powers of 37 away from the answer in order to find our reverse hash. So let’s look at the implementation
function reverse_hash(A)
{
var answer = 'aaaaaaaaa';
// subtract away the max amount possible without going negative
// we start with 'aaaaaaaaa' and work our way up, one letter at
// a time (from beginning 'a').
for (var column = 0; column < 9; column++)
{
var charAt = 0;
var next = answer;
// we cannot ever go more than the base of our alphabet
while (charAt < base + 1)
{
// replace the letters at the given column
next = next.replaceAt(column, alphabet[charAt++]);
// we did not go negative so we can update our answer
if (A - hash(next) >= 0)
{
answer = next;
}
else
{
break; // go to the next column in answer
}
}
}
return answer;
}
Finally I add a helper function to the String.prototype to replace characters. I normally wouldn’t mess with String.prototype but since I am not using this application on a large scale it doesn’t matter.
String.prototype.replaceAt = function(index, letter)
{
return this.substr(0, index)
+ letter
+ this.substr(index+letter.length);
}
To find the reversed hash answer in just milliseconds we can use our new reverse_hash
command.
var answer = reverse_hash(910897038977002);
And that is it. Oh and if you are wondering what the answer is: asparagus
There is probably another 20 ways to solve this problem but this is the way I chose. Did you actually make it here? Wow… I figured I would have bored most people off by this line.