<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Not Really a Blog &#187; jobs</title>
	<atom:link href="http://blog.notreally.org/tag/jobs/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.notreally.org</link>
	<description>or is it?</description>
	<lastBuildDate>Tue, 13 Jul 2010 09:48:09 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='blog.notreally.org' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://1.gravatar.com/blavatar/93018cad14db97a3057eb332c3ba920a?s=96&#038;d=http://s2.wp.com/i/buttonw-com.png</url>
		<title>Not Really a Blog &#187; jobs</title>
		<link>http://blog.notreally.org</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://blog.notreally.org/osd.xml" title="Not Really a Blog" />
	<atom:link rel='hub' href='http://blog.notreally.org/?pushpress=hub'/>
		<item>
		<title>Computer puzzles in job interviews</title>
		<link>http://blog.notreally.org/2006/01/31/computer-puzzle-in-job-interviews/</link>
		<comments>http://blog.notreally.org/2006/01/31/computer-puzzle-in-job-interviews/#comments</comments>
		<pubDate>Tue, 31 Jan 2006 20:20:00 +0000</pubDate>
		<dc:creator>jroncero</dc:creator>
				<category><![CDATA[Computers]]></category>
		<category><![CDATA[Life]]></category>
		<category><![CDATA[interviews]]></category>
		<category><![CDATA[jobs]]></category>

		<guid isPermaLink="false">http://blognotreally.wordpress.com/2006/01/31/computer-puzzle-on-job-interviews/</guid>
		<description><![CDATA[Yesterday, I went to a job interview. I was asked some difficult problems and ways to solve them. One of them was: We have a list of a million phone numbers on the standard input and we have a reduced memory pc which we want to use it to sort them and use to check [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.notreally.org&amp;blog=8911601&amp;post=19&amp;subd=blognotreally&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Yesterday, I went to a job interview. I was asked some difficult<br />
problems and ways to solve them. One of them was: We have a list of a<br />
million phone numbers on the standard input and we have a reduced memory<br />
pc which we want to use it to sort them and use to check later if any<br />
number given is in that list or not.</p>
<p>The solution offered was to use an array of bits and use the number as<br />
an index, so if b[1234 ] == 0 would mean that the phone number 1234<br />
wasn&#8217;t on the input list and 1 would mean that we had it.</p>
<p>I have been thinking about the solution they provided me.<br />
I&#8217;ve been thinking for a while and would<br />
like to extend the rationale behind.<br />
Ok, so, Let&#8217;s imagine we have 1 million phone numbers. If we choose to<br />
use the bit array approach, that means that, given the English phone<br />
number system, we have numbers like these:</p>
<p>01273 121212</p>
<p>That is, eleven numbers. I guess, the initial 0 is always present, so we<br />
can omit it right now. So that leaves us 10 numbers to deal with.<br />
Supposing that there are no numbers starting with a 0 (after the initial<br />
0), that leaves us with the possibility of having number in the range:</p>
<p>1000000000 to<br />
9999999999 (=10 000 000 000, 10 thousand million numbers )</p>
<p>So, there can be a total of 10,000,000,000 &#8211; 1,000,000,000 numbers, which is<br />
then, 9,000,000,000.</p>
<p>If you want an array of bits to use for storing if certain number is on<br />
the list or not, that means that you need an array of 9,000,000,000<br />
elements, or, 9,000,000,000 bits. That is 1,125,000,000 bytes, or, 1,072Mb<br />
or a little bit more than 1 Gb, right?</p>
<p>Ok, that&#8217;s a lot of memory. And, the thing is that you would need to<br />
have all that array even if it was the best case (that in which all the<br />
numbers would be the same, one single number repeated one million<br />
time). So the memory consumption would be fixed.</p>
<p>On the other hand, if we used a n-ary tree to hold all the information<br />
as something like this:</p>
<pre>
   0  0  0
   1  1  1
   2  2  2
   3  3  3
   4 /4--4
   5/ 5  5
   6  6  6
   7  7  7
   8  8  8
   9  9  9
</pre>
<p>would mean that the number 544 (in a 3-number scheme) would exist.</p>
<p>For example, if we used a C struct like this one:</p>
<pre>
typedef struct Node Node;
struct Node {
        Node *array[10]; /*  */
	};
</pre>
<p>In which each position of the array of pointers would represent a number<br />
itself and a pointer to the next one.</p>
<pre>
    +----+    +----+    +----+
    |null|    |null|    |null|
    +----+    +----+    +----+
    |null|    |null|    |null|
    +----+    +----+    +----+
    |null|    |null|    |null|
    +----+    +----+    +----+
    |null|    |null|    |null|
    +----+    +----+    +----+
    |    ------    \    |null|
    +----+    +----+\   +----+
    |null|    |null| \  |null|
    +----+    +----+  \ +----+
    |null|    |null|   \ XXX |
    +----+    +----+    +----+
    |null|    |null|    |null|
    +----+    +----+    +----+
    |null|    |null|    |null|
    +----+    +----+    +----+
    |    \    |null|    |null|
    +----+\   +----+    +----+
           \
            \   +----+    +----+
             \  |null|    |null|
              \ +----+    +----+
               \ null|    |null|
                +----+    +----+
                |null|   / XXX |
                +----+  / +----+
                |null| /  |null|
                +----+/   +----+
                |    /    |null|
                +----+    +----+
                |null|    |null|
                +----+    +----+
                |null|    |null|
                +----+    +----+
                |null|    |null|
                +----+    +----+
                |null|    |null|
                +----+    +----+
                |    \    |null|
                +----+\   +----+
                       \
                        \
                         \  +----+
                          \ |null|
                           \+----+
                             XXX |
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
</pre>
<p>So that would give us numbers 446 and 912 and 992 in a 3 digit scheme.<br />
We would set up all the pointers to null. If any pointer at any position<br />
is different of null, that means that is a valid number and it points to<br />
the following number (the next array of pointers), so we use the pointer<br />
to both show the number and point to the next structure. The only<br />
exception is for the last node (last digit), that we could make it point<br />
to a invalid address (represented here as XXX).</p>
<p>So, I don&#8217;t know if I make myself clear enough, but using this method,<br />
the thing is that in the worst case, we would have 1,000,000<br />
ways to go from the root to the end of each branch, meaning that we<br />
would need 1,000,000 * 10 (the number of digits each phone number has)<br />
nodes to hold all that information. That is 10,000,000 nodes. In this<br />
structure, thats about 40 bytes, which gives us the amount of<br />
400,000,000 bytes, that&#8217;s is : 381 Mb approximately, which is 1/3 of the<br />
other approach. But, that&#8217;s the worst case scenario. In the average case<br />
scenario, many numbers would be repeated and the total size would be<br />
reduced, actually. And on the best case scenario (one number repeated<br />
one million times) we would only need 10 nodes.</p>
<p>On doing the search, using this n-ary tree, it is O(10) at most if we<br />
need to check the whole number, but as soon we found a null on any<br />
digit&#8217;s position, we can stop there, so it would be &lt; O(10) which is a<br />
good number anyway.</p>
<p>the thing is, Am I right, or not?
<p>Ok, so, Let&#8217;s imagine we have 1 million phone numbers. If we choose to<br />
use the bit array approach, that means that, given the English phone<br />
number system, we have numbers like these:</p>
<p>01273 121212</p>
<p>That is, eleven numbers. I guess, the initial 0 is always present, so we<br />
can omit it right now. So that leaves us 10 numbers to deal with.<br />
Supposing that there are no numbers starting with a 0 (after the initial<br />
0), that leaves us with the possibility of having number in the range:</p>
<p>1000000000 to<br />
9999999999 (=10 000 000 000, 10 thousand million numbers )</p>
<p>So, there can be a total of 10,000,000,000 &#8211; 1,000,000,000 numbers, which is<br />
then, 9,000,000,000.</p>
<p>If you want an array of bits to use for storing if certain number is on<br />
the list or not, that means that you need an array of 9,000,000,000<br />
elements, or, 9,000,000,000 bits. That is 1,125,000,000 bytes, or, 1,072Mb<br />
or a little bit more than 1 Gb, right?</p>
<p>Ok, that&#8217;s a lot of memory. And, the thing is that you would need to<br />
have all that array even if it was the best case (that in which all the<br />
numbers would be the same, one single number repeated one million<br />
time). So the memory consumption would be fixed.</p>
<p>On the other hand, if we used a n-ary tree to hold all the information<br />
as something like this:</p>
<pre>
   0  0  0
   1  1  1
   2  2  2
   3  3  3
   4 /4--4
   5/ 5  5
   6  6  6
   7  7  7
   8  8  8
   9  9  9
</pre>
<p>would mean that the number 544 (in a 3-number scheme) would exist.</p>
<p>For example, if we used a C struct like this one:</p>
<pre>
typedef struct Node Node;
struct Node {
        Node *array[10]; /*  */
    };
</pre>
<p>In which each position of the array of pointers would represent a number<br />
itself and a pointer to the next one.</p>
<pre>
    +----+    +----+    +----+
    |null|    |null|    |null|
    +----+    +----+    +----+
    |null|    |null|    |null|
    +----+    +----+    +----+
    |null|    |null|    |null|
    +----+    +----+    +----+
    |null|    |null|    |null|
    +----+    +----+    +----+
    |    ------    \    |null|
    +----+    +----+\   +----+
    |null|    |null| \  |null|
    +----+    +----+  \ +----+
    |null|    |null|   \ XXX |
    +----+    +----+    +----+
    |null|    |null|    |null|
    +----+    +----+    +----+
    |null|    |null|    |null|
    +----+    +----+    +----+
    |    \    |null|    |null|
    +----+\   +----+    +----+
           \
            \   +----+    +----+
             \  |null|    |null|
              \ +----+    +----+
               \ null|    |null|
                +----+    +----+
                |null|   / XXX |
                +----+  / +----+
                |null| /  |null|
                +----+/   +----+
                |    /    |null|
                +----+    +----+
                |null|    |null|
                +----+    +----+
                |null|    |null|
                +----+    +----+
                |null|    |null|
                +----+    +----+
                |null|    |null|
                +----+    +----+
                |    \    |null|
                +----+\   +----+
                       \
                        \
                         \  +----+
                          \ |null|
                           \+----+
                             XXX |
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
                            |null|
                            +----+
</pre>
<p>So that would give us numbers 446 and 912 and 992 in a 3 digit scheme.<br />
We would set up all the pointers to null. If any pointer at any position<br />
is different of null, that means that is a valid number and it points to<br />
the following number (the next array of pointers), so we use the pointer<br />
to both show the number and point to the next structure. The only<br />
exception is for the last node (last digit), that we could make it point<br />
to a invalid address (represented here as <span class="caps">XXX</span>).</p>
<p>So, I don&#8217;t know if I make myself clear enough, but using this method,<br />
the thing is that in the worst case, we would have 1,000,000<br />
ways to go from the root to the end of each branch, meaning that we<br />
would need 1,000,000 * 10 (the number of digits each phone number has)<br />
nodes to hold all that information. That is 10,000,000 nodes. In this<br />
structure, thats about 40 bytes, which gives us the amount of<br />
400,000,000 bytes, that&#8217;s is : 381 Mb approximately, which is 1/3 of the<br />
other approach. But, that&#8217;s the worst case scenario. In the average case<br />
scenario, many numbers would be repeated and the total size would be<br />
reduced, actually. And on the best case scenario (one number repeated<br />
one million times) we would only need 10 nodes.</p>
<p>On doing the search, using this n-ary tree, it is O(10) at most if we<br />
need to check the whole number, but as soon we found a null on any<br />
digit&#8217;s position, we can stop there, so it would be &lt; O(10) which is a<br />
good number anyway.</p>
<p>the thing is, Am I right, or not?</p>
<br /><img alt="" border="0" src="http://feeds.wordpress.com/1.0/categories/blognotreally.wordpress.com/19/" /> <img alt="" border="0" src="http://feeds.wordpress.com/1.0/tags/blognotreally.wordpress.com/19/" /> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/blognotreally.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/blognotreally.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/blognotreally.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/blognotreally.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/blognotreally.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/blognotreally.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/blognotreally.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/blognotreally.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/blognotreally.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/blognotreally.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/blognotreally.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/blognotreally.wordpress.com/19/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/blognotreally.wordpress.com/19/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/blognotreally.wordpress.com/19/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.notreally.org&amp;blog=8911601&amp;post=19&amp;subd=blognotreally&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.notreally.org/2006/01/31/computer-puzzle-in-job-interviews/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/e39e820dfad61c10be3c1f2c7f9c2747?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">golan</media:title>
		</media:content>
	</item>
	</channel>
</rss>