Counting pairs of numbers with a given difference out of a set

This is just a post detailing the solution to a small algorithmic challenge from HackerRank. The exercise is to find the pairs of numbers out of a set of numbers N that have difference K. The link to the actual HackerRank problem can be seen here. We will see some different implementations and strategies to approach this problem and how they could be optimized even further.

This is just a post detailing the solution to a small algorithmic challenge from HackerRank. The exercise is to find the pairs of numbers out of a set of numbers N that have difference K. The link to the actual HackerRank problem can be seen here. I am not sharing my solution because it’s unique or even good. There may be much better solutions out there and in fact if someone knows something better I encourage you to comment on it so that both me and the readers of this blog can learn something new!

All of the code mentioned in this tutorial can be cloned from Github

The actual text of the challenge is given below:

Given N numbers [N<=10^5], count the total pairs of numbers that have a difference of K. [K>0 and K<1e9]

Input Format:

1st line contains N & K (integers).
2nd line contains N numbers of the set. All the N numbers are assured to be distinct.

Output Format:

One integer saying the no of pairs of numbers that have a diff K.

Sample Input #00:

5 2
1 5 3 4 2

Sample Output #00:

3


Now with challenges like this, speed is what is important. We want our program to solve the problem for all possible input in the least amount of computational time. I will be using C to solve this problem. First of all we have to figure out a way to read in the input from the stdin as is described in the problem definition.

1
2
3
4
5
6
7
8
9
10
int N,K,*set,i=0,j;
char* buff,* token;
int result = 0;
 
 
//read 1st input line
scanf("%d %d \n",&N,&K);
//allocate space for the set numbers string and the actual set
buff = malloc(13*N);//13 here is for ULONG_MAX number of digits + space
set = malloc(sizeof(int)*N);

Above is the beginning of main() for the problem. N and K are the numbers given in the problem definition. set will be an array holding all of our numbers while buff will be a buffer holding the string of the second line from which to read the numbers into set. After reading both N and K with scanf() we allocate the buffer to hold the string. It may be a bit sloppy doing it like that but here we allocate 13 bytes for each number. This is an approximation judging from the typical maximum value of ULONG_MAX as shown here, plus the space separating each number. It should be safe since we are using ints and ULONG_MAX is for unsigned ints. We also allocate the array of numbers.

//read second line of input
gets(buff);
//find the individual numbers
token = strtok(buff," ");
while(token != NULL)
{
    sscanf(token,"%d",&set[i]);
    token = strtok(NULL," ");
    i++;
}

Further down in reading the input we can see from above that we read the second input line into the buffer. Then since we know from the problem definition that the numbers are space separated we can simply use libc's strtok() function to obtain each number of the set. This is what is accomplished by the while loop in there. It tokenizes the string located inside buff until all of the numbers are read into set.

Now since we have read all of the input we have to decide on the strategy with which to determine how many pairs of numbers have difference of K. The naive approach would be to simply go through the array NxN times and count the difference of all numbers increasing a counter for those which match K. Something like below

//naive approach
for(i=0;i<N;i++)
{
    for(j=0;j<N;j++)
    {
        if( abs(set[i] - set[j]) == K)
            result++;
    }
}

This approach is not very good and it will actually fail the test in the last 2 data sets in HackerRank. The complexity here is O(N2) and is not good. We can do better than that! For reference let me mention the processing times that HackerRank returned for this approach:

Status: Wrong Answer

# 0 0.06s : Success

# 1 0.08s : Success

# 2 0.14s : Success

# 3 0.23s : Success

# 4 0.36s : Success

# 5 0.54s : Success

# 6 0.74s : Success

# 7 0.98s : Success

# 8 1.26s : Success

# 9 1.58s : Success

# 10 1.93s : Success

# 11 2.32s : Success

# 12 2.74s : Success

# 13 3.22s : Time limit exceeded

# 14 3.22s : Time limit exceeded

So what could an alternate way to approach this problem be? Well let's see! We have an array of numbers and these numbers are also guaranteed to be positive. The relation between the numbers that we are trying to explore is their difference. What operation on the array could allow us to deal with the difference in a much better way? Yes you guessed it correctly! We can sort the array! By having a sorted array of numbers we can easily get the differences between numbers without wasting too much time.

Counting pairs of numbers with a given difference out of a set of numbers

//comparison function for qsort()
int compare(const void* a, const void* b)
{
    //a less than b
    if( *(int*)a < *(int*)b)
        return -1;
    //a greater than b
    return 1;
    //we don't take into account a == b since according to the problem definition it is impossible
}
//sort the array
qsort(set,N,sizeof(int),compare);

As we can see from the above code the C standard library includes an implementation of quicksort as long as you #include <stdlib.h>. Since C has no templates you need to implement a comparison function using the signature int compare(const void* a, const void* b) which will be returning -1 in case of a being less than b, 0 in case they are equal and 1 in case a is greater than b. But in our comparison implementation we omit one of those checks. The problem definition clearly states that the numbers in the array are unique and also that the difference K can't be 0. So there is no need for a and b being equal comparison.

So afterwards we can call qsort() on the array which will sort it in place. For more information on stdlib quicksort implementation take a look at the relevant documentation.

int diff;
for(i=0;i<N;i++)
{
    for(j=i+1;j<N;j++)
    {
        diff = set[j]-set[i];
        if(diff > K)
            break;
        if( diff == K)
        {
            result++;
            break;
        }
    }
}
printf("%d",result);

And here is the actual counting part of this approach. We check each number of the array against each other trying to find number pairs which satisfy the difference of K. Thanks to the sorting of the array we also know that if we go to differences greater than K we can safely proceed to the next number since all subsequent numbers would be greater and so would the difference. In addition, we don't need to be checking the previous numbers that were already checked and we know that since K>0, the second for() loop starts at i+1.

This approach shows a very big difference in the results as can be seen by the details from the HackerRank data sets.



# 0 0.03s : Success

# 1 0.03s : Success

# 2 0.03s : Success

# 3 0.04s : Success

# 4 0.03s : Success

# 5 0.04s : Success

# 6 0.04s : Success

# 7 0.05s : Success

# 8 0.05s : Success

# 9 0.05s : Success

# 10 0.05s : Success

# 11 0.06s : Success

# 12 0.08s : Success

# 13 0.08s : Success

# 14 0.08s : Success


There is a small way to squeeze out a bit more performance for this algorithm though. It does not involve a change of strategy but rather a change of implementation. C stdlib quicksort has one disadvantage. In order to provide for any possible type of input it uses void* and a function pointer for the comparison of the elements of the array that has to be defined by the user. C++ makes up for that with the use of templates but since we are using C there is another way. We can have a dedicated quicksort implementation only for integers. I actually had one lying around and so I inserted it into the code to compare.

int partition(int* a,int left,int right,int pivotI)
{
    int swap,newPivotI,i;
    int pivotValue = a[pivotI];
    //move the pivot to the end
    a[pivotI] = a[right];
    a[right] = pivotValue;
    newPivotI = left;//initialize the new pivot position to the leftmost index
    //find the new pivot position
    for(i=left;i<right;i++)
    {//if the value at the iterated index is less than the pivot value then swap that value
    // with the potential pivot position and increase the potential pivot position
        if(a[i] <= pivotValue)
        {
            swap = a[i];
            a[i] = a[newPivotI];
            a[newPivotI] = swap;
            newPivotI++; //increase the potential pivot index
        }
    }
    //in the end move the pivot to its final position
    a[right] = a[newPivotI];
    a[newPivotI] = pivotValue;
    return newPivotI;
}
 
void quicksort_inplace(int* a,int left,int right)
{
    int pivotI,newPivotI;
    if(left < right)//if the list has at least 2 items
    {
        //choose the pivot
        pivotI = left+(right-left)/2;
        //partition the list
        newPivotI = partition(a,left,right,pivotI);
        //recursively quick sort elements smaller than the pivot
        quicksort_inplace(a,left,newPivotI-1);
        //recursively quick sort elements greater than the pivot
        quicksort_inplace(a,newPivotI+1,right);
    }
}
 
void quicksort(int* a,int size)
{
    quicksort_inplace(a,0,size-1);
}

As can be seen from the code above this is a standard in-place quicksort implementation for integers. It involves a partition() function and a function that's called recursively on the created partitions to sort the array in place. It's easy to study the code and I believe it is well commented. If you want more information on the actual algorithm you can find it here.

The only change that needs to be done in the code that was presented before is swapping the call to stdlib's qsort() with the following:

quicksort(set,N);

The results as can be seen from the HackerRank data set details showed a very small improvement. So maybe for this data and this problem there is an improvement but whether it's worth the effort is up for discussion.

# 0 0.02s : Success

# 1 0.03s : Success

# 2 0.03s : Success

# 3 0.03s : Success

# 4 0.03s : Success

# 5 0.04s : Success

# 6 0.04s : Success

# 7 0.04s : Success

# 8 0.04s : Success

# 9 0.05s : Success

# 10 0.05s : Success

# 11 0.05s : Success

# 12 0.05s : Success

# 13 0.06s : Success

# 14 0.06s : Success



A final and perhaps not so important improvement would be to remember the last j index that was reached in the inner loop. This way we can compare and see if the difference of the value at i and at the last j is less than K. If it is, then the inner loop can actually start from the next item after the last j.

int diff;
int lastJ=1;
for(i=0;i<N;i++)
{
    j = i+1;
    if(set[lastJ]-set[i]<K)
        j = lastJ+1;
    for(;j<N;j++)
    {
        diff = set[j]-set[i];
        if(diff > K)
            break;
        lastJ = j;
        if( diff == K)
        {
            result++;
            break;
        }
    }
}
printf("%d",result);

Above we can see what I said implemented in code. There is an extra if check but depending on the data some improvement could be gained. But as far as HackerRank's data is concerned there was no visible improvement when I ran this code through it.

EDIT1:
Christian Sommer made a very useful comment below detailing a very clever approach to this problem! Taking advantage of the fact that all the values are guaranteed to be unique and that we simply want to count the number of the pairs and not the pairs themselves there is an even smarter approach.

If during reading a number M from the input stream we have already read another number satisfying either M+K or M-K then that means that we have found a pair. It's that simple! The specific code given by Christian is in C++ and utilizes std::set from the STL. It assumes that stream is an ifstream and that we have already read the first line that contains N and K.

int pairCount = 0;
int el;
std::set<int> s;
while(stream >> el) {
if (s.count(el+K) > 0 || s.count(el-K) > 0) ++pairCount;
s.insert(el);
}
return pairCount;

Well that's all I guess. I simply wanted to share my implementation of a solution to this particular problem. I hope it helped the readers and maybe even taught something new to some people. If a reader has a better and faster solution I encourage you to share it with me in the comments section. I am really looking forward to seeing if there are more awesome ways to approach this problem.

Converting an Integer to a Hexadecimal String

This post presents a nice and efficient algorithm to convert an integer into a hexadecimal string using the properties of binary numbers. Source code in the C language is provided along with examples.

Just quickly sharing an algorithm to convert an unsigned decimal integer to a string of hexadecimals. Normally to do this as a human, the first way that everyone is taught is to progressively divide by the base (16) and keep the remainders. While this is not wrong, there are of course much faster ways and I would like to share one in this post.

There is a reason that hexadecimals are universally utilized in computing in order to represent numbers. They are very well fit to do so. Let’s take for example the number 2579541374shown below and see what I mean.

Binary Representation of a Decimal
Binary Representation of a Decimal

Hexadecimals are numbers with 16 as a base, which is a power of 2. 2 in the power of 4 to be more specific. Let’s see what happens if we group that big binary number into groups of 4 bits.

Binary Number divided into groups of 4 bits
Binary Number divided into groups of 4 bits

It looks much more presentable now, right? But what have we achieved? Well by dividing the big integer into groups of 4 bits we have almost already converted it into a hexadecimal. Why so? Well a string of 4 bits can represent numbers 0 to 15. And it’s no coincidence that 0 to 15 are exactly the digits of the hexadecimal numbers.

Hex Conversion Table
Hex Conversion Table

So the “algorithm” consists of nothing more other than simply taking each 4 of the bits of the integer and figuring out the hexadecimal digit they correspond to. For 2579541374the solution is shown below.

Hexadecimal conversion
Hexadecimal conversion

Putting all of the above into C code is very easy. By sacrificing 16 bytes of memory to keep an array of the hexadecimal digits we can convert all possible integers.

static char hex [] = { '0', '1', '2', '3', '4', '5', '6', '7',
                        '8', '9' ,'A', 'B', 'C', 'D', 'E', 'F' };
 
//The function that performs the conversion. Accepts a buffer with "enough space" TM 
//and populates it with a string of hexadecimal digits.Returns the length in digits
int uintToHexStr(unsigned int num,char* buff)
{
    int len=0,k=0;
    do//for every 4 bits
    {
        //get the equivalent hex digit
        buff[len] = hex[num&amp;0xF];
        len++;
        num>>=4;
    }while(num!=0);
    //since we get the digits in the wrong order reverse the digits in the buffer
    for(;k<len/2;k++)
    {//xor swapping
        buff[k]^=buff[len-k-1];
        buff[len-k-1]^=buff[k];
        buff[k]^=buff[len-k-1];
    }
    //null terminate the buffer and return the length in digits
    buff[len]='\0';
    return len;
}

The function should be simple enough to understand without any explanation. It takes every 4 bits of the integer and matches the corresponding hex digit which it saves inside the buffer. Of course since this produces the reverse of the hexadecimal we want (since we read the number from the LSB) we need to reverse it in the end. That is achieved by simple XOR swapping. To use the function you would do something like below

#include  
 
int main()
{
   char buff[16];//enough for 64 bits integer
   int length;   
   //convert
   length = uintToHexStr(3735928559,buff);
   //and print the results
   printf("0x%s with length:%d",buff,length);
   return 0;
}

The above would output 0xDEADBEEF(^_^) with length:8. So that’s all, hope you liked the explanation of the way to convert an integer to a hexadecimal string and if you have any questions don’t hesitate to leave a comment below.

In addition if the post caught your interest and you love C programming in general why not check my work in the Refu Library? It currently needs more contributors and people who are able and willing to provide feedback! You could be one of them 🙂