look for only the first common element between two arrays

1k views Asked by At

I have two arrays, namely a[] and b[]. I would like to search if any element of a is present in b. Note that I do not want to find all duplicate occurrences, if any one element from a[] exists in b[], I would like to report it and break out without checking the rest of a[] or b[]

I'm a bit confused about how 'break' operates. Can 'break' break out of any kind of loop - for/while, but it breaks out of only the 'innermost loop' around where it's placed in code ?

This is my solution and please suggest if there is a better implementation for this O(n^2) approach

#include <stdio.h>

int main(void)
{
 int a[] = {1,2,3,4,5,6};
 int b[] = {11,2,3,7,8,9,6,10,11};
 int i = 0, j = 0;
 int duplicate = 0;

 for(i=0;i<sizeof(a)/sizeof(int);i++)
 {
  for(j=0;j<sizeof(b)/sizeof(int);j++)
  {
   if(a[i] == b[j])
   {
    printf("Duplicates found for [%d]\n",a[i]);
    duplicate = 1;
    break;
   }
  }

  if(duplicate == 1)
  {
   break;
  }
 }

 return 0;
}
3

There are 3 answers

0
Yank Leo On BEST ANSWER

breaks generally break out of the most inner loop in c. You have used it correctly.

if u want to just report a duplicate, then u can put this code into a fuction and whenever a duplicate is found, you just return. The function would look like this:

//Here a and b are pointers to array 
//n1 and n2 are number of elements in array a,b
chk_dup(int *a,int *b,int n1,,int n2){

for(i=0;i<n1;i++)
 {
  for(j=0;j<n2;j++)
  {
   if(a[i] == b[j])
   {
    printf("Duplicates found for [%d]\n",a[i]);
    //duplicate = 1;
    //break;
    return;
   }
  }
 }
}

You cannot use sizeof(a) here becoz its just a pointer to array.Same goes for sizeof(b).

You can make this algorithm more efficient by sorting the arrays using quicksort(that would take O(nlgn)) and then for each element in a do binary search in b.

pseudo code for that is something like this:

//Here a and b are pointers to sorted arrays 
//n1 and n2 are number of elements in array a,b
chk_dup(int *a,int *b,int n1,,int n2){

for(i=0;i<n1;i++)
 {
   //find a[i] in b using binary search.
   int found=binary_search(a[i],b);
   if(found){
     printf("Found Duplicate");
     return;
   }
 }
 printf("No duplicate found");
}

So, the whole algorithm works in O(nlgn). While the algorithm you are using can take O(n^2).

Otherwise your code is perfectly fine and the use of break is correct

1
Jack On

break only breaks the innermost loop in which it is used. If you want to avoid the ugly syntax then you have multiple solutions:

  • ignore this optimization until you prove it's relevant
  • move the code inside a function that accepts the two arrays as arguments so that you can directly return from the function without having to break.
  • adjust indices i and j after that you found a duplicate to make both loops return gracefully
  • use a goto instruction (which is not advisable in any case)

Other solutions, with complexity lower than O(n^2) could require some additional data structure, like a hashset or sorting of data.

0
orestiss On

You could use goto as noted here How to break out of nested loops? it's the last remaining valid use of this...

And maybe there is a better solution using xor, not sure if you can reduce the

complexity though...