I have a table with two columns emailid and keyword and I am pivoting(kind of matrix) the value is sql such as the columns are the distinct keywords and the rows are the distinct users the values at [emailid][keyword] is 1 if the value is present and null if it is not, and I am trying to find the correlation between keywords i.e. if two users have searched for the same keyword then there is a correlation between those two keywords, How can I achieve this ?
You should replace the null value with 0 to begin. You may want to explore various correlation techniques such as Pearson and Spearman correlation.
This is a page on Pearson Correlation: http://learntech.uwe.ac.uk/da/Default.aspx?pageid=1442
This gives the output as 1.0 which means total correlation or positive correlation. The output of Pearson correlation varies from -1.0 (Most negative correlation) to 1.0 (high positive correlation). Here 0 means no correlation between the two data quantity.
The more information on this could be found under: https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.pearsonr.html