matlab, each bar in histogram correspond to which letter

85 views Asked by At

I have 400 files, each one contains about 500000 character, and those 500000 characters consists only from about 20 letters. I want to make a histogram indicating the most 10 letters used (x-axis) and number of times each letter is used (y-axis). I wrote this code which has missing thing which is I want to know each bar is corresponding to which letter. What should I add on the code ? You can change the whole code, but keeping this is better for me. provide me the whole code so I can copy it directly to a script and run it.

     i = 1;
     z = zeros(1, 10);
        for i=1:400
    j = num2str(i);
    file_name = strcat('part',j,'txt');
    file_id = fopen(file_name);
    part = fread(file_id, inf, 'uchar');
    h = hist(part,10);
    z = z + h;
    fclose(file_id);
end
1

There are 1 answers

0
Tamás Szabó On

First of all, your use of hist is wrong. hist(data,10) will create a histogram from data that consists of 10 bins, so a bin will correspond to more than one character in your files.

A way to solve this would be to use hist on predefined bins like:

bins = 1:255; % define the bins for hist
histSum = zeros(numel(bins),1);

for file=1:10;
    data = randi(25,100) + 'a';     %Generate random data - letters between 'a' and 'z'
    data = reshape(T,numel(T),1);   % Make it a vector

    histSum = histSum + hist(data,bins)';
end

Note that you have to define your bins to accommodate all possible values, therefore ranging from 1 to 255