Pandas Dataframe to HTML with very long line in a column

256 views Asked by At

I'm having a big issue trying to convert a Pandas data frame into an html table. In particular, the problem is in the central 3rd column where I have a really long value that is displayed completely on one row while the other columns are splitted into multiples. Here's the code that I was using:

doc = df.to_html(justify="left", index=False)

which give me as result the following html:

<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: left;">
      <th>Gene Symbol</th>
      <th>Strain name</th>
      <th>MP term ids</th>
      <th>MP term names</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>[Ighm&lt;tm1Cgn&gt;]</td>
      <td>[B6.129S2-Ighm&lt;tm1Cgn&gt;/CgnOrl]</td>
      <td>[MP:0000063,MP:0000130,MP:0000135,MP:0000488,MP:0000689,MP:0001544,MP:0001800,MP:0001802,MP:0001805,MP:0001806,MP:0001807,MP:0001835,MP:0001879,MP:0002133,MP:0002144,MP:0002363,MP:0002460,MP:0002492,MP:0003009,MP:0003724,MP:0003797,MP:0004031,MP:0004803,MP:0004804,MP:0004953,MP:0004984,MP:0005025,MP:0005027,MP:0005094,MP:0005095,MP:0005166,MP:0005387,MP:0005388,MP:0005463,MP:0005616,MP:0006387,MP:0008071,MP:0008075,MP:0008079,MP:0008127,MP:0008201,MP:0008212,MP:0008213,MP:0008240,MP:0008241,MP:0008401,MP:0008470,MP:0008495,MP:0009346,MP:0010965]</td>
      <td>[abnormal antigen presentation,abnormal B cell differentiation,abnormal cardiovascular system physiology,abnormal CD4-positive, alpha-beta T cell physiology,abnormal CD8 positive, alpha-beta intraepithelial T cell morphology,abnormal compact bone morphology,abnormal cytokine secretion,abnormal humoral immune response,abnormal intestinal epithelium morphology,abnormal lymphatic vessel morphology,abnormal metallophilic macrophage morphology,abnormal respiratory system physiology,abnormal response to infection,abnormal spleen B cell follicle morphology,abnormal spleen marginal sinus morphology,abnormal spleen marginal zone macrophage morphology,abnormal spleen morphology,abnormal T cell number,abnormal T cell proliferation,abnormal trabecular bone morphology,absent B cells,absent follicular dendritic cells,absent immature B cells,absent mature B cells,arrested B cell differentiation,decreased bone mineral density,decreased CD4-positive, alpha beta T cell number,decreased CD8-positive, alpha-beta T cell number,dec]</td>
    </tr>
    <tr>
      <td>[Ighm&lt;tm1Cgn&gt;]</td>
      <td>[B6.129(Cg)-Apoe&lt;tm1Unc&gt; Ighm&lt;tm1Cgn&gt;/Orl]</td>
      <td>[MP:0000063,MP:0000130,MP:0000135,MP:0000488,MP:0000689,MP:0001544,MP:0001800,MP:0001802,MP:0001805,MP:0001806,MP:0001807,MP:0001835,MP:0001879,MP:0002133,MP:0002144,MP:0002363,MP:0002460,MP:0002492,MP:0003009,MP:0003724,MP:0003797,MP:0004031,MP:0004803,MP:0004804,MP:0004953,MP:0004984,MP:0005025,MP:0005027,MP:0005094,MP:0005095,MP:0005166,MP:0005387,MP:0005388,MP:0005463,MP:0005616,MP:0006387,MP:0008071,MP:0008075,MP:0008079,MP:0008127,MP:0008201,MP:0008212,MP:0008213,MP:0008240,MP:0008241,MP:0008401,MP:0008470,MP:0008495,MP:0009346,MP:0010965]</td>
      <td>[abnormal antigen presentation,abnormal B cell differentiation,abnormal cardiovascular system physiology,abnormal CD4-positive, alpha-beta T cell physiology,abnormal CD8 positive, alpha-beta intraepithelial T cell morphology,abnormal compact bone morphology,abnormal cytokine secretion,abnormal humoral immune response,abnormal intestinal epithelium morphology,abnormal lymphatic vessel morphology,abnormal metallophilic macrophage morphology,abnormal respiratory system physiology,abnormal response to infection,abnormal spleen B cell follicle morphology,abnormal spleen marginal sinus morphology,abnormal spleen marginal zone macrophage morphology,abnormal spleen morphology,abnormal T cell number,abnormal T cell proliferation,abnormal trabecular bone morphology,absent B cells,absent follicular dendritic cells,absent immature B cells,absent mature B cells,arrested B cell differentiation,decreased bone mineral density,decreased CD4-positive, alpha beta T cell number,decreased CD8-positive, alpha-beta T cell number,dec]</td>
    </tr>
    <tr>
      <td>[Ighm&lt;tm1.1Aak&gt;]</td>
      <td>[STOCK Ighm&lt;tm1.1Aak&gt;/Orl]</td>
      <td>[MP:0004816,MP:0005017,MP:0005387,MP:0008186,MP:0008209]</td>
      <td>[abnormal class switch recombination,decreased B cell number,decreased pre-B cell number,immune system phenotype,increased pro-B cell number]</td>
    </tr>
  </tbody>
</table>

I've tried the following commands to have the 3rd column formatted in a better way but they are not working at all only with that column (the header of that column is modified correctly instead):

doc2 = doc.replace('<tr>', '<tr align="center">')
doc2 = doc2.replace('<tr style="text-align: right;">', '<tr style="text-align:center;">')

Thank you for the help

1

There are 1 answers

0
Saxtheowl On

Lets try to use the overflow-wrap: break-word; or word-break: break-all; CSS properties, it will break the string at any character if necessary to prevent overflow.

def wrap_in_div(value):
    return f'<div style="max-width: 500px; overflow: auto; overflow-wrap: break-word; word-break: break-all;">{value}</div>'

df['MP term ids'] = df['MP term ids'].apply(wrap_in_div)

doc = df.to_html(justify="left", index=False, escape=False)