XSLT group by half hour

503 views Asked by At

Given this XML:

<root>
  <row>
    <time>08:00</time>
    <sales>800</sales>
  </row>
  <row>
    <time>08:15</time>
    <sales>815</sales>
  </row>
  <row>
    <time>08:30</time>
    <sales>830</sales>
  </row>
  <row>
    <time>08:45</time>
    <sales>845</sales>
  </row>
  <row>
    <time>11:00</time>
    <sales>1100</sales>
  </row>
  <row>
    <time>11:45</time>
    <sales>1145</sales>
  </row>
  <row>
    <time>14:15</time>
    <sales>1415</sales>
  </row>
  <row>
    <time>14:30</time>
    <sales>1430</sales>
  </row>
</root>

I am trying to find a way to XSLT transform by summarizing sales by 30 minute intervals. I can summarize by hourly intervals by 60 minutes using MUENCHIAN method, but I cannot use it for 30 minute since I need a custom function to do so (but I cannot use XSLT 2.0, nor .Net's custom functions). Please help!

The expected output is this:

30 minute
08:00 $1600  
08:30 $1675 
11:00 $1100 
11:30 $1145 
14:00 $1415 
14:30 $1430 
2

There are 2 answers

1
Dimitre Novatchev On BEST ANSWER

Solution 1.

This transformation:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="text"/>

 <xsl:key name="krowsInHalfHour" match=
  "row[not((substring-after(time,':')+30) mod 30 = 0)]"
  use="generate-id(
        preceding-sibling::row
          [60*substring-before(time,':')
          +
           substring-after(time,':')
          >=
           60*substring-before(current()/time,':')
          +
           substring-after(current()/time,':')

          -
           substring-after(current()/time,':') mod 30

          ]
             [1]
                   )
  "/>

 <xsl:template match=
  "row[(substring-after(time,':')+30) mod 30 = 0
     or
      not(
       60*substring-before(preceding-sibling::row[1]/time,':')
          +
           substring-after(preceding-sibling::row[1]/time,':')
       >=
          60*substring-before(time,':')
          +
           substring-after(time,':')

          -
           substring-after(time,':') mod 30
          )
      ]
  ">
  <xsl:variable name="vPrevStartMins" select=
  "60*substring-before(time,':')
  +
   substring-after(time,':')

  -
   substring-after(time,':') mod 30
  "/>
  <xsl:value-of select=
   "concat('&#xA;',floor($vPrevStartMins div 60),
           ':', concat(substring('0',($vPrevStartMins mod 60 >0)+1),
                       $vPrevStartMins mod 60
                       )
           )
   "/>
  <xsl:text> $</xsl:text>
  <xsl:value-of select=
  "sum(sales
      |
       key('krowsInHalfHour',generate-id())/sales)"/>
 </xsl:template>

 <xsl:template match="text()"/>
</xsl:stylesheet>

when applied on the provided XML document:

<root>
    <row>
        <time>08:00</time>
        <sales>800</sales>
    </row>
    <row>
        <time>08:15</time>
        <sales>815</sales>
    </row>
    <row>
        <time>08:30</time>
        <sales>830</sales>
    </row>
    <row>
        <time>08:45</time>
        <sales>845</sales>
    </row>
    <row>
        <time>11:00</time>
        <sales>1100</sales>
    </row>
    <row>
        <time>11:45</time>
        <sales>1145</sales>
    </row>
    <row>
        <time>14:15</time>
        <sales>1415</sales>
    </row>
    <row>
        <time>14:30</time>
        <sales>1430</sales>
    </row>
</root>

produces the wanted, correct result:

8:00 $1615
8:30 $1675
11:00 $1100
11:30 $1145
14:00 $1415
14:30 $1430

Solution 2:

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:my="my:my" >
 <xsl:output method="text"/>
 <xsl:strip-space elements="*"/>

 <my:halfHours>
  <t>00:00</t><t>00:30</t><t>01:00</t><t>01:30</t>
  <t>02:00</t><t>02:30</t><t>03:00</t><t>03:30</t>
  <t>04:00</t><t>04:30</t><t>05:00</t><t>05:30</t>
  <t>06:00</t><t>06:30</t><t>07:00</t><t>07:30</t>
  <t>08:00</t><t>08:30</t><t>09:00</t><t>09:30</t>
  <t>10:00</t><t>10:30</t><t>11:00</t><t>11:30</t>
  <t>12:00</t><t>12:30</t><t>13:00</t><t>13:30</t>
  <t>14:00</t><t>14:30</t><t>15:00</t><t>15:30</t>
  <t>16:00</t><t>16:30</t><t>17:00</t><t>17:30</t>
  <t>18:00</t><t>18:30</t><t>19:00</t><t>19:30</t>
  <t>20:00</t><t>20:30</t><t>21:00</t><t>21:30</t>
  <t>22:00</t><t>22:30</t><t>23:00</t><t>23:30</t>
  <t>24:00</t>
 </my:halfHours>

 <xsl:variable name="vhalfHrs" select=
  "document('')/*/my:halfHours/*"/>

 <xsl:template match="row">
  <xsl:variable name="vStart" select=
  "$vhalfHrs[translate(.,':','')
            >
             translate(current()/time,':','')
             ][1]
                 /preceding-sibling::*[1]
  "/>

  <xsl:variable name="vprecRow" select=
   "preceding-sibling::*[1]"/>

  <xsl:if test=
   "not(translate($vprecRow/time,':','')
       >=
       translate($vStart,':','')
       )">
   <xsl:value-of select="concat('&#xA;',$vStart, ' $')"/>
   <xsl:value-of select=
   "sum(sales|following-sibling::*
          [not(translate(time,':','')
              >=
               translate($vStart/following-sibling::*,':','')
               )
          ]
            /sales
        )
   "/>
  </xsl:if>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the same XML document, again the wanted, correct result is produced:

08:00 $1615
08:30 $1675
11:00 $1100
11:30 $1145
14:00 $1415
14:30 $1430

Explanation:

  1. In the variable $vhalfHrs we have elements whose values are the starting time of every half-hour period during the day.

  2. In the template that matches every row, the $vStart variable is set ti this half-hour start-time, in which the time of the current node (row) falls into.

  3. The variable $vprecRow is set to the preceding sibling (row) of the current row.

  4. If the time of the preceding row is not later than the start-half-hour-time (in $vStart), then the currentrow` is the first one in this half-hour period.

  5. We output the starting half-hour period time.

  6. We calculate and output the sum of all row elements whose time is in this same half-hour time period. They are following siblings of the current row and their time is not greater or equal the start of the next half-hour period.

Solution 3 (XSLT 2.0):

<xsl:stylesheet version="2.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xsl:output method="text"/>

 <xsl:template match="/*">
  <xsl:for-each-group select="row" group-adjacent=
  "(xs:integer(substring-before(time,':'))*60
   +
    xs:integer(substring-after(time,':'))
    )
    idiv 30
  ">
   <xsl:variable name="vStartMinutes"
        select="current-grouping-key()*30"/>
   <xsl:value-of separator="" select=
   "'&#xA;',
     format-number($vStartMinutes idiv 60, '00'), ':',
     format-number($vStartMinutes mod 60,'00'),
     ' $',
     sum(current-group()/sales/xs:integer(.))
   "/>
  </xsl:for-each-group>
 </xsl:template>
</xsl:stylesheet>

when this transformation is applied on the same XML document as above, the same wanted, correct result is produced:

08:00 $1615
08:30 $1675
11:00 $1100
11:30 $1145
14:00 $1415
14:30 $1430

Explanation:

  1. We are using <xsl:for-each-group> with the group-adjacent attribute set as an expression calculating the position of the 1/2 hour period in which a row/time is.

  2. Heavy use of the standard functions current-group() and current-grouping-key() .

0
AudioBubble On

Without preceding axis, this stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output method="text"/>
    <xsl:key name="kRowByHalfHour"
             match="row"
             use="floor((substring(time,1,2)*60+substring(time,4))div 30)"/>
    <xsl:template match="row[count(.|key('kRowByHalfHour',
                                         floor((substring(time,1,2) * 60
                                                 + substring(time,4)) div 30)
                                     )[1]
                             )=1
                         ]">
        <xsl:variable name="vKey"
             select="floor((substring(time,1,2)*60+substring(time,4))div 30)"/>
        <xsl:value-of
             select="concat(format-number(floor($vKey * 30 div 60),'00'),':',
                            format-number(($vKey * 30) mod 60,'00'),' $',
                            sum(key('kRowByHalfHour',$vKey)/sales),'&#xA;'
                     )"/>
    </xsl:template>
    <xsl:template match="text()"/>
</xsl:stylesheet>

Output:

08:00 $1615
08:30 $1675
11:00 $1100
11:30 $1145
14:00 $1415
14:30 $1430

Edit 2: Even better key value with normalized time allowing parameterization (by document() function or two phase transformation): if we take key calculation as complexity measure, this will be O(N)