ETW Tracing: Log files are corrupted after few days of continuous tracing

391 views Asked by At
  1. We use ETW Custom EventSource (inherit from Microsoft.Diagnostics.Tracing.EventSource, Microsoft.Diagnostics.Tracing.EventSource.Redist.1.1.28 from NUGET) to instrument our application.

  2. Our provider is enabled in a session which can capture the events in files. Below is the configuration Trace session:

    • Stream mode: File

Trace Buffers

  • Buffer size: 512 KB
  • Min Buffers: 200
  • Max Buffers: 400
  • Flush timer: 0 seconds

File:

  • Log Mode: New File (New file on reaching max file size) Stop condition:
  • Max Size: 11MB

    1. Maximum event rate is about few hundred events per second.
    2. We use Windows server 2012 SP1.

Occasionally we see that ETL files generated are of size 11MB (or sometimes more) but it has zero events. This happens mostly on reliability servers where system is running with max event rate for few days. Once this situation occurs all log file thereafter is with zero events and we lose all the events thereafter.

When trying to open the log in Windows performance analyzer, below error message is shown.

Windows Performance analyzer Error

This issue is reproducible infrequently. This makes it difficult to try different trial and errors.

Solutions tried (didn’t work):

  1. Reduce the number of Min and Max buffers to 24 and 48 respectively.
  2. Introduce flush timer to 10 Seconds. This has a downside of creating 11MB log files with very few events when event rate is low.

Anyone has faced such an issue? kindly help.

0

There are 0 answers