Performance Optimization for .net 3.5 desktop Application and SQL Server 2008

359 views Asked by At

I need to improve performance a desktop application (.net) which was designed to read the database and create xml files based on XBRL (eXtensible Bussiness Reporting Language). It is using UBMatrix for creating XBRL Taxonomies.

The application works fine if the size of particular data is small. But the application will take more than 30 min to generate files if the data is big. The client data is always huge/big. So the application requires more time to generate files.

My task is to optimise the application in order to reduce the time taken to create the xml files. When I checked the application I found that the application is running on this way.

Starts

  • Create connection to db
  • gets the first set of data ( this table (table1) is too Large ). And the query will returns around 15-30 K rows to dataTable
  • for loop 0 to datatable.Rows.count
    • checks some condition
    • get data from db. (this table (table2) is also too large than (table1).
    • send data to form xbrl and writes to xml ( this is done by thrid party application called UBMatrix). It is not possible to edit the code which creates xbrl-xml file.

Similarly there are 3 to 4 set of data will process

In my observation, we can avoid db calls in for loop. Get all the data before the loop. When I checked the queries, there were subqueries,not exists(select * from table) etc can be replaced with joins, not exists (select 1 from table)

But still the application need to process in loop. I am also thinking of using threading so that the I can create threads based on the size of data and process it simultaneosly.

Eg

  • if there are 100 rows.there will be 100 entries to xml file (XBRL)
  • So i will make 50,50 and run in two threads which will generate two xml file. at the end I will combine two into one xml file.

So processing of 0th question and 50th question can be start at same time. Currently in for loop, 0th will process and 99th will be process at the end only. I am not sure about the idea. Can any suggest /share your ideas . any help will be appreciated. Thanks in advance

4

There are 4 answers

0
Albin Sunnanbo On

30k queries in 30 minutes is just 16 queries per second. That is not very much unless the queries are expensive.

To find out, run SQL Profiler and check the execution time of each query. Multiply with the number of queries. If that is reasonably close to 30 minutes you are lucky if you can rewrite all those queries to a join and put the result in a Dictionary or ILookup.

If you need to resort to multi threading. Check if you have the possibility to upgrade to .NET 4. Then you can use Parallel.ForEach or some other suitable method in TPL to parallelize your work.

0
Jared Shaver On

Without seeing the code I cannot tell what classes you are using for data access but from your mention of DataTable.Rows I am assuming you are using DataSet/DataTable. If you switch to using an IDataReader with CommandBehavior.SequentialAccess you can avoid a lot of the unnecessary overhead that comes with DataSet/DataTable.

0
Maciej On

I suggest profiler but for .NET app. Check where does it spend most of the time and attack that place. If it is calls to get data from DB you may look at database and possibly add some new indexes and/or redesign queries. If it is in calls to UBMatrix there is probably not much you can do except get an explanation to whoever gave you this task. But before you give up you can try parallel processing, first making sure that UBMatrix is thread safe, as Simon pointed. If it's not or you cannot tell you can run parallel processing as separate AppDomains to imitate thread safety. This will come at a cost of resources and more complex code though. Parallel processing will only make sense if during normal app run you can observe CPU usage below about 70% and disk is not used excessively (check with Resource Monitor) so there are spare resources to be used.

If disk is used a lot, one other way could be to check if making xml files to be written to a RAM-drive would improve anything.

Anyway, start with profiling your .NET application - that should give you a good starting point. Here is a free .NET profiler: http://www.eqatec.com/tools/profiler/

0
Simon Wilson On

Not really an answer, just a really large comment:

I would remove multi-threading from your plans unless the UBMatrix API states it is thread-safe, thinking of all the disc I/O when generating the XBRL.

Have your profiled your app for memory usage? I am thinking of the 15-30K rows of data being loaded, then possible transferred into an an object model prior to processing and writing to file. If you start to reach the 2GB limit (32 bit), then your process will be doing a lot of paging, which is sooo slooow.

Would this alternative be a possibility? Pre-generate the data to file, possibly in xml format. Then, hoping the UBMatrix has an api which accepts a file path and streams data, you could just pass off the path to your file data. (This is more in case it is a memory issue, but could still speed things up if the data queries are long running.)