Library to read Rocket U2 data

1.1k views Asked by At

We are trying to port data in a Rocket U2 database to SQL server. As per what I read on Wikipedia, the data is stored as delimited text files.

We are considering the following alternatives:

  1. Purchase the Rocket U2 toolkit
  2. Parse the text files ourselves
  3. Import the text files into Excel
  4. Rely on 3rd part libraries

How feasible are each of the above options? Are there any other alternatives we could consider?

3

There are 3 answers

0
Galaxiom On

We connect UniVerse as a Linked Server on SQL Server using UniVerse ODBC which is included in the UniVerse license. It does require quite significant configuration of the UniVerse Dictionaries to bring out all the data on each table.

The UniVerse generic DICT entries (F1, F2 , F3 etc) are not available to UniVerseSQL. Any data to be read must have an entry in the file's DICT. For ODBC it is essential to have an @SELECT entry to list the DICT items that will be available in ODBC because it uses a SELECT * command behind the scenes.

The tables are not directly visible via ODBC and data is accessed from SQL Server is best accessed by constructing Views using OPENQUERY. The ODBC connection can only handle tables sequentially, one table at a time. This limitation is not significant for a couple of tables in a query but bogs down if there are several tables joined. In these cases it is best to write the data to temp tables on SQL Server and query those.

Joins at the ODBC level in the OPENQUERY command are hopelessly slow. Just use OPENQUERY within the query on single tables (or use the table views you create) and do the joins in SQL Server.

The greatest limitation is the ODBC 255 character limit on all fields but this can often be worked around with I-descriptors to bring across multiple blocks of 255 characters as separate dictionary items and concatenate them back together in SQL Server.

UniVerseSQL supports a concept called Dynamic Normalization where the multivalues are transparently presented as virtual related tables. These virtual tables are made available via ODBC. It is pretty nifty to have a UniVerse multivalue database presented in a normalized form in SQL Server.

Subvalues can be supported via I-descriptors to bring out each single value using EXTRACT(). Alternatively the whole multivalue can be brought over with all the delimited subvalues then parsed on SQL Server.

UniVerseSQL is designed to work with the value attributes as columns. It is not suited to structures where there is an arbitrary number of attributes within a record. (For example a Select List).

I have used LOWER() in I-descriptors to convert such structures to a single attribute with multivalues but encountered mixed results. Works fine within UniVerseSQL but does not translate via ODBC, apparently also somehow due to the 255 character limit of the ODBC.

Native UniVerseSQL can be sent to the UniVerse server via ODBC. This bypasses the need for the @SELECT entry in the DICT and allows the use of EVAL() to create on-the-fly I-descriptors without having to configure the file's dictionary. This is particularly useful for ad hoc investigation of data in UniVerse from SQL Server. It could be useful for users who don't have terminal access to Universe, though they would need to understand the syntax of the UniVerse functions supported in I-descriptors.

Native queries also allows tables joined in UniVerse to be presented to ODBC. This works a lot better than joins at the ODBC level. However the best performance can be had by using TRANS() in I-Descriptors to manifest fields from the related table in the main table.

It is important to realise that native queries also allow for UPDATE and INSERT to execute in UniVerse even if the Linked Server is configured as read only. Ensure the permissions to the Universe files are appropriately set up, using extended acls if necessary.

Note that UniVerseSQL does not support the DELETE command on file based tables.

When using UniverseSQL it is important to realise the syntax has some very significant differences from T-SQL. It is absolutely vital to understand the difference between WHERE and WHEN keywords, especially before attempting an UPDATE.

The functionality will be quite limited until the developer understands the syntax of Dynamic Normalization.

0
Gary Brunton On

Our team has created a process that does the following from a .net application:

  1. Create models (think data structures) from the unidata schemas including modeling the multi-value and sub multi-values from associations.

  2. Use these models to generate a sql server db.

  3. Use these models to create html documentation.

  4. Use the LIST command to extract the data to files from our unidata db.

  5. Perform an Rsync to move the data files from a unix server over to our windows server.

  6. Read from the files, truncate any existing sql data and use the models to perform sql bulk inserts.

We use this process for daily migrations so steps 2 and 3 are really only done once (or whenever we add a new file).

It took our team a couple of months to completely automate this and get it right. Our main challenges were from the fact that we had zero knowledge of unidata or multi-value databases. Most of the complexity was parsing the data file's multi-value and sub multi-values. Understanding the associations and breaking them out correctly took us a lot time.

Our final solution turned out quite nice and is extremely fast. We are currently migrating over 2 and a half gigs of data every day from over 57 unidata files in under 2 hours.

So what I'm saying is that this is possible but it is a rather large investment unless you guys have more knowledge than we have (which would be very possible). It looks like you are looking for a one time port as opposed to nightly migrations so I would probably recommend the same as @jbmonco did and just use the .net toolkit that Rocket offers.

0
jbmonco On

The files are hashed tables not pure delimited text files. Do you have the Rocket database installed or just the data files themselves? There a numerous ways to extract data from Rocket U2 files but you do need to understand the data structure which is MultiValued. You might get their faster by hiring someone who is familiar with the environment. There is a MultiValue Users Group and U2 Users Group in LinkedIn.

Other than that I would suggest looking into the Rocket U2 Toolkit for .NET tool as a way to access data from a Microsoft perspective. Here is a link to general information: http://www.rocketsoftware.com/products/rocket-u2-toolkit-net

Also search here on StackOverflow for the tag u2netdk.