Using Microsoft SQL Server Express Edition (64-bit) 10.0.550.0
I'm trying to extract data from an Autodesk Vault server. The SQL involved to get to the required data is too advanced for my current level of knowledge, so I'm trying to lay a puzzle using bits from Google and StackOverflow as pieces. Using this excellent answer I was able to transpose the vertical data into a manageable horizontal format.
The Autodesk Vault database stores information about CAD drawings (among other things). The main vertical table dbo.Property
holds information about all the different revisions of each CAD drawing. The problem I'm currently facing is that I'm getting too much data. I just want the data from the latest revision of each CAD drawing.
Here's my SQL so far
select
CreateDate,
EntityID,
PartNumber,
CategoryName,
[Subject],
Title
from
(
select
EntityID,
CreateDate,
[53] as PartNumber,
[28] as CategoryName,
[42] as [Subject],
[43] as Title
from
(
select
p.Value,
p.PropertyDefID,
p.EntityID,
e.CreateDate
from dbo.Property as p
inner join dbo.Entity as e on p.EntityID = e.EntityId
where p.PropertyDefID in(28, 42, 43, 53)
and e.EntityClassID = 8
) t1
pivot
(
max(Value)
for PropertyDefID in([28], [42], [43], [53])
) t2
) t3
where PartNumber is not null
and PartNumber != ''
and CategoryName = 'Drawing'
-- (1) additional condition
order by PartNumber, CreateDate desc
Where dbo.Property.Value
is of sql_variant
datatype. The query above results in a data set similar to this:
CreateDate | EntityID | PartNumber | CategoryName | Subject | Title
---------------------------------------------------------------------
2016-01-01 | 59046 | 10001 | Drawing | Xxxxx | Yyyyy
2016-05-01 | 60137 | 10001 | Drawing | Xxxxx | Yyyyy
2016-08-01 | 62518 | 10001 | Drawing | Xxxx | Yyyyyy
2016-12-16 | 63007 | 10001 | Drawing | Xxxxxx | Yyyyyy
2016-01-01 | 45776 | 10002 | Drawing | Zzzzz | NULL
2016-11-01 | 65011 | 10002 | Drawing | Zzzzzz | NULL
...
(about 23000 rows)
The problem that I have is that I'm getting all revisions for each drawing. In the example above I only want the latest revision for PartNumber=10001
dated '2016-12-16' etc.
I have also looked at this answer on how to group and select rows where one of the columns has a max value, but I just can't seem to figure out how to combine the two. I tried adding the following snippet to the commented line in the above query, but it fails on many different levels.
and (PartNumber, CreateDate) in
(
select PartNumber, max(CreateDate)
from t3
group by PartNumber
)
The reason I'm tagging this question "pivot", although the pivoting is already done, is that I suspect that the pivoting is what's causing me trouble. I just haven't been able to wrap my head around this pivoting stuff yet, and my SQL optimization skills are seriously lacking. Maybe the filtering should be done at an inner level?
Drawing inspiration from the comment provided by @Strawberry, I kept working and tweaking until I got something that seems to work. I had to use a
PIVOT
inside aPIVOT
for it all to work.Edit: At first I used views, but then the prerequisites changed as I had to work with a read-only database user. Fortunately, I was still allowed to create temporary tables.
This is the final result.
I had to change the suggested
join
to awhere x in(y)
because the join was insanely slow (I terminated the query after four minutes). Now the resulting data set (which takes ~2 seconds to produce) looks promising:No more old revisions in the set.