150MB csv file stored in SQL 2012 FileTable - how can I query the data to insert into a table

1.5k views Asked by At

I have a simple database with a Persons table

UserId, Name,DOB

The table contains 750,000 rows (People).

I receive a CSV file that may contain 10 million rows

UserId, Address1, Address2, Address3, TownCity, Region, Country, 
Telephone1, Telephone2, Telephone3 

of the users current and prior contact details

This file has the usersId in it so I can match the contact details to the user.

A one to many relationship exists between person and contact details

My end goal is a query to select UserId, Name, DOB, Address1, Address2...... etc

Currently I use .net to open the csv file and read line by line and writing each record to SQL server individually (possibly 10 million sql inserts) - This is slow, so I am looking at other was of doing this.

I am experimenting with SQL 2012 FileTable/Stream - so I have the csv file stored in the FileTable and can query as such

SELECT [file_stream], [name], [path_locator], [parent_path_locator],
       [file_type], [cached_file_size], [creation_time]  
FROM [MYDB_FileTable].[dbo].[AddressFileTable]

Does anybody know if I can query the FileStream and bulk insert or any other way of working with the core csv data?

I have looked at

DECLARE @csv varchar(max)
SELECT @csv = convert(varchar(max), [file_stream])  FROM PKIFileTable

but due to the file size this does not work

Any suggestions are welcome.

3

There are 3 answers

2
Janus007 On BEST ANSWER

You could use SSIS for that task. It should be fairly simple to accomplish what you need.

What is slow?

Why was my previous answer deleted? What is going on here?

My answer is based at the last clause : "any other way of working with the core csv data?"

0
Ben On

Thanks Janus007 - it would seem that having someone point out the obvious - use SSIS. Often is the simplest answer. I am using SSIS to do this successfully

But i would still like to know how to best query a CSV file that is stored in a Filestream

0
Janus007 On

Another solution could be to develop a CLR in C# and handle the FileStream with that, utilizing this method you can still use the nice language of TSQL and easily debug in C#.