Replace Text in a column based on values from a lookup table sql

205 views Asked by At

I have a imported table of data from Excel (ImportedTable example) that has data and some "units" jumbled together and trying to "filter" out the units (LookupTable example) to clean it up. The LookupTable can have almost 20 different values, and the ImportedTable is about 2000 records.

Trying to select the text from a column without the values from another table. Example:

LookupTable:

"Id"  | "Name"
'1'   | '% time operational'
'2'   | 'KGal'
'3'   | 'Gallons'
...

ImportedTable:

"Id"  | "Text"
'1'   | 'SomeText here % time operational'
'2'   | '500 KGal'
'3'   | '1.05 Gallons'
'4'   | '105,000'
'5'   | 'TestTextKGal'
...

Desired Result:

'SomeText here'
'500'
'1.05'
'105,000'
'TestText'

I know this doesn't work but wanted something like this to work:

SELECT LTRIM(RTRIM(REPLACE([Text], (SELECT DISTINCT [Name] FROM LookupTable), '')))
FROM ImportedTable

And those lookup values are removed from the "Text" in every record in ImportedTable, in a select statement (not actually removing them from the table).

3

There are 3 answers

1
Bohemian On BEST ANSWER

Cross join the two tables and take the MIN() of the result of making all the substitutions:

SELECT
  i.[id], MIN(TRIM(REPLACE(i.[Name], l.[Name], ''))) AS Name
FROM ImportedTable i
CROSS JOIN LookupTable l
GROUP BY i.[id]

See live demo.

Note that the convenient use of MIN() here only works when removing text from the end of text. It works because MIN() considers text to be "less" if characters are removed from the end, eg 'TestText' is "less than" 'TestTextKGal'.

Note also the use of TRIM(...) instead of LTRIM(RTRIM(...)).

0
nbk On

This solution will remove all text from the second tabe, which has text FROM the first independent of the position

I Expect that all pattern will only once occur, for multiple you must You need to repeat the first CTE until no more matches are found

the perfomance can be improve when you replace the LIKE CONCAT('%' , t1.[Name] , '%') with a full text search.

WITH CTE aS (SELECT
t2.[id],  REPLACE(t2.[Text],t1.[Name],'') resttest
FROM Table1 t1 CROSS JOIN Table2 t2
WHERE t2.[Text] LIKE CONCAT('%' , t1.[Name] , '%'))
SELECT * FROM CTE
  UNION ALL
SELECT  [Id],[Text] FROM Table2 
  WHERE [Id] NOT IN (SELECT [Id] FROM CTE )
ORDER BY Id
id resttest
1 SomeText here
2 500
3 1.05
4 105,000
5 TestText

fiddle

0
John Cappelletti On

Just another option which will support MULTIPLE occurrences of lookup values within the imported text .

Here, we convert your table into a JSON string and then do a Global Search & Replace via the Lookup table. Note the descending order of len([Name]) ... this will avoid any collisions (values within values).

Example

Declare @S nvarchar(max) = (  Select * from ImportedTable for JSON Path )
 
Select @S=replace(@S,[Name],'')
 From  (Select top 10000 * 
         From  LookupTable 
         Order By len([Name]) desc
        ) A

Select *
 From  OpenJSON(@S)
 with ( Id int
       ,Text varchar(500)
      )

Results

Id  Text
1   SomeText here 
2   500 
3   1.05 
4   105,000
5   TestText