SQL Server 2005 Update/Delete Substring of a Lengthy Column

129 views Asked by At

I'm not sure if it is possible to do what I'm trying to do, but I thought i would give it a shot anyway. Also, I am fairly new to the SQL Server world and this is my fist post, so I apologize if my wording is poor or if I leave information out. Also, I am working with SQL Server 2005.

Let's say I have a table named "table" and a column named "column" The contents of column is a jumbled mess of characters (ntext data type). These characters were all drawn in from multiple entry fields in a front end application. Now one of those entry fields was for sensitive information that we no longer need and would like to get rid of but I can't just get rid of the whole column because it also contains other valuable information. Most of the solutions I have found so far only deal with columns that have short entries so they are just able to update the whole string, but for mine I think I need to identify the the beginning and the end of the specific substring that I need and replace it or delete it somehow. This is the closest I have gotten to at least selecting the data that I need... AAA and /AAA mark the beginning and the end of the substring that I need.

select 
    substring (column, charindex ('AAA', column), charindex ('/AAA',column)) 
from table 
where column like '%/AAA%'

The problems I am having with this one are that the substring doesn't stop at /AAA, it just keeps going, and some of the results are just blank so it looks something like:

  1. AAA 12345 /AAA abcdefghijklmnop
  2. AAA 12346 /AAA qrstuvwxyzabcdef
  3. AAA 12347 /AAA abcdefghijklmnop

With the characters in bold being the information I need to get rid of. Also even though row 3 is blank, it still does contain the info that I need but I'm guessing that it isn't returning it because it has a different amount of characters before it (for example, rows 1, 2, and 4 might have 50 characters before them but row 3 might have 100 characters before it), at least that's the only reason that I could think of.

So I suppose the first step would probably be to actually select the right substring, then to either delete it or replace it with a different, meaningless substring like "111111" or something.

If there is more information that you need to be provided with or if I was unclear about anything please let me know and thank you for taking the time to read through (and hopefully answer) my question!

Edit: Another one that gets close to the right results goes something like this

    select substring(column,charindex('AAA',column),20) from table
where column like '%/AAA%'

I'm not sure if this approach would work better since the substring i'm looking for is always going to have the same amount of characters. The problem with this one though, is that instead of having blank rows, they are replaced with irrelevant substrings from that column, but all of the other rows do return exactly what I want.

1

There are 1 answers

2
Greg Viers On

First of all, check your usage of SUBSTRING(). The third argument is for length, not end character, so you would need to alter your query to something like:

    select substring (column, charindex ('AAA',column)
, charindex ('/AAA',column)-charindex ('AAA',column))
 from table where column like '%/AAA%'

Yes your approach of finding it and then either deleting or replacing it is sound.

If some of the results are blank, it's possible that you are finding and replacing the entire string. If it had not found the correct regular expression in there, you would have not returned the row at all, which is different from returning a black value in that column.