How to separate a specific portion out of a string and into its own column in ORACLE?

68 views Asked by At

Say I have the following data set:

Address_1 Address_2
PO BOX 123
COMPANY NAME ONE 1234 ROSE AVE STE 100
456 PETUNIA WAY FIFTH FLOOR
COMPANY NAME TWO PO BOX 78910
COMPANY NAME THREE 1112 DAISY ST THIRD FLOOR

I'm trying to figure out two things here -

  1. How to separate any company name prior to the start of an address. Basically, if the string doesn't start with 'PO' or any sequence of numbers (ex: 456, 1112, etc.), then take that portion out and put it into a new column

  2. How to get the substring directly to the left of the word 'FLOOR' to result into a new column. It would be nice if I could pull over 'FIFTH FLOOR' together, but if not I can always concatenate ' FLOOR' to the end of the new column

I want my data to result as below:

Address_1 Address_2 Company_name ADDR_LNE_1_NM ADDR_LNE_2_NM
PO BOX 123 PO BOX 123
COMPANY NAME ONE 1234 ROSE AVE STE 100 COMPANY NAME ONE 1234 ROSE AVE STE 100
456 PETUNIA WAY FIFTH FLOOR 456 PETUNIA WAY FIFTH FLOOR
COMPANY NAME TWO PO BOX 78910 COMPANY NAME TWO PO BOX 78910
COMPANY NAME THREE 1112 DAISY ST THIRD FLOOR COMPANY NAME THREE 1112 DAISY ST THIRD FLOOR

Here is the current code I have for separating the substring ' FLOOR' out. I just need to know how to modify or change this code to include the 'FIFTH' and 'THIRD' (and any other written out number) in there like in the output. Maybe by pulling everything until the first space to the left of ' FLOOR'?

CASE
    WHEN 
        database.table.Address_1 like '% FLOOR%'
        THEN 
            TRIM(SUBSTR(database.table.Address_1, 0, INSTR(database.table.Address_1, ' FLOOR')-1))

END "ADDR_LNE_1_NM",


CASE
    WHEN
        database.table.Address_1 like '% FLOOR%'
        THEN
            TRIM(SUBSRT(database.table.Address_1, DECODE(INSTR(database.table.Address_1, ' FLOOR'), 0, NULL, INSTR(database.table.Address_1, ' FLOOR'))))

END "ADDR_LNE_2_NM"

I don't have any code for separating the company name. That one I'm at a loss for.

Thank you!

1

There are 1 answers

0
marcothesane On

I would work with the regular expression functions that Oracle offers: REGEXP_SUBSTR() to cut out the pieces on floor and company name, and REGEXP_REPLACE() to remove the "n-th FLOOR" part from the address:

WITH
indata(Address_1,Address_2) AS (
          SELECT 'PO BOX 123'                                  ,NULL      FROM dual
UNION ALL SELECT 'COMPANY NAME ONE 1234 ROSE AVE'              ,'STE 100' FROM dual
UNION ALL SELECT '456 PETUNIA WAY FIFTH FLOOR'                 ,NULL      FROM dual
UNION ALL SELECT 'COMPANY NAME TWO PO BOX 78910'               ,NULL      FROM dual
UNION ALL SELECT 'COMPANY NAME THREE 1112 DAISY ST THIRD FLOOR',NULL      FROM dual
)
SELECT
  address_1
, address_2
, REGEXP_SUBSTR(address_1,'^.+ (?=(PO BOX|\d+ \w+))') AS compname
, REGEXP_REPLACE(
    REGEXP_SUBSTR(address_1,'(PO BOX \d+|\d+).*$')
  , '\w+ FLOOR'
  , ''
  )  AS addr_ln_1
, NVL(REGEXP_SUBSTR(address_1,'\w+ FLOOR$'),address_2) AS addr_ln_2
FROM indata;                                                                                          
ADDRESS_1 ADDRESS_2 COMPNAME ADDR_LN_1 ADDR_LN_2
PO BOX 123 null null PO BOX 123 null
COMPANY NAME ONE 1234 ROSE AVE STE 100 null 1234 ROSE AVE STE 100
456 PETUNIA WAY FIFTH FLOOR null null 456 PETUNIA WAY FIFTH FLOOR
COMPANY NAME TWO PO BOX 78910 null null PO BOX 78910 null
COMPANY NAME THREE 1112 DAISY ST THIRD FLOOR null null 1112 DAISY ST THIRD FLOOR

fiddle