sql - How to link two database tables when no integrity constraints are in place -


i trying link 2 tables have matching field, values fields not same format. example, matching field name "serial id" , value in 1 table may "134a-ac65" value in other table may "134a_ac65" or other combination of characters. issue database tables weren't setup kind of constraints , data has been inputted manually end-users operating in field, accuracy of input questionable. far, i've found matching values exist in both tables, know tables can linked, need know if there other matching values, may not match due poor data entry. ideas? thank you.

values seem consist of "data" , "separators". key success here knowing separators are.

create table table_1 (   serial_id varchar(15) primary key ); insert table_1 values ('134a-ac65'); insert table_1 values ('232a/ac65'); insert table_1 values ('333:a:ac65');  create table table_2 (   serial_id varchar(15) primary key ); insert table_2 values ('134a_ac65'); insert table_2 values ('232a_ac65'); insert table_2 values ('333|a|ac65'); 

five different separators, '-', '_', ':', '|', '/'. i'm replacing them spaces purpose of joining.

select table_1.serial_id, table_2.serial_id  table_1 inner join table_2      on translate(table_1.serial_id, '-_:|/', '     ') = translate(table_2.serial_id, '-_:|/', '     '); 

this how postgresql performed joins. (oracle translate() similar.)

 134a-ac65    134a_ac65 232a/ac65    232a_ac65 333:a:ac65   333|a|ac65 

in general, kind of data error want repair, because joins on expressions tend slow(ish). there's more 1 way that. can update values in-place. can add column each table, update column "right" value serial_id, , join on new column. (cleanup tasks left reader.)


Comments