# Teradata SQL join with approximate match by nearest Neighbor

## Teradata SQL join with approximate match by nearest Neighbor

Contents

Problem Description:

I have two tables `table1` and `table2` I need to join these tables with keys that are not exactly matched, they could differ in the last 3 or 4 digits there is no fixed pattern, the joining will have two conditions:

• 1st Condition: `table2.Key_2` must be numerically greater than `table1.Key_1` so join will will search for nearest large Key
• 2nd Condition: `table1` could have two keys approximately matched with one key in `table2` so the join will consider the nearest large one the other return Null
``````
|Key_1             |
|------------------|
|111330317223636588|
|121199074223629055|
|47256714523667238 |
|101348062023590858|
|106331320423644206|
|106061712623646625|
|120562195823631381|
|155570817823642550|
|114948476223640334|
|103285939423669298|
|103285939423669910|
``````
``````|Key_2              |Value|
|-------------------+-----|
|111330317223636610 |Done |
|121199074223629090 |Done |
|47256714523667300  |Done |
|101348062023590900 |Done |
|106331320423644305 |Done |
|106061712623647100 |Done |
|120562195823631399 |Done |
|155570817823642677 |Done |
|114948476223640455 |Done |
|103285939423669988 |Done |
``````

the expected result will be like the below:

``````|Key_1              |Value|
|-------------------+-----|
|111330317223636588 |Done |
|121199074223629055 |Done |
|47256714523667238  |Done |
|101348062023590858 |Done |
|106331320423644206 |Done |
|106061712623646625 |Done |
|120562195823631381 |Done |
|155570817823642550 |Done |
|114948476223640334 |Done |
|103285939423669298 |NULL |
|103285939423669910 |Done |
``````

## Solution – 1

This join on the next value is hard to express as a join, but simple using this approach:

``````with cte as
( -- combine both key columns into one
select key_1, cast(null as varchar(10)) as val from table1
union all
select key_2 as key_1, "value" from table2
-- order by 1
)
select
key_1
-- return the next rows value
over (order by key_1) as "value"
,val
from cte
-- filter only rows from the first table
qualify val is null
``````

The cte combines both tables:

`````` Key_1              val <-- NULLs indicate rows from table_1
------------------ -----
47256714523667238 NULL
47256714523667300 Done
101348062023590858 NULL
101348062023590900 Done
103285939423669298 NULL
103285939423669910 NULL
103285939423669988 Done
106061712623646625 NULL
106061712623647100 Done
106331320423644206 NULL
106331320423644305 Done
111330317223636588 NULL
111330317223636610 Done
114948476223640334 NULL
114948476223640455 Done
120562195823631381 NULL
120562195823631399 Done
121199074223629055 NULL
121199074223629090 Done
155570817823642550 NULL
155570817823642677 Done
``````

LEAD finds the next row’s value:

`````` Key_1              val  value
------------------ ---- -----
47256714523667238 NULL Done
47256714523667300 Done NULL
101348062023590858 NULL Done
101348062023590900 Done NULL
103285939423669298 NULL NULL
103285939423669910 NULL Done
103285939423669988 Done NULL
106061712623646625 NULL Done
106061712623647100 Done NULL
106331320423644206 NULL Done
106331320423644305 Done NULL
111330317223636588 NULL Done
111330317223636610 Done NULL
114948476223640334 NULL Done
114948476223640455 Done NULL
120562195823631381 NULL Done
120562195823631399 Done NULL
121199074223629055 NULL Done
121199074223629090 Done NULL
155570817823642550 NULL Done
155570817823642677 Done NULL
``````

and the final QUALIFY removes all rows from the 2nd table

`````` Key_1              value
------------------ -----
47256714523667238 Done
101348062023590858 Done
103285939423669298 NULL
103285939423669910 Done
106061712623646625 Done
106331320423644206 Done
111330317223636588 Done
114948476223640334 Done
120562195823631381 Done
121199074223629055 Done
155570817823642550 Done
``````
Rate this post
We use cookies in order to give you the best possible experience on our website. By continuing to use this site, you agree to our use of cookies.