A faster version of your function would be an inline Table Valued Function.
CREATE OR ALTER FUNCTION dbo.Similar (@x varchar(max), @y varchar(max))
RETURNS TABLE AS RETURN
SELECT COUNT(CASE WHEN xJ.value <> yJ.value THEN 1 END) * 1.0 / COUNT(*) AS Pct
FROM (
SELECT *,
ROW_NUMBER() OVER (ORDER BY (SELECT 1)) rn
FROM STRING_SPLIT(@x, ',')
) xJ
JOIN (
SELECT *,
ROW_NUMBER() OVER (ORDER BY (SELECT 1)) rn
FROM STRING_SPLIT(@y, ',')
) yJ ON yJ.rn = xJ.rn;
However, STRING_SPLIT
with a row-number is not guaranteed to always return results in the actual order of the string. It may do it once, it may do it a million times, but there is always a chance the compiler could rearrange things. So instead you could use OPENJSON
CREATE OR ALTER FUNCTION dbo.Similar (@x varchar(max), @y varchar(max))
RETURNS TABLE AS RETURN
SELECT COUNT(CASE WHEN xJ.value <> yJ.value THEN 1 END) * 1.0 / COUNT(*) AS Pct
FROM OPENJSON('[' + @x + ']') xJ
JOIN OPENJSON('[' + @y + ']') yJ ON yJ.[key] = xJ.[key];
You would use it like this
WITH Y AS (
select
a.RowCnt,
a.Lvl,
a.TargetID a_TargetID,
b.targetid b_TargetID,
a.codes a_codes,
b.codes b_codes,
sim.Pct sim
from TargetsComp A
inner join TargetsComp B
on a.RowCnt = b.RowCnt
and a.TargetID < b.TargetID
CROSS APPLY dbo.sim(a.codes, b.codes) sim
)
insert into TargetFilled
(RowCnt, Lvl, a_TargetID, b_TargetID, a_codes, b_codes, sim)
SELECT RowCnt, Lvl, a_TargetID, b_TargetID, a_codes, b_codes, sim
FROM Y;
-- you may want to add
-- WHERE sim.Pct < 100
I have removed the ORDER BY
from the insert as I don't think it's necessary.
You should index your table as follows
CLUSTERED INDEX ON TargetsComp (RowCnt, TargetID)