Hi,
For an analysis I have to perform a following join mentioned below -
SELECT a.AffiliateName, a.CustomerID, a.CustomerKey
from ActiveCustomerInfo a, ActiveCustomerInfoDup b
WHERE (a.CustomerKey != b.CustomerKey)
AND (a.EmailAddressKey = b.EmailAddressKey)
AND ((a.firstname != b.firstname AND a.lastname != b.lastname)
OR (a.firstname != b.firstname AND a.lastname = b.lastname))
GROUP BY a.AffiliateName, a.CustomerID, a.CustomerKey
The table ActiveCustomerInfo and ActiveCustomerInfoDup are identical.
I could also do a self join if it would be faster.
The row count in the table is approximatley 37 million.
The table structure is -
CREATE TABLE `ActiveCustomerInfo` (
`CustomerID` varchar(64) DEFAULT NULL,
`CustomerKey` int(11) NOT NULL,
`FirstName` varchar(255) DEFAULT NULL,
`LastName` varchar(255) DEFAULT NULL,
`EmailAddressKey` int(11) DEFAULT NULL,
`AffiliateName` varchar(255) DEFAULT NULL
) ENGINE=BRIGHTHOUSE DEFAULT CHARSET=latin1;
I could reduce the varchar length for firstName, lastName.
Currently it is taking a lot of time. At the time of writing this request it had already been well over 4 hours.
Please help me fine tune the sql.
Thanks a lot in advance,
Som
