So, we have a customer masterfile that’s been around for years. Users have been keying in customer info and sometimes adding email addresses, sometimes not, sometimes entering nonsense or even just text that says things like “this chap has no email so don’t email him”. Yes, it’s a filthy dirty file filled with grubby email addresses. #grubby It’s time to get cleaning….
Luckily it is super easy to do a quick and basic email validity check using SQL:
SELECT * FROM 'file-name' WHERE 'email-field' NOT LIKE '%_@__%.__%'
This is a basic selection that will check for really obviously malformed email address. It does not check if an email address is actually *real* or not.
We can add some more checks to this without getting into regular expression madness:
select * from 'file-name' where 'email-field' not like '%_@__%.__%' -- Must be email@example.com or email not like '%@%.%' -- Must contain at least one @ and one or email like '%..%' -- Cannot have two periods in a row or email like '%@%@%' -- Cannot have two @ anywhere or email like '%.@%' or email like '%@.%' -- Cannot have @ and . next to each other or email like '%.cm' or email like '%.co' -- Unlikely. Probably typos or email like '%.or' or email like '%.ne' -- Missing last letter
Tweak it to your heart’s delight but it’s a simple and easy way to validate a file that I’ve used dozens of times. 🙂