IBM i Data Obfuscation – Making Data Foggy Murky and Squinty

IBM i

Jul 03

A little while ago, I wrote a data obfuscation tool – which I decided to call Data Fogging. This is specifically to solve the data privacy problems that we face when we want to suck down real data from production environments to our development and Test/QA boxes.

Obviously we don’t want to implement any real data on our test machines (names, addresses, emails, social, etc etc) and this little tool lets us

  • record a list of keywords to scan for (ie: name, address, social, zip, etc)
  • scan all file data for fields containing those words
  • generate automatic SQL scripts to fog (obfuscate) that data
  • and then.. fog it

I included it with the latest V7202 free (and commercial) version of Projex4i

Hope its useful!

IBM I DATA OBFUSCATION – FOGGING TOOL 

The Data Fogging Tool allows selection and obfuscation of various data sets in any IBM I Library or SQL Schema. 

Main Menu is FOG (GO FOG)

1 – WRKFOGWRD 

Create list of word to scan for. 

These scan words will be found in the database and allow you to define obfuscation rules

Each search word can have several rules associated with it: 

These search word rules let you select database fields based on partial hits, case or even scanning for field names, text or column headings. 

The default obfuscation RULE sets the rule to be used to obfuscate any fields that are selected because they match this scan word. 

2- GENFOGDB 

Generate the fogging database 

To load and scan the QGPL library the command might be: GENFOGDB FUNCTION(*BOTH) FILE(QGPL/*ALL) 

This will load all files in the specified library and scan those fields for any occurrences of the search words you defined previously. 

All results will be displayed later allowing to select which ones you want to obfuscate. 

3 – PRTFOG 

Print list of fogging words 

The words that have been defined can be listed in PDF and emailed for clearance prior to use 

4 – FOGNOW 

Now we have defined the search words and loaded the scan database we can work with the HITS and define whether fogging rules to be applied. 

This is the main brains of the fogging toolset. 

This allows you to perform data obfuscation on all rows found to contain the search words, as well as interrogate data and define standard obfuscation scripts or even create bespoke ones of your own.

As you can see, if the fogging database has not been built the utility will report that it is empty 

5 – PRGDATA 

Data purging allows you to reduce the dataset that you are working with.  

This screen lets you add SQL purge scripts – obviously only SQL expert should use this option. 

Follow

About the Author

IBM i Software Developer, Digital Dad, AS400 Anarchist, RPG Modernizer, Alpha Nerd and Passionate Eater of Cheese and Biscuits. Nick Litten Dot Com is a mixture of blog posts that can be sometimes serious, frequently playful and probably down-right pointless all in the space of a day. Enjoy your stay, feel free to comment and in the words of the most interesting man in the world: Stay thirsty my friend.