Inbuilt Cleaning Methods
When creating mappings, there are a number of inbuilt cleaning methods (that are provided by the NdrSupport gem).
These methods undertake standard cleaning of data when mapped into a field, with the rawtext value remaining unchanged.
The clean methods are used in an mapping with the following syntax:
- column: hosp_no
rawtext_name: hospitalnumber
mappings:
- field: hospitalnumber
clean: :lpi
Below is a list of the clean methods, their functionality and examples:
:nhsnumber
Functionality:
- Removes any non numeric characters
Examples:
Raw Value | Cleaned Value |
---|---|
” 123-456-7890” | “1234567890” |
“888 888 8888 “ | “8888888888” |
“678-098 9876” | “6780989876” |
“Quick O`brown, Fox-38” | “38” |
Example fields for use: nhsnumber
:lpi
Funtionality:
- Upcases
- Removes any non aplhanumeric characters
Examples:
Raw Value | Cleaned Value |
---|---|
“rgt9878” | “RGT9878” |
” 1878785234” | “1878785234” |
“RGT-786” | “RGT786” |
“65 78997” | “6578997” |
“Quick O`brown, Fox-38” | “QUICKOBROWNFOX38” |
Example fields for use: hospitalnumber
:hospitalnumber
Funtionality:
- Removes last character from value if it is not a digit
Examples:
Raw Value | Cleaned Value |
---|---|
“RGT1223B” | “RGT1223” |
“746R876” | “746R876” |
“d4578886C” | “d4578886” |
“Quick O`brown, Fox-38”|”Quick O`brown, Fox-38” |
Example fields for use: hospitalnumber
:sex
Functionailty:
- Cleans into consistent format of ‘1’ for male, ‘2’ for female or ‘0’ for not known
Examples:
Raw Value | Cleaned Value |
---|---|
“male” | “1” |
“FEMALE” | “2” |
“1” | “1” |
“2” | “2” |
“M” | “1” |
“F” | “2” |
”” | “0” |
“UNKNOWN” | “0” |
“unk” | “0” |
“Quick O`brown, Fox-38” | “0” |
Example fields for use: sex
:name
Functionailty:
- Removes .
- Replaces , or ; with a space.
- Replaces 2 or more spaces with 1 space
- Replaces ` with ‘
- Removes leading and trailing spaces
Examples:
Raw Value | Cleaned Value |
---|---|
“ollie” | “OLLIE” |
“O`brian” | “O’BRIAN” |
“Smith Jones” | “SMITH JONES” |
” 67890” | “67890” |
”,,, Potato” | “POTATO” |
“Thomas h. “ | “THOMAS H” |
“Quick O`brown, Fox-38” | “QUICK O’BROWN FOX-38” |
Example fields for use: surname, forenames, previoussurname
:roman5
Functionailty:
- Deromanises roman numerals between 1 and 5
Examples:
Raw Value | Cleaned Value |
---|---|
“I” | “1” |
“5” | “5” |
“IV” | “4” |
“iii” | “3” |
“iiC” | “2C” |
“IIII-B” | “4-B” |
“UNKNOWN” | “UNKNOWN” |
“Quick O`brown, Fox-38”|”Qu1ck O`brown, Fox-38” |
:code_icd
Functionality:
- Splits grouped codes by comma, semicolon or space
- Upcases
- ICD code is removed if it is entirely non alphanumeric characters
Examples:
Raw Value | Cleaned Value |
---|---|
“c50.9” | “C50.9” |
“C61.x, C34.2, –.” | “C61.X C34.2” |
“C14x” | “C14X” |
“C61.x, C34.2, –.” | “C61.X C34.2” |
“c459; ~~; C01.9” | “C459 C01.9” |
“Quick O`brown, Fox-38”|”QUICK O`BROWN FOX-38 “ |
Example fields for use: primarydiagnoses, otherdiagnoses
:code_opcs
Functionality:
- Splits grouped codes by comma, semicolon or space
- Upcases
- Non alphanumeric characters removed from each code
- Cleaned codes of length < 3 or > 4 are removed
Examples:
Raw Value | Cleaned Value |
---|---|
“X71.9, ~~, e543” | “X719 E543” |
” t-12.4” | “T124 “ |
“Quick O`brown, Fox-38” | ”” |
Example fields for use: primaryprocedures
:postcode
Functionality:
- Values in a postcode format are upcased and centre padded with space(s) to make it 7 characters long (if required)
- All other values are returned untouched
Examples:
Raw Value | Cleaned Value |
---|---|
“N2[ space ]5zz” | “N2[ space ][ space ]5ZZ” |
“ZZ32 7rr” | “ZZ327RR” |
“W12 8QT “ | “W12 8QT” |
“ab213TT” | “AB213TT” |
“UNKNOWN” | “UNKNOWN” |
“Quick O`brown, Fox-38”|”Quick O`brown, Fox-38” |
Example fields for use: postcode
:tnmcategory
Functionality:
- Leading ‘T’, ‘N’, or ‘M’ are removed (upper or lowercase)
- Lowercase ‘x’ is upcased to ‘X’
- All other values are downcased
Examples:
Raw Value | Cleaned Value |
---|---|
“T1A” | “1a” |
“Nx” | “X” |
“n1” | “1” |
“x” | “X” |
“TIS” | “is” |
“m0” | “0” |
“Unknown” | “unknown” |
“Quick O`brown, Fox-38”|”quick o`brown, fox-38” |
:upcase
Functionality:
- Upcases any raw value
Examples:
Raw Value | Cleaned Value |
---|---|
“c50.9” | “C50.9” |
“iii” | “III” |
“Quick O`brown, Fox-38”|”QUICK O`BROWN, FOX-38” |
Notes:
It’s worth noting that some of these fields benefit from the Standard YAML mappings functionality.