I have some experience using FME to clean/transform data. I found a CSV dataset of Airbnb listings in New York City, USA and performed several data cleaning operations. The operations I performed include:
- Checking for valid prices (i.e. prices > $0)
- Checking for short term listings (minimum stay 30 nights or less)
- Removing duplicate style listings (many hotels will list individual rooms that are the same type/price, so I filtered these out)
- Removed entries that didn’t have the name for the host or an invalid host name
- Renamed the attributes to more readable names
- Exported the data as an ESRI shapefile
Shown below is a screenshot of the FME workflow I built:
