CleanHaven: Automatically Tidy Up Your Text and Data

CleanHaven from Holy Mackerel Software is a tool focused on one purpose; simplifying the task of cleaning and formatting text.  At first, it might seem that this need is confined to IT and data specialists, such as marketing professionals, working with lists. But there are times where occasional users would benefit from a tool that easily automates, say, formatting names and addresses in a contact list or removing duplicates.

Both the simplicity of use and the fact it is free make CleanHaven an ideal tool for this kind of use. CleanHaven has a powerful set of features available for managing text, and we’ll be taking a look at these in today’s review.

Getting Started

The CleanHaven download is provided as a zip file. After you’ve downloaded it and opened it with Archive Utility, you can drag CleanHaven to your Applications folder.

Each time you launch CleanHaven, the text input window defaults to showing some helpful ‘How to use CleanHaven’ information which includes some test data so you can see the effect of applying each of the clean up options. This is a really helpful way to gradually learn about what the app is capable of.

Design

Opening CleanHaven shows the text input window:

CleanHaven text input window

CleanHaven text input window

CleanHaven keeps a rigid separation between input and output views by launching a separate output window – the Results window – as soon as you perform a conversion operation:

CleanHaven Results window

CleanHaven Results window

This allows you to visually compare the input and output to see the effect your changes have had.

Functionality

The most direct way to start using CleanHaven is to paste into the input window the text you want to work with. It’s also possible to open and read in any text file.

Now you have the text you want to clean, you can pick which operation you want to perform on it. The Convert tab in the input window lets you choose a variety of options which you can either use individually, or combine together.

Convert options

Convert options

Conversion Options

The check box turns on a particular category, then you can select an action from a drop down list. For example in Case you can choose between Title Case, Sentence case, UPPERCASE, lowercase, RaNdOM cAsE and “curly quotes”.

There is similar degree of control in the way in which you can control order under Sort and the ways in which duplicate entries in your text can be treated.

The Remove option has a useful list of items you can choose to heve taken out of your text: excess returns, excess spaces, linefeeds, non-ASCII, non-letters, non-numbers, periods, punctuation, returns, spaces and tabs. It also allows you to convert bidirectionally between linefeeds and returns.

The Personal options are of particular interest as they include the ability to convert full names to first name/last name, format phone numbers and expand UK postcodes to show the city and county. Finally on this tab, Info can provide details on correct and incorrect spellings, frequency of words in your input text and other measurements.

Having chosen the options and attributes you want to apply, you then click the Clean button to see the outcome in the Results window. If the outcome is not what you needed, you can change your options back in the input window, click Clean again and the results wil be updated accordingly.

One you are satisfied with the results, you have the option of either adding them to the clipboard or, by clicking the Source button, passing them back as a new input to CleanHaven. Thus you can carry out a multiple stage clean up by applying one rule at a time and examining the interim results to check they meet your needs before progessing to the next stage.

As well as copying your results to the clipboard, CleanHaven lets you save them as a text file, an Excel file, or a tab or comma separated values (CSV) file.

Find And Replace

In addition to the primary text and format manipulation purpose of CleanHaven, it is possible to find and replace terms in your input text via the Replace tab. The special characters Carriage Return, Linefeed, Tab and Escape can be included in find and replace strings.

CleanHaven Replace options

CleanHaven Replace options

Table View

You can toggle between Text and Table views in the input and results screens. So far we’ve only shown Text view but if you have tabular data, you can switch to column view and apply your convert and replace actions either to all columns or only to one indivdual column. This is managed through a drop down choice in the Settings tab.

CleanHaven settings options

CleanHaven settings options

Rival Text Cleaners

There are products that compete with CleanHaven such as  CleanText ($29) which has a wide range of built in conversions and TextSoap ($39.95) which has plug ins for other popular applications and supports scripting. Although each will have its supporters, CleanHaven’s price advantage of costing you nothing makes it a good starting point for people needing to manipulate text and it may well be all you need.

Conclusion

CleanHaven, which is also available for Windows and Linux, is a very powerful tool for automating the laborious process of correcting or reformatting text to meet your needs. It has some drawbacks; the spelling features don’t cope well with punctuation, and the product shows its British origins with the sole inclusion of UK postcodes.

These are more than offset by a simple learning curve, a handy set of ‘how to‘ videos on the web site, and the value it offers. CleanHaven deserves a place within the toolbox of anyone who needs to maniplulate data on even an occasional basis.


Summary

CleanHaven has a powerful set of features available for managing text, and manipulating large sets of data to easily fix problems and save you lots of time!

8
  • Tonijn

    I don’t get it. Why would you need such an app?

  • rudie

    Question: Does the app have more than one output format? For example, can it output in html?

  • Quine

    This looks good for novices who need to do some minimal text processing.

    For programmers / power users I recommend textmate, which has more ways to process text than any other program I’ve ever seen, many of which require nothing more than searching for the name of the command (ex: “convert tabs to spaces” or “strip trailing whitespace from lines”). I’d especially recommend textmate over the non-free text cleaners, since it has many more uses than simply ‘cleaning’ text.

    Of course if you need some (probably) overkill tools you can also just use awk and sed too.

  • Pingback: Need better text hygiene? Try CleanHaven.

theatre-aglow
theatre-aglow
theatre-aglow
theatre-aglow