Tidy Development Version

|- index -|- preamble -|- coding -|- downloads -|- end -|


Of course, it is always recommended that you download and use HTML Tidy from its official web site ... However, since I now use HTML Tidy virtually all the time, to 'tidy up' each of my HTML files on site, I wanted a command line version that worked the way I wanted it ;=))

The CVS source of Tidy comes with a command line interpreter, console/tidy.c, but due to the unix heritage of Tidy, this is geared toward a unix/linux/POSIX standard. It has commands that begin with a single '-', and others that begin with a '--', and allows some single letter commands ... all quite confusing to a Windows person. Unix usually calls these 'short' and 'long' commands ...

So although I did not really add anything new, I re-worked this command line interpreter, and called it console/tidy2.c. In the main it acts the same as the unix version for most items, but anything that can be input using a single '-' can now be input also using '--', and vice versa.

There are at least two important changes however :-
(a) the order of the commands no longer matters, and
(b) will accept @input.txt, a file of line separated commands.

The order used to be very important for unix, since multiple strings of commands could be input at one time. Each time the 'standard' interpreter reached a 'bare' file name, it would begin processing that file, using the previously found '-' or '--' commands, and/or those found in any 'config' file, and when that HTML/XML file was completed, would come back to the interpreter, accepting new commands until another 'bare' filename was found, and so on ...

Now such blocks of commands can be put in an input response file, in any order, and the next block in subsequent input files, and then Tidy can be run using each input response file, to achieve a similar multiple file processing.

One other important change, for me at least, Tidy will now EXIT if it detects any error in the command line. I consider it highly undesirable that an application 'continue' after having trouble with the command line ;=() Unfortunately, I have also removed the use of 'standard input', mainly because I could not always get it working properly in Windows and feel it would seldom be used in Windows anyway ...



As stated, this was all achieved by simply replacing the CVS console/tidy.c source with my new console/tidy2.c source, and link it with the current static library. A better name probably would have been tidyWIN.c, but only thought of that later ;=))

With this 'enhanced' command processor, Tidy :-
(a) supports @input file, with line delimited options, comments start with ';' (new)
(b) only support ONE command line set at a time (new behavior)
(c) will ABORT (exit) on any command error (new behavior)
(d) all command line checks are from expanded internal tables (new)
(e) exits with ERRORLEVEL set to -
 0 = no errors or warnings in parse, or any HELP command;
 1 = some warnings, but no errors;
 2 = some errors, and maybe warnings also;
 3 = command line, or config file, parse error (new behavior);
(f) if ANY windows '/' found, give HELP (new behavior).

There is one anomaly. Old tidy supported -indent, or -i, to set indent to auto only *AND* --indent <columns> to set the indent column size. If TIDY_NEW_INDENT is undefined - it is defined by default - then this Tidy emulates that OLD, dual behavior.

As indicated, all this can be achieved, simply by using my console/tidy2.c source, included in the downloads below, instead of the current CVS console/tidy.c source. One file change.

However, as well as this completely NEW command line interpreter, this development version also has some of my own enhancement. One I find invaluable is some enhanced cleaning of HTML as output by MS Word; another is improved handling of javascript ... It also has a lexer debug mode, which really helps in understanding and debugging Tidy file parsing. Although most of these have been presented to the present Tidy community, none made it into the CVS development code, so I have included them here, in this 'development' version.



2011-01-11:14: Commenced update including HTML5 support (on going), and the 'dev' version has a patch to output error messages in a form suitable for use with MSVC - set tidydev.exe as an external tool.

Some downloads: TAKE CARE running EXECUTABLES from the web.
tidycvs-14-bin.zip: Release candidate, including HTML5 support.
tidycvs-14-src.zip: Full modified source, with MSVC build files.
tidydev-14-bin.zip: Development version, release candidate, with HTML5 support and using the config option 'gnu-emacs true' will output error message in the MSVC form.
tidydev-14-src.zip: Full modified source, with MSVC build files.

Date Link Size MD5
2012/01/11 tidycvs-14-bin.zip 140,267 6b00582fa7a26fb321bc4e1df8f5b86e
2012/01/11 tidycvs-14-src.zip 670,905 b7189e688d4e0bf424cfee1ce408ea2a
2012/01/11 tidydev-14-bin.zip 145,827 ef24670a31cbc7d30f92aa0ddb9abdd3
2012/01/11 tidydev-14-src.zip 1,152,301 b0c4e52f28c5b96535228196ecbf90e0

Older Versions:

As usual, take care with downloading and running executable files from the web!

Description Download Date Size MD5
Later development versions are available from the index page
WIN32 EXE using MSVC8 tidydeve02.zip 09/01/2008 141,289 dd0305d942cb6f63fa3c63a03ffea73b
Complete modified DEV source tidydev-02.zip 09/01/2008 3,102,474 774c47634b21e29bb1a2dae0a070154b
Diff, new and modified files tidydevd02.zip  09/01/2008 147,090  1444cb22715d1cc8dabd3dc61a06b1fc

tidydeve02.zip - contents is just release tidy.exe ... which I rename to tidydev.exe in my system, to keep it separated form other versions ...

tidydev-02.zip - This is the COMPETE DEV source, including some modules that are not used, and the MSVC8 SLN and VCPROJ files ... of course, most of this is exactly what can be downloaded from Tidy CVS ...

tidydevd02.zip - This includes tidydev-02.txt, a diff -u file, and the new modules used, console/tidy2.c, src/clean2003.c and src/lexer_dbg.c, as well as copies of the all the modified files. With this you should be able to download the latest CVS source, and apply the tidydev-02.txt using patch, and unpack the 3 new modules, console/tidy2.c, src/clean2003.c and src/lexer_dbg.c. (Note: lexer_dbg.c is only for DEBUG build).

But if all you want, or need, is a different 'command processor', then only console/tidy2.c is required.

As usual, take care with downloading and running executable files from the web!


PS: This file was tidied using this exe, which outputs a version date, using -v of  :-
HTML Tidy for Windows released on 6 November 2007, compiled on Jan 9 2008 at 14:54:59
and was tidied with the command options :-
--wrap 99 --tidy-mark no --indent yes --break-before-br yes --indent-attributes yes --vertical-space yes --indent-spaces 1

PPS: 20080109 - Item (f) above was added in response to Feature Request 1867576. It treats any '/' found in the command line as an error, outputs HELP, and exits.


checked by tidy  Valid HTML 4.01 Transitional