Updated WordPress Code Highlighter to GeSHi 1.0.8.6
Wednesday, May 12th, 2010Today I've been tracking down a little rendering bug with MarsEdit and WordPress. It's in the pre tags that run through the Code Highlighter plugin that I've been using for quite a while. It's the "less-than-sign". If I use the HTML escape code then I get that as the literal text in the MarsEdit preview pane as well as the WordPress page. But if I place the single character, MarsEdit thinks it's the start of a tag, and gets confused on the syntax highlighting.
So the two questions are: why can't the Code Highlighter handle the HTML escape for the less-than-sign, and why is MarsEdit not rendering it properly? I'm going to attack the first, and let Daniel handle the second.
First thing I noticed was that the Code Highlighter is based on the GeSHi engine and the version it's working on was 1.0.7.x whereas the current version is 1.0.8.6. So let's see if we can update the GeSHi engine in the plugin without breaking anything.
Turns out, it's a pretty simple encapsulation of the GeSHi engine. I was able to drop in the new version without too much trouble. In the doing, I've upgraded the languages that this guy can work with considerably. That's a very nice little perk.
Unfortunately, this didn't solve the problem with the less-than-sign. I did a cursory look in the GeSHi code and didn't see where it'd be doing any conversions. I'll probably spend a little more time on it - just to see if it's possible. But even if that doesn't come to anything, we have a far better selection of supported languages:
4cs div lscript python abap dos lsl2 qbasic actionscript dot lua rails actionscript3 eiffel m68k rebol ada email make reg apache erlang mapbasic robots applescript fo matlab rsplus apt_sources fortran mirc ruby asm freebasic mmix sas asp fsharp modula3 scala autohotkey gambas mpasm scheme autoit gdb mxml scilab avisynth genero mysql sdlbasic awk gettext newlisp smalltalk bash glsl nsis smarty basic4gl gml oberon2 sql bf gnuplot objc systemverilog bibtex groovy ocaml-brief tcl blitzbasic haskell ocaml teraterm bnf hq9plus oobas text boo html4strict oracle11 thinbasic c idl oracle8 tsql c_mac ini pascal typoscript caddcl inno per vb cadlisp intercal perl vbnet cfdg io perl6 verilog cfm java php-brief vhdl cil java5 php vim clojure javascript pic16 visualfoxpro cmake jquery pike visualprolog cobol kixtart pixelbender whitespace cpp-qt klonec plsql whois cpp klonecpp povray winbatch csharp latex powerbuilder xml css lisp powershell xorg_conf cuesheet locobasic progress xpp d logtalk prolog z80 dcs lolcode properties delphi lotusformulas providex diff lotusscript purebasic
[5/13] UPDATE: I was doing some more digging into the GeSHi engine - actually the Code Highlighter plugin, and I found what I thought was going to be a good place to fix this problem. In the codehighlighter.php file, we see:
if ($lang != null) { $tabstop = 2; $geshi =& new GeSHi($code, $lang); $geshi->set_tab_width($tabstop);
where it's clear in the comments that he's allowing for the special case use of the pre tag, and I decided to try a simple modification of that for these less-than and greater-than signs I'm having trouble with:
if ($lang != null) { $tabstop = 2; $geshi =& new GeSHi($code, $lang); $geshi->set_tab_width($tabstop);
This is a little odd in the way I have to show it, but it's pretty simple to understand - you replace the HTML escape sequence with the single character in the code. From there, you let the GeSHi engine do it's thing.
What I found was that it worked wonderfully! What a treat. Now I can use either method, and hopefully Daniel will have a fix for MarsEdit sooner rather than later.
The next thing I wanted to tackle with the Code Highlighter was the line numbers. There was far too much space between the lines in a code sample with line numbers. Turns out, there's a style for that in GeSHi. Simply edit the geshi.php file:
/** * Line number styles * @var string */ var $line_style1 = 'font-weight: normal; vertical-align:top;'; /** * Line number styles for fancy lines * @var string */ var $line_style2 = 'font-weight: bold; vertical-align:top;';
to be:
/** * Line number styles * @var string */ var $line_style1 = 'margin: 0; font-weight: normal; vertical-align:top;'; /** * Line number styles for fancy lines * @var string */ var $line_style2 = 'margin: 0; font-weight: bold; vertical-align:top;';
and the extra border space that the default WordPress theme puts into the li tag will be removed and it'll look much better.
The last little annoyance is the blank lines that start, and end, the code section when you use line numbers. It's just plain annoying. It makes it hard to get the numbers right, and it's whitespace that's not needed. It's a little more involved, but not too bad. In the geshi.php file, you need to change:
// Get code into lines /** NOTE: memorypeak #2 */
to:
// Get code into lines /** NOTE: memorypeak #2 */ // remove a blank first and last line } }
Now I can imagine a way that might be a little more efficient, but I'm not worried at this point. It's not all that bad, and it's very solid. If the first or last lines are empty of code, they get removed and the array is re-indexed. Simple.
With this, I have a really nicely workable solution for my code. Nice.