Class | SM::SimpleMarkup |
In: |
markup/simple_markup.rb
|
Parent: | Object |
This code converts input_string, which is in the format described in markup/simple_markup.rb, to HTML. The conversion takes place in the convert method, so you can use the same SimpleMarkup object to convert multiple input strings.
require 'rdoc/markup/simple_markup' require 'rdoc/markup/simple_markup/to_html' p = SM::SimpleMarkup.new h = SM::ToHtml.new puts p.convert(input_string, h)
You can extend the SimpleMarkup parser to recognise new markup sequences, and to add special processing for text that matches a regular epxression. Here we make WikiWords significant to the parser, and also make the sequences {word} and <no>text...</no> signify strike-through text. When then subclass the HTML output class to deal with these:
require 'rdoc/markup/simple_markup' require 'rdoc/markup/simple_markup/to_html' class WikiHtml < SM::ToHtml def handle_special_WIKIWORD(special) "<font color=red>" + special.text + "</font>" end end p = SM::SimpleMarkup.new p.add_word_pair("{", "}", :STRIKE) p.add_html("no", :STRIKE) p.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD) h = WikiHtml.new h.add_tag(:STRIKE, "<strike>", "</strike>") puts "<body>" + p.convert(ARGF.read, h) + "</body>"
missing
SPACE | = | ?\s | ||
SIMPLE_LIST_RE | = | /^( ( \* (?# bullet) |- (?# bullet) |\d+\. (?# numbered ) |[A-Za-z]\. (?# alphabetically numbered ) ) \s+ )\S/x |
List entries look like:
* text 1. text [label] text label:: text Flag it as a list entry, and work out the indent for subsequent lines |
|
LABEL_LIST_RE | = | /^( ( \[.*?\] (?# labeled ) |\S.*:: (?# note ) )(?:\s+|$) )/x |
take a block of text and use various heuristics to determine it‘s structure (paragraphs, lists, and so on). Invoke an event handler as we identify significant chunks.
# File markup/simple_markup.rb, line 207 207: def initialize 208: @am = AttributeManager.new 209: @output = nil 210: @block_exceptions = nil 211: end
Add to the sequences recognized as general markup
# File markup/simple_markup.rb, line 226 226: def add_html(tag, name) 227: @am.add_html(tag, name) 228: end
Add to other inline sequences. For example, we could add WikiWords using something like:
parser.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)
Each wiki word will be presented to the output formatter via the accept_special method
# File markup/simple_markup.rb, line 240 240: def add_special(pattern, name) 241: @am.add_special(pattern, name) 242: end
Add to the sequences used to add formatting to an individual word (such as bold). Matching entries will generate attibutes that the output formatters can recognize by their name
# File markup/simple_markup.rb, line 218 218: def add_word_pair(start, stop, name) 219: @am.add_word_pair(start, stop, name) 220: end
Look through the text at line indentation. We flag each line as being Blank, a paragraph, a list element, or verbatim text
# File markup/simple_markup.rb, line 274 274: def assign_types_to_lines(margin = 0, level = 0) 275: now_blocking = false 276: while line = @lines.next 277: 278: if line.isBlank? 279: line.stamp(Line::BLANK, level) 280: next 281: end 282: 283: # if a line contains non-blanks before the margin, then it must belong 284: # to an outer level 285: 286: text = line.text 287: 288: for i in 0...margin 289: if text[i] != SPACE 290: @lines.unget 291: return 292: end 293: end 294: 295: active_line = text[margin..-1] 296: 297: # 298: # block_exceptions checking 299: # 300: if @block_exceptions 301: if now_blocking 302: line.stamp(Line::PARAGRAPH, level) 303: @block_exceptions.each{ |be| 304: if now_blocking == be['name'] 305: be['replaces'].each{ |rep| 306: line.text.gsub!(rep['from'], rep['to']) 307: } 308: end 309: if now_blocking == be['name'] && line.text =~ be['end'] 310: now_blocking = false 311: break 312: end 313: } 314: next 315: else 316: @block_exceptions.each{ |be| 317: if line.text =~ be['start'] 318: now_blocking = be['name'] 319: line.stamp(Line::PARAGRAPH, level) 320: break 321: end 322: } 323: next if now_blocking 324: end 325: end 326: 327: 328: # Rules (horizontal lines) look like 329: # 330: # --- (three or more hyphens) 331: # 332: # The more hyphens, the thicker the rule 333: # 334: 335: if /^(---+)\s*$/ =~ active_line 336: line.stamp(Line::RULE, level, $1.length-2) 337: next 338: end 339: 340: # Then look for list entries. First the ones that have to have 341: # text following them (* xxx, - xxx, and dd. xxx) 342: 343: if SIMPLE_LIST_RE =~ active_line 344: 345: offset = margin + $1.length 346: prefix = $2 347: prefix_length = prefix.length 348: 349: flag = case prefix 350: when "*","-" then ListBase::BULLET 351: when /^\d/ then ListBase::NUMBER 352: when /^[A-Z]/ then ListBase::UPPERALPHA 353: when /^[a-z]/ then ListBase::LOWERALPHA 354: else raise "Invalid List Type: #{self.inspect}" 355: end 356: 357: line.stamp(Line::LIST, level+1, prefix, flag) 358: text[margin, prefix_length] = " " * prefix_length 359: assign_types_to_lines(offset, level + 1) 360: next 361: end 362: 363: 364: if LABEL_LIST_RE =~ active_line 365: offset = margin + $1.length 366: prefix = $2 367: prefix_length = prefix.length 368: 369: next if handled_labeled_list(line, level, margin, offset, prefix) 370: end 371: 372: # Headings look like 373: # = Main heading 374: # == Second level 375: # === Third 376: # 377: # Headings reset the level to 0 378: 379: if active_line[0] == ?= and active_line =~ /^(=+)\s*(.*)/ 380: prefix_length = $1.length 381: prefix_length = 6 if prefix_length > 6 382: line.stamp(Line::HEADING, 0, prefix_length) 383: line.strip_leading(margin + prefix_length) 384: next 385: end 386: 387: # If the character's a space, then we have verbatim text, 388: # otherwise 389: 390: if active_line[0] == SPACE 391: line.strip_leading(margin) if margin > 0 392: line.stamp(Line::VERBATIM, level) 393: else 394: line.stamp(Line::PARAGRAPH, level) 395: end 396: end 397: end
for debugging, we allow access to our line contents as text
# File markup/simple_markup.rb, line 498 498: def content 499: @lines.as_text 500: end
We take a string, split it into lines, work out the type of each line, and from there deduce groups of lines (for example all lines in a paragraph). We then invoke the output formatter using a Visitor to display the result
# File markup/simple_markup.rb, line 250 250: def convert(str, op, block_exceptions=nil) 251: @lines = Lines.new(str.split(/\r?\n/).collect { |aLine| 252: Line.new(aLine) }) 253: return "" if @lines.empty? 254: @lines.normalize 255: @block_exceptions = block_exceptions 256: assign_types_to_lines 257: group = group_lines 258: # call the output formatter to handle the result 259: # group.to_a.each {|i| p i} 260: group.accept(@am, op) 261: end
for debugging, return the list of line types
# File markup/simple_markup.rb, line 504 504: def get_line_types 505: @lines.line_types 506: end
Return a block consisting of fragments which are paragraphs, list entries or verbatim text. We merge consecutive lines of the same type and level together. We are also slightly tricky with lists: the lines following a list introduction look like paragraph lines at the next level, and we remap them into list entries instead
# File markup/simple_markup.rb, line 469 469: def group_lines 470: @lines.rewind 471: 472: inList = false 473: wantedType = wantedLevel = nil 474: 475: block = LineCollection.new 476: group = nil 477: 478: while line = @lines.next 479: if line.level == wantedLevel and line.type == wantedType 480: group.add_text(line.text) 481: else 482: group = block.fragment_for(line) 483: block.add(group) 484: if line.type == Line::LIST 485: wantedType = Line::PARAGRAPH 486: else 487: wantedType = line.type 488: end 489: wantedLevel = line.type == Line::HEADING ? line.param : line.level 490: end 491: end 492: 493: block.normalize 494: block 495: end
Handle labeled list entries, We have a special case to deal with. Because the labels can be long, they force the remaining block of text over the to right:
this is a long label that I wrote: | and here is the block of text with a silly margin |
So we allow the special case. If the label is followed by nothing, and if the following line is indented, then we take the indent of that line as the new margin
this is a long label that I wrote: | here is a more reasonably indented block which will ab attached to the label. |
# File markup/simple_markup.rb, line 416 416: def handled_labeled_list(line, level, margin, offset, prefix) 417: prefix_length = prefix.length 418: text = line.text 419: flag = nil 420: case prefix 421: when /^\[/ 422: flag = ListBase::LABELED 423: prefix = prefix[1, prefix.length-2] 424: when /:$/ 425: flag = ListBase::NOTE 426: prefix.chop! 427: else raise "Invalid List Type: #{self.inspect}" 428: end 429: 430: # body is on the next line 431: 432: if text.length <= offset 433: original_line = line 434: line = @lines.next 435: return(false) unless line 436: text = line.text 437: 438: for i in 0..margin 439: if text[i] != SPACE 440: @lines.unget 441: return false 442: end 443: end 444: i = margin 445: i += 1 while text[i] == SPACE 446: if i >= text.length 447: @lines.unget 448: return false 449: else 450: offset = i 451: prefix_length = 0 452: @lines.delete(original_line) 453: end 454: end 455: 456: line.stamp(Line::LIST, level+1, prefix, flag) 457: text[margin, prefix_length] = " " * prefix_length 458: assign_types_to_lines(offset, level + 1) 459: return true 460: end