Class | SM::SimpleMarkup |
In: |
markup/simple_markup.rb
|
Parent: | Object |
This code converts input_string, which is in the format described in markup/simple_markup.rb, to HTML. The conversion takes place in the convert method, so you can use the same SimpleMarkup object to convert multiple input strings.
require 'rdoc/markup/simple_markup' require 'rdoc/markup/simple_markup/to_html' p = SM::SimpleMarkup.new h = SM::ToHtml.new puts p.convert(input_string, h)
You can extend the SimpleMarkup parser to recognise new markup sequences, and to add special processing for text that matches a regular epxression. Here we make WikiWords significant to the parser, and also make the sequences {word} and <no>text...</no> signify strike-through text. When then subclass the HTML output class to deal with these:
require 'rdoc/markup/simple_markup' require 'rdoc/markup/simple_markup/to_html' class WikiHtml < SM::ToHtml def handle_special_WIKIWORD(special) "<font color=red>" + special.text + "</font>" end end p = SM::SimpleMarkup.new p.add_word_pair("{", "}", :STRIKE) p.add_html("no", :STRIKE) p.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD) h = WikiHtml.new h.add_tag(:STRIKE, "<strike>", "</strike>") puts "<body>" + p.convert(ARGF.read, h) + "</body>"
missing
SPACE | = | ?\s | ||
SIMPLE_LIST_RE | = | /^( ( \* (?# bullet) |- (?# bullet) |\d+\. (?# numbered ) |[A-Za-z]\. (?# alphabetically numbered ) ) \s+ )\S/x |
List entries look like:
* text 1. text [label] text label:: text Flag it as a list entry, and work out the indent for subsequent lines |
|
LABEL_LIST_RE | = | /^( ( \[.*?\] (?# labeled ) |\S.*:: (?# note ) )(?:\s+|$) )/x |
take a block of text and use various heuristics to determine it‘s structure (paragraphs, lists, and so on). Invoke an event handler as we identify significant chunks.
# File markup/simple_markup.rb, line 207 207: def initialize 208: @am = AttributeManager.new 209: @output = nil 210: end
Add to the sequences recognized as general markup
# File markup/simple_markup.rb, line 225 225: def add_html(tag, name) 226: @am.add_html(tag, name) 227: end
Add to other inline sequences. For example, we could add WikiWords using something like:
parser.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)
Each wiki word will be presented to the output formatter via the accept_special method
# File markup/simple_markup.rb, line 239 239: def add_special(pattern, name) 240: @am.add_special(pattern, name) 241: end
Add to the sequences used to add formatting to an individual word (such as bold). Matching entries will generate attibutes that the output formatters can recognize by their name
# File markup/simple_markup.rb, line 217 217: def add_word_pair(start, stop, name) 218: @am.add_word_pair(start, stop, name) 219: end
Look through the text at line indentation. We flag each line as being Blank, a paragraph, a list element, or verbatim text
# File markup/simple_markup.rb, line 272 272: def assign_types_to_lines(margin = 0, level = 0) 273: 274: while line = @lines.next 275: if line.isBlank? 276: line.stamp(Line::BLANK, level) 277: next 278: end 279: 280: # if a line contains non-blanks before the margin, then it must belong 281: # to an outer level 282: 283: text = line.text 284: 285: for i in 0...margin 286: if text[i] != SPACE 287: @lines.unget 288: return 289: end 290: end 291: 292: active_line = text[margin..-1] 293: 294: # Rules (horizontal lines) look like 295: # 296: # --- (three or more hyphens) 297: # 298: # The more hyphens, the thicker the rule 299: # 300: 301: if /^(---+)\s*$/ =~ active_line 302: line.stamp(Line::RULE, level, $1.length-2) 303: next 304: end 305: 306: # Then look for list entries. First the ones that have to have 307: # text following them (* xxx, - xxx, and dd. xxx) 308: 309: if SIMPLE_LIST_RE =~ active_line 310: 311: offset = margin + $1.length 312: prefix = $2 313: prefix_length = prefix.length 314: 315: flag = case prefix 316: when "*","-" then ListBase::BULLET 317: when /^\d/ then ListBase::NUMBER 318: when /^[A-Z]/ then ListBase::UPPERALPHA 319: when /^[a-z]/ then ListBase::LOWERALPHA 320: else raise "Invalid List Type: #{self.inspect}" 321: end 322: 323: line.stamp(Line::LIST, level+1, prefix, flag) 324: text[margin, prefix_length] = " " * prefix_length 325: assign_types_to_lines(offset, level + 1) 326: next 327: end 328: 329: 330: if LABEL_LIST_RE =~ active_line 331: offset = margin + $1.length 332: prefix = $2 333: prefix_length = prefix.length 334: 335: next if handled_labeled_list(line, level, margin, offset, prefix) 336: end 337: 338: # Headings look like 339: # = Main heading 340: # == Second level 341: # === Third 342: # 343: # Headings reset the level to 0 344: 345: if active_line[0] == ?= and active_line =~ /^(=+)\s*(.*)/ 346: prefix_length = $1.length 347: prefix_length = 6 if prefix_length > 6 348: line.stamp(Line::HEADING, 0, prefix_length) 349: line.strip_leading(margin + prefix_length) 350: next 351: end 352: 353: # If the character's a space, then we have verbatim text, 354: # otherwise 355: 356: if active_line[0] == SPACE 357: line.strip_leading(margin) if margin > 0 358: line.stamp(Line::VERBATIM, level) 359: else 360: line.stamp(Line::PARAGRAPH, level) 361: end 362: end 363: end
for debugging, we allow access to our line contents as text
# File markup/simple_markup.rb, line 464 464: def content 465: @lines.as_text 466: end
We take a string, split it into lines, work out the type of each line, and from there deduce groups of lines (for example all lines in a paragraph). We then invoke the output formatter using a Visitor to display the result
# File markup/simple_markup.rb, line 249 249: def convert(str, op) 250: @lines = Lines.new(str.split(/\r?\n/).collect { |aLine| 251: Line.new(aLine) }) 252: return "" if @lines.empty? 253: @lines.normalize 254: assign_types_to_lines 255: group = group_lines 256: # call the output formatter to handle the result 257: # group.to_a.each {|i| p i} 258: group.accept(@am, op) 259: end
for debugging, return the list of line types
# File markup/simple_markup.rb, line 470 470: def get_line_types 471: @lines.line_types 472: end
Return a block consisting of fragments which are paragraphs, list entries or verbatim text. We merge consecutive lines of the same type and level together. We are also slightly tricky with lists: the lines following a list introduction look like paragraph lines at the next level, and we remap them into list entries instead
# File markup/simple_markup.rb, line 435 435: def group_lines 436: @lines.rewind 437: 438: inList = false 439: wantedType = wantedLevel = nil 440: 441: block = LineCollection.new 442: group = nil 443: 444: while line = @lines.next 445: if line.level == wantedLevel and line.type == wantedType 446: group.add_text(line.text) 447: else 448: group = block.fragment_for(line) 449: block.add(group) 450: if line.type == Line::LIST 451: wantedType = Line::PARAGRAPH 452: else 453: wantedType = line.type 454: end 455: wantedLevel = line.type == Line::HEADING ? line.param : line.level 456: end 457: end 458: 459: block.normalize 460: block 461: end
Handle labeled list entries, We have a special case to deal with. Because the labels can be long, they force the remaining block of text over the to right:
this is a long label that I wrote: | and here is the block of text with a silly margin |
So we allow the special case. If the label is followed by nothing, and if the following line is indented, then we take the indent of that line as the new margin
this is a long label that I wrote: | here is a more reasonably indented block which will ab attached to the label. |
# File markup/simple_markup.rb, line 382 382: def handled_labeled_list(line, level, margin, offset, prefix) 383: prefix_length = prefix.length 384: text = line.text 385: flag = nil 386: case prefix 387: when /^\[/ 388: flag = ListBase::LABELED 389: prefix = prefix[1, prefix.length-2] 390: when /:$/ 391: flag = ListBase::NOTE 392: prefix.chop! 393: else raise "Invalid List Type: #{self.inspect}" 394: end 395: 396: # body is on the next line 397: 398: if text.length <= offset 399: original_line = line 400: line = @lines.next 401: return(false) unless line 402: text = line.text 403: 404: for i in 0..margin 405: if text[i] != SPACE 406: @lines.unget 407: return false 408: end 409: end 410: i = margin 411: i += 1 while text[i] == SPACE 412: if i >= text.length 413: @lines.unget 414: return false 415: else 416: offset = i 417: prefix_length = 0 418: @lines.delete(original_line) 419: end 420: end 421: 422: line.stamp(Line::LIST, level+1, prefix, flag) 423: text[margin, prefix_length] = " " * prefix_length 424: assign_types_to_lines(offset, level + 1) 425: return true 426: end