Class SM::SimpleMarkup
In: markup/simple_markup.rb
Parent: Object

Synopsis

This code converts input_string, which is in the format described in markup/simple_markup.rb, to HTML. The conversion takes place in the convert method, so you can use the same SimpleMarkup object to convert multiple input strings.

  require 'rdoc/markup/simple_markup'
  require 'rdoc/markup/simple_markup/to_html'

  p = SM::SimpleMarkup.new
  h = SM::ToHtml.new

  puts p.convert(input_string, h)

You can extend the SimpleMarkup parser to recognise new markup sequences, and to add special processing for text that matches a regular epxression. Here we make WikiWords significant to the parser, and also make the sequences {word} and <no>text...</no> signify strike-through text. When then subclass the HTML output class to deal with these:

  require 'rdoc/markup/simple_markup'
  require 'rdoc/markup/simple_markup/to_html'

  class WikiHtml < SM::ToHtml
    def handle_special_WIKIWORD(special)
      "<font color=red>" + special.text + "</font>"
    end
  end

  p = SM::SimpleMarkup.new
  p.add_word_pair("{", "}", :STRIKE)
  p.add_html("no", :STRIKE)

  p.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)

  h = WikiHtml.new
  h.add_tag(:STRIKE, "<strike>", "</strike>")

  puts "<body>" + p.convert(ARGF.read, h) + "</body>"

Output Formatters

missing

Methods

Constants

SPACE = ?\s
SIMPLE_LIST_RE = /^( ( \* (?# bullet) |- (?# bullet) |\d+\. (?# numbered ) |[A-Za-z]\. (?# alphabetically numbered ) ) \s+ )\S/x   List entries look like:
 *       text
 1.      text
 [label] text
 label:: text

Flag it as a list entry, and work out the indent for subsequent lines

LABEL_LIST_RE = /^( ( \[.*?\] (?# labeled ) |\S.*:: (?# note ) )(?:\s+|$) )/x

Public Class methods

take a block of text and use various heuristics to determine it‘s structure (paragraphs, lists, and so on). Invoke an event handler as we identify significant chunks.

[Source]

     # File markup/simple_markup.rb, line 207
207:     def initialize
208:       @am = AttributeManager.new
209:       @output = nil
210:     end

Public Instance methods

Add to the sequences recognized as general markup

[Source]

     # File markup/simple_markup.rb, line 225
225:     def add_html(tag, name)
226:       @am.add_html(tag, name)
227:     end

Add to other inline sequences. For example, we could add WikiWords using something like:

   parser.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)

Each wiki word will be presented to the output formatter via the accept_special method

[Source]

     # File markup/simple_markup.rb, line 239
239:     def add_special(pattern, name)
240:       @am.add_special(pattern, name)
241:     end

Add to the sequences used to add formatting to an individual word (such as bold). Matching entries will generate attibutes that the output formatters can recognize by their name

[Source]

     # File markup/simple_markup.rb, line 217
217:     def add_word_pair(start, stop, name)
218:       @am.add_word_pair(start, stop, name)
219:     end

Look through the text at line indentation. We flag each line as being Blank, a paragraph, a list element, or verbatim text

[Source]

     # File markup/simple_markup.rb, line 272
272:     def assign_types_to_lines(margin = 0, level = 0)
273: 
274:       while line = @lines.next
275:         if line.isBlank?
276:           line.stamp(Line::BLANK, level)
277:           next
278:         end
279:         
280:         # if a line contains non-blanks before the margin, then it must belong
281:         # to an outer level
282: 
283:         text = line.text
284:         
285:         for i in 0...margin
286:           if text[i] != SPACE
287:             @lines.unget
288:             return
289:           end
290:         end
291: 
292:         active_line = text[margin..-1]
293: 
294:         # Rules (horizontal lines) look like
295:         #
296:         #  ---   (three or more hyphens)
297:         #
298:         # The more hyphens, the thicker the rule
299:         #
300: 
301:         if /^(---+)\s*$/ =~ active_line
302:           line.stamp(Line::RULE, level, $1.length-2)
303:           next
304:         end
305: 
306:         # Then look for list entries. First the ones that have to have
307:         # text following them (* xxx, - xxx, and dd. xxx)
308: 
309:         if SIMPLE_LIST_RE =~ active_line
310: 
311:           offset = margin + $1.length
312:           prefix = $2
313:           prefix_length = prefix.length
314: 
315:           flag = case prefix
316:                  when "*","-" then ListBase::BULLET
317:                  when /^\d/   then ListBase::NUMBER
318:                  when /^[A-Z]/ then ListBase::UPPERALPHA
319:                  when /^[a-z]/ then ListBase::LOWERALPHA
320:                  else raise "Invalid List Type: #{self.inspect}"
321:                  end
322: 
323:           line.stamp(Line::LIST, level+1, prefix, flag)
324:           text[margin, prefix_length] = " " * prefix_length
325:           assign_types_to_lines(offset, level + 1)
326:           next
327:         end
328: 
329: 
330:         if LABEL_LIST_RE =~ active_line
331:           offset = margin + $1.length
332:           prefix = $2
333:           prefix_length = prefix.length
334: 
335:           next if handled_labeled_list(line, level, margin, offset, prefix)
336:         end
337: 
338:         # Headings look like
339:         # = Main heading
340:         # == Second level
341:         # === Third
342:         #
343:         # Headings reset the level to 0
344: 
345:         if active_line[0] == ?= and active_line =~ /^(=+)\s*(.*)/
346:           prefix_length = $1.length
347:           prefix_length = 6 if prefix_length > 6
348:           line.stamp(Line::HEADING, 0, prefix_length)
349:           line.strip_leading(margin + prefix_length)
350:           next
351:         end
352:         
353:         # If the character's a space, then we have verbatim text,
354:         # otherwise 
355: 
356:         if active_line[0] == SPACE
357:           line.strip_leading(margin) if margin > 0
358:           line.stamp(Line::VERBATIM, level)
359:         else
360:           line.stamp(Line::PARAGRAPH, level)
361:         end
362:       end
363:     end

for debugging, we allow access to our line contents as text

[Source]

     # File markup/simple_markup.rb, line 464
464:     def content
465:       @lines.as_text
466:     end

We take a string, split it into lines, work out the type of each line, and from there deduce groups of lines (for example all lines in a paragraph). We then invoke the output formatter using a Visitor to display the result

[Source]

     # File markup/simple_markup.rb, line 249
249:     def convert(str, op)
250:       @lines = Lines.new(str.split(/\r?\n/).collect { |aLine| 
251:                            Line.new(aLine) })
252:       return "" if @lines.empty?
253:       @lines.normalize
254:       assign_types_to_lines
255:       group = group_lines
256:       # call the output formatter to handle the result
257:       #      group.to_a.each {|i| p i}
258:       group.accept(@am, op)
259:     end

for debugging, return the list of line types

[Source]

     # File markup/simple_markup.rb, line 470
470:     def get_line_types
471:       @lines.line_types
472:     end

Return a block consisting of fragments which are paragraphs, list entries or verbatim text. We merge consecutive lines of the same type and level together. We are also slightly tricky with lists: the lines following a list introduction look like paragraph lines at the next level, and we remap them into list entries instead

[Source]

     # File markup/simple_markup.rb, line 435
435:     def group_lines
436:       @lines.rewind
437: 
438:       inList = false
439:       wantedType = wantedLevel = nil
440: 
441:       block = LineCollection.new
442:       group = nil
443: 
444:       while line = @lines.next
445:         if line.level == wantedLevel and line.type == wantedType
446:           group.add_text(line.text)
447:         else
448:           group = block.fragment_for(line)
449:           block.add(group)
450:           if line.type == Line::LIST
451:             wantedType = Line::PARAGRAPH
452:           else
453:             wantedType = line.type
454:           end
455:           wantedLevel = line.type == Line::HEADING ? line.param : line.level
456:         end
457:       end
458: 
459:       block.normalize
460:       block
461:     end

Handle labeled list entries, We have a special case to deal with. Because the labels can be long, they force the remaining block of text over the to right:

this is a long label that I wrote:and here is the block of text with a silly margin

So we allow the special case. If the label is followed by nothing, and if the following line is indented, then we take the indent of that line as the new margin

this is a long label that I wrote:here is a more reasonably indented block which will ab attached to the label.

[Source]

     # File markup/simple_markup.rb, line 382
382:     def handled_labeled_list(line, level, margin, offset, prefix)
383:       prefix_length = prefix.length
384:       text = line.text
385:       flag = nil
386:       case prefix
387:       when /^\[/
388:         flag = ListBase::LABELED
389:         prefix = prefix[1, prefix.length-2]
390:       when /:$/
391:         flag = ListBase::NOTE
392:         prefix.chop!
393:       else raise "Invalid List Type: #{self.inspect}"
394:       end
395:       
396:       # body is on the next line
397:       
398:       if text.length <= offset
399:         original_line = line
400:         line = @lines.next
401:         return(false) unless line
402:         text = line.text
403:         
404:         for i in 0..margin
405:           if text[i] != SPACE
406:             @lines.unget
407:             return false
408:           end
409:         end
410:         i = margin
411:         i += 1 while text[i] == SPACE
412:         if i >= text.length
413:           @lines.unget
414:           return false
415:         else
416:           offset = i
417:           prefix_length = 0
418:           @lines.delete(original_line)
419:         end
420:       end
421:       
422:       line.stamp(Line::LIST, level+1, prefix, flag)
423:       text[margin, prefix_length] = " " * prefix_length
424:       assign_types_to_lines(offset, level + 1)
425:       return true
426:     end

[Validate]