Class SM::SimpleMarkup
In: markup/simple_markup.rb
Parent: Object

Synopsis

This code converts input_string, which is in the format described in markup/simple_markup.rb, to HTML. The conversion takes place in the convert method, so you can use the same SimpleMarkup object to convert multiple input strings.

  require 'rdoc/markup/simple_markup'
  require 'rdoc/markup/simple_markup/to_html'

  p = SM::SimpleMarkup.new
  h = SM::ToHtml.new

  puts p.convert(input_string, h)

You can extend the SimpleMarkup parser to recognise new markup sequences, and to add special processing for text that matches a regular epxression. Here we make WikiWords significant to the parser, and also make the sequences {word} and <no>text...</no> signify strike-through text. When then subclass the HTML output class to deal with these:

  require 'rdoc/markup/simple_markup'
  require 'rdoc/markup/simple_markup/to_html'

  class WikiHtml < SM::ToHtml
    def handle_special_WIKIWORD(special)
      "<font color=red>" + special.text + "</font>"
    end
  end

  p = SM::SimpleMarkup.new
  p.add_word_pair("{", "}", :STRIKE)
  p.add_html("no", :STRIKE)

  p.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)

  h = WikiHtml.new
  h.add_tag(:STRIKE, "<strike>", "</strike>")

  puts "<body>" + p.convert(ARGF.read, h) + "</body>"

Output Formatters

missing

Methods

Constants

SPACE = ?\s
SIMPLE_LIST_RE = /^( ( \* (?# bullet) |- (?# bullet) |\d+\. (?# numbered ) |[A-Za-z]\. (?# alphabetically numbered ) ) \s+ )\S/x   List entries look like:
 *       text
 1.      text
 [label] text
 label:: text

Flag it as a list entry, and work out the indent for subsequent lines

LABEL_LIST_RE = /^( ( \[.*?\] (?# labeled ) |\S.*:: (?# note ) )(?:\s+|$) )/x

Public Class methods

take a block of text and use various heuristics to determine it‘s structure (paragraphs, lists, and so on). Invoke an event handler as we identify significant chunks.

[Source]

     # File markup/simple_markup.rb, line 207
207:     def initialize
208:       @am = AttributeManager.new
209:       @output = nil
210:       @block_exceptions = nil
211:     end

Public Instance methods

Add to the sequences recognized as general markup

[Source]

     # File markup/simple_markup.rb, line 226
226:     def add_html(tag, name)
227:       @am.add_html(tag, name)
228:     end

Add to other inline sequences. For example, we could add WikiWords using something like:

   parser.add_special(/\b([A-Z][a-z]+[A-Z]\w+)/, :WIKIWORD)

Each wiki word will be presented to the output formatter via the accept_special method

[Source]

     # File markup/simple_markup.rb, line 240
240:     def add_special(pattern, name)
241:       @am.add_special(pattern, name)
242:     end

Add to the sequences used to add formatting to an individual word (such as bold). Matching entries will generate attibutes that the output formatters can recognize by their name

[Source]

     # File markup/simple_markup.rb, line 218
218:     def add_word_pair(start, stop, name)
219:       @am.add_word_pair(start, stop, name)
220:     end

Look through the text at line indentation. We flag each line as being Blank, a paragraph, a list element, or verbatim text

[Source]

     # File markup/simple_markup.rb, line 274
274:     def assign_types_to_lines(margin = 0, level = 0)
275:       now_blocking = false
276:       while line = @lines.next
277: 
278:         if line.isBlank?
279:           line.stamp(Line::BLANK, level)
280:           next
281:         end
282:         
283:         # if a line contains non-blanks before the margin, then it must belong
284:         # to an outer level
285: 
286:         text = line.text
287:         
288:         for i in 0...margin
289:           if text[i] != SPACE
290:             @lines.unget
291:             return
292:           end
293:         end
294: 
295:         active_line = text[margin..-1]
296: 
297:         #
298:         # block_exceptions checking
299:         #
300:         if @block_exceptions
301:           if now_blocking
302:             line.stamp(Line::PARAGRAPH, level)
303:             @block_exceptions.each{ |be|
304:               if now_blocking == be['name']
305:                 be['replaces'].each{ |rep|
306:                   line.text.gsub!(rep['from'], rep['to'])
307:                 }
308:               end
309:               if now_blocking == be['name'] && line.text =~ be['end']
310:                 now_blocking = false
311:                 break
312:               end
313:             }
314:             next
315:           else
316:             @block_exceptions.each{ |be|
317:               if line.text =~ be['start']
318:                 now_blocking = be['name']
319:                 line.stamp(Line::PARAGRAPH, level)
320:                 break
321:               end
322:             }
323:             next if now_blocking
324:           end
325:         end
326: 
327: 
328:         # Rules (horizontal lines) look like
329:         #
330:         #  ---   (three or more hyphens)
331:         #
332:         # The more hyphens, the thicker the rule
333:         #
334: 
335:         if /^(---+)\s*$/ =~ active_line
336:           line.stamp(Line::RULE, level, $1.length-2)
337:           next
338:         end
339: 
340:         # Then look for list entries. First the ones that have to have
341:         # text following them (* xxx, - xxx, and dd. xxx)
342: 
343:         if SIMPLE_LIST_RE =~ active_line
344: 
345:           offset = margin + $1.length
346:           prefix = $2
347:           prefix_length = prefix.length
348: 
349:           flag = case prefix
350:                  when "*","-" then ListBase::BULLET
351:                  when /^\d/   then ListBase::NUMBER
352:                  when /^[A-Z]/ then ListBase::UPPERALPHA
353:                  when /^[a-z]/ then ListBase::LOWERALPHA
354:                  else raise "Invalid List Type: #{self.inspect}"
355:                  end
356: 
357:           line.stamp(Line::LIST, level+1, prefix, flag)
358:           text[margin, prefix_length] = " " * prefix_length
359:           assign_types_to_lines(offset, level + 1)
360:           next
361:         end
362: 
363: 
364:         if LABEL_LIST_RE =~ active_line
365:           offset = margin + $1.length
366:           prefix = $2
367:           prefix_length = prefix.length
368: 
369:           next if handled_labeled_list(line, level, margin, offset, prefix)
370:         end
371: 
372:         # Headings look like
373:         # = Main heading
374:         # == Second level
375:         # === Third
376:         #
377:         # Headings reset the level to 0
378: 
379:         if active_line[0] == ?= and active_line =~ /^(=+)\s*(.*)/
380:           prefix_length = $1.length
381:           prefix_length = 6 if prefix_length > 6
382:           line.stamp(Line::HEADING, 0, prefix_length)
383:           line.strip_leading(margin + prefix_length)
384:           next
385:         end
386:         
387:         # If the character's a space, then we have verbatim text,
388:         # otherwise 
389: 
390:         if active_line[0] == SPACE
391:           line.strip_leading(margin) if margin > 0
392:           line.stamp(Line::VERBATIM, level)
393:         else
394:           line.stamp(Line::PARAGRAPH, level)
395:         end
396:       end
397:     end

for debugging, we allow access to our line contents as text

[Source]

     # File markup/simple_markup.rb, line 498
498:     def content
499:       @lines.as_text
500:     end

We take a string, split it into lines, work out the type of each line, and from there deduce groups of lines (for example all lines in a paragraph). We then invoke the output formatter using a Visitor to display the result

[Source]

     # File markup/simple_markup.rb, line 250
250:     def convert(str, op, block_exceptions=nil)
251:       @lines = Lines.new(str.split(/\r?\n/).collect { |aLine| 
252:                            Line.new(aLine) })
253:       return "" if @lines.empty?
254:       @lines.normalize
255:       @block_exceptions = block_exceptions
256:       assign_types_to_lines
257:       group = group_lines
258:       # call the output formatter to handle the result
259:       #      group.to_a.each {|i| p i}
260:       group.accept(@am, op)
261:     end

for debugging, return the list of line types

[Source]

     # File markup/simple_markup.rb, line 504
504:     def get_line_types
505:       @lines.line_types
506:     end

Return a block consisting of fragments which are paragraphs, list entries or verbatim text. We merge consecutive lines of the same type and level together. We are also slightly tricky with lists: the lines following a list introduction look like paragraph lines at the next level, and we remap them into list entries instead

[Source]

     # File markup/simple_markup.rb, line 469
469:     def group_lines
470:       @lines.rewind
471: 
472:       inList = false
473:       wantedType = wantedLevel = nil
474: 
475:       block = LineCollection.new
476:       group = nil
477: 
478:       while line = @lines.next
479:         if line.level == wantedLevel and line.type == wantedType
480:           group.add_text(line.text)
481:         else
482:           group = block.fragment_for(line)
483:           block.add(group)
484:           if line.type == Line::LIST
485:             wantedType = Line::PARAGRAPH
486:           else
487:             wantedType = line.type
488:           end
489:           wantedLevel = line.type == Line::HEADING ? line.param : line.level
490:         end
491:       end
492: 
493:       block.normalize
494:       block
495:     end

Handle labeled list entries, We have a special case to deal with. Because the labels can be long, they force the remaining block of text over the to right:

this is a long label that I wrote:and here is the block of text with a silly margin

So we allow the special case. If the label is followed by nothing, and if the following line is indented, then we take the indent of that line as the new margin

this is a long label that I wrote:here is a more reasonably indented block which will ab attached to the label.

[Source]

     # File markup/simple_markup.rb, line 416
416:     def handled_labeled_list(line, level, margin, offset, prefix)
417:       prefix_length = prefix.length
418:       text = line.text
419:       flag = nil
420:       case prefix
421:       when /^\[/
422:         flag = ListBase::LABELED
423:         prefix = prefix[1, prefix.length-2]
424:       when /:$/
425:         flag = ListBase::NOTE
426:         prefix.chop!
427:       else raise "Invalid List Type: #{self.inspect}"
428:       end
429:       
430:       # body is on the next line
431:       
432:       if text.length <= offset
433:         original_line = line
434:         line = @lines.next
435:         return(false) unless line
436:         text = line.text
437:         
438:         for i in 0..margin
439:           if text[i] != SPACE
440:             @lines.unget
441:             return false
442:           end
443:         end
444:         i = margin
445:         i += 1 while text[i] == SPACE
446:         if i >= text.length
447:           @lines.unget
448:           return false
449:         else
450:           offset = i
451:           prefix_length = 0
452:           @lines.delete(original_line)
453:         end
454:       end
455:       
456:       line.stamp(Line::LIST, level+1, prefix, flag)
457:       text[margin, prefix_length] = " " * prefix_length
458:       assign_types_to_lines(offset, level + 1)
459:       return true
460:     end

[Validate]