Scheme from Scratch - Bootstrap v0.3 - Characters

Characters are implemented similarly to integers. We need to update the model, read, and print layers of the interpreter. Characters are self-evaluating so we still don’t need much of an eval layer yet.

Because Bootstrap Scheme is a quick and dirty interpreter, and a small, readable implementation is one of the goals, implementing ASCII characters is fine. We don’t need to enter the world of unicode.

Character literals in Scheme use a prefix notation: #\a, #\9. The trouble makers for parsing are the special literals for newlines and spaces: #\newline, #\space. Here is a sample REPL session with characters:

$ ./scheme
Welcome to Bootstrap Scheme. Use ctrl-c to exit.
> #\a
> #\newline
> #\

> #\space
> #\ 

Note that the second newline example may be considered bad. This is all part of the “dirty” aspect of a bootstrap interpreter. It is more important to have a small readable implementation than cover every single boundary case.

In the second space example above there is a space after the backslash before pressing enter.

It seems a bit odd that in R5RS there is was standard for a tab character. You can implement #\tab if you want.

Implementing a language encourages examination of the language’s design decisions. I am not a big fan of character literals in Scheme. We write #\newline for a newline character literal but to write a newline in a Scheme string we write "hello, world\n". The lack of parallelism between special characters as character literals and in strings is a bit unfortunate.

I have always liked C’s character literals. Part of the reason is that in C the character literal for a newline '\n' is the same as the escape character for a new line in a string "hello, world\n". The single quote character is not really available for this purpose in Scheme. It could be used but then characters would not have a prefix notation.

For a Scheme-like language of my own design, I would consider character literals half way between Scheme’s and C’s: #'a', #'\n', #' '.

There is a v0.3 branch on github for this version.

Previous article: Booleans
Next article: Strings


Have something to write? Comment on this article.

pmarin January 13, 2010

mzscheme has the same behaviour with newlines :)

Have something to write? Comment on this article.