Newsgroups: comp.lang.lisp
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!gatech!newsfeed.internetmci.com!in2.uu.net!harlequin.com!epcot!usenet
From: norvig@meteor.harlequin.com (Peter Norvig)
Subject: Re: input question
In-Reply-To: "Robert G. Malkin"'s message of Sun, 21 Jan 1996 14:49:07 -0500
Message-ID: <NORVIG.96Jan24093722@meteor.harlequin.com>
Lines: 70
Sender: usenet@harlequin.com (Usenet Maintainer)
Nntp-Posting-Host: meteor.menlo.harlequin.com
Organization: Harlequin, Inc., Menlo Park, CA
References: <sl0dWnS00iWU42s8Qu@andrew.cmu.edu>
Date: Wed, 24 Jan 1996 17:37:22 GMT



In article <sl0dWnS00iWU42s8Qu@andrew.cmu.edu> "Robert G. Malkin" <rm6k+@andrew.cmu.edu> writes:

> a quick question about input:
> i'm trying to get lisp (mcl) to read in from a file a large block of
> text, complete with punctuation; including quotation marks. the question
> is, how would i get lisp to return from reading the file a list of all
> the words?
> for example, i want a read function called this example text:
> 
> 
> " Holmes , " said I as I stood one morning in our bow-window
> looking down the street , " here is a madman coming along .
> 
> to return something like:
> 
> (holmes said i as i stood one morning etc.)

Here's a simple solution.  If you don't want the punctuation, pass them as
IGNORE to make-safe-readtable rather than as SINGLE.

;;; The idea is that SAFE-READ will always return an atom.  It won't
;;; signal an error just because the input stream is not valid Lisp
;;; syntax. By default, SAFE-READ returns nil at end of file.
;;; Exactly how the stream is broken into atoms depends on the read
;;; table you construct with a call to MAKE-SAFE-READTABLE.  An example:

;;; (with-input-from-string (stream "hello, #6); how are you?")
;;;   (loop for x = (safe-read stream) while x collect x)
;;; ==> (HELLO |,| |#| 6 |)| |;| HOW ARE YOU |?|)


(defun make-safe-readtable (&key (alphabetic "") (ignore "")
                                 (single "!@#$%^&*()_+|\=-[]{};'`:\"~,./<>?"))
  "Create a readtable that won't give an error even when asked
  to read stuff that is not valid Lisp syntax. The ALBHABETIC chars 
  (along with the normal ones) are grouped into symbols, the SINGLE chars 
  become 1-char Lisp symbols even when adjacent to other chars
  and the IGNORE chars become whitespace that is discarded."
  (let ((table (copy-readtable nil)))
    (map nil #'(lambda (ch)
                 (set-macro-character ch #'(lambda (s char)
                                             (declare (ignore s))
                                             (intern (string char)))
                                      nil table))
         single)
    (map nil #'(lambda (ch)
                 (set-syntax-from-char ch #\A table))
         alphabetic)
    (map nil #'(lambda (ch)
                 (set-syntax-from-char ch #\space table))
         ignore)
    table))


(defparameter *safe-readtable* (make-safe-readtable)
  "Readtable for safe-read function.")

(defun safe-read (&optional (stream *standard-input*)
                            (eof-errorp nil) (eof-value nil)
                            (*readtable* *safe-readtable*))
  "A read function where funny chars read as the chars themselves.
  Note that the default EOF-ERRORP is NIL, not T."
  (read stream eof-errorp eof-value))
-- 
Peter Norvig                  | Phone: 415-833-4022           FAX: 415-833-4111
Harlequin Inc.                | Email: norvig@harlequin.com
1010 El Camino Real, #310     | http://www.harlequin.com
Menlo Park CA 94025           | http://www.cs.berkeley.edu/~russell/norvig.html
