Artificial Intelligence/Machine Learning
Prof. R. Williams
	
		 OVERVIEW OF SUBDIRECTORY CONTENTS
		   AND HOW TO USE THESE PROGRAMS

This subdirectory contains data files and programs for 5 different
approaches to learning from examples:

Version Space		
Decision Tree		(discrete attributes only)
Perceptron Network	
Backprop Network	
Nearest Neighbor	(doesn't use k-d trees)


They are all designed to work on training data given in a common format.


Relevant parts of Winston, "Artificial Intelligence", 3rd edition:

Version Space    Ch. 20
Decision Tree    Ch. 21 (pp. 423-431)
Neural Nets	 Ch. 22 (pp. 443-453, 458-469)
		 Ch. 23 (pp. 471-477, 482-484)
Nearest Neighbor Ch. 19


Program Capabilities
---------------------
			   Input	    Output (discrete only)
		    discrete continuous multi-class multi-dimensional
		    -------- ---------- ----------- -----------------
Version Space         yes      no         no            no      
Decision Tree         yes      no         yes           no
Perceptron Network    yes      yes        yes           yes
Backprop Network      yes      yes        yes           yes
Nearest Neighbor      yes      yes        yes           yes


Program Files
--------------
version-space.lisp
decision-tree.lisp
perceptron.lisp
backprop-net.lisp
nearest-nbr.lisp

learn-utils.lisp
nnet-utils.lisp

Compile and Load Files
----------------------
compile-learn-progs.lisp
load-learn-progs.lisp

To compile all programs, evaluate (load "compile-learn-progs")
To load	all programs, evaluate (load "load-learn-progs")


USER ROUTINES ACROSS ALL PROGRAMS
---------------------------------
define-training-data input-ranges output-ranges training-data

	Specifies valid values for input and output attributes
	and lists examples to be used for training.  Ordinarily
	included as part of a data file to be loaded at the
	desired time.  See data files for examples of how the
	arguments are specified.

show-training-data

	Displays the data currently loaded.

show-ranges

	Displays valid ranges for input and output attribute values.

USER ROUTINES BY PROGRAM
------------------------

Version Space
-------------
init-version-space &optional concept

	Prepares training data for the version space program and
	initializes the internal data structures.  If no argument given,
	it tries to make an intelligent guess as to which output
	attribute and value to use.

train-version-space

	Processes all examples in the training set one by one to
	build the version space.  Its use is optional because
	assimilate-positive-example and assimilate-negative-example
	could be used instead.

test-version-space input-att-vec

	Allows user to specify an input attribute vector for the program
	to try to classify according to its current version space.

assimilate-positive-example input-att-vec
assimilate-negative-example input-att-vec

	Allow user to hand the program training examples one by one as
	an optional alternative to train-version-space.

Decision Tree
-------------
init-decision-tree

	Prepares training data for the decision tree program.

train-decision-tree

	Builds decision tree for classifying all training examples.

test-decision-tree input-att-vec

	Allows user to specify an input attribute vector to be
	classified according to the decision tree.

show-decision-tree

	Displays the decision tree.

Perceptron
----------
init-perceptron

	Prepares training data for the perceptron program, determines
	mappings of input and output attribute values to units, and
	constructs the network with all weights equal to zero.

train-perceptron &optional (max-passes 1000)

	Cycles through data and trains the perceptron network.
	Stops when all data learned or after max-passes passes
	through the data.

test-perceptron input-att-vec

	Allows user to specify an input attribute vector to be
	classified by the perceptron network.

show-perceptron-weights

	Displays the current values of the perceptron weights.

show-neural-net-representation

	Displays the correspondence between network units and attribute
	values for both input and output.

*show-perceptron-detail*

	Global parameter.  When set to t, program displays the detailed
	weight change steps of the perceptron algorithm.  Initial value
	is nil.


Backprop Net
------------
init-backprop-net &optional (nbr-hidden-units 2)

	Prepares training data for the backpropagation program, determines
	mappings of input and output attribute values to units, and
	constructs a network with this number of input and output units
	and the specified number of hidden units.  Initializes all weights
	to small random values (chosen uniformly from [-0.5,0.5]).

train-backprop-net &optional (learn-coeff 0.5) (max-passes 5000)

	Cycles through data and trains the backpropagation network.
	Stops when all data learned to within the desired tolerance
	or after max-passes passes through the data.

test-backprop-net input-att-vec

	Allows user to specify an input attribute vector to be
	classified by the backpropagation network.

show-backprop-weights

	Displays the current values of all weights in the network.

show-neural-net-representation

	Displays the correspondence between network units and attribute
	values for both input and output.

*display-interval*

	Global parameter determining how many passes through the
	training data occur between displays of RMS error information.
	Initial value is 100.

*tolerance*

	Global parameter determining how close to the desired values
	all output values must be before successful termination.
	Initial value is 0.1.

Nearest Neighbor
----------------
init-nearest-nbr

	Prepares training data for the nearest neighbor program.

train-nearest-nbr

	Currently doesn't do anything, but provides a potential hook
	for eventual interfacing to a k-d tree program.  Also, provided
	now for consistency with the other learning programs.

test-nearest-nbr input-att-vec &optional (nbr-neighbors 1)

	Allows user to specify an input attribute vector to be
	classified according to the specified number of nearest
	neighbors.


General Remarks:

1. Whenever a new set of training data has been loaded using
   define-training-data it is necessary to call init-... to be sure that
   the corresponding program interfaces to it properly.

2. These programs are designed to be loaded into Lisp at the same time and
   will not interfere with each other.  One or more of them can thus be
   run in the same Lisp environment to obtain a side-by-side comparison
   of their behavior on the same data set.
