At the forefront of Artificial Intelligence
  Home Articles Reviews Interviews JDK Glossary Features Discussion Search
Home » Articles » Natural Language Processing » Applications/Code

Archbot - A Chatterbot building Architecture, or "Bot Architecture" using XML, MSAgent, SAPI and WSH

By Greg Binning

The following is an abbreviated version of "Main Documentation.txt" that can be found in the docs subfolder of the Archbot directory.

Introduction

Archbot is a chatterbot building system or "architecture" that allows the user to build voice controlled, scriptable, animated chatterbots using XML files that provide the bot's specifications. The user can type via keyboard or speak via microphone to his/her computer, and have it talk back, using an animated Agent, as well as execute scripts written in either VBScript or JScript, that perform tasks, letting the user literally control his/her PC by talking to it. Bots with different brains, personalities, and even different genders can be built using this architecture. There are two versions of this project, a SAPI (speech recognition) and a Non-SAPI (keyboard typed dialog) version. BEFORE YOU BEGIN, you need to read the "Main Document". It provides a "Quick Start Guide" that lets you get where you need to go quickly.

Quick Start Guide

This section is provided to let you know what each section of the paper talks about, letting you skip around the document to go where you want to go. If you want to learn more about the history of Chatterbots and the current state of bot technology, read the "History of Chatterbots" and "Current Technologies" section. If you want to Learn more about the project's purpose and what it is, read the "Project Description" section. The "ArchBot User Manual" will allow a User to get started using the application right away. Then we get into the "Project Requirements" section where we discuss the Methodology used as well as the technologies used in the project. A small study of the English language follows this, giving the user a brief tutorial on concepts of the English language, that are essential to understanding the Natural Language Processing utilized in the project. The "Project Application Development" section contains information for programmers wishing to learn how to build a bot architecture engine and how to expand and extend it. If you wish to learn how to build a bot and the bot's brains through the XML files, check out the section "Building a Bot". Ideas for future updates/enhancements and a little paragraph about me are provided at the end of this paper.

History of Chatterbots

ELIZA

During the 1960s Joseph Weizenbaum created ELIZA. ELIZA created a storm of public interest in AI, as it helped thousands overcome their personal problems. ELIZA was a psychiatrist, particularly one that posed analytical questions for every answer the user gave it. Though sometimes they may have seemed ambiguous, people actually felt ELIZA could take care of their needs just as well as any other therapist. They became emotionally involved with ELIZA, even Weizenbaum's secretary demanded to be left alone with the program. When people had started calling ELIZA intelligent, Joseph Weizenbaum went into an uproar. Technically, ELIZA was actually unable to understand people's personal problems to the depth of any other human being. ELIZA could only manipulate syntax (grammar), and check for some key words. Certainly, if someone had no knowledge of ELIZA being a program, one could easily conclude that it behaved like a human conversing, although it never really neccessary understood everything to the detail that humans do.

Turing Test

In 1950, Dr. Alan Turing, a British mathematician who is now considered the "Father of AI" proposed the "Turing Test" for intelligence. Simply put, the Turing Test boils down to the question: "Can this machine convince the human to think that it's human?". Specifically, the machine is a natural language system that converses with human subjects. In the Turing Test, a human (the judge) is placed in one room, and the machine/or another human is placed in another. The judge may ask questions or answer questions posed by the computer/or another human. All communication is done through a terminal, input is done by typing. The judge is not aware whether or not the subject that he/she is talking to is either a human or a computer before the conversation begins. Supposing that the judge was conversing with a computer, during and after the conversation, he/she must be "fooled" into thinking that the machine is a human in order for the machine to pass the Turing Test. There are actually very many pitfalls to the Turing Test, and it is in fact, not very widely accepted as a test for true intelligence.

Problems with AI then

I don't know where to begin on how technology has changed in the last half century, much less what has happened in the last decade with the developments in computers, technology and AI. Back then we didn't possess the raw computing power, much less the advances in speech recognition, text to speech, neural nets, easy string manipulation algorithms, and graphical animation. But computers have only really been around for 60 years so we wouldn't expect to much of them at first. People had plenty of high expectations as to what technology would be like in the year 2000. Some of those dreams are being realized today while many others are still far away.

Current Technologies

Alice

In the 1990's, Dr. Richard Wallace developed a chatterbot system that could be written in an XML specification called AIML, short for Artifical Intelligence Markup Language, and "Alice" was born. Today, Alice and her many derivitives, or "clones", permeate the web today as artificial site greeters, sales representatives, celebrities like Elvis, The Beatles, and as a novelty item on the movie web site for "AI - Artificial Intelligence". Alice runs similiar to Eliza, with more tricks and a bigger brain this time, and is a very popular chatterbot in the AI community today. Probably the biggest factor of success for Alice is the fact that she's open source, drawing on many resources around the world to contribute to her further success. Another big note is the fact that Alice has won the Loebner Contest, mentioned below, for two years in a row as of this writing. There are around 25000 templates in her brain, and growing. Dr. Wallace's unique one liners as responses is what gives Alice her unique personality. This project chooses to go open source for the same reasons as Alice, providing Visual Basic programmers a framework to continue to build upon and become the Microsoft Windows equivalent to the SETL environment's Alice.

Loebner Contest

Today, the Loebner Contest is an instantiation, a modern version of the Turing Test. The criticisms surrounding the Loebner contest deals with how the Turing Test is carried out. The goal of the contestant is to fool or trick the judge into thinking that his program is a human. Such a prospect does not encourage the advancement of AI. For example, messages are transmitted via text, as the subject (human or computer) types, the judge sees the text that is being typed, live. Thus, many contestants have been forced to emulate typing conditions of humans, i.e. text that is outputed comes out at varied speeds, sometimes words must be misspelled and corrected, incorrect punctuation is often used etc. Even then, the programs in the contest usually talk about only one subject (to talk about everything present in our culture is simply impossible, at least for a natural language system that understands only words, syntax and semantics and not really what they look like, what some objects really do etc.). If the judge picks another subject to discuss, the programs usually try to divert the attention of the judge. Programs have even tried to use vulgarity or an element of surprise, to get the judge excited. Alice, mentioned above, has won the Loebner Contest two years in a row.

Problems with AI today

Although we have seen many advances in technology, especially with AI, we still do not possess software that unequivocally possesses the ability to produce conversation at a human's intelligence level. Chatterbots have utilized Markov chains, neural nets, language disambiguation parsing, as well as the more classic methods of ELIZA like Case Logic, blackboards, NLP, and part of speech tagging. Huge claims to fame from corporations of "child machines", "common sense databases", and other AI and NLP related topics have resulted in nothing really more than a research project whose funding gets cut. Nobody and nothing can purport to have a machine that actually THINKS like a human today. At any rate, we still do not possess a software that can pass the Turing Test, or display strong, genuine artificial intelligence in natural language conversation. We are however, producing better chatterbots and chatterbot systems.

Project Description

The intent of this project is to be able to develop chatterbots through a "bot architecture" that can provide numerous services from a Chatterbot interface. This interface will bring together the technologies of Voice Recognition, Natural Language Processing, Text to Speech, Animated Agents, and Windows Scripting Host. The interface provides for voice or keyboard input from the user and controls the MSAgent's movement, animations and speech, as well as scripted commands being executed by WSH. The script can be in any script language supported by Windows Scripting Host.

The "bot architecture" will enable bot authors to build bots that have different brains, different personalities, different animated Agents, even different bot genders, through the use of various XML files. This bot architecture has been developed to ease the maintainability issue that comes with handling huge brainfiles, as well as mechanisms providing for numerous NLP tasks like the task of Symbolic Reduction. Bot authors can develop brainfiles that can be written to provide endless possibilities of randomness and variation in the bot's responses and the Agent's animation. Bot authors also have the ability to write linear conversation trees. This means giving the bot author the ability to have the bot actually carry on a conversation past one sentence. There are many possibilities of what the bot can do in terms of scripting, and bot authors can choose to script their bots in whatever scripting languages are installed on their PC, usually in most cases it's VBScript or JScript.

ArchBot User Manual

The User Interface

The File Menu

Open Bot - This presents the "Open Bot" dialog allowing the user to select a bot file (*.xml).

Save Conversation - This presents the "Save Conversation" dialog allowing the user to save conversation's that the user and bot have.

Speech is On/Off - this toggles whether Speech Recognition is on or not. (SAPI version only)

Show/Hide Agent - this toggles "Show/Hide" display of the MSAgent loaded.

Agent Properties - this presents the user with a dialog allowing the user to change the MSAgent's properties.

Exit - this Exits the Archbot application.

The Help Menu

The About Screen - this displays information about the Archbot Project.

User Input Textbox - This is at the bottom of the main interface and allows the user to type in their utterances and their side of the conservation. Instead of typing, the SAPI version allows you to talk instead of type (SAPI version only). While the Agent is talking, speech recogonition is turned off. When the Agent finishes speaking, the speech recognition is turned on, and the user can speak then. When the user's speech is recognized, the input is captured and sent to the NLP engine to respond to, and speech recognition is turned back off, and the Agent speaks the response. This process is continued throughout the conversation. If you wish to start talking before the Agent is finished speaking, hit the Scroll Lock Key, and the Agent stops talking and speech recognition is turned on (see below).

Conversation Output Textbox - This is at the top of the main interface and allows the bot to display the conversation that the user and it has.

The Splitter Bar - This is in the middle of the User Input Textbox and the Bot Output Textbox to seperate the two. You can adjust the position of the splitter bar with the mouse cursor.

The Scroll Lock Key - Use this key to interupt the agent and speak before he/she finishes. (SAPI version only)

Bot Files

Provided with the application are four bots that were made using four popular MSAgents.
   ArchBot\Bots
         Genie.xml
         Merlin.xml
         Peedy.xml
         Robby.xml

Project Requirements

Selected Methodology

In addition to the tools and technologies used in the Selected Technologies part of this paper, the methodology we use here is based on the need for the ease of use and understandability of an architectural system while finding ways of doing the job in pure, easy to understand Visual Basic 6.0. Visual Basic was used in this project for its ease of use and understanding but the project could exist in other languages. Obviously we don't need to reinvent the wheel in terms of quality of product at the right price because Microsoft provides almost everything used here: MSAgent 2.0, MSXML 4.0, SAPI 5.1, and Windows Script Hosting components, get this, for FREE! The code used to invoke these technologies are not very hard to program and allow us to take for granted that these components will do their job bug free, allowing us to concentrate on NLP coding. Some of the algorithms used in this project are written for ease of understanding, not speed. The primary purpose of this first prototype is to provide maximum exposure of all code in a few forms, a main module, and four class modules, while heavily commenting the project's code for easier understanding of all the processes. Later in this paper we will detail in depth the main natural language parser engine that we use in conjunction with these other technologies to derive a quality bot product.

The project could certainly be enhanced ten fold by more efficient algorithms but it's programmed the way it is so a programmer doesn't have to jump around too much in the code. The core engine consists of 22 functions. This said, for the most part, the algorithms and code are pretty fast and efficient as is. A lot of functionality like Internet Surfing, EMail, Reminder System etc. was not coded here to present the project code as simply as possible. However, once a programmer learns how the system operates, he/she can easily expand it to do so much more.

Selected Technologies

VB 6.0 - This is the development environment of this project.

XML 4.0 - XML 4.0 is used to formulate all the documents and their SAX Readers used by the bot architecture.

MSAgent 2.0 - This is the agent technology used for the Animation and Text to Speech technologies of the bot.

SAPI 5.1 - This is the Speech API used for the voice recognition. (SAPI version only)

WSH 5.5 - This is the Windows Scripting Host control that executes the bot's scripts.

RichTextFormat 6.0 - This Rich Text Format textbox control is used for display and opening and saving files.

It should be noted that SAPI 5.1 only runs on the following systems:

  • Windows XP Professional or Home editions; all language versions.
  • Windows.NET Server editions; all language versions.
  • Microsoft Windows 2000 Professional Workstation or Server; all language versions.
  • Microsoft Windows Millennium edition.
  • Microsoft Windows 98 all editions.
  • Microsoft Windows ® NT Workstation or Server 4.0, service pack 6a.
  • Windows 95 or earlier is NOT supported.
Therefore, Win95 users should use the Non-SAPI version (typed text only). Alternatively, the Win95 version could have SAPI 4.0 version programmed for the Voice Recognition part. This project only has SAPI 5.1 programmed because it's easier to use and a better, more accurate voice recognition engine. Below is a listing of URLs that let you download the technologies used in this project for FREE.

SAPI 5.1

http://www.microsoft.com/speech/download/sdk51/

Windows Script Host (WSH) 5.5

http://www.microsoft.com/msdownload/vbscript/scripting.asp

MSAgent 2.0

http://www.microsoft.com/msagent/downloads.htm
http://www.msagentring.org/setup.htm

MSXML 4.0

http://msdn.microsoft.com/downloads/default.asp?url=/downloads/sample.asp?url=/msdn-files/027/001/766/msdncompositedoc.xml

About Me

My name is Greg Binning and I have spent 2 years putting this project together in my spare time. I work for state government in Louisiana as a Systems Analyst 2, currently I'm working on a special project to build a statewide web based application and I serve as that project's Technical Manager. I have studied Artificial Intelligence for many years now and this project is one product of that study. I am as of now, finished with this project and now involved in an expanded artificial intelligence project using Visual Basic 6.0 as the choice development language. This enhanced version will be open source as well.

Download

Submitted: 09/04/2002

Article content copyright © Greg Binning, 2002.
 Article Toolbar
Print
BibTeX entry

Search

Latest News
- Generation5 10-year Anniversary (03/09/2008)
- New Generation5 Design! (09/04/2007)
- Happy New Year 2007 (02/01/2007)
- Where has Generation5 Gone?! (04/11/2005)
- NeuroEvolving Robotic Operatives (NERO) (25/06/2005)

What's New?
- Back-propagation using the Generation5 JDK (07/04/2008)
- Hough Transforms (02/01/2008)
- Kohonen-based Image Analysis using the Generation5 JDK (11/12/2007)
- Modelling Bacterium using the JDK (19/03/2007)
- Modelling Bacterium using the JDK (19/03/2007)


All content copyright © 1998-2007, Generation5 unless otherwise noted.
- Privacy Policy - Legal - Terms of Use -