Cybermate: a self-explorative artificial intelligent

CyberMate: A Self-Explorative Artificial Intelligent

ABSTRACT

Many researches have been carried out to develop virtual personalities for a single specific purpose. The purpose of this research is to form an entry point to address identified five issues exists in Artificial Intelligence Markup Language (AIML) based chatter-bots and software agents currently exist by mixing up with a high level programming language. An efficient method for representing knowledge to achieve the machine learning concept for AIML based chatterbots and software agents is practically proven by the outcome of this research. CyberMate system is designed by addressing these issues with built in capabilities to responding for voice, vision and heat signals captured simultaneously from the environment. The product CyberMate also prove the practicability of addressing some of the issues in the fields of Knowledge engineering, Machine learning, Natural Language processing and social intelligence. CyberMate system will be open for the word wide community through Skype. Final Endeavour of this research is to build a first AIML based AI chatter-bot software that can pass the classical Turing-Test.

Key Words: Artificial Intelligence, AIML, machine learning, Turing-Test, CyberMate

1.0 INTRODUCTION

During the last five to six decays, many researches have made significant contribution to develop techniques for building an Artificial Intelligent virtual personality. Though the intended purposes of those researches were identical to the research product the common goal was to build a virtual personality who can actively respond for the human gestures.

Among the various researches previously conducted on the field of AI software agents; the magnificent creation of Artificial Intelligence Markup Language (AIML) by Dr. Richard S. Wallace, as a XML dialect that can use to represent natural language patterns for software agents and chatter-bots open up a new era for developing natural language processing chatterbots. AIML scripting is a very simple pattern language[1], substantially less complex than regular expressions and as such not even of level 3 in the Chomsky hierarchy [2], many developers could directly write natural language processing chatterbots in different computer programming languages like C++, Java, C#, Python, PHP etc. due to its simple pattern matching capabilities, AIML interpreters can provide preprocessing functions to expand abbreviations and capability of identifying misspelled words.

Following issues are identified in existing AIML based chatterbots and software agents.

  1. The inability to learn and change terms at runtime dynamically.
  2. Inefficient and excess memory usage for knowledgebase.
  3. Communication is done only in text based.
  4. Inability to serve multiple users simultaneously.
  5. Blind, deaf and dumb nature.

CyberMate has addressed these issues in a simple manner by mixing-up with Microsoft.NET C#, Microsoft SQL Server, AForge.NET API and Microsoft SAPI.

2.0 BACKGROUND

2.1 Identification And Significance Of The Problem

The following five considerations have been emphasized in the latest AIML 1.0.1 specification [3].

2.1.1 The inability to learn and change terms at runtime dynamically.

Every AIML based chatterbot still fails to map the facts dynamically that have learnt with user interaction. The following conversation is done with the ALICE hosted at http://alice.pandorabots.com [4].

Human: hi

ALICE: Hello there.

Human: Who am i?

ALICE: You are unknown person, seeker.

Human: I am Hasitha

ALICE: Does that make you feel superior?

Human: who am i?

ALICE: You are unknown person, seeker.

In case (1.1) situation the bot does not actually know with whom that it chats with. But on case (1.2) situation the user has clearly stated his name and AIML scripts failed to identify the user name.

The following example proves this point again.

Human: who is the president of America?

ALICE: I'm not that into politics. Who is it?

Human: Barack Obama is the president of America

ALICE: Is it the only one?

Human: who is the president of America?

ALICE: I'm not that into politics. Who is it?

Human: who is the president of USA?

ALICE: George W. Bush.

Human: Barack Obama is the president of USA

It is almost eight months have gone after the president of USA has being changed and ALICE still not update its knowledge, also ALICE fails to identify the term 'America' and 'USA' is referring to the same nation [5].

2.1.2 Inefficient and excess memory usage.

By default all the AIML chatterbots load everything in '.aiml' script files located inside the AIML folder of the working directory. When the contents inside the AIML script files grow bigger the chatterbot takes considerable time to load in to its operational status.

Ex AIML script:

WHO INVENTED THE PC

Many people contributed to it. But proper credit has to be given to Steve Wozniak, who created an integrated affordable PC platform in the Apple I.

If only this category loads in to the memory in ASCII format it will nearly consume 160x8 bits (160 Bytes). 1,000,000 categories will approximately take;

1,000,000x150 Bytes= 143. 05 MB

150 := (avg. memory consumption per category in Bytes)

The latest update of the Annotated A.L.I.C.E. AIML set contains 47205 categories which approximately require 12 MB of memory. The proposed specification ( See 2.2.x ) only consumes ; 47 Bytes.

WHO INVENTED THE PC

#QUERY_C(WHO, INVENTED THE PC)

1,000,000 categories will approximately take;

1,000,000x50 Bytes = 47.68 MB

50 := (avg. memory consumption per category in Bytes)

Memory Advantage :

143.0547.683.00 (2)

So the solution provided by CyberMate is practically possible.

2.1.3 Communication is done only in direct text based.

Following communication mechanisms are not yet being developed in to AIML based software agents.

  1. Direct voice based two way communication
  2. Feedback in hypermedia contents (multimedia + hypermedia)
  3. Text based communication via Skype

2.1.4 Inability to serve multiple users simultaneously.

Another issue regarding AIML Chatterbots and software agents is that it does not support or serve multiple users simultaneously. Though the web based chatterbot systems does not have this sort of problem, still desktop based system struggle to serve multiple users at one time.

2.1.5 Blind, deaf and dumb nature

Some of the existing AIML software agents are now capable of producing voice reply to the user by using the Text to Speech techniques with a high level programming language. But no practically integrated the concepts like computer vision and natural language speech recognition.

2.2 Methodology

For the purpose of addressing the five issues, the primary thing that was done is to move all the static contents from the AIML scripts in to SQL database. AIML scripts are supposed to contain only the conversation dialogs and the #Query_c(..) tags. Relational schema for SQL table as follows. Const_Kn (QType, Tag, Predicate).

So when the AIML interpreter process and retrieve the respond to the high level programming language, the returned textual data is sent to a text matching algorithm. If that text contains any #Query_C operation, a database manipulation method is called to retrieve the appropriate values.

2.2.1 The tactic used to learn and change terms at runtime dynamically.

The first issue in AIML is addressed by using the some text replacement algorithms with AIML xlearnfact and XEDUCATE patterns keywords.

Human: who is the president of America?

CyberMate: I'm not that into politics. Who is it?

Human: Barack Obama is the president of America

CyberMate: Hm.. I'm not sure. I must search the internet.

Human: who is the president of America?

CyberMate: Barack Obama

At the initial situation (2.1) is similar to the earlier situation (1.3) where the agent does not know about the president of America. But the user's feedback is captured and sent to building facts algorithm.

Ex:

ALGORITHM BUILD_LERNING_FACTS()

identified_Q who is the president of America

ans_ProvidedBarack Obama is the president of America

QTyp extract questionType from identified_Q

identified_Q remove QTyp from identified_Q

possible_Ans ans_Provided - identified_Q

IF (search_google(possible_Ans+ identified_Q) )

THEN

appendToAIMLfile(president of America, Barack Obama)

appendToSQL(president of America, Barack Obama)

reload knowledgebase()

ENDIF

New category will be added to the AIML script as follows.

WHO IS THE PRESIDENT OF AMERICA

#QUERY_C(WHAT, PRESIDENT OF AMERICA)

A tuple (WHO, PRESIDENT OF AMERICA, BARACK OBAMA) is inserted to the SQL database. As soon as the operation gets completed, AIML knowledgebase gets reloaded. So from here after the software agent permanently knows about this fact. Consider a situation like (2.4), where the software agent only knows about the president of America and USA is also refers to America.

Human: who is the president of USA?

CyberMate: Barack Obama.

To address this type of scenario the user input is sent to pattern exchanging algorithm to identify different words or phrases that is describing a known thing to the software agent. The replaced text "who is the president of America" then will be sent to the AIML interpreter.

This strategy is being used for CyberMate to address the first issue in AIML.

2.2.2 Efficient and less memory usage

To minimize the size of the static knowledge contents including non-conversational dialogues and definitions are shifted to the dynamic knowledgebase to minimize and make efficient use of the memory consumption. In order to get the knowledge queried from the knowledge base all AIML Chatterbot [6] requires loading of all content to the memory whereas it will increase the memory utilization by consuming some quantity of timely delay which leads to inefficiency. For this issue, the index method can refer the knowledgebase. Thus, minimizes the length of the string which is used. (Section 2.1.2)

2.2.3 Communication is done in direct text based, direct voice and indirectly with Skype.

Most of the existing Chatterbot communication is done via text therefore it is incapable to provide a real time conversation with the users. The indication is to implement a strong AI personality which can learn new responses based on user interactions and enable the computing environment to sense in a standard that is the minimally equal to a human and have consiousness themselves by providing real time reasoning and arguing with the users using voice commands in a highly effective mode. The issues in AIML based chatterbots are solved by integrating following APIs and COM components in to the .NET environment.

  1. Direct Voice : Microsoft speech API
  2. Skype : SKYPE4Com library

When speech recognition feature is enabled, the identified string will be taken in to processing in the same way that it was directly keying in to the application. The accuracy of the speech recognition depends on the speech API uses.

Interoperability is implemented in between .NET environment and the Skype by referencing Skype4Com component [7] and monitoring the Skype events manually. A new concept is introduced through CyberMate by providing live conversations using Skype application, combining multiple users around any place in the world at any time. This will serve exclusive and explainable reactions to any user who is willing to get information according to their preference.

2.2.4 Serve multiple users simultaneously is possible

Hence the Skype is integrated to the CyberMate; any user can directly access the CyberMate from around the world. Even any clones of CyberMate can access others and share or request the information they needed.

2.2.5 Three out of the five wits

CyberMate is capable of responding to voice, video, and heat percepts captured from the live environment. As usual the voice is captured by an integrated microphone, video is captured using two or three camera system while the heat is measured by a new invented PIC16f678a microcontroller and DS1621 IC [8] based hardware device. For more information about this device is attached to Appendix.

2.3 Technical Objectives

People have been fascinated with the idea of non-human assistance and the idea made the essential and motivated to make the computer environment more flexible for storing and retrieving information for solving difficult inferential problems through integrated AI applications. Thus CyberMate contributes to an innovative way of implementing a Chatterbot.

AIML does not support dynamic knowledge bases whereas most of the existing AIML based Chatterbot are utilized by static knowledge, containing the inbuilt knowledge that came up with the compilation. The attempt is to introduce a concept in order to add new knowledge that the agent learnt dynamically and update the its knowledge. This will add the self-learning feature for AIML Chatterbot. The Chatterbot will learn from its success by querying users for good responses. In this fact, using a hardcoded HTML parser it can discovers information in a web and stores it in a local database. Using search engine technologies, each link is classified and a set of keywords that will be used by AIML engine for providing hyperlinked external resources associated with it.

The indication is to implement a strong AI personality which can learn new responses based on user interactions and make the computer environment to think on a level that is at least equal to human and possible even be conscious of themselves by providing real time reasoning and arguing with the users using voice commands in a highly effective mode.

With the use of multiple forms of AIML, most of the scientific definitions, biographical, geographical, historical and entertainment information are included to the knowledgebase in order to address the issue in absent of the relevant knowledge.

The foremost idea behind the research is to bridge the gap exists in between the digitized world and the physical world and also contributes to the A.L.I.C.E open source foundation by uplifting existing AIML based ALICEBot.

2.4 Detailed Design

To achieve the stated objectives several methods and techniques are carried out to implement an enhanced AI personality to reside in the modern computing environment. Constraints are dependent on number of factors including the platform; software IDE and the method of several APIs from different vendors get collaborated into one single system.

2.3.1 Knowledgebase

CyberMate knowledgebase is rewritten by overloading the AIML 1.0.1 with the proposed methodology of representing the textual knowledge. Refer Section 2.1.2.

2.3.2 Machine Learning Mechanism

The algorithm used to achieve this feature mechanism is described with example in section 2.1.2.

2.3.3 Speech Processing

To execute the speech processing it is essential to generate percept sequences or logical expressions for the received text patterns from the pattern matching process. Microsoft Speech API 5.4 and Microsoft Anna's TTS engine is used for the procession of the above mentioned feature.

2.3.4 Video Processing

Few video processing algorithms are running inside the CyberMate system to 1).Track any moving objects in the environment 2). Identify human face.

All the video processing algorithms are based on AForge.NET specification.

2.3.5 Logic Processing

Basic level algorithms are used to identify and converts textual expressions to the logical facts. CyberMate system is also integrated to Prolog.NET so it can do basic machine reasoning.

2.3.6 Automatic web searching

Searching process utilizes the capacity via the Google AJAX Search API, Google Maps API, Wikipedia and YouTube according to the format of the information enquire by the user.

2.5 Anticipated Benefits

The system functionalities contribute to fulfill different satisfaction levels of the system users.

2.4.1 Easy access the word wide knowledge

The system users can request any information they require and the system is capable of responding to any request. It gives the result in any format without being limited to the textual output as in ordinary search. The Google NewsShow, Google AJAX Search API, Google Maps API, Google AJAX Map Search and Google AJAX Video Search will enable the responses to different types of user information requests.

2.4.2 Introduce a new measuring method for the classical Turing-test

The system can access and test with use of remote communication mechanism like Skype. CyberMate will introduce remote Turing testing capabilities for an artificial intelligent agent.

2.4.3 User convenience

User does not want to sit in front of the machine to do their computing since the system will act like a human to interact with the user. By the human voice identification and motion detection processes the user is privileged to access the system and it will create the convenient computing environment.

2.4.4 Enhanced search

CyberMate lets user come across almost all the information as textual or hypermedia format when the user requests information. This provides more convenient way of searching information and representing them as the preference of the user. Also keeping information up to date and capability of storing the information in a knowledge base are the most significant approaches in better searching.

3.0 CONCLUSIONS

This system will be spread throughout all computer users without limiting for a specific group. Knowledge seekers can gather information, students can use the system as a learning tool, and the general users can gather information as an explorative tool and especially disable people can use as a supportive tool while they cannot use common input devices.

Within this research it assists to overcome many weaknesses in existing AIML chatterbots and address many unresolved areas in application of AI today work flows by studying current systems in detail. As future enhancement CyberMate will become closer to strong AI, since it will learn from user interactions and will be capable of producing new and unique responses, rather than being driven from a static database by focusing more on the pragmatic aspect of chatterbot technology.

4.0 ACKNOWLEDGMENTS

Gratitude & felicitations offered toward the officials and other staff members of Sri Lanka Institute of Information Technology who rendered their help during the course of project work.

Last but not least a sense of gratitude and love is expressed toward our friends and our beloved parents for their immense support, strength, and co-operation. Finally, thanks to all the team members for the e?ort and contribution toward making this project a success.

5.0 REFERENCES

[1]. "AIML". [online]. Available at: [http://en.wikipedia.org/wiki/AIML]. [Accessed on: 12 May 2010]

[2]. "Chomsky Hierarchy". [online]. Available at: [http://en.wikipedia.org/wiki/Chomsky_hierarchy]. [Accessed on: 14 May 2010]

[3]. "Artificial Intelligence Markup Language (AIML) Version 1.0.1". [online]. Available at: [http://www.alicebot.org/TR/2001/WD-aiml]. [Accessed on: 10 May 2010]

[4]. "A. L. I. C. E. Artificial Intelligence Foundation". [online]. Available at: [http://alice.pandorabots.com]. [Accessed on: 18 May 2010]

[5]. "A.L.I.C.E. Artificial Intelligence Foundation". [online]. Available at: [http://www.alicebot.org/aiml.html]. [Accessed on: 18 May 2010]

[6]. "Chatterbot". [online]. Available at: [http://en.wikipedia.org/wiki/Chatterbot]. [Accessed on: 10 May 2010]

[7]. "Skype4COM reference". [online]. Available at: [https://developer.skype.com/Docs/Skype4COM"]. [Accessed on: 09 May 2010]

[8]. "DS1621 Digital Thermometer and Thermostat ". [online]. Available at: [http://www.maxim-ic.com/quick_view2.cfm/qv_pk/2737]. [Accessed on: 09 May 2010]

Please be aware that the free essay that you were just reading was not written by us. This essay, and all of the others available to view on the website, were provided to us by students in exchange for services that we offer. This relationship helps our students to get an even better deal while also contributing to the biggest free essay resource in the UK!