Pages

Tuesday, 5 July 2011

Machines and meaning

Old post, now archived thus.

--

The problem with machines is subjectivism. A machine’s ability to feel, to perceive and to draw subjective conclusions is compromised by its creation itself: if we were to put random bits of metal together and suddenly witness life, we wouldn’t bother with any of the intricate mechanisms that seem to make the device life-like. Essentially, we’re only imitating life, we’re not creating it.

In the absence of mechanical subjectivism, the most favourable other recourse at our disposal is to imitate the processes through which we humans achieve subjectivism. The most important of these processes is the feedback loop, and together with such things as logic gates and Turing machines, the loop manages to recreate various scenarios. Recreation, however, is not our ultimate goal but must still be suffered so that we may find what we seek.

As an example, I’ll use the ‘intent’ and ‘mechanism’ modules to highlight the problems associated with making a machine think for itself. And this is not a simple computation process that a Universal Turing Machine (UTM) could solve – think of it as a UTM with a very large rules table, an infinite and variable input tape and an output that must fit certain descriptions.

Intent: To reprimand a man who’s made a mistake

Resources: Corpus of words, grammatical rules, semantics, alpha-numeric index

Mechanisms:

  1. UTM.access accesses elements from the ‘Resources’ set

  2. UTM.flip substitutes one element with another element

  3. UTM.eval.typo is a function that evaluates the typology of a given phrase and returns a predefined structure index value

  4. UTM.eval.sem is a function that evaluates the meaning of a given word and returns a predefined semantics index value

  5. UTM.build finalizes the order in which a length of words in a two-dimensional array are set

  6. UTM.insert inserts a word into the array

  7. UTM.remove removes a word from its place in the array


Now, in terms of the UTM: the input tape is made finite and constituted by the words in the corpus, the rules table inherits its logicality from the properties of the ‘Resources’ set, and the output must match the intent.

Corpus of words (closed concept)

  • Finite in number

  • Categorized as adjectives, nouns, verbs, determiners and prepositions

  • Spelling


Grammatical rules (closed concept)

  • Syntactic rules (placement of commas after certain words, etc.)

  • Typologies (OVS and SVO)

  • Placement rules for verb phrase (VP), noun phrase (NP), prepositional phrase (PP), and adjectival phrase (AP)

  • Categorization rules for phrasal grammar


(Example: My neighbour, whose dog was barking all night, parked his car in the garage and ran into his backyard.

Phrasal form: {My neighbour, {whose dog was barking all night}, {parked {his {car}} {in {the garage}}} and {ran {into {his backyard.}}}})

Alpha-numeric index

  • A mapping from one closed-concept finite set to a closed-concept infinite set

  • Providing inter alia “communication” between different functions and procedures

  • Definitions can be modified by user(s)


There we have it. Now, using these tools, and a corpus of say 5,000 words, script the algorithm for a Turing machine to generate as many sentences as possible that are all synonymous to: “He is so foolish, so much more foolish than my neighbours.”

The simpler sections of the algorithm are invariably those that have been deconstructed into discrete encapsulations of different concepts, and therefore are easily deployed as part of a process. The more difficult sections are those that employ the semantic aspects of words because the problem statement is how do you make a machine understand meaning?

(Remember that the corpus does not contain a synonymic categorization rule.)

To be continued...

No comments:

Post a Comment