Forum: War Ensemble BBS

Wrong ideas about chatbots

From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.misc on Sun Jun 8 00:39:21 2025

From Newsgroup: comp.misc

Ben Collver <bencollver@tilde.pink> wrote or quoted:
| you'd start with an enormous quantity of text, then do a lot
|of computationally-intense statistical analysis to map out which
|words and phrases are most likely to appear near to one another.

I had already explained why that was off, but let me give you
all a recent example from a recent chat of mine with a chatbot.

I asked the chatbot to write a program for left-associative parsing
of English. He must have mixed up my "left-associative" with the
more common "left-recursive" or figured I just said it wrong.

He clearly did not know about this specific left-associative
parsing method for natural languages. Even after I gave him
the exact source, he still did not get it right.

Then I laid it out for him [1]. After that, he wrote a program [2]
for left-associative parsing of English. Here is what he produced:
[3]. I also asked him for an explanation of the approach to laymen,
so if you want to learn more about it, see [4].

So tell me, how is real understanding like this supposed
to happen if chatbots just work based on "a statistical
analysis of which words often show up together"?

[1] How I explained it to him, after his first program was
not what I wanted

Maybe you're mislead applying standard terms to the special
NEWCAT approach.

It starts by an empty object and then always appends the next
word until the end of text is reached. No big recursion there.

We get: empty + "the"

Now, we need a grammar rule to see if that combination is
possible. For this purpose each of "empty" and "the" have
a category which is a data structure with their attributes,
and the grammar rule than checks whether two things with these
categories can be combined and if so, it creates the new
sentence start "empty + 'the'" with a new category given by
the rule.

Then we try to add "cat" to the sentence start. So the sentence
is build left-to-right, that's what's "left-associative" about it.

Finally, we add the "." after "fish" and then we may have
a complete sentence if everything was allowed by the rules.

This parsing is trivial. The crucial thing is to write the
categories and the rules so that all continuations of a
sentence start that are legal in English are allowed by the
rules and all others are rejected.

[2] The parser for a tiny subset of English he wrote then,
according to my above explanation

import builtins
import sys
import time

# Toy lexicon
LEXICON = {
'the': ('DET', {}),
'cat': ('N', {'number': 'sg'}),
'cats': ('N', {'number': 'pl'}),
'dog': ('N', {'number': 'sg'}),
'dogs': ('N', {'number': 'pl'}),
'and': ('CONJ', {}),
'eats': ('V', {'number': 'sg'}),
'eat': ('V', {'number': 'pl'}),
}

def get_category(word):
if word not in LEXICON:
raise ValueError(f"Unknown word: {word}")
return LEXICON[word]

def combine(state, next_token):
cat, features = next_token
print(f" [combine] State: {state}, Next: ({cat}, {features})")
# If state is None, we're at the start
if state is None:
if cat == 'DET':
print(" [combine] Start DET")
return ('subj', {'number': None, 'is_complete': False, 'needs_conj': False, 'pending_det': True})
elif cat == 'N':
print(" [combine] Start bare N (fail)")
return None
else:
print(" [combine] Start fail")
return None
# If we're building a subject
if state[0] == 'subj':
subj = state[1]
# --- Coordination context: CONJ + DET + N ---
if subj.get('needs_conj') and subj.get('pending_det') and cat == 'N':
print(" [combine] CONJ + DET + N -> coordinated NP (plural)")
return ('subj', {'number': 'pl', 'is_complete': True, 'needs_conj': False, 'pending_det': False})
# --- Plain DET + N (not in coordination) ---
if subj.get('pending_det') and cat == 'N':
print(" [combine] DET + N -> NP")
return ('subj', {'number': features['number'], 'is_complete': True, 'needs_conj': False, 'pending_det': False})
# --- NP + CONJ ---
if subj.get('is_complete') and cat == 'CONJ':
print(" [combine] NP + CONJ")
return ('subj', {'number': subj['number'], 'is_complete': False, 'needs_conj': True, 'pending_det': False})
# --- CONJ + DET ---
if subj.get('needs_conj') and cat == 'DET':
print(" [combine] CONJ + DET")
return ('subj', {'number': None, 'is_complete': False, 'needs_conj': True, 'pending_det': True, 'prev_number': subj['number']})
# --- NP + V (agreement check) ---
if subj.get('is_complete') and cat == 'V':
print(f" [combine] NP + V: subj number {subj['number']}, verb number {features['number']}")
if subj['number'] == features['number']:
print(" [combine] Agreement OK")
return ('sent', {'number': subj['number']})
else:
print(" [combine] Agreement FAIL")
return None
print(" [combine] No rule applies (fail)")
return None
# If we already have a sentence, nothing more allowed
if state[0] == 'sent':
print(" [combine] Already sent (fail)")
return None
print(" [combine] No rule applies (fail)")
return None

def parse(sentence):
words = sentence.lower().split()
state = None
print(f"\nParsing: {sentence}")
for word in words:
print(f" State: {state}, Next: {word}")
cat = get_category(word)
state = combine(state, cat)
if state is None:
print(f" Failed at: {word}")
return False
print(f" Final state: {state}")
return state is not None and state[0] == 'sent'

# Test sentences
sentences = [
"the cat eats", # ✔️ singular
"the cats eat", # ✔️ plural
"the cat and the dog eat", # ✔️ coordination (plural)
"the cat and the dogs eat", # ✔️ coordination (plural)
"the cat and the dog eats", # ❌ subject-verb disagreement
"the cats eats", # ❌ subject-verb disagreement
"the cat and eat", # ❌ incomplete NP
"cat eats", # ❌ missing DET
"the cat and the", # ❌ incomplete NP
]

for s in sentences:
result = parse(s)
print(f"Result: {'ACCEPT' if result else 'REJECT'}")

[3] the abridged output of the parser

Parsing: the cat eats
Result: ACCEPT

Parsing: the cats eat
Result: ACCEPT

Parsing: the cat and the dog eat
Result: ACCEPT

Parsing: the cat and the dogs eat
Result: ACCEPT

Parsing: the cat and the dog eats
Result: REJECT

Parsing: the cats eats
Result: REJECT

Parsing: the cat and eat
Result: REJECT

Parsing: cat eats
Result: REJECT

Parsing: the cat and the
Result: REJECT

[4] the chatbot explains it:

What is Left-Associative Parsing?

Left-associative parsing is a way of analyzing a sentence by
reading it from left to right, always combining the information
you've built up so far (the "sentence start" or "current state")
with the next word.

- At every step, you only ever look at the current state and
the next word - never more than that.

- You never look ahead or backtrack; you just keep moving
forward.

- This is sometimes called "incremental" or "shift-reduce"
parsing, but here we mean: Always combine the current
state with the next word, and update the state.

Why is this interesting?

- It's simple, fast, and models how humans often process
language in real time.

- It forces you to encode all the information you'll need for
future steps in the current state, because you never get to
"look back".

What are Complex Categories?

In traditional grammar, you might label things as "Noun Phrase
(NP)", "Verb Phrase (VP)", etc.

But in left-associative parsing, the current state must
carry all the information you'll need for the rest of the
parse.

So, you use complex categories - data structures that store
not just the grammatical type, but also features like number
(singular/plural), whether you're in the middle of a coordination
("and"), whether you're expecting a noun, and so on.

In the script:

- The state is a tuple like ('subj', {...}) or
('sent', {...}).

- The dictionary inside holds all the features you need:
"number", "is_complete", "needs_conj", "pending_det", etc.

--- Synchronet 3.21a-Linux NewsLink 1.2

From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.misc on Sun Jun 8 12:51:07 2025

From Newsgroup: comp.misc

ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted:

Parsing: the cat eats
Result: ACCEPT

I asked the chatbot to modify the program to also output the
meaning in both a Python-like notation and with an image.

Input: the dog eats
Result: ACCEPT
Meaning: eats(one(dog))
/^ ^\
/ 0 0 \
/ Y \
~~~~~
V\ - /V
/ \
| |
(__V__)

----------------------------------------
Input: the dogs eat
Result: ACCEPT
Meaning: eat(morethanone(dog))
/^ ^\ /^ ^\
/ 0 0 \ / 0 0 \
/ Y \ / Y \
~~~~~ ~~~~~
V\ - /VV\ - /V
/ \ / \
| | | |
(__V__) (__V__)

----------------------------------------
Input: the cat eats
Result: ACCEPT
Meaning: eats(one(cat))
/\_/\
( o.o )

^ <

~~~~~

----------------------------------------
Input: the cats eat
Result: ACCEPT
Meaning: eat(morethanone(cat))
/\_/\ /\_/\
( o.o ) ( o.o )

^ < > ^ <

~~~~~ ~~~~~

----------------------------------------
Input: the cat and the dog eat
Result: ACCEPT
Meaning: eat(and_(one(cat), one(dog)))
/\_/\
( o.o )

^ <

~~~~~

/^ ^\
/ 0 0 \
/ Y \
~~~~~
V\ - /V
/ \
| |
(__V__)

----------------------------------------
Input: the cat and the dogs eat
Result: ACCEPT
Meaning: eat(and_(one(cat), morethanone(dog)))
/\_/\
( o.o )

^ <

~~~~~

/^ ^\ /^ ^\
/ 0 0 \ / 0 0 \
/ Y \ / Y \
~~~~~ ~~~~~
V\ - /VV\ - /V
/ \ / \
| | | |
(__V__) (__V__)

----------------------------------------
Input: the dog and eat
Result: REJECT
----------------------------------------
Input: cat eats
Result: REJECT
----------------------------------------
Input: the cats eats
Result: REJECT
----------------------------------------
Input: the dog and the cat eats
Result: REJECT
----------------------------------------

--- Synchronet 3.21a-Linux NewsLink 1.2

From Ben Collver@bencollver@tilde.pink to comp.misc on Sun Jun 8 14:27:49 2025

From Newsgroup: comp.misc

On 2025-06-08, Stefan Ram <ram@zedat.fu-berlin.de> wrote:

Ben Collver <bencollver@tilde.pink> wrote or quoted:

^^^^^^

Just to be clear, i quoted it. I am not the author of this blog.

| you'd start with an enormous quantity of text, then do a lot
|of computationally-intense statistical analysis to map out which
|words and phrases are most likely to appear near to one another.

You are fixating on the technical and ignoring the social. From the
original article:

[AI] turns social relations into number-crunching operations...

Meredith Whittaker, president of the Signal Foundation, has
described AI as being fundamentally "surveillance technology".

AI systems have found their best product-market fit in police and
military applications, where short-circuiting people's critical
thinking and decision-making processes is incredibly useful...
--- Synchronet 3.21a-Linux NewsLink 1.2

From ram@ram@zedat.fu-berlin.de (Stefan Ram) to comp.misc on Sun Jun 8 14:59:40 2025

From Newsgroup: comp.misc

Ben Collver <bencollver@tilde.pink> wrote or quoted:

You are fixating on the technical and ignoring the social. From the
original article:
[AI] turns social relations into number-crunching operations...
Meredith Whittaker, president of the Signal Foundation, has
described AI as being fundamentally "surveillance technology".
AI systems have found their best product-market fit in police and
military applications, where short-circuiting people's critical
thinking and decision-making processes is incredibly useful...

It's true that for some folks, AI kind of takes the place of
real social interaction, and it can be used for things like
surveillance, law enforcement, or the military.

It's a good thing when people spot potential risks and speak
up about them.

But people really shouldn't be acting as if every social
interaction we have is now run by AI, or like those police
and military uses are all AI is good for.

You could say the same thing about all kinds of tech and scientific
progress. Take psychology, for example. There are techniques that
let you figure out someone's political leanings just from subtle word
choices. Any kind of scientific or technical breakthrough can get
twisted by bad actors, like dictators, to spy on their own people,
mess with them, or even go after other countries.

--- Synchronet 3.21a-Linux NewsLink 1.2

Who's Online
Recent Visitors
- Microbot
  Sat Aug 23 00:05:56 2025
  from Moore, Ok via Telnet
- Noozle
  Fri Aug 22 11:07:42 2025
  from Noozle City via Telnet
- Microbot
  Fri Aug 22 01:53:59 2025
  from Moore, Ok via Telnet
- Microbot
  Thu Aug 21 03:21:53 2025
  from Moore, Ok via Telnet

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	1,064
Nodes:	10 (0 / 10)
Uptime:	148:01:05
Calls:	13,691
Calls today:	1
Files:	186,936
D/L today:	33 files (6,120K bytes)
Messages:	2,410,931

Wrong ideas about chatbots

Who's Online

Recent Visitors

System Info