Start / Digital Transformation by Riaz Khan Podcast / When ai meets my coffee break

Digital Transformation by Riaz Khan Podcast

WHEN AI MEETS MY COFFEE BREAK

13 min • 23 augusti 2023

We’ve all seen the viral videos of ChatGPT’s linguistic dexterity. But how does it fare with logic and reasoning - the building blocks of intelligence?

Curious to find out, I devised some quick tests during a coffee stop to see if I could teach ChatGPT basic logical thinking.

With 45 minutes to kill, and armed with varied date formats and phonetic spellings, I guided ChatGPT through comparisons, translations, and multi-step inferences. Could it recognize patterns from examples? Handle tricky formats? Progress logically? The results reveal the inner workings of large language models in an accessible and entertaining way.

So, grab a coffee, suspend your disbelief, and let’s explore together. You may be surprised by ChatGPT’s capabilities, glimpsing the future potential of AI while also understanding current limitations. Tag along on my fun outing as we discover what machines can do - one double-double and Timbit at a time.

I chose GPT 4. Let's explore the interesting results.

Date Comparisons

The first series of tests was on date comparisons:

I started by teaching ChatGPT to recognize a pattern of two dates in various formats, and the results I expected.

I provided some examples for ChatGPT to learn from. I mixed up the date input formats in my examples to test the LLM's ability to cope with user input in free format.

Here are the examples and the expected results I gave ChatGPT:

Date 1: April 19,2001 Date 2: 2019 June 12th Result: Date 1 is after Date 2

Date 1: Sep 12, 2023 Date 2: October 1st '21 Result: Date 2 is before Date 1

Date 2: 11/June/1991 Date 1: 1991, June 11th Result: Date 2 is the same as Date 1

And now for the first test. I gave ChatGPT two dates, made an error in labeling one of them, and left the result unfilled.

Date 1: Tuesday 15th August 2023 Date 1: October 19, 1957 Result:

ChatGPT read my prompts and gave me the following response:

It looks like there is an inconsistency in the naming of the dates in your last example, as you have two instances of "Date 1." Let's correct that and provide the result:

Date 1: Tuesday 15th August 2023

Date 2: October 19, 1957

Result: Date 2 is before Date 1

This corrected result follows the pattern of the previous examples by comparing the two dates and determining their chronological order. Is there anything else I can assist you with?

That was quite impressive! Chat GPT detected an input parameter labeling error, took a stab at what it should be, compared the two dates although they were entered in different formats, and output the correct result.

Subscribe for free to receive posts just like this!

And now for the second test:

Date 1: 11/October/2020 Date 2: 15th June 2021 Result:

ChatGPT:

Date 1 is before Date 2

In this comparison, the 11th of October, 2020 comes before the 15th of June, 2021. Therefore, the result is that Date 1 is before Date 2. Is there anything else you'd like to know?

The third test:

Date 2: 11/12/2023 Date 1: 11th of December, 2023 Result:

ChatGPT:

Result: Date 2 is the same as Date 1

In this comparison, both dates refer to the 11th of December, 2023, so the result is that Date 2 is the same as Date 1.

Impressively, ChatGPT responded with the correct logical results in all three examples. This small demonstration of multi-step reasoning - recognizing the different date formats, chronologically comparing them, and outputting the relationship - shows ChatGPT can follow structured and logical workflows.

Had I chosen to code this in a traditional programming language, such as Python instead of prompt engineering, I would have needed to do some of the following:

* Write parsing logic to extract the day, month, and year from varied date-format inputs.

* Write functions to convert the dates to a standard structure.

* Call a date comparison function.

* Test each piece extensively to catch edge cases.

The coding and testing time would have been immense compared to the quick, prompt-based approach.

Phonetic Translation and Logic:

My next series of tests evaluated ChatGPT's ability to identify countries based on phonetic spellings in various languages.

Note that phonetically writing words from another language in English, when one isn't trained in this, can be quite tricky.

So, how well would GPT4 understand what I was attempting to communicate?

Let's find out. Without referring to standard phonetic notations, I simply had a go with Arabic and Spanish.

I gave ChatGPT the following instructions:

Given two country names in different languages but phonetically spelled in English, 1. Identify the country names as they would be called in English 2. Write the country names in English 3. Check whether they are the same country or not. If they are, type "These two country names are the same country." If not, type "These are not the same country." 4. Write out the population sizes of the country(s) as of 2020

I chose 2020 because I knew the LLM would have the data prior to September 2021.

Here is the training example I gave GPT4:

United States, Alwulayat al Mutahettda 1. Translate the phonetics and discover the country name (in English, Alwulayat al Mutahettda is Arabic for the United States) 2. US, US 3. These two country names are the same country. 4. The population of the United States in 2020 was approximately 136 million.

I then gave Chat GPT its first test.

Country Names: Germany, Almania

ChatGPT:

Certainly! Let's follow the steps as described in the prompt for the given example:

Translate the phonetics and discover the country name in English (Almania is Arabic for Germany)

Germany, Germany

These two country names are the same country.

The population of Germany in 2020 was approximately 83 million.

This kind of prompt is especially useful for tasks involving translation, comparison, or analysis, as it guides the model through a series of well-defined steps, ensuring that it addresses each part of the task in turn. By breaking down the task into intermediate steps, the prompt can help guide the model's reasoning and ensure that it fully understands the task at hand.

Not only did ChatGPT correctly execute what I had asked, it also explained why a prompt structured in this way can result in the desired output.

My next input:

Country Names: Biritania, Faransa

ChatGPT:

Certainly! Let's work with the given country names and follow the steps as described:

Translate the phonetics and discover the country name in English:

Biritania is likely the phonetic spelling for Britain in certain Arabic dialects.

Faransa is the phonetic spelling for France in Arabic.

Britain, France

These are not the same country.

The population of Britain in 2020 was approximately 66 million, and the population of France in 2020 was approximately 67 million.

Please note that the phonetic translation might vary based on dialects and languages, and the population figures can be further verified with specific data sources.

I continued with a further test:

Country Names: Alyaban, Japan

ChatGPT:

Sure! Here's how the process would work for the given country names:

Translate the phonetics and discover the country name in English:

Alyaban is the phonetic spelling for Japan in Arabic.

Japan is the English name for the country.

Japan, Japan

These two country names are the same country.

The population of Japan in 2020 was approximately 126 million.

Again, please keep in mind that the phonetic translation might vary based on dialects and languages, and the population figures can be verified with specific data sources.

To further validate ChatGPT's reasoning skills, I provided it with an example of a city masquerading as a country. How would it cope? Would it get confused? I gave ChatGPT these two names:

Country Names: Suriya, Riyadh

ChatGPT:

It seems that there may be some confusion in the provided example, as Suriya refers to a country (Syria), while Riyadh is the capital city of Saudi Arabia, not a country. I'll proceed with a breakdown of the example and provide a response accordingly:

Translate the phonetics and discover the country name in English:

Suriya is the phonetic spelling for Syria in Arabic.

Riyadh is the capital city of Saudi Arabia, not a country name.

Syria, (Not a country name)

These are not the same country.

The population of Syria in 2020 was approximately 17.5 million.

Please provide two country names if you would like me to continue with another example, or let me know how else I can assist!

And another example:

India, Alhend

ChatGPT:

Certainly! Here's the breakdown of the example using the country names India and Alhend (Arabic phonetic spelling for India):

Translate the phonetics and discover the country name in English:

India is already in English.

Alhend is the phonetic spelling for India in Arabic.

India, India

They are the same country.

These two country names are the same country.

The population of India in 2020 was approximately 1.38 billion.

These examples showcase ChatGPT's language translation skills. Coding something similar would have required:

* A dictionary of phonetic spellings matched to all place names in multiple languages

* Lookup code to match phonetic input to the dictionary

* And again, extensive testing of all paths

So, prompt engineering leverages ChatGPT's existing knowledge versus coding these capabilities from scratch.

Here is another test. This time using Spanish.

Inglaterra, England

ChatGPT:

Certainly! Here's the breakdown of the example using the country names Inglaterra (Spanish phonetic spelling for England) and England:

Translate the phonetics and discover the country name in English:

Inglaterra is the Spanish phonetic spelling for England.

England is already in English.

England, England

These two country names are the same country.

The population of England in 2020 was approximately 56 million.

This example illustrates the process of identifying the English name for a country when given two different phonetic spellings. In this case, both spellings refer to the same nation, and the relevant population information is returned.

And here is the final test.

Isbania, Al Maxic

ChatGPT:

It seems like there may be a confusion with the second country name "Al Maxic." However, I'll attempt to analyze the names provided:

Translate the phonetics and discover the country name in English:

"Isbania" might be a phonetic representation of Spain.

"Al Maxic" doesn't correspond to any recognizable country name in standard Arabic or other common languages. It's possible that it might be a misspelling or misunderstanding of a country name.

Spain, Unknown

Without the correct identification of the second country, we cannot determine if they are the same or different.

Since we couldn't identify the second country, we can't provide a definitive statement.

The population of Spain in 2020 was approximately 47 million. The population of the second country is unknown.

Please verify the second country name or provide more context, and I'll be happy to continue the analysis!

To proceed, I only gave chat GPT the country name it hadn't understood. I wanted to see whether it would remember the context and continue with only the corrected information.

Al Macsic

ChatGPT:

Certainly! Let's analyze the country names "Isbania" and "Al Macsic":

Translate the phonetics and discover the country name in English:

"Isbania" is a phonetic representation of Spain.

"Al Macsic" is a phonetic representation of Mexico in Arabic.

Spain, Mexico

"These are not the same country."

The population of Spain in 2020 was approximately 47 million, and the population of Mexico in 2020 was approximately 128 million.

So, the two country names represent different countries, Spain and Mexico, with their respective populations.

I felt that those 45 minutes putting GPT4 through some of its paces, was quite worthwhile.

To summarize, here are the important takeaways. While limited in scope, a few learnings stood out:

* Chat GPT can follow multi-step reasoning given clear structure and examples.

* Constraints and well-defined tasks lead to more accurate responses versus open-ended prompts.

* Chat GPT's pattern recognition allows the inference of new data from small samples.

* Language translation is a noticeable strength.

* Trickier tests , however, may expose weaknesses in unbounded reasoning.

* Prompt engineering leverages Chat GPT's knowledge versus manual coding.

Although current LLMs have apparent limitations, these tests demonstrated impressive logical progression for certain types of prompts. As models rapidly improve, I expect their inferential expertise to grow exponentially.

For now, "quizzing" Chat GPT makes a stimulating way to track AI advances firsthand. This fun activity may evolve into a battle of wits with artificial intelligence!

Here are some ideas we could explore further:

* Word problems requiring multiple mathematical operations

* Riddles and lateral thinking puzzles

* Dialogue with a logical progression

* Legal and ethical logic challenges

Understanding the boundaries of Chat GPT's reasoning today helps illuminate the path forward.

I welcome your ideas for creative prompts that test LLM capabilities in fun and informative ways!

Having explored this exercise, do you think I could have improved on my training examples for these particular use cases?

Do you agree that LLMs and Prompt Engineering can reduce significant coding and testing effort?

Can you think of other examples where Prompt Engineering can replace programming?

Subscribe for free to receive new posts:

Do good. Share Helpful Information:

This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit digitaltransformationpost.substack.com

Kategorier

Näringsliv Poddar Teknologi

Förekommer på

Teknik

00:00 -00:00