OpenAI has just (Tuesday 14/3-2023 at 18.00 CET) released version 4 of ChatGPT, and again, it pushes the boundaries of what is possible for language models.

Viden.ai've gained access to ChatGPT-4 and have tested a lot myself, but since we only have 100 queries per 4 hours so far, part of the article will be based on other people's experiences and tests.

In the article, you can learn about the new changes in GPT-4 and what it means for teaching. Shortly, we will delve more into the new version and its opportunities and challenges. 

Improvements in many parameters

The new version of GPT has been tested on simulations of exams designed for humans, and interestingly, it scores quite high compared to GPT-3.5. When we look at the numbers, there are improvements in math challenges (SAT Math), law (LSAT), evidence-based reading and writing (SAT Evidence-Based Reading & Writing), medicine (MKSAP), chemistry (AP Chemistry), and many more. The full overview can be viewed on the OpenAi page about GPT-4.

Gpt-4 can also handle much longer texts than GPT-3's 2048 tokens (about 1500 words). OpenAI has announced that GPT-4 can work with as many as 32,000 tokens (25,000 words)! This enables it to find content, write summaries, analyze very long texts, and write much longer texts. OpenAI has also tested the new version in a wide range of languages, and it should have become significantly better for smaller languages. They claim that it surpasses the English version of GPT-3.5 in performance. 

The dataset underlying GPT-4 is still the same as ChatGPT – as of September 2021.

Some of the most maligned things about ChatGPT are its hallucinations where it makes up facts itself, its uncritical dissemination of harmful content (recipes for bombs, weapons, etc.), general misinformation, issues with privacy information in training data, cybersecurity, heavy bias, and a lot more inconveniences. GPT-4 should be greatly improved in all of these areas, and will respond with the following, asking for something illegal, for example:

"As an AI developed by OpenAI, I cannot provide guidance, help or support for illegal activities, including hacking, theft or other criminal acts. My purpose is to help and inform users within the framework of the law and ethical guidelines."

Recognizing content in images

A forthcoming feature for ChatGPT is the ability to generate text from images, meaning the language model will be able to "see" what is in an image and then analyze the content of the image. This could pose a challenge in education, especially for tasks involving image analysis.

Here is an example from the OpenAI website:

Bruger:
Answer question I.1.a. Think step-by-step.
Svar fra GPT-4:
To answer question I.1.a, we need to express the temperature T(x) in every point of the conductive bar, which is characterized by the abscissa x. We are given that T(0) = T0 and T(L) = Tb.

For a stationary regime, the heat equation for the one-dimensional heat conduction in the bar is:
d²T/dx² = 0

Integrating once, we get:
dT/dx = C₁

Integrating a second time:
T(x) = C₁x + C₂

To determine the constants C₁ and C₂, we use the boundary conditions:
T(0) = T0, which gives C₂ = T0
T(L) = Tb, which gives C₁ = (Tb - T0) / L

Finally, the expression for T(x) is:
T(x) = (Tb - T0) * (x / L) + T0
This represents a linear temperature profile along the conductive bar, with a slope of (Tb - T0) / L.

This also makes it possible to have it explain content from our physical textbooks and provide solutions to tasks based solely on images. Another example from OpenAI shows how GPT-4 can read and interpret bar charts:

Bruger:
What is the sum of average daily meat consumption for Georgia and Western Asia? Provide a step-by-step reasoning before providing your answer.
Svar fra GPT-4:
To find the sum of average daily meat consumption for Georgia and Western Asia, we will follow these steps:

1. Identify the average daily meat consumption for Georgia.
2. Identify the average daily meat consumption for Western Asia.
3. Add the values from steps 1 and 2.

Step 1: From the provided chart, we see that the average daily meat consumption for Georgia is 79.84 grams per person per day.
Step 2: From the chart, we see that the average daily meat consumption for Western Asia is 69.62 grams per person per day.
Step 3: Add the values from steps 1 and 2.
79.84 g (Georgia) + 69.62 g (Western Asia) = 149.46 g

So, the sum of average daily meat consumption for Georgia and Western Asia is 149.46 grams per person per day.

Furthermore, GPT-4 can also decode the content within images and understand the context.

Bruger:
What is unusual about this image?
Svar fra GPT-4:
The unusual thing about this image is that a man is ironing clothes on an ironing board attached to the roof of a moving taxi.

The feature where GPT-4 recognizes and analyzes images is not publicly available yet but will be introduced later. However, it will change how we present educational tasks, as any image can now be converted into text.

In their live event, they also demonstrated how a photo of a hand-drawn sketch can be turned into a simple website and then directly translated into code – very impressive.

Sources

GPT-4
We’ve created GPT-4, the latest milestone in OpenAI’s effort in scaling up deep learning. GPT-4 is a large multimodal model (accepting image and text inputs, emitting text outputs) that, while less capable than humans in many real-world scenarios, exhibits human-level performance on various professi…

https://cdn.openai.com/papers/gpt-4-system-card.pdf

https://cdn.openai.com/papers/gpt-4.pdf