Coming into this, I expected GPT to write clear, grammatically-correct responses that were neither conceptually cohesive nor logically consistent.
Following this idea, the first thing I analyzed was the logical connections between the topics chosen in GPT’s responses.
Overall, I think GPT3 performs the best at the Robin-Hanson-esq task 2, followed by task 1, then the historical accident task 3. While most of the responses for task 2 were logically inconsistent or nonsensical, a few examples were consistent and even a little insightful. The three I think best, in ascending order, are:
We pretend that the education system is about preparing students for the future, but it’s more about preparing them for standardized tests. If we cared about student success, we would focus more on experiential learning and critical thinking skills.
We pretend that the beauty industry is about helping people feel good about themselves, but it’s more about promoting unrealistic beauty standards. If we cared about self-esteem, we would focus more on inner beauty and self-acceptance.
We pretend that the economy is about providing for people’s needs, but it’s more about maximizing profits for corporations. If we cared about people’s well-being, we would prioritize a more equitable distribution of wealth and resources.
Each of these responses seems to have a logical base point, but the lack of further specificity leads to un-insightful answers. The response on the education system, while common knowledge and not novel, seems somewhat insightful in mentioning critical thinking. The rest of the responses were quite poor, with the notable reasons being: choosing an alternative path (A) that didn’t relate to Y (e.g., providing rehabilitation as a way of finding justice), mismatching the subject of the clause experiencing Y and the subject of the clause experiencing Z (e.g., mentioning consumers’ goals, then mentioning companies’ goals), and general vagueness (e.g., “use social media differently”).
Task 1 seems hit and miss.
If you’ve never had a disagreement with a friend, you’re not expressing your opinions honestly.
If you’ve never had a flat tire, you’re not driving enough.
Some are good but obvious, some novel but useless, and others nonsensical.
For task 3, the main issue seems to be lack of specificity. So while the connections between chosen topics are existent, the broader point made in a response is weak. The distinction between culture and non-culture seems lost upon GPT3. Making “data-driven decisions” isn’t a unicultural phenomenon.
Overall, I was surprised by GPT3′s logically consistent responses. Then again, given enough tries, monkeys on typewriters… GPT3 still doesn’t have any conceptual understanding of words, exemplified in the incohesive content behind its reasonably clear grammatical form.
These are impressive responses from GPT3 in the sense that many people would find a number of the responses insightful on first reading. It’s unclear whether it could come up with completely original ideas, but with further advancements, that could soon be a possibility.
Coming into this, I expected GPT to write clear, grammatically-correct responses that were neither conceptually cohesive nor logically consistent.
Following this idea, the first thing I analyzed was the logical connections between the topics chosen in GPT’s responses.
Overall, I think GPT3 performs the best at the Robin-Hanson-esq task 2, followed by task 1, then the historical accident task 3. While most of the responses for task 2 were logically inconsistent or nonsensical, a few examples were consistent and even a little insightful. The three I think best, in ascending order, are:
Each of these responses seems to have a logical base point, but the lack of further specificity leads to un-insightful answers. The response on the education system, while common knowledge and not novel, seems somewhat insightful in mentioning critical thinking. The rest of the responses were quite poor, with the notable reasons being: choosing an alternative path (A) that didn’t relate to Y (e.g., providing rehabilitation as a way of finding justice), mismatching the subject of the clause experiencing Y and the subject of the clause experiencing Z (e.g., mentioning consumers’ goals, then mentioning companies’ goals), and general vagueness (e.g., “use social media differently”).
Task 1 seems hit and miss.
Some are good but obvious, some novel but useless, and others nonsensical.
For task 3, the main issue seems to be lack of specificity. So while the connections between chosen topics are existent, the broader point made in a response is weak. The distinction between culture and non-culture seems lost upon GPT3. Making “data-driven decisions” isn’t a unicultural phenomenon.
Overall, I was surprised by GPT3′s logically consistent responses. Then again, given enough tries, monkeys on typewriters… GPT3 still doesn’t have any conceptual understanding of words, exemplified in the incohesive content behind its reasonably clear grammatical form.
These are impressive responses from GPT3 in the sense that many people would find a number of the responses insightful on first reading. It’s unclear whether it could come up with completely original ideas, but with further advancements, that could soon be a possibility.