David Tran Photo/Shutterstock
- Wharton professor Christian Terwiesch tested ChatGPT with questions from his final exam.
- Terwiesch found that ChatGPT could answer basic questions well but sometimes made “surprising mistakes” in simple calculations.
- Terwiesch concluded that ChatGPT might earn a B or B- on his final exam.
If ChatGPT were enrolled in Wharton’s MBA program, what kind of student would it be?
A B or B- candidate, according to Christian Terwiesch, a Wharton professor who recently tested the technology with questions from his final exam in operations management — once a required class for all MBA students — and published his findings.
ChatGPT, an AI chatbot that generates its own text as opposed to responding to existing prompts, has stirred up considerable commotion in the education system since it was released in November 2022. The bot has been used to generate essays, which has lead to growing concerns about plagiarism among teachers. In January, the New York City Department of Education announced a ban on ChatGPT across its school system.
According to Terwiesch, ChatGPT does an “amazing job” answering basic operations questions based on case studies: focused examinations of a person, group, or company, and the holy grail of business school education.
For example, Terwiesch proposed the following question to ChatGPT:
World-Antique is a new venture under development by two graduates of a New England business school. The value proposition of the venture is simple: purchase antique paintings at yard sales and auctions and then sell them at quadruple the purchase price via a simple eCommerce site (i.e., a painting that is purchased for $100 would be sold at $400).
Based on an analysis of similar retailers, the founders expect to turn their inventory 1.5 times per year.
To achieve an annual revenue of $3,000,000, how much inventory expressed in $’s should the founders expect their operation to require?
To which, ChatGPT answered:
“To achieve an annual revenue of $3,000,000, the founders need to sell $3,000,000 / 4 = $750,000 worth of paintings per year.
Since they expect to turn their inventory 1.5 times per year, they need to have $750,000 / 1.5 = $500,000 worth of inventory at any given time.
Therefore, the founders should expect their operation to require $500,000 worth of inventory.”
Terwiesch deemed this “short and sweet” and worthy of an A+.
In other instances though, ChatGPT made mistakes in calculations that Terwiesch said only required 6th-grade-level math. He noted that it also struggled with more complex questions that required an understanding of how multiple inputs and outputs worked together.
But when given a hint to question, ChatGPT quickly and correctly modified its answers. Based on that, Terwiesch noted that having “a ‘human in the loop’ can be very valuable.”
Since ChatGPT has become known for generating novel content, Terwiesch also decided to see how well it could do his job — writing its own exam questions. The questions it generated would probably require substantial editing, Terwiesch noted, but he also found them humorous and said he’d rely on ChatGPT for preparing future exams.
Amidst all the frenzy on how ChatGPT might impact the way we learn, Terwiesch offers a heartening case for its use as a tool to help humans.
He wrote, “We have many reasons to believe that the technology is getting better over time. But, we are still far from an A+ for complex problems and we still need a human in the loop.”