ChatGPT-5 Test Series - Part 3: Data Analysis in Practice
🕓 Read Time: ~5 minutes
What do we mean by “data analysis” with AI?
For most of us, data analysis means uploading an Excel or CSV file and asking the AI to surface insights. Think survey results, campaign performance, budgets, or customer feedback.
With GPT-5, the larger context windows and stronger reasoning models make this far more powerful than before. But it’s not magic. And unless you prepare your data, you’ll likely end up frustrated.
The good news
I tested GPT-5 with two real datasets:
-
A badly formatted budget file (columns mislabeled, rows misaligned, empty spaces, merged cells).
-
A large set of marketing campaign data from the last few months.
Once I prepared the files, used the thinking model, and enabled the code interpreter (called “data analyst” in some tools), the results were excellent. GPT-5 handled the complexity better than 4o ever could, and the insights were genuinely useful.
⚠️ And here’s the straight talk
Working with small, clean files? Easy. Upload, ask, and you’ll get something decent back.
Working with larger, messy files? That’s where things go wrong and why people complain about poor outputs. The truth is: GPT-5 is still a large language model. It’s not a database. What it does isn’t fancy at all: it converts your data into text it can analyze. This means context will get lost and GPT-5 will make up the missing bits — just like its predecessors.
So unless you set it up properly, it will sometimes “invent” numbers or insights. That’s not GPT-5 being broken. It’s messy data, poor prompts, the wrong workflow, or simply your expectations that need aligning.
How to get good results (lessons from my tests)
-
Prepare your data.
-
Always put column headers in row 1.
-
Remove empty rows, merged cells, manual formatting.
-
Save as CSV (works better than Excel for larger files).
-
-
Label your columns clearly.
-
Ensure your columns add relevant context.
-
E.g. instead of “Q1, Q2, Q3” label them “Q1_ParticipantName,” “Q2_SatisfactionScore,” etc. Or instead of "Revenue" put "Revenue_in_k_USD".
-
Use unambiguous date formats (e.g., YYYY-MM-DD).
-
-
Prompt specifically.
-
Don’t just say “analyze this file.”
-
Validate that it accurately reads the file.
-
Specify your goal(s).
-
Point it to the exact columns you want insights from.
-
-
Turn on the code interpreter.
-
Without it, GPT-5 works like a text engine and will make mistakes.
-
With it, you’ll see it generate Python code behind the scenes. Slower, but much more accurate.
-
-
Validate the output.
-
Don’t assume the first result is final.
-
Check the numbers, and iterate until the analysis makes sense.
-
🔍 My take
Overall, I was very happy with GPT-5’s performance on my real-world data. With clean inputs, clear prompts, and the code interpreter running, it feels like having a data analyst on call.
In fact, I was so impressed that I’m upgrading my campaign insights workflows: I’m building a team of CustomGPTs and am planning to connect them with well-selected automations. The analyses I run over and over are being folded into smart workflows that save time and enable me to finally use the wealth of insight hidden away in all that data.
That’s a real upgrade, if you ask me.
Key Takeaway
GPT-5 can be a powerful data partner but only if you prepare your data, enable the right AI features, and stay specific in your prompts. Garbage in, garbage out still applies but with the right setup, it’s pretty good.
Til next time — stay data-smart,
Elena