I pass on the abstract of a multiauthor work from MIT. Undergrads, EEG caps on, wrote three 20-minute essays. Those who leaned on GPT-4o showed weaker alpha-beta coupling, produced eerily similar prose, and later failed to quote their own sentences. The next MindBlog post relays a commentary on and critique of this work.
With today's wide adoption of LLM products like ChatGPT from OpenAI, humans and
businesses engage and use LLMs on a daily basis. Like any other tool, it carries its own set of
advantages and limitations. This study focuses on finding out the cognitive cost of using an LLM
in the educational context of writing an essay.
We assigned participants to three groups: LLM group, Search Engine group, Brain-only group,
where each participant used a designated tool (or no tool in the latter) to write an essay. We
conducted 3 sessions with the same group assignment for each participant. In the 4th session
we asked LLM group participants to use no tools (we refer to them as LLM-to-Brain), and the
Brain-only group participants were asked to use LLM (Brain-to-LLM). We recruited a total of 54
participants for Sessions 1, 2, 3, and 18 participants among them completed session 4.
We used electroencephalography (EEG) to record participants' brain activity in order to assess
their cognitive engagement and cognitive load, and to gain a deeper understanding of neural
activations during the essay writing task. We performed NLP analysis, and we interviewed each
participant after each session. We performed scoring with the help from the human teachers
and an AI judge (a specially built AI agent).
We discovered a consistent homogeneity across the Named Entities Recognition (NERs),
n-grams, ontology of topics within each group. EEG analysis presented robust evidence that
LLM, Search Engine and Brain-only groups had significantly different neural connectivity
patterns, reflecting divergent cognitive strategies. Brain connectivity systematically scaled down
with the amount of external support: the Brain‑only group exhibited the strongest, widest‑ranging
networks, Search Engine group showed intermediate engagement, and LLM assistance elicited
the weakest overall coupling. In session 4, LLM-to-Brain participants showed weaker neural
connectivity and under-engagement of alpha and beta networks; and the Brain-to-LLM
participants demonstrated higher memory recall, and re‑engagement of widespread
occipito-parietal and prefrontal nodes, likely supporting the visual processing, similar to the one
frequently perceived in the Search Engine group. The reported ownership of LLM group's
essays in the interviews was low. The Search Engine group had strong ownership, but lesser
than the Brain-only group. The LLM group also fell behind in their ability to quote from the
essays they wrote just minutes prior.
As the educational impact of LLM use only begins to settle with the general population, in this
study we demonstrate the pressing matter of a likely decrease in learning skills based on the
results of our study. The use of LLM had a measurable impact on participants, and while the
benefits were initially apparent, as we demonstrated over the course of 4 months, the LLM
group's participants performed worse than their counterparts in the Brain-only group at all levels:
neural, linguistic, scoring.
We hope this study serves as a preliminary guide to understanding the cognitive and practical
impacts of AI on learning environments.