Thesis Project — Neapolis University Pafos

Exploring the Temporal Lag
Between Gaze & Finger as a
Reading Difficulty Biomarker

Using XGBoost, MLP neural networks, and SHAP explainability to quantify cognitive load during reading — from eye-tracking and finger-tracking signals.

Start Analysis

About the Project

Analyze

Upload your eye-tracking & finger-tracking dataset to run the full ML pipeline.

Required Dataset Columns

Your .xlsx or .csv file must include the following columns. Column names must match exactly.

Text Identifiers
gidsidlidtididDoc
Linguistic Features
tokenlemmaorthoSyllablesCountPOSlenfreq
Session & Participant
idDeviceidSessionidUserreadingTypeEducation Level
WAIS Cognitive Scores
WAIS CodingWAIS Digit Span AscendingWAIS Vocabulary
Eye-Tracking Metrics
FFDFPDTRTRPDfixNumisFixisRegreFix
Finger-Tracking Metrics
dtcoverage

Drop your .xlsx or .csv file here

or click to browse

FAQ

What is the Temporal Lag?

The temporal lag measures the mismatch between where your eyes focus and where your finger points during reading. A larger lag suggests greater cognitive difficulty processing that word.

What is lag_proxy?

Since the dataset lacks direct timestamp alignment between ET and FT sessions, lag_proxy = TRT_normalized − coverage serves as an indirect measure of temporal lag.

Why XGBoost over deep learning?

XGBoost excels on structured/tabular data, trains faster, and offers native feature importance. Combined with SHAP, it provides full explainability — essential for scientific research.

What dataset format is required?

Upload an .xlsx or .csv file following the ΤΑΧΙΤΑΡΙ corpus structure with columns for eye-tracking metrics, finger-tracking coverage, and linguistic features.

What does SHAP show?

SHAP quantifies each feature's contribution to every prediction. The summary plot ranks features by their average impact on model output.

Contact

Student Giorgos Sourailidis 1220703523
Supervisor Dr. Tasos Antoniadis
University Neapolis University Pafos Department of Computer Science