Evaluating the Efficacy of Regression and Machine Learning Models to Predict Prehistoric Land-use Patterns

Summary

This is an abstract from the "Novel Statistical Techniques in Archaeology II (QUANTARCH II)" session, at the 84th annual meeting of the Society for American Archaeology.

Archaeologists continue to rely on predictive models that suffer from the same errors that have plagued the discipline for decades: small training sets, improper statistical techniques, and vague or only implicit theory. To address these shortcomings, we develop a framework for modeling archaeological site occurrences with machine learning. Drawing on insights from species distribution modeling in ecology, we evaluate the predictive power of four statistical modeling approaches—generalized linear models, generalized additive models, maximum entropy, and random forests—to predict Formative Period archaeological site locations in the Grand Staircase-Escalante National Monument. We assess each modeling approach using a threshold-independent measure, the Area Under the Curve, and a threshold-dependent measure, the True Skill Statistic. We find that the random forests approach produces the most accurate predictive models, followed by maximum entropy and generalized additive models.

Cite this Record

Evaluating the Efficacy of Regression and Machine Learning Models to Predict Prehistoric Land-use Patterns. Peter Yaworsky, Kenneth B. Vernon, Simon Brewer, Jerry Spangler, Brian Codding. Presented at The 84th Annual Meeting of the Society for American Archaeology, Albuquerque, NM. 2019 ( tDAR id: 452313)

Spatial Coverage

min long: -124.365; min lat: 25.958 ; max long: -93.428; max lat: 41.902 ;

Record Identifiers

Abstract Id(s): 23838