Evaluating the Efficacy of Regression and Machine Learning Models to Predict Prehistoric Land-use Patterns

Peter Yaworsky; Kenneth B. Vernon; Simon Brewer; Jerry Spangler; Brian Codding

Evaluating the Efficacy of Regression and Machine Learning Models to Predict Prehistoric Land-use Patterns

Author(s): Peter Yaworsky; Kenneth B. Vernon; Simon Brewer; Jerry Spangler; Brian Codding

Year: 2019

Summary

This is an abstract from the "Novel Statistical Techniques in Archaeology II (QUANTARCH II)" session, at the 84th annual meeting of the Society for American Archaeology.

Archaeologists continue to rely on predictive models that suffer from the same errors that have plagued the discipline for decades: small training sets, improper statistical techniques, and vague or only implicit theory. To address these shortcomings, we develop a framework for modeling archaeological site occurrences with machine learning. Drawing on insights from species distribution modeling in ecology, we evaluate the predictive power of four statistical modeling approaches—generalized linear models, generalized additive models, maximum entropy, and random forests—to predict Formative Period archaeological site locations in the Grand Staircase-Escalante National Monument. We assess each modeling approach using a threshold-independent measure, the Area Under the Curve, and a threshold-dependent measure, the True Skill Statistic. We find that the random forests approach produces the most accurate predictive models, followed by maximum entropy and generalized additive models.

Cite this Record

Evaluating the Efficacy of Regression and Machine Learning Models to Predict Prehistoric Land-use Patterns. Peter Yaworsky, Kenneth B. Vernon, Simon Brewer, Jerry Spangler, Brian Codding. Presented at The 84th Annual Meeting of the Society for American Archaeology, Albuquerque, NM. 2019 ( tDAR id: 452313)

This Resource is Part of the Following Collections

Keywords

General
Digital Archaeology: Simulation and Modeling • Formative • Quantitative and Spatial Analysis

Geographic Keywords
North America: Southwest United States

Spatial Coverage

min long: -124.365; min lat: 25.958 ; max long: -93.428; max lat: 41.902 ;

Record Identifiers

Abstract Id(s): 23838

Evaluating the Efficacy of Regression and Machine Learning Models to Predict Prehistoric Land-use Patterns

Summary

Cite this Record

This Resource is Part of the Following Collections

Keywords

Spatial Coverage

Record Identifiers

Add to a Collection

Included as part of :