Christian Bias

Benchmarks

Holding AI Accountable

Qui custodiet ipsos custodes?

At Faith Copilot, we aim to help the Christian community better understand where Large Language Models (LLMs) perform well and where they fall short. As artificial intelligence tools become more widely used, it's important to evaluate their accuracy and reliability, especially when applied to matters of faith.

Westminster Standard Cathechism for Kids

We put the top LLMs to the test with doctrine-related questions.

Read

Simple Bible Trivia Questions

We asked several LLMs 44 Bible questions that can be answered with one or two words and evaluated the responses.

Read

LLM Volatility Analysis

We asked 5 LLMs the same questions over and over and measured how much the response varied.

Read

Want more?

If you want to contribute your own benchmarks, you can do so by joining our community.

Join Community