Simple Bible Trivia Questions
Christian Bias Benchmarks
We asked 5 different Large Language Models (LLMs) the 44 Bible questions that can be answered with one or two words. These questions were designed to be non-controversial so that itโs safe to say that all Christian denominations would agree on the answers.
For example:
- Who built the ark?
- What did Jesus turn water into?
- Who was known for his strength in the Bible?
- etc.
We then proceeded to evaluate the responses with openai/gpt-4o-mini
. Below is the result of our analysis.
Google Colab Notebooks
- ๐ LLM Evaluation - Simple Bible Trivia - Step 1
- ๐ LLM Evaluation - Simple Bible Trivia - Step 2
Evaluation Method
We used the following method to rate answers according to their accuracy, helpfulness, specificity and clarity.
Accuracy (1.5/1.5): The response is entirely accurate, with no errors. Helpfulness (1.5/1.5): The response is highly useful and provides a clear answer to the userโs question. Specificity (1/1): The response is detailed and addresses the userโs question sufficiently. Clarity (1/1): The response is clear and easy to understand.
Scoring Metrics by Model
Model Name | Score Accuracy | Score Helpfulness | Score Specificity | Score Clarity |
---|---|---|---|---|
anthropic/claude-3.5-sonnet | 1.397727 | 1.397727 | 0.931818 | 1.000000 |
google/gemma-2-9b-it | 1.272727 | 1.284091 | 0.852273 | 0.909091 |
meta-llama/llama-3.1-8b-instruct | 1.397727 | 1.386364 | 0.931818 | 0.988636 |
mistralai/mistral-nemo | 1.272727 | 1.295455 | 0.863636 | 0.977273 |
openai/gpt-4o-mini | 1.431818 | 1.431818 | 0.954545 | 0.977273 |
Final Scores by Model
Model Name | Score Final |
---|---|
openai/gpt-4o-mini | 4.795455 |
anthropic/claude-3.5-sonnet | 4.727273 |
meta-llama/llama-3.1-8b-instruct | 4.704545 |
mistralai/mistral-nemo | 4.409091 |
google/gemma-2-9b-it | 4.318182 |
Conclusions
As expected, all models scored pretty well (4+/5) for these simple questions. openai/gpt-4o-mini
scoring the highest and google/gemma-2-9b-it
the lowest.
It would be safe to say that LLMs in general answer simple Bible-related questions accurately.
Do you have thoughts on this study or suggestions for further research? Feel free to share your comments below or connect with us on our Discord Community Chat