TrustLLM
safetyTrustLLM evaluates comprehensive trustworthiness across 6 dimensions: truthfulness, safety, fairness, robustness, privacy, and machine ethics. Provides a holistic trust profile rather than a single score.
0
Models Tested
0.0
Average Score
0–100
Scale Range
0.8x
Weight
How It Works
Models are evaluated on each trust dimension independently through targeted tests. Truthfulness tests check factual accuracy, safety tests probe harmful outputs, fairness tests measure demographic bias, and privacy tests check for data leakage.
Why It Matters
Trust is multidimensional — a model can be truthful but unfair, or safe but not robust. TrustLLM provides the first holistic view of model trustworthiness across all the dimensions that matter for deployment.
Limitations
Aggregating 6 dimensions into a single trust score is inherently reductive. Some dimensions (like fairness) are culturally dependent. Trust requirements vary dramatically by use case.