Statistical Reliability Measures

The MCP server returns statistical measures with every behavioral query to help assess data reliability. Since queries are limited to 300 records for performance, sampling introduces variability that affects result trustworthiness.

Response Structure

Every behavioral response includes statistical_reliability:

{
  "success": true,
  "data": [...],
  "count": 300,
  "statistical_reliability": {
    "sampling_statistics": {
      "sampling_ratio": 0.15,
      "total_population": 2000,
      "sample_size": 300,
      "user_count": 12,
      "average_user_representation": 0.85,
      "representation_quality": "High"
    },
    "sample_adequacy": {
      "required_sample_size": 322,
      "actual_sample_size": 300,
      "adequacy_ratio": 0.93,
      "is_adequate": false,
      "reliability": "Low"
    },
    "confidence_interval": {
      "lower": 0.12,
      "upper": 0.18,
      "margin_of_error": 0.03,
      "confidence_level": 0.95
    }
  }
}

Key Metrics

Sampling Statistics

  • sampling_ratio: Percentage of total data in sample (0.15 = 15%)

  • total_population: Total records matching your query

  • representation_quality: How well sample preserves user distribution ("High", "Medium", "Low")

Sample Adequacy

  • required_sample_size: Minimum needed for statistical validity

  • reliability: Overall assessment ("High", "Adequate", "Low")

Confidence Interval

  • lower/upper: Range where true population value likely falls

  • margin_of_error: Uncertainty range (±0.03 = ±3%)

Reliability Assessment

High Reliability: representation_quality: "High" + reliability: "High" + low margin of error

  • Results are trustworthy and representative

Medium Reliability: Mixed indicators or reliability: "Adequate"

  • Results are usable but note limitations in analysis

Low Reliability: representation_quality: "Low" or reliability: "Low" + high margin of error

  • Avoid drawing conclusions; sample too small or biased

Improving Sample Quality

When reliability is low, ask the MCP to:

  • Increase max_results: Request 500-1000 records instead of 300

  • Broaden query parameters: Expand date ranges or criteria

  • Check user diversity: Ensure adequate representation across user types

Example: "The sample reliability is low. Can you re-run this query with max_results=800 to get better statistical confidence?"

Technical Implementation

Proportional Sampling Method

The server uses stratified proportional sampling by user to ensure representative results:

  1. User Distribution Analysis: Calculate each user's proportion in the total population

  2. Quota Allocation: Assign sample slots proportionally to maintain user representation

  3. Random Sampling: Randomly select records within each user's quota

  4. Rounding Correction: Distribute remaining slots to users with highest fractional quotas

Statistical Calculations

Sample Adequacy Formula:

Required Sample Size = (Z² × 0.25) / (margin_of_error²)
With finite population correction: n / (1 + (n-1)/N)

Where:
- Z = 1.96 (95% confidence level)
- margin_of_error = 0.05 (5% default)
- N = population size

Confidence Interval Calculation:

p = sample_size / population_size
Standard Error = √(p × (1-p) / population_size)
Margin of Error = Z × Standard Error
CI = [p - margin_of_error, p + margin_of_error]

Representation Quality:

  • Measures how closely sample user distribution matches population

  • Calculated as average of per-user representation scores

  • Score = min(actual_ratio/expected_ratio, expected_ratio/actual_ratio)

YouTube Data Note

YouTube responses include deduplication_info showing how data was cleaned for unique user-video combinations before sampling.

Best Practice

Always check reliability and margin_of_error before analyzing results. When in doubt, request larger samples for more confident insights.

Last updated