- What is Data Sharing?
- What are the advantages of using Data Sharing?
- Data Structure & Costs
- Data Availability
- Dataset Overview
- Performance Optimization
- Technical Notes
- What we need from you
What is Data Sharing?
Snowflake Data Sharing for Datastream is a secure and direct way to route Chartbeat traffic and engagement metrics right into your Snowflake warehouse.
By leveraging Snowflake's architecture, we eliminate any data copying or transferring. Because Chartbeat pre-deduplicates and normalizes the datasets on our end before sharing, your data arrives clean, accurate, and ready to analyze. Read more about Datastream here.
What are the advantages of using Data Sharing?
- Zero ETL required: Eliminate complex engineering overhead with automated data sharing instead of building custom pipelines.
- No storage duplication: Query normalized Chartbeat metrics directly where they live, avoiding double storage fees and transfer friction.
- Pre-processed accuracy: Work with pre-cleaned metrics, removing the need for manual data filtering on your side.
- Unified insights: Effortlessly join raw Chartbeat reader engagement metrics with your subscription and revenue datasets inside Snowflake.
Data Structure & Costs
The shared dataset is a secure view built on top of our raw sessions data. Chartbeat will cover all data egress costs to make this dataset available (as part of your package). Your organization will only be responsible for compute costs associated with querying your data.
Data Availability
- Update Frequency: Hourly. Please note that data is delayed by 4 hours i.e you wouldn't see yesterday's data in full until 4 AM next day.
- Historical Data: Available from November 2021 to present.
Dataset Overview
The Chartbeat pingdata dataset provides comprehensive pageview-level web analytics data that enables deep analysis of user engagement, content audience behavior. This dataset supports a wide range of analytical use cases, including:
- Tracking article engagement metrics and scroll depth
- Analyzing traffic sources and campaign attribution through UTM parameters
- Understanding geographic audience distribution and device/platform usage patterns
- Measuring content performance by sections and authors
This data can help answer critical business questions such as
- Which articles generate the highest engaged time
- What are the most effective referral sources
- How do mobile users engage differently than desktop users
- Which content sections drive the most loyal readership
See the full list of available metrics and dimensions here.
Performance Optimization
The view is clustered by the DT and HOST columns. Always include DT in your WHERE clause when querying to ensure optimal query execution time.
Technical Notes
- DT: Timestamp field with hourly granularity
- DEPARTURE_TIMESTAMP: Provides second-level precision for more granular analysis
- Row Definition: Each row represents one pageview
What we need from you
- Your Snowflake account email address
- Your Snowflake region
Please reach out to your Customer Success representative with this information to get started.