Mastering Snowflake Platform  
Generate, fetch, and automate Snowflake data as a skilled data practitioner (English Edition)
Author(s): Pooja Kelgaonkar
Published by BPB Publications
Publication Date:  Available in all formats
ISBN: 9789355519764
Pages: 400

EBOOK (EPUB)

ISBN: 9789355519764 Price: INR 899.00
Add to cart Buy Now
Handling ever evolving data for business needs can get complex. Traditional methods create bulky and costly-to-maintain data systems. Here, Snowflake emerges as a cost-effective solution, catering to both traditional and modern data needs with zero or minimal maintenance costs. This book helps you grasp Snowflake, guiding you to create complete solutions from start to finish. The starting focus covers Snowflake architecture, key features, native loading and unloading capabilities, ANSI SQL support, and processing of diverse data types and objects. The next part utilizes acquired knowledge to look into implementing data security, governance, and collaborations, utilizing Snowflake's features like data sharing and cloning. The final part explores advanced topics, including streams, tasks, performance optimizations, cost efficiencies, and operationalization with automated monitoring. Real-time use cases and reference architectures are provided to assist readers in implementing data warehouse, data lake, and data mesh solutions with Snowflake.
Rating
Description
Handling ever evolving data for business needs can get complex. Traditional methods create bulky and costly-to-maintain data systems. Here, Snowflake emerges as a cost-effective solution, catering to both traditional and modern data needs with zero or minimal maintenance costs. This book helps you grasp Snowflake, guiding you to create complete solutions from start to finish. The starting focus covers Snowflake architecture, key features, native loading and unloading capabilities, ANSI SQL support, and processing of diverse data types and objects. The next part utilizes acquired knowledge to look into implementing data security, governance, and collaborations, utilizing Snowflake's features like data sharing and cloning. The final part explores advanced topics, including streams, tasks, performance optimizations, cost efficiencies, and operationalization with automated monitoring. Real-time use cases and reference architectures are provided to assist readers in implementing data warehouse, data lake, and data mesh solutions with Snowflake.
Table of contents
  • Cover
  • Title Page
  • Copyright Page
  • Dedication Page
  • About the Author
  • About the Reviewer
  • Acknowledgement
  • Preface
  • Table of Contents
  • 1. Getting Started with Snowflake
    • Introduction
    • Structure
    • Objectives
    • Why Snowflake?
    • History of Snowflake
    • Snowflake certifications
    • Snowflake community
    • Setting up a trial account with Snowflake
    • Connecting to Snowflake
      • Snowflake Web User Interface
        • Classic console
        • Context setting
        • Snowsight
      • Command line interface
        • Download and setup SnowSQL
        • Connecting to Snowflake
    • Conclusion
    • Points to remember
    • Practical
    • Multiple choice questions
    • Answers
  • 2. Three Layered Architecture
    • Introduction
    • Structure
    • Objectives
    • Traditional data warehouse challenges
    • Legacy data warehouse challenges
    • Typical data lake challenges
      • Hadoop ecosystem
      • Extended support to NoSQL
    • Snowflake architecture
    • Cloud services layer
      • Authentication and access control
      • Optimizer
      • Metadata storage and management
      • Result cache
      • Security
      • Availability of cloud services layer
      • Uses of cloud services layer
    • Virtual warehouse layer
      • Virtual warehouses: Single and multi-cluster
      • Single cluster warehouses
      • Multi-cluster warehouses
      • Virtual warehouse properties
      • Virtual warehouse sizes
      • Warehouse scaling
        • Scaling up
        • Scaling out
        • Scaling modes
        • Scaling policies
      • Virtual warehouses: Creation and modifications
        • Warehouse setup using console
      • Warehouse: Using SQL commands
        • Creating warehouse
        • Other warehouse commands
      • Virtual warehouses: Billing and usage
    • Storage layer
      • Time-travel
      • Failsafe
      • Data cloning
      • Storage usage and billing
    • Cache in Snowflake
      • Metadata cache
      • Query result cache
      • Warehouse cache
    • Conclusion
    • Points to remember
    • Multiple choice questions
    • Answers
    • Questions
  • 3. Data Types, Data Objects and SQL Commands
    • Introduction
    • Structure
    • Objectives
    • Understanding data types
      • Numeric data types
      • String and binary data types
      • Binary data types
      • Logical data types
      • Date and time data types
        • Date and time intervals and constants
        • Supported date and time arithmetic operations
      • Semi-structured data types
        • VARIANT
        • OBJECT
        • OBJECT constant
        • OBJECT elements
        • OBJECT considerations
        • ARRAY
        • ARRAY elements
        • ARRAY considerations
      • Geospatial data types
        • Geography
        • Geometry
    • Using SQL commands
      • Data definition language
        • Account and session DDL
        • User and security DDL
        • Warehouse and resource monitor DDL
        • Database, schema and share DDL
        • Table, view and sequence DDL
        • Data loading / unloading DDL
        • User-defined functions, external functions, and stored procedures DDL
        • Data pipeline DDL
      • Data manipulation language
        • Standard DML
        • Data loading/unloading DML
        • File staging commands
    • Creating database objects
      • Creating users and roles
      • Creating roles using SQL commands
      • Creating users using SQL commands
      • Creating roles using Web Console
      • Creating users using Web Console
    • Creating warehouses
      • Creating warehouse using SQL command
      • Creating warehouse using Web Console
      • Creating databases and schemas
      • Creating database using SQL command
      • Creating database using web console
      • Creating schemas using SQL command
    • Setting up context
      • Setup context using SQL command
    • Creating tables and views
      • Creating table using SQL command
    • Implementing extended SQL objects
      • UDF example (JavaScript)
      • UDF example (SQL)
      • Tabular functions
        • Sample function
        • Creating function
        • Using function
      • SQL UDFT function
        • Creating function
        • Using function
      • External functions
    • Conclusion
    • Points to remember
    • Practical: Create Snowflake objects
    • Multiple choice questions
    • Answers
    • Questions
  • 4. Data Loading and Unloading
    • Introduction
    • Structure
    • Objectives
    • Understanding data load needs
    • Creating Snowflake load objects
      • Source files
        • Loading from file locations
        • Loading files from local system
        • Loading data from Web UI
    • Using COPY INTO to load data
      • Bulk load using COPY INTO
        • COPY syntax
      • Transformations supported in COPY INTO
        • COPY with transformation syntax
      • VALIDATE command
      • COPY optional parameters
      • COPY INTO options
        • Options
      • COPY INTO using for bulk loads
      • COPY for local system
        • PUT command
    • Understanding streaming loads
    • Implementing Snowpipe
      • Automate the load using cloud messaging
      • Using Snowpipe REST endpoints
      • Snowpipe versus bulk load
      • Snowpipe features and recommendations
    • Snowpipe commands
      • SQL commands
        • CREATE PIPE
        • ALTER PIPE
        • DROP PIPE
        • DESCRIBE PIPE
        • SHOW PIPE
    • Loading real time data
      • Snowpipe
      • Snowflake connector for Kafka
      • Snowpipe streaming
      • Snowpipe streaming versus Snowpipe
        • Streaming cost
    • Loading near real time data
    • Implementing Snowpipe
    • Understanding data unloading
    • Database feed
    • File feed
    • Using COPY to unload data
      • COPY INTO
        • Exported feeds, formats and locations
        • Compression supported
        • Encryption supported
    • Bulk unloading from a table or query
    • Bulk unloading to one or multiple files
      • File naming in case of multiple file exports
      • Bulk unloading using PARTITIONs
      • COPY INTO options
      • Exporting data to named internal stages
      • GET command
      • Exporting data to named external stages
      • Steps to unload data to named external stages
    • Practical: Data loading and unloading
      • Creating stages
      • Creating file formats
      • Creating storage integrations
      • Creating LOAD commands and load using COPY
      • Load from local using COPY
      • Unload from local using COPY
    • Conclusion
    • Points to remember
    • Questions
    • Multiple choice questions
    • Answers
  • 5. Understanding Streams and Tasks
    • Introduction
    • Structure
    • Objectives
    • Understanding Change Data Capture
    • Understanding Snowflake Streams
      • Types of Streams
        • Standard Streams
        • Append-only Streams
        • Insert-only Streams
      • Validity of change logs
      • Billing of Streams
    • Implementing data capture with Streams
      • CREATE STREAM
      • ALTER STREAM
      • DESCRIBE STREAM
      • DROP STREAM
      • SHOW STREAMS
      • Stream examples
    • Understanding tasks
      • Serverless tasks
      • User-managed tasks
      • Scheduling tasks
    • Implementing scheduling with Tasks
      • TASK commands
        • CREATE TASK
        • ALTER TASK
        • DROP TASK
        • EXECUTE TASK
        • SHOW TASKS
    • Practical: Create Snowflake Tasks and Streams
      • Exercise 1
      • Exercise 2
      • Exercise 3
    • Conclusion
    • Points to remember
    • Multiple choice questions
      • Answers
    • Questions
  • 6. Understanding Snowpark
    • Introduction
    • Structure
    • Objectives
    • Why Snowpark?
    • Understanding Snowpark
      • Snowpark-optimized warehouse
      • Snowpark-optimized warehouse billing
      • Snowpark-ML
      • Snowpark-benefits over Spark Connector
    • Implementing Snowpark
      • Environment setup
      • Using Python and developing code
      • How does Python dataframe API work?
      • Using Snowpark for ML
    • Use cases to implement Snowpark
      • Data engineering use cases
    • Practical: Setting up Snowpark and use cases
      • Snowpark engineering use cases
    • Conclusion
    • Points to remember
    • Questions
    • Multiple choice questions
    • Answers
  • 7. Access Control and Managing Users Roles
    • Introduction
    • Structure
    • Objectives
    • Snowflake access control overview
      • Snowflake role and users
        • Types of roles
      • Snowflake securable objects
    • Understanding RBAC
    • Understanding default roles in Snowflake
    • Implementing RBAC with Snowflake
      • Step 1: Creating Snowflake resources
        • Create environments
        • Create DEV databases and schemas
        • Create QA databases and schemas
        • Create BI databases and schemas
        • Create ML databases and schemas
        • Create warehouses for users and use cases
      • Step 2: Create custom roles
        • Create DEV custom role
      • Step 3: Granting access to the custom roles
        • Grant access of DATABASES to custom role
        • Grant access to WAREHOUSES to the custom role
        • Grant access to QA custom role
        • Grant access to DEV custom role in QA environment
      • Step 4: Create users (user onboarding)
        • Create users
      • Step 5: Assign custom roles to the users
        • Assign custom roles to the users
      • Step 6: Using the custom roles
        • Using custom roles created
        • Testing and validating access of the custom roles created
      • Step 7: Validating the access privileges
        • Using custom roles created
    • Understanding access hierarchy
    • Managing users and roles
      • User commands
      • CREATE command
      • DROP command
      • ALTER command
      • DESCRIBE command
      • SHOW command
    • Practical
    • Conclusion
    • Points to remember
    • Questions
    • Multiple choice questions
    • Answers
  • 8. Data Protection and Recovery
    • Introduction
    • Structure
    • Objectives
    • What is data protection?
    • Implementing dynamic masking
      • Implementation commands
        • Create masking policy
        • Normal masking policy
        • Conditional masking policy
        • Alter masking policy
        • Drop masking policy
        • Show masking policy
        • Describe masking policy
      • Using dynamic masking
        • Step 1: Create custom role
        • Step 2: Grant custom role to a user
        • Step 3: Create masking policy
        • Step 4: Apply masking policy to the database objects
        • Step 5: Query data and validate masking policies
    • Understanding data recovery options
      • Understanding time travel
        • Time travel SQL extensions
        • Time travel data retention
        • Time travel setup
        • Time travel storage cost
        • Time travel storage for temporary and transient tables
      • Understanding Failsafe
        • Failsafe storage cost
    • Implementing time travel
      • Setup time travel at create table
      • Access historical data
      • Cloning objects
      • Commands to access dropped and restored objects
      • List dropped objects
      • Restore dropped objects
      • Reference use case
    • Implementing data replication
      • Database replication
      • Share replication
      • Replication group
      • Replication schedule
      • Business continuity
    • Practical: Create Snowflake policies, tagging objects
    • Exercises
    • Conclusion
    • Points to remember
    • Questions
    • Multiple choice questions
    • Answers
  • 9. Snowflake Performance Optimization
    • Introduction
    • Structure
    • Objectives
    • Understanding Snowflake performance
      • Query performance
    • Understanding Snowflake metadata objects
      • INFORMATION_SCHEMA
        • Using INFORMATION_SCHEMA
      • ACCOUNT_USAGE
      • DATA_SHARING_USAGE
      • ORGANIZATION_USAGE
    • Introduction to ACCOUNT_USAGE
      • Dropped object information
      • Latency of the details
      • Data retention
      • Account usage views
      • Account usage functions
      • Reader account usage views
      • Roles and permissions
    • Introduction to INFORMATION_SCHEMA
      • Information schema views
      • Information schema functions
    • Calculating and understanding performance measures
    • Understanding QUERY_HISTORY
    • Introduction to Query Acceleration Service
      • SYSTEM$ESTIMATE_QUERY_ACCELERATION function
      • QUERY_ACCELERATION_ELIGIBLE view
      • Enable query acceleration service
      • Query acceleration usage
      • Query acceleration billing
    • Implementing performance optimization
      • Query performance optimization
        • View historical performance in Snowsight
        • View query profile Snowsight
        • Listing queries for performance optimization
      • Warehouse performance optimization
        • Reducing queue time
        • Resolving memory spillage
        • Changing warehouse size
        • Optimizing cache
        • Limiting concurrent queries
        • Using query acceleration service
      • Storage performance optimization
        • Automatic clustering
        • Materialized views
        • Search optimization service
        • Choosing the right strategy
        • Performance considerations
    • Conclusion
    • Points to remember
    • Questions
    • Multiple choice questions
    • Answers
  • 10. Understanding Snowflake Costing and Utilizations
    • Introduction
    • Structure
    • Objectives
    • Understanding Snowflake costing
      • Visibility
      • Control
      • Optimization
    • Component costing in Snowflake
      • Cloud services cost
      • Compute cost
      • Storage cost
      • Serverless cost
      • Data transfer cost
    • Using Snowflake metadata objects
      • View compute usage
      • View Cloud services usage
      • View serverless usage
    • Monitoring Snowflake costs
    • Conclusion
    • Points to remember
    • Questions
    • Multiple choice questions
    • Answers
  • 11. Implementing Cost Optimizations
    • Introduction
    • Structure
    • Objectives
    • Components costing in Snowflake
      • Compute or virtual warehouse layer
      • Cloud services layer
      • Storage layer
      • Serverless services
      • Sample use case
    • Implementing cost optimization
      • Compute optimization
        • Access setup to warehouse
        • Setup the right size of the warehouse
        • Setup query time limit
        • Setup query queue time limit
        • Setup auto suspend and auto resume
        • Enforce spend limits
        • Storage optimization
        • Store only required data
      • Cloud services optimization
    • Implementing cost dashboards
    • Conclusion
    • Points to remember
    • Questions
  • 12. Data Sharing
    • Introduction
    • Structure
    • Objectives
    • Data sharing needs
    • Implementing data sharing
      • How does data sharing work?
    • Implementing sharing with Snowflake consumers
      • Data sharing considerations
      • Data sharing commands
      • Data sharing pre-requisites
      • Creating a data share
      • Creating a data share with multiple databases
      • Using a data share
    • Implementing sharing with non-Snowflake consumers
      • Creating reader account
    • Understanding data sharing options
      • Listing
        • Marketplace listing
        • Private listing
        • Personalized listing
      • Data exchange
      • Providers tasks
      • Consumer tasks
    • Practical: Create Snowflake shares
      • Scenario
      • Technical requirement
      • Account setup
      • Data sharing
    • Conclusion
    • Points to remember
    • Questions
  • 13. Data Cloning
    • Introduction
    • Structure
    • Objectives
    • Data cloning needs
    • Zero copy cloning
    • Implementing data cloning
      • Cloning considerations
    • Cloning implementation
      • Syntax of CREATE
      • Reference use cases and examples
    • Understanding the usage of cloned objects
    • Practical: Create Snowflake clones
    • Conclusion
    • Points to remember
    • Questions
  • 14. Understanding Snowsight
    • Introduction
    • Structure
    • Objectives
    • Understanding Snowsight
      • Snowsight interface
      • Worksheet pages
      • Data pages
      • Dashboard page
      • Data page
      • Marketplace page
      • Activity page
      • Admin page
    • Implementing dashboards with Snowsight
    • Practical: Create Snowflake dashboards
      • Use case
      • Initial set up
      • Key asks
    • Conclusion
    • Points to remember
    • Questions
  • 15. Programming Connectors and Drivers
    • Introduction
    • Structure
    • Objectives
    • Understanding programming connectors
      • Snowflake connector for Python
      • Snowflake connector for Kafka
      • Snowflake connector for Spark
    • Understanding drivers
      • JDBC driver and ODBC driver
      • GO driver
    • Understanding Snowsql: CLI
      • Installing Snowsql
      • Connecting to Snowsql
      • Using password to connect via Snowsql
    • Conclusion
    • Annexure
    • Points to remember
    • Questions
  • 16. Workload Patterns with Snowflake
    • Introduction
    • Structure
    • Objectives
    • Understanding platform needs
      • Extract, Transform, and Load
      • Extract, Load and Transform
    • Implementing data solutions with Snowflake
      • Snowflake’s features
      • Data solutions
    • Reference use cases and architecture
      • Data warehouse reference use case
        • Use case
        • Technical requirements
        • Business requirements
        • Proposed solution design
        • Reference architecture
        • Pipeline design
        • Platform design
      • Data lake reference use case
        • Use case
        • Technical requirements
        • Business requirements
        • Proposed solution design
        • Reference architecture
        • Pipeline design
        • Platform design
      • Data mesh reference use case
        • Use case
        • Technical requirements
        • Business requirements
        • Proposed solution design
        • Reference architecture
        • Platform design
    • Snowflake recommendations
      • Recommendations
        • Warehouse recommendations
        • Storage recommendations
        • Load recommendations
        • Transformation recommendations
        • Usage recommendations
    • Conclusion
    • Points to remember
    • Questions
  • 17. Introduction to Snowflake’s Advance Features
    • Introduction
    • Structure
    • Objectives
    • Understanding new features and releases
      • Snowflake releases
        • New features
      • SQL extensions
        • Schema features
        • SQL features
    • Understanding new types of tables
      • Dynamic tables
      • Event tables
      • Hybrid tables
      • Iceberg tables
    • Introduction to new Snowsight features
      • Worksheets
      • Data governance
      • Native apps
      • Budgets
      • Data loading
      • Stage options
    • Introduction to new LLM capabilities
    • Conclusion
    • Points to remember
  • Index
User Reviews
Rating