Gnarly Data Waves is a weekly show about the world of Data Analytics and Data Architecture. Learn about the technologies giving the company access to cutting-edge insights. If you work datasets, data warehouses, data lakes or data lakehouses, this show it for you!
Join us for our live recordings to participate in the Q&A:
dremio.com/events
Subscribe to the Dremio youtube channel on:
youtube.com/dremio
Take the Dremio Platform for a free test-drive:
The podcast Gnarly Data Waves by Dremio is created by Dremio (The Open Data Lakehouse Platform). The podcast and the artwork on this page are embedded on this page using the public podcast feed (RSS).
This session will provide a comprehensive overview of Iceberg's journey, its current role within the data ecosystem, and the promising future it holds with the integration of Polaris (incubating). We will discuss how these technologies redefine table formats and catalog management, empowering organizations to efficiently manage and analyze large-scale data. Attendees will gain valuable insights into the evolving landscape, ensuring they remain at the forefront of innovation and continue to shape thought leadership in the data ecosystem.
Try Out Dremio on your Laptop: https://drmevn.fyi/youtubelakehouse102924
Legacy data platforms often fall short of the performance, processing and scaling requirements for robust AI/ML initiatives. This is especially true in complex multi-cloud (public, private, edge, airgapped) environments. The combined power of MinIO and Dremio creates a data lakehouse platform that overcomes these challenges, delivering scalability, performance and efficiency to ensure successful AI initiatives. Watch Brenna Buuck, Sr. Technical Evangelist at MinIO and Alex Merced, Sr. Technical Evangelist at Dremio provided insights on: - AI Workflows: How a data lakehouse simplifies critical AI tasks like model training, refinement, feature selection and real-time inference for faster decisions - Scalability and Performance: How a data lakehouse architecture scales seamlessly to meet the fast-growing demands of AI applications - Data Management Efficiency: How a data lakehouse streamlines data management for IT teams, allowing them to focus on innovation
Dremio unveiled new features in our latest release that enhance the creation, performance, and management of Apache Iceberg data lakehouses. You will learn how Dremio delivers market-leading SQL query and write performance, improved federated query security and management as well as streamlined data ingestion by delivering: - Live Reflections on Iceberg tables that will accelerate performance, ensure up-to-date data and reduce management overhead. - Result Set Caching that can accelerate query performance up to 28X - Merge-on-Read that can enhance write and ingestion speed - Auto Ingest Pipes that eliminate complex pipeline setup and maintenance - User Impersonation for federated queries that allows for granular permissions, better access control, and user workload tracking
Organizations want to empower teams with data at their fingertips and in every part of their business. They want their teams to move quickly with data never as a bottleneck, but an accelerant to decision making —all without the curiosity tax too common in consumption-based cloud platforms. Dremio enables data teams to unify all of their disparate data, from Snowflake to Iceberg and other sources, by combining an intelligent semantic layer with a powerful SQL platform that eliminates silos, optimizes costs through intelligent query acceleration, and enables self-service analytics. You will how Dremio enables Snowflake users to: - Unify all of your data from Snowflake and all sources - Optimize analytics costs and performance - Use easy self-service analytics for faster time-to-insight - Ensure Apache Iceberg native compatibility for future-proof data access
Learn how to master semantic layers with Dremio. We will provide a high-level overview of their purpose in modern analytics, showing how they act as a bridge between complex data sources and business users. You’ll learn how semantic layers simplify data access, ensure consistency, and empower users to derive meaningful insights from data, regardless of their technical expertise. - The definition and core purpose of a semantic layer in data analytics: How it acts as a bridge between complex data and business users, simplifying data access and interpretation. - Key benefits and use cases of semantic layers: How they enable self-service analytics, ensure data consistency, and accelerate time-to-insight. - How Dremio's semantic layer technology can transform your data strategy: Dremio makes it easier to manage and leverage your data for faster, data-driven decision-making.
Watch Vishnu Vardhan, Director of Product Management StorageGRID at NetApp and Alex Merced,Senior Technical Evangelist at Dremio, as they explore the future of data lakes and discover how NetApp and Dremio can revolutionize your analytics by delivering the next generation of lakehouse with Apache Iceberg. Transitioning to a modern data lakehouse environment allows organizations to increase business insight, reduce management complexity, and lower overall TCO of their analytics environments. The growing adoption of Apache Iceberg is a key enabler for building the next generation lakehouse. Its robust feature set, coupled with an open ecosystem for analytics use cases, including ACID transactions, time travel, and schema evolution, continues to drive rapid adoption. Vishnu and Alex will delve into market trends surrounding Iceberg, as well as key drivers for lakehouse adoption and modernization. You will learn about: - Iceberg adoption trends - NetApp StorageGRID and its benefits - The Dremio and NetApp data lakehouse solution - Key Iceberg data lakehouse modernization use cases - Customer examples
Watch and learn about Apache Iceberg. A 10 part web series designed to help you master Apache Iceberg. https://hello.dremio.com/webcast-an-apache-iceberg-lakehouse-crash-course-reg.html?utm_medium=social-free&utm_source=youtube&utm_content=webcast-gdw-se-the-architecture-of-apache-iceberg-apache-hudi-and-delta-lake-intro&utm_campaign=webcast-gdw-se-the-architecture-of-apache-iceberg-apache-hudi-and-delta-lake-intro The "An Apache Iceberg Lakehouse Crash Course" an in-depth webinar series designed to provide a comprehensive understanding of Apache Iceberg and its pivotal role in modern data lakehouse architectures. Over the course of ten sessions, you'll explore a wide range of topics: - foundational concepts like data lakehouses - table formats to advanced features such as partitioning, optimization, and streaming with Apache Iceberg - Each session will offer detailed insights into the architecture and capabilities of Apache Iceberg, alongside practical demonstrations of data ingestion using tools like Apache Spark and Dremio.
As the demand for data analytics grows, and with a decentralized approach at its core, Major Swedish manufacturer Scania needed to balance domain autonomy and alignment, while implementing a self-serve data & governance platform, coupled with a unified way of accessing data. Discover how Scania addressed these challenges by adopting a data mesh strategy, and how using Dremio and Witboost has facilitated their journey. Learn about the cultural shifts, changes, and partnerships that are driving tangible business impacts. Additionally, gain insights and trends from Dremio’s Field CDO and the co-founder and CTO Witboost. Ready to Get-Started: https://www.dremio.com/get-started/?utm_medium=website&utm_source=youtube&utm_content=gdw-od&utm_campaign=gdw-ep51 See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-waves/?utm_medium=website&utm_source=youtube&utm_content=gdw-od&utm_campaign=gdw-ep51 Connect with us! Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN Resource: https://www.dremio.com/resources/?utm_medium=website&utm_source=youtube&utm_content=gdw-od&utm_campaign=gdw-ep51 Events: https://www.dremio.com/events/?utm_medium=website&utm_source=youtube&utm_content=gdw-od&utm_campaign=gdw-ep51
Join us for a captivating recap of Subsurface 2024—the leading conference at the intersection of data engineering, open source technology, and modern data architecture. This webinar will distill: - highlights of the conference, - curated clips of inspiring keynotes, - insightful discussions on real-world data lakehouse implementations by industry leaders such as Nomura, NetApp, and Blue Cross. - and deep dives into the transformative potential of open source projects like Apache Iceberg, Apache XTable, and Ibis. Whether you missed the conference or want to revisit its most impactful moments, this webinar offers a unique opportunity to stay ahead of the curve in the rapidly evolving data landscape. Don't miss this chance to gain valuable insights from the experts and innovators who are shaping the future of data. - Article on Dremio Auto-Ingest: https://www.dremio.com/blog/introducing-auto-ingest-pipes-event-driven-ingestion-made-easy/ - Article on Dremio and Hybrid Data Lakehouses (Vast, Netapp, Minio): https://www.dremio.com/blog/3-reasons-to-create-hybrid-apache-iceberg-data-lakehouses/ --------------------------------------------------------------- Get Hands-on with the Data Lakehouse ---------------------------------------------------------------- - Apache Iceberg Lakehouse on your Laptop: https://bit.ly/am-dremio-lakehouse-laptop - SQLServer to Iceberg to Dashboard: https://bit.ly/am-sqlserver-dashboard - MongoDB to Iceberg to Dashboard: https://bit.ly/am-mongodb-dashboard - Postgres to Iceberg to Dashboard: https://bit.ly/am-postgres-to-dashboard - MySQL to Iceberg to Dashboard: https://bit.ly/am-dremio-mysql-dashboard - Elasticsearch to Iceberg to Dashboard: https://bit.ly/am-dremio-elastic - Apache Druid to Iceberg to Dashboard: https://bit.ly/am-druid-dremio - JSON/CSV/Parquet to Iceberg to Dashboard: https://bit.ly/am-json-csv-parquet-dremio - From Kafka to Iceberg to Dremio: https://bit.ly/am-kafka-connect-dremio - Lowering Snowflake Costs with Dremio: https://bit.ly/am-dremio-snowflake-spend
Watch Alex Merced, Senior Technical Evangelist at Dremio on "Optimize Analytics Workloads with Dremio + Snowflake". This session will delve into the key cost drivers of Snowflake and demonstrate how integrating Apache Iceberg and Dremio with a Data Lakehouse architecture can significantly reduce your data warehousing expenses. Discover strategies to optimize your data operations and achieve cost efficiency with cutting-edge technologies. Ready to Get-Started: https://www.dremio.com/get-started/?u... See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN Resource: https://www.dremio.com/resources/?utm... Events: https://www.dremio.com/events/?utm_me...
Dremio is making it easier than ever to build and manage an Apache Iceberg data lakehouse. Mark Shainman will share the new Dremio capabilities that help you achieve the fastest, most scalable, and easiest-to-manage lakehouse for analytics and AI. In this video you’ll learn how: - Dremio can help you accelerate Apache Iceberg adoption with seamless ingest - Enhanced Reflections query acceleration can optimize performance and streamline management - New capabilities continue to improve reliability, stability and scalability - Dremio is delivering new capabilities to increase observability for ease of administration and management
We will embark on a journey that begins with a brief history of data analytics, tracing its development through the advent of the data lakehouse concept. This exploration sets the stage for a deeper understanding of the unique position Dremio occupies within this ecosystem, highlighting its innovative approach to bridging the gap between vast data lakes and the analysts striving to extract actionable insights. The core of this presentation features a live demonstration, showcasing the end-to-end process of data connection and evaluation within the Dremio platform. Attendees will witness firsthand how Dremio facilitates a seamless flow of data from storage in data lakes to its transformation into a format ready for analysis, ultimately culminating in the delivery of valuable insights to analysts. This demonstration not only illustrates Dremio’s capabilities but also emphasizes its role in enabling a win-win scenario for both data engineers and analysts, by simplifying access to data and enhancing the efficiency of the analytics process. In this video, we’ll cover: - A short overview of the power of Dremio - What is a semantic layer and why you need it - Why Dremio is faster than anything else Watch to gain a deeper understanding of the Dremio Data Lakehouse and discover how it can revolutionize your approach to data analytics, from enhancing data accessibility to streamlining the journey from raw data to actionable insights.
Ready to revolutionize your data management approach and learn how to maximize your environment with Dremio? Watch Alex Merced in this workshop where he’ll guide you step-by-step through building a lakehouse on your laptop with Dremio, Nessie and Minio. This is a great opportunity to try out many of the best features Dremio offers. You'll learn how to: - Read and write Apache Iceberg tables on your object storage, cataloged by Nessie, - Create views in the semantic layer, - And much more GDW Community Edition Workshop Description: This hands-on workshop, participants will embark on a journey to construct their very own data lakehouse platform using their laptops. The workshop is designed to introduce and guide participants through the setup and utilization of three pivotal tools in the data lakehouse architecture: Dremio, Nessie, and Apache Iceberg. Each of these tools plays a crucial role in enabling the flexibility of data lakes with the efficiency and ease of use of data warehouses aiming to simplify and economize data management. You will start by setting up a Docker environment to run all necessary services, including a notebook server, Nessie for catalog tracking with Git-like versioning, Minio as an S3-compatible storage layer, and Dremio as the core lakehouse platform. The workshop will provide a practical, step-by-step guide to federating data sources, organizing and documenting data, and performing queries with Dremio; tracking table changes and branching with Nessie; and creating, querying, and managing Apache Iceberg tables for an ACID-compliant data lakehouse. Prerequisites for the workshop include having Docker installed on your laptop. You will be taken through the process of creating a docker-compose file to spin up the required services, configuring Dremio to connect with Nessie and Minio, and finally, executing SQL queries to manipulate and query data within their lakehouse. This immersive session aims to not just educate but to empower attendees with the knowledge and tools needed to experiment with and implement their data lakehouse solutions. By the end of the workshop, participants will have a functional data lakehouse environment on their laptops, enabling them to explore further and apply what they have learned to real-world scenarios. Whether you're looking to improve your data management strategies or curious about the data lakehouse architecture, this workshop will provide a solid foundation and practical experience.
Data leaders are navigating the challenging landscape of enabling data-driven customer experiences and enhancing operational efficiency through analytics insights, all while meticulously managing budgets. Organizations leveraging cloud data warehouses, like Snowflake, often grapple with the complexities of unifying data analytics across diverse cloud and on-premise applications. The process involves significant costs, resources, and time to extract, rebuild, and integrate data for consumability. Enter the data lakehouse - offering the potential to drastically reduce the total cost of ownership (TCO) associated with analytics. In this video, you will gain insights into: Key distinctions between traditional data warehouses and the innovative data lakehouse model. How Dremio empowers organizations to slash analytics TCO by over 50%. Uncovering hidden costs associated with data ingestion, storage, compute, business intelligence, and labor. Simplifying self-service analytics through Dremio's unified lakehouse platform. Watch Alex Merced, Developer Advocate at Dremio to explore the future of data management and discover how Dremio can revolutionize your analytics TCO, enabling you to do more with less.
Organizations aim to increase data access and lower the time it takes to gain insights, all while managing governance and controlling rising data costs. Dremio’s unified lakehouse platform for self-service analytics enables data consumers to move fast while also reducing manual repetitive tasks and ticket overload for data engineers. In this Gnarly Data Waves episode, you will learn: - Overview of Dremio, what is it and why is it growing rapidly - Proven use cases by some of the most demanding customers in the world - Demonstration for how to rapidly get started and try it out Ready to Get-Started: https://www.dremio.com/get-started/?u... See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN Resource: https://www.dremio.com/resources/?utm... Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #opendatalakehouse #apachei
Traditional ETL processes are notorious for their complexity and cost inefficiencies. Join us as we introduce a game-changing virtual data pipeline approach with Dremio's next-gen DataOps, aimed at streamlining, simplifying, and fortifying your data pipelines to save time and reduce cost. In this webinar, you'll gain insights into: - Simplified Data Pipeline Management: How to use Dremio for data source branching, merging, and pipeline automation. - Mastering Data Ingestion and Access: Learn how to curate data using virtual data marts accessed through a universal Semantic layer. - Better Orchestration with dbt: Discover the benefits of orchestrating DML and view logic, optimizing data workflows. - Elevating Data Quality: Learn techniques to automate lakehouse maintenance and improve data integrity.
S&P Global is a leading global financial services company headquartered in New York. It provides credit ratings, benchmarks, analytics, and workflow solutions in the global capital, commodity, and automotive markets. As a company, data is an essential asset across all of S&P Global’s solutions offerings. Watch Tian de Klerk, Director of Business Intelligence, as he shares how they built a data lakehouse for FinOps analysis with Dremio Cloud on Microsoft Azure. Tian will share about: - The hidden costs of extracting operational data into BI cubes - Simplifying traditional data engineering processes with Dremio’s zero-ETL lakehouse - How Dremio’s semantic layer and query acceleration make self-service analytics easy for end users
In this session, Dremio and Microsoft will delve into the exciting developments surrounding the public preview launch of Dremio Cloud on Microsoft Azure. This presentation will provide a comprehensive exploration of how businesses are strategically operationalizing their data lakes, with a particular focus on unlocking the vast potential residing within Azure Storage. Attendees will gain valuable insights into the transformative journey toward harnessing the full benefits of a data lakehouse. The discussion will guide participants through the myriad possibilities that emerge when leveraging Dremio Cloud seamlessly on Azure, offering a holistic approach to executing analytics pipelines. This integration eliminates the need for costly data warehouses, presenting a revolutionary paradigm shift. A step-by-step walkthrough will illuminate the process of landing data within the lakehouse, followed by seamlessly progressing data through a virtual semantic layer. This strategic approach adds significant business meaning and value, enhancing the overall utility of the data before it is surfaced to end users. The session will also shed light on the noteworthy performance improvements and cost savings achieved by reducing data extract expenses associated with Power BI workloads. By embracing Dremio Cloud on Azure, organizations can elevate their analytical capabilities while optimizing operational costs, marking a pivotal advancement in the realm of data management and analytics. Join us as we explore the forefront of innovation in data lake operationalization and witness the tangible benefits of this dynamic integration. Watch and learn how Jonny Dixon, Sr. Product Manager at Dremio and Hanno Borns, Principal Product Manager at Microsoft Azure will look into: - Problems companies face with existing analytical architectures - How Dremio and Microsoft Azure work together - What Dremio Cloud on Azure is, and the value it provides - How the Dremio Cloud on Azure solution works, with a demo
Dremio delivers no compromise lakehouse analytics for all of your data - and recent launches are making Dremio faster, more reliable, and more flexible than ever. Watch Mark Shainman, Product Marketing Manager at Dremio, Colleen Quinn, Product Marketing Manager at Dremio share and learn what's new in Dremio. - New Gen-AI capabilities for automated data descriptions and labeling - Dremio Cloud SaaS service now available on Microsoft Azure - Advances to ensure 100% query reliability with no memory failures - Expanded Apache Iceberg capabilities to streamline Iceberg adoption and improve performance Ready to Get-Started: https://www.dremio.com/get-started/?u... See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN Resource: https://www.dremio.com/resources/?utm... Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #opendatalakehouse #apacheiceberg #selfservice #enterprisedata #multitables #tableformat #microsoftazure #saas #automateddata #query #whatsnew #colleenquinn #markshainman #getstarted #etl #pipelines #genai #generativeai #parquet #json #tableau
Embark on a transformative journey with our insightful presentation, "ZeroETL & Virtual Data Marts: The Cutting Edge of Lakehouse Architecture." In this engaging video, we'll delve into the intricacies of modern data engineering and how it has evolved to address key pain points in the realm of data processing. Alex will illuminate the challenges data engineers face, from the complexities of backfilling and brittle pipelines to the frustration of sluggish data delivery. We'll introduce you to the high-impact concepts of ZeroETL and Virtual Data Marts, demonstrating how these innovative patterns can dramatically alleviate these common pains. By reducing the need for manual data movement and preparation pipelines, you'll discover a more efficient, agile, and responsive data ecosystem. Watch this video as Alex Merced, Developer Advocate from Dremio provide a practical guide to implementing these transformative patterns. He'll walk you through the steps to bring the power of ZeroETL and Virtual Data Marts into your own data landscape. Leveraging cutting-edge tools like Dremio, DBT, and more, you'll gain hands-on experience in designing and deploying these patterns to streamline your data workflows and supercharge your analytics capabilities. Don't miss this opportunity to stay at the forefront of data architecture, enabling your organization to harness data's full potential while reducing complexity and overhead. Join us for an exploration of the future of data engineering – a future where ZeroETL and Virtual Data Marts pave the way for data agility, speed, and innovation. Ready to Get-Started: https://www.dremio.com/get-started/?u... See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN Resource: https://www.dremio.com/resources/?utm... Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #opendatalakehouse #apacheiceberg #selfservice #enterprisedata #multitables #tableformat #ETL #BI #genai #datapipelines #datamovement #automation #security #nodatacopies #dataanalytics #reflection #dataaccess #storagecost #lowercompute #maintenancecost #networkfees #licensingfees #federatesources #dashboards #c3 #apachearrow #queryoptimizer #rawreflections #aggregatereflections #lakehousearchitecture #zeroetl #virtualdatamarts #datasource
Get hands on with Dremio on Your Laptop: https://www.dremio.com/blog/intro-to-... The challenges of building and maintaining data pipelines have become all too familiar. This meetup, titled "ZeroETL & Virtual Data Marts: A Discussion in Painless Data Engineering," aims to shed light on these common pains and explore innovative solutions that can revolutionize the field. The event will kick off with an engaging presentation that delves into the typical pain points experienced by data engineers. These challenges include dealing with brittle pipelines that often necessitate endless backfilling and contending with the delays resulting from layers of pipelines, leading to stale and inaccurate data reaching data consumers. However, the meetup doesn't stop at identifying problems. Our discussion will introduce you to potential solutions that harness the power of Dremio, enabling the adoption of "Painless" Patterns such as "ZeroETL" and "Virtual Data Marts." These patterns are designed to reduce the manual effort involved in data movement and the creation of data movement pipelines. Attendees will gain insights into how these approaches can streamline data engineering workflows, enhance data quality, and improve data accessibility for stakeholders.
Led by Mark Hoerth, Escalations Engineer, this workshop will guide you through the process of creating tables in your Iceberg catalog, ingesting Iceberg Tables into Amazon S3, creating a clean data product, enabling governed self-service for your organization, and ultimately querying the data through our SQL Runner and a BI Tool. Key Learning Points: - Creating tables in your Iceberg catalog - Manage data as code to create a production data product effortlessly - Implement data controls to enable governed self-service for your business - Create a Reflection for sub-second BI performance - Query the data product effectively Be sure to configure your Dremio Cloud Account - https://www.dremio.com/sign-up/ and create your first Sonar Project - https://docs.dremio.com/cloud/tutoria.... By doing so, you will be fully prepared to actively follow the workshop and maximize the benefits of this workshop. Workshop Source Code - https://gist.github.com/isha-dremio/7... See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN Resource: https://www.dremio.com/resources/?utm... Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #datasharing #ETL #selfservice #dataascode #branches #tableformat #dremiosonar #enterprisedata #reflections #workshop #iceberglakehouse #buildin60 #MarkHoerth
In this video with Alex Merced, Developer Advocate, we'll explore how Dremio revolutionizes data access, delivering speed, simplicity, and substantial cost savings. Discover the power of Dremio as we dive deep into: - Data Access at Lightning Speed: Learn how Dremio accelerates data access, making insights available in real-time. - Simplicity in Data Preparation: Streamline your data pipeline with Dremio's intuitive interface for data transformation. - Cost Efficiency: Uncover how Dremio’s optimizations save you money while improving performance - Use Cases: Explore real-world success stories and applications of Dremio's data access solutions. - Future-Proofing Your Data Infrastructure: Understand how Dremio ensures scalability and adaptability. Watch this video to uncover the secrets of fast, easy data access without breaking the bank! See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN Resource: https://www.dremio.com/resources/?utm... Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #dremiotestdrive #opendatalakehouse #apacheiceberg #selfservice #enterprisedata #multi-tables #dataanalytis #tableformat #cloud #ETL #BI #genai #llm #datapipelines #datamovement #automation #security #nodatacopies #dataanalytics #reflection #dataaccess #otpbank #ncr #henkel #storagecost #lowercompute #maintenancecost #networkfees #licensingfees #federatesources #dashboards #c3 #apachearrow #queryoptimizer #rawreflections #aggregatereflections
Organizations are struggling with the proliferation of toolings in their data infrastructure and the exponential growth of ETL pipelines are slowing down data engineers to deliver value to the business. They want to spend more time making impactful decisions and working on high value projects. Fivetran significantly reduces the amount of time spent in building ETL pipelines with their no-code approach. Dremio is the easy and open data lakehouse, providing self-service analytics with data warehouse functionality and data lake flexibility across all your data. Together, Dremio and Fivetran bring the best solution for enabling organizations to GTM faster. In this video, you will learn: - What Iceberg table format is and why it matters in data lakehouses - How to load source files into Iceberg tables using Fivetran - How to create a unified access layer for your data with Dremio Cloud See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN Resource: https://www.dremio.com/resources/?utm... Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #dremiotestdrive #opendatalakehouse #apacheiceberg #selfservice #enterprisedata #multi-tables #analytics #dataanalytis #tableformat #cloud #cli #api #ETL #BI #fivetran #hadoop #AI #ML #genai #llm #datapipelines #datamovement #automation #scale #saas #cdc #governance #security #nodatacopies #dataanalytics #pii #reflection #timetravel #parquet #workday #oracle #postgres #aws #s3
Watch this insightful webinar featuring Jacopo Tagliabue of Bauplan Labs as he dives into the world of data science and machine learning pipelines. In this webinar, you'll discover the rationale behind Bauplan Labs' choice of open-source technologies, such as Apache Iceberg table format and Project Nessie transactional data catalog, for their cutting-edge platform. Gain valuable insights into why modern data platforms are increasingly adopting these technologies and how Nessie's git-like features can revolutionize your data management. Don't miss out on this opportunity to stay ahead in the world of data science and technology! About Project Nessie - Introducing Nessie as a Dremio Source Learn: - Why Modern Data Platforms are being built on Apache Iceberg - Why Modern Data Platforms are being built on Nessie See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN Resource: https://www.dremio.com/resources/?utm... Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #dremiotestdrive #opendatalakehouse #apacheiceberg #projectnessie #selfservice #lakehousecatalog #dataascode #enterprisedata #multi-tables #analytics #dataanalytis #bauplanlabs #jacopotagliabue #tableformat #transactionaldatacatalog #git-like #pipelines #cloud #mixed-languagetabulartransformations #cli #api #bauplansdk #sql #python #dag
Product analytics offers a transformative opportunity for companies to elevate the customer experience and offer a way to proactively understand customer behavior. This personalized understanding allows companies to tailor their product offerings, provide targeted recommendations, and streamline customer journeys, resulting in a more engaging, satisfying, and loyalty-inducing customer experience. Effective product analytics is a comprehensive strategy to proactively manage support and promote customer success. NetApp, a leading global company specializing in hybrid cloud data services, helps enterprises build a simple and secure way to drive innovation wherever their data and applications live. The customer experience is a core driver within NetApp’s portfolio of solutions offering. Watch Aaron Sims, Technical Director at NetApp as he shares his experience building out a unified access layer for product analytics with Dremio. In this video, you will learn: - NetApp’s journey to unified analytics with Dremio’s phased approach for Hadoop modernization - How a unified access layer makes data easier to discover and explore for your end users without data duplication - Ways to maximize your existing infrastructure investments for improved ROI and lower TCO with Dremio. See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN Resource: https://www.dremio.com/resources/?utm... Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #getstarted #opendatalakehouse #apacheiceberg #selfservice #table #enterprisedata #analytics #dataanalytis #governance #security #cii #netapp #productanalytics #activeiq #producttelemetry #optimization #bigdata #ai #hadoop #spark #storagegrid #pipelines #semanticlayer #lakehousequeryengine #dataingestion #etl #query #disasterrecovery #tco
Organizations who want to leverage their data lake for insights often struggle to deliver a consistent, accurate, high-quality view of their data to all of their data consumers. That challenge is often exacerbated by the need to make changes to data that impacts multiple tables. In this video, we’ll share how data teams can use Dremio Arctic, a data lakehouse management service, to simplify data management and operations. Using Git for Data capabilities like branching, tagging, and commits, we’ll show how Dremio Arctic makes it easier than ever to: - Create zero-copy clones of your data lake so data consumers can work on production-quality data without impacting other users. - Quickly make updates to all of your tables and merge those changes atomically, so every user has access to an accurate and consistent view of the data lake. - Reduce the costs and complexities associated with data lakehouse management. See all upcoming episodes and past episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN Resource: https://www.dremio.com/resources/?utm... Events: https://www.dremio.com/events/?utm_me...#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #dremiocloud #dremiotestdrive #opendatalakehouse #apacheiceberg #dremioarctic #selfservice #lakehousecatalog #dataascode #branches #automates #tableformat #dataoptimization #enterprisedata #reflections #fileformat #multi-tables #analytics #dataanalytis #timetravel #governance #security #accesscontrol #tableoptimization #tablecleanup #zero-copy #clones #productiondata #gitfordata #catalogversioning #ci #cd #merge #etl #branch
Imagine fast, intuitive analytics on all of your data where it lives - with the power of a data warehouse and the scale of a data lake. Dremio's open data lakehouse makes it easy to access, understand, and analyze all your data with a lightning fast SQL query engine and low-code and no-code options for all users. Learn what’s new in Dremio - including next gen Reflections SQL acceleration - and how you can accelerate self-service analytics at scale. We’ll discuss about: - Next gen Reflections SQL acceleration and new Reflection Recommender to automatically create Reflections for your most important queries - New Generative AI capabilities for text-to-SQL and more to make it possible for all users to interact with data - Expanded table format support, including time-travel for Delta Lake - Enhancements to our lightning-fast query engine that deliver even faster, more intelligent interactive analytics - Our native Apache Iceberg lakehouse catalog, Dremio Arctic, now in Preview See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #versioning #tables #catalog #accelerate #analytics #ELT #dataanalytis #query #real-timeanalytics #datastrategy#genreflections #sql #text-to-sql #timetravel #deltalake
In the world of data acceleration and optimization, both materialized views and Dremio's Data Reflections stand out as pivotal tools. This video aims to demystify these technologies, comparing their benefits, limitations, and unique features. Dive deep into understanding the core differences between materialized views and Dremio's Reflections. Whether you're a seasoned data professional or just starting out, this webinar offers insights to optimize your data strategy. Discover the nuances, best practices, and real-world applications of these powerful tools, and make informed decisions for your data architecture. See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #versioning #tables #catalog #accelerate #analytics #ELT #dataanalytis #query #real-timeanalytics #datastrategy
In the rapidly evolving landscape of big data, Data Lakehouse is heralding a new age of unified analytics, blending the best elements of data lakes and data warehouses. Central to this convergence is the need for advanced table formats that can meet the demands of scalability, performance, and data reliability. This webinar dives deep into the world of Data Lakehouse table formats, specifically focusing on Apache Iceberg, Delta Lake, and Apache Hudi. Who should watch this video? Data engineers, data architects, data analysts, and other professionals interested in modernizing their data platform or seeking deeper insights into the technicalities and advantages of these advanced table formats. Key Takeaways: - Introduction to Data Lakehouse: Explore the genesis of the Data Lakehouse paradigm, its significance, and how it’s reshaping the way organizations think about big data storage and analytics. - Demystifying Apache Iceberg, Delta Lake, and Apache Hudi: Understand the intricacies of these popular table formats, their architectural nuances, and how they differ from traditional table structures. - Features Spotlight: Delve into the unique feature sets that each format brings to the table - from ACID transactions, time-travel queries, to efficient upserts and scalability features. - The Relevance Quotient: Understand why these table formats matter in today's data-driven world. Learn about their roles in ensuring data consistency, improving query performance, and facilitating near real-time analytics on large datasets. - Best Practices and Use Cases: Explore real-world scenarios where organizations have leveraged these formats to transform their data analytics operations, and glean best practices for successful implementation and optimization. Watch this video to uncover the intricate dance of modern table formats that are at the heart of the Data Lakehouse revolution. Equip yourself with the knowledge to harness their power, ensuring a robust and efficient data infrastructure for your organization. See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #versioning #tables #catalog #accelerate #analytics #ELT #dataanalytis #ACIDtransactions #time-travelqueries
The data lakehouse is an architectural strategy that combines the flexibility and scalability of data lake storage with the data management, data governance, and data analytics capabilities of the data warehouse. As more organizations adopt this architecture, data teams need a way to deliver a consistent, accurate, and performant view of their data for all of their data consumers. In this video, we will share how Dremio Arctic, a data lakehouse management service: - Enables easy catalog versioning using data as code, so everyone has access to consistent, accurate, and high quality data. - Automatically optimizes Apache Iceberg tables, reducing management overhead and storage costs while ensuring high performance on large tables. - Eliminates the need to manage and maintain multiple copies of the data for development, testing, and production. See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #versioning #tables #catalog #accelerate #analytics #ELT #dataanalytis
Watch this video for an insightful discussion titled "ELT, ETL, and the Dremio Data Lakehouse," where we explore the cutting-edge capabilities of Dremio in revolutionizing data engineering and analytics workflows. This webinar delves into the strategic use of Dremio's innovative technologies to optimize Extract, Load, Transform (ETL) and Extract, Load, Transform (ELT) patterns for enhanced efficiency and cost-effectiveness. The session will commence with an in-depth exploration of traditional ETL and ELT methodologies, highlighting the challenges faced by organizations in managing large-scale data transformations. We will analyze the critical role of ELT patterns in the modern data landscape and the growing significance of data lakes for storage and processing. Subsequently, we will introduce Dremio, a powerful and flexible data lakehouse platform, as a game-changer for executing ETL and ELT operations. Dremio's unique architecture empowers users to directly query data residing in the data lake, eliminating the need for unnecessary data copies and reducing data movement overhead significantly. During the webinar, attendees will gain valuable insights into how Dremio's no-copy architecture minimizes data redundancy, accelerates data processing, and drastically reduces the associated costs. By harnessing the full potential of data lake storage, organizations can simplify their data engineering workflows, enhance data availability, and achieve unparalleled performance for analytical workloads. Key webinar takeaways: A comprehensive overview of ETL and ELT patterns and their relevance in modern data environments. - The rise of data lakes and the pivotal role of Dremio's data lakehouse platform in transforming data management paradigms. - Understanding the benefits of Dremio's no-copy architecture in optimizing data processing and analytics. - Best practices and practical implementation tips for leveraging Dremio effectively in ETL and ELT workflows. See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #versioning #tables #catalog #accelerate #analytics #ELT #dataanalytis
As data volumes grow - and more users across your organization want access to data to accelerate business decision-making - managing data governance is more important than ever. What this video in how to simplify data governance for analytics, and deliver data governance at scale with Dremio. You will learn: - Data governance on the data lakehouse - How to balance data access and control to accelerate analytics - What does good data governance look like? - How Dremio supports simplified data governance See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #versioning #tables #catalog #accelerate #analytics
Watch the Dremio developer advocacy and engineering teams for an installment of Apache Iceberg Office Hours. During this time we’ll have a brief Iceberg presentation on table format interoperability, going over the table format migration options, converters, and newer interoperability solutions like onetable and uniform. We’ll go through the capabilities, limitations, and considerations and then have lots of dedicated time for Q&A on the presented topic or any other questions or guidance you’re looking for help on in learning about Apache Iceberg or architecting your data lakehouse around Apache Iceberg. We will cover topics: - Format Interop - Using Lakehouse Engines to Unite Table Formats - Using Onehouse's Onetable Technology - Using Delta Lake 3.0 UniFormat - Consideration - Consistency - Consideration - Vendor Agnosticism - Consideration - Flexibility Examples of questions you can ask: How can I optimize my Iceberg tables for my different use cases? What tools will best handle my ETL job to write to Iceberg? How can I control access to my Iceberg tables? How can I convert data from X into an Iceberg table? How can I get started with Iceberg in Databricks? See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #versioning #tables #catalog #officehours #Deltalake #Onehouse #Onetable #vendoragnosticism #flexibility #consistency
Maersk is a global leader in container shipping, logistics, and energy. With an extensive network of offices in 116 countries, over 900 vessels, hundreds of warehouses, and a modern fleet of aircraft. Maersk provides comprehensive shipping services across the globe with commitments to achieve decarbonization and reach net-zero emissions. Join this live fireside chat with Mark Sear, Director of Data Analytics and AI/ML at Maersk, and Tomer Shiran, founder and chief product officer at Dremio, as they talk about Maersk’s journey in building a next-generation data platform for solution development using Dremio’s open data lakehouse and GenerativeAI. In this video, you will learn: - Common data platform challenges in the shipping and logistics industry - How Maersk uses Dremio’s open data lakehouse to empower their developers and end users to deliver agile and cost-effective solutions - A live demo of GenerativeAI See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #versioning #tables #catalog #generativeai #AI #shipping #logistic #maersk
Versioning is a technique that has helped software developers to develop many practices that allow them integrate and deploy new code continuously allowing for more rapid development of software. In a world where data is being generated faster than even, the data community needs technology that allows for rapid integration and deployment of new data. In this video, we’ll discuss: - 3 Levels of Versioning on the Data Lakehouse (File, Table and Catalog) - Pros and Cons to each versioning paradigm - When should you use each? See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #versioning #tables #catalog
In the rapidly evolving data landscape, organizations seek to use data assets to drive growth and competitive advantage. The problem is, the rigid warehouse-centric data architecture makes it hard to deliver faster access to data to end users without creating data copies and siloed ETL pipelines. As cloud data lakes grow, the challenge for many organizations will be providing access to that data for exploratory BI and interactive analytics. In this video, you will learn about building a data lakehouse on Azure Data Lake Storage with product leaders from Microsoft and Dremio: - The fundamentals of a data lakehouse architecture on Azure - The need for an open data lakehouse - Unifying data access on ADLS with Dremio - A self-service experience with Dremio and Power BI See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #microsoft #azure #dataarchitecture #azuredatalakestorage
As the data mesh paradigm gains adoption across enterprises, it’s hard to ignore the increasing focus on the architectural aspects of this approach, which often overshadows the crucial socio-organizational element. The problem is, it’s hard to implement the concept of data mesh if the technology and organizational aspects are not aligned. Business units need faster access to unified data and data teams want to simplify data architecture. Watch Nik Acheson, Sr. Director of Product Management and GTM Strategy from Dremio, as he talks about getting started with data mesh and how Dremio’s open data lakehouse brings the concepts of data mesh to life. In this video, you will: - Understand the core principles of data mesh - The benefits of data mesh and navigating organizational adoption - Learn how Dremio’s open data lakehouse simplifies data mesh journey with a 3-part phased approach See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #changedatacapture #AI #dataarchitecture
For analytical workloads, data teams today have various options to choose from in terms of data warehouses and lakehouse query engines. To enable self-service, they provide a semantic layer for end users, usually with materialized views, BI extracts, or OLAP cubes. The problem is, this process creates data copies and requires end users to understand the underlying physical data model. Join the Dremio engineering team in this episode of Gnarly Data Waves to learn about accelerating your queries with data reflections. Get answers to business questions faster without the challenges that come with today's approach, such as governing data copies or managing complex aggregate tables and materialized views. In this video, you will learn: - The importance of data reflections and how it removes the need for data copies - When to use raw reflections and aggregate reflections - Best practices on data reflection refreshes See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #reflections #ML #changedatacapture
In the search to implement a data lakehouse many have been adopting one of the three major data lakehouse table formats. In this video, you’ll learn about how the different formats can be used with Dremio’s lakehouse platform. - What is a Table Format - What is Iceberg, Delta Lake and Hudi - Reading with Dremio - Using Multiple Formats with Dremio - Accelerating Queries with Dremio See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #apachespark #ML #changedatacapture
As more data consumers require access to critical customer and operational data in the data lake, data teams need solutions that enable multiple users to leverage the same view of the data for a wide range of use cases without impacting each other. In this video of Gnarly Data Waves, we will discuss how the data as code capabilities in Dremio Arctic enable data scientists to: - Create a data science branch of the production branch for experimentation without creating expensive data copies or impacting production workloads - Easily work and collaborate cross-functionally with other data consumers and line of business experts - Quickly reproduce models and results by returning to previous branch states with tags and commit history See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #projectnessie #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool #apachespark #ML #changedatacapture
The Apache Iceberg project has made tremendous strides, evolving on various fronts such as usage, ecosystem adoption, community growth, and capabilities. In the past few months, the project has introduced many exciting new features and performance improvements around the core library, compute engines and standalone libraries (such as PyIceberg) that makes this lakehouse technology robust & valuable for organizations. In this video of Gnarly Data Waves, we will go over some of the notable new capabilities of Apache Iceberg. Specifically, we will discuss about: - Version 1.2.0 release - Features such as : Branching/Tagging, New write-distribution-mode, Change Data Capture, Catalog Migrator Tool, Delta to Iceberg migration - PyIceberg (What’s happening in the Python library) - Compute Engine-specific features: Dremio, Apache Spark, Flink See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #data #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #nessie #sonar #dremiosonar #optimization #automaticdata #scalability #enterprisedata #federated #catalogmigratortool # pylceberg #apachespark #flink #changedatacapture
Memorial Sloan Kettering Cancer Center (MSK) is the largest private cancer center in the world and has devoted more than 135 years to exceptional patient care, innovative research, and outstanding educational programs. Today, MSK is one of 52 National Cancer Institutes designated as Comprehensive Cancer Centers, with state-of-the-art science flourishing side by side with clinical studies and treatment Join Arfath Pasha, Sr. Engineer at Memorial Sloan Kettering, as he shares his data mesh experience building a scientific data and compute infrastructure for accelerating cancer research. In this episode, you will learn: Use cases for creating a central data lake for all enterprise data How Dremio’s data lakehouse enables data mesh Best practices for making data easier to discover, understand, and trust for data consumers See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #data #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #tableformat #ApacheArrow #nessie #sonar #dremiosonar #optimization #automaticdata #scalability #MSK #enterprisedata #federated
Many organizations turned to HDFS to address the challenge of storing growing volumes of semi-structured and unstructured data. However, Hadoop never managed to replace the data warehouse for enterprise-grade Business Intelligence and Reporting, and most teams ended up with separate monolithic architectures including data lakes and data warehouses, with siloed data and analytic workloads That is why data teams are increasingly considering a data lakehouse architecture that combines the flexibility and scalability of data lake storage with the data management, data governance, and enterprise-grade analytic performance of the data warehouse. In this episode, Jorge A. Lopez, Product Specialist for Analytics at AWS, and Dremio's Jeremiah Morrow will discuss best practices for modernizing analytic workloads from Hadoop to an open data lakehouse architecture, including: - Choosing the right storage solution for your data lakehouse, and what features and functionality, such as performance, scalability reliabilty, and more, you should be evaluating. - Specific steps and best practices for gradually shifting on-premises workloads to a cloud data lakehouse while ensuring business continuity. - Consolidating data silos to achieve a complete view of your customer and operational data before, during, and after migration. See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #data #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #selfservice #compliance #dataascode #branches #tags #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #sorting #tableformat #metastore #ApacheArrow #nessie #sonar #dremiosonar #optimization #automaticdata #aws #scalability
Data silos and a lack of collaboration between teams have been long-standing challenges in data management. This is where data mesh comes into play as an architectural and organizational paradigm, providing a solution by enabling decentralized teams to work collaboratively and share data in a governed manner across the enterprise. Dremio’s semantic layer provides a particular useful tool for achieving both of these needs and in this video we will discuss: - The needs of a data mesh (Data Products, Computational Governance, Self-Service) - The open and decentralized nature of the Dremio Open Data Lakehouse - How data products can be created and shared with Dremio’s semantic layer - How governance can be architected centrally using fine-grained access rules - How to unify your data products across the enterprise - How the Dremio to Dremio connector enables sharing between domains See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #data #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #datagrowth #selfservice #compliance #arctic #dataascode #branches #tags #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #sorting #tableformat #metastore #ApacheArrow #nessie #sonar #dremiosonar #optimization #automaticdata #management
While cloud data lakes address the need to efficiently store large volumes of structured, semi-structured, and unstructured data, they have traditionally lacked the data management and data governance capabilities that have tied enterprise data teams to data warehouse architectures. In this video, learn how Dremio Arctic, a lakehouse management service, delivers automatic data optimization features that simplify data management and enable high-performance analytics directly on data in the data lake. We'll cover: - The open data lakehouse architecture, and the importance of a lakehouse management service like Dremio Arctic. - Dremio Arctic's data optimization capabilities. - How these features ensure high performance analytics and optimal storage footprint while reducing the management burden for data teams. See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #data #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioarctic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #datagrowth #selfservice #compliance #arctic #dataascode #branches #tags #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #sorting #tableformat #metastore #ApacheArrow #nessie #sonar #dremiosonar #optimization #automaticdata #management
As organizations strive to provide value faster to end users, data silos makes it difficult to provide insights on time. Learn how Dremio’s data lakehouse accelerates data delivery and discovery, without copies. In this video, you will get: - The fundamentals of the data lakehouse with Dremio and Apache Iceberg - Proven use cases for unifying data access on the lakehouse - Customer success stories See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #data #analytics #datawarehouse #datalake #dataengineers #dataarchitects #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioartic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #datagrowth #selfservice #compliance #arctic #dataascode #branches #tags #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #sorting #tableformat #metastore #ApacheArrow #nessie #dremioarctic
Many organizations are moving to a data mesh, a decentralized approach to data architecture that emphasizes domain ownership of data products. Data as code is the practice of managing data the same way software developers manage code in application development, and in a data mesh architecture it can simplify and accelerate the process of building, managing, and sharing data products. In this video, you'll learn: - Why businesses adopt a data mesh strategy, and key components of a data mesh architecture. - How data as code enables domain owners to build, manage, and share data products. - How Dremio Arctic delivers data as code functionality so domain owners can ship data products as easily as developers ship software products. See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioartic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #datagrowth #selfservice #compliance #arctic #dataascode #branches #tags #optimized #automates #datamovement #clustering #metrics #filtering #partitioning #sorting #tableformat #metastore #ApacheArrow #nessie #dremioarctic
Most users of the Hadoop platform are fed up with its high cost of operational overhead and poor performance. With innovations around open source standards, like Apache Iceberg and Arrow, the data lakehouse has emerged as the destination for companies migrating off Hadoop. In this video of Gnarly Data Waves, you will learn about: - 5 key factors to consider as you migrate off Hadoop to the Data Lakehouse - Why Apache Iceberg replaced Hive metastore - Creating a unified access layer on your data lakehouse with Dremio See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioartic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #datagrowth #selfservice #compliance #arctic #dataascode #branches #tags #optimized #automates #datamovement #zorder #clustering #metrics #filtering #partitioning #sorting #tableformat #hive #hadoop #metastore #ApacheArrow #treehive #donaldfarmer
So you want to base your data lakehouse around Apache Iceberg to take advantage of its features, performance and vast ecosystem of platform/tool compatibility. You’ll need to take your current Hive tables and convert them to Iceberg. Iceberg offers several tools to do so depending on your needs and in this video we’ll explore that migration process. - How to do an in-place migration and avoid rewriting your data - How to do a shadow migration where you can update your data’s schema and partitioning - How to move Apache Iceberg tables from one catalog to another See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioartic #datamesh #metadata #modernization #datasharing #migration #ETL #datasilos #datagrowth #selfservice #compliance #arctic #dataascode #branches #tags #optimized #automates #datamovement #zorder #clustering #metrics #filtering #partitioning #sorting #tableformat #hive
Listen to Dremio's developer advocacy and engineering teams for an installment of Apache Iceberg Office Hours. During this time we’ll have a brief Iceberg presentation on Hidden Partitioning and Partitioning transforms in Iceberg and then lots of dedicated time for Q&A on the presented topic or any other questions or guidance you’re looking for help on in learning about Apache Iceberg or architecting your data lakehouse around Apache Iceberg. Questions being asked: How can I optimize my Iceberg tables for my different use cases? What tools will best handle my ETL job to write to Iceberg? How can I control access to my Iceberg tables? How can I convert data from X into an Iceberg table? How can I get started with Iceberg in Databricks? #datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #tableau #bestpractices #dashboards #partitionevolution #metadata #icebergpartitioning #partitiontransforms #officehours
Querying 100s of petabytes of data demands optimized query speed specifically when data accumulates over time. We have to ensure that the queries remain efficient because over time you may end up with a lot of small files and your data might not be optimally organized. In this video, Dipankar will cover: Apache Iceberg table format Problems in the data lake: small files, unorganized files Techniques such as: partitioning, compaction, metrics filtering Overlapping metrics problem Solving it using sorting, Z-order clustering See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN
#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioartic #datamesh #metadata #modernization #datasharing #datagovernance #ETL #datasilos #datagrowth #selfservice #compliance #arctic #dataascode #branches #tags #optimized #automates #datamovement #zorder #clustering #metrics #filtering #partitioning #sorting #tableformat
The data lakehouse is quickly emerging as the ideal data architecture because it combines the flexibility and scalability of data lakes with the data management, data governance, and data analytics capabilities of data warehouses. Table formats bring many of the “house” features to the data lakehouse. Apache Iceberg is a truly open table format that is built for easy management and high performance analytics on the largest data volumes in the world. In this video, we’ll discuss: - Why open table formats are fundamental to building a data lakehouse - How Fivetran automates data movement and helps organizations easily move data from various sources to their Amazon S3 data lake in Apache Iceberg tables. - How Dremio & Fivetran simplify your data lakehouse architecture while providing high performance and ease of use. See all upcoming episodes: https://www.dremio.com/gnarly-data-wa... Connect with us! Twitter: https://bit.ly/30pcpE1 LinkedIn: https://bit.ly/2PoqsDq Facebook: https://bit.ly/2BV881V Community Forum: https://bit.ly/2ELXT0W Github: https://bit.ly/3go4dcM Blog: https://bit.ly/2DgyR9B Questions?: https://bit.ly/30oi8tX Website: https://bit.ly/2XmtEnN#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioartic #datamesh #metadata #modernization #datasharing #datagovernance #ETL #datasilos #datagrowth #selfservice #compliance #arctic #dataascode #branches #tags #fivetran #automates #datamovement #amazons3
As data lakes become the primary destination for growing volumes of customer and operational data, data teams need tools and processes that ensure data quality and consistency across data consumers and use cases. Join Dremio’s Jeremiah Morrow and Alex Merced as they discuss the emergence of data as code for data management, its benefits for data teams, and how Dremio customers are using it to deliver access to a consistent and accurate view of data in their data lakes.
In this video on Gnarly Data Waves - Managing your data as code with Dremio Arctic, you will learn about:
- Why data as code is necessary for ensuring consistency and data quality for large data lakes.
- How Dremio Arctic uses Git-like concepts such as branches, tags, and commits to make data management easy.
- Some high value use cases for data as code.
See all upcoming episodes: https://www.dremio.com/gnarly-data-waves/?utm_medium=social-free&utm_source=youtube&utm_term=GDWEP8&utm_content=gdw-OD&utm_campaign=gdw-EP8
Connect with us!
Twitter: https://bit.ly/30pcpE1
LinkedIn: https://bit.ly/2PoqsDq
Facebook: https://bit.ly/2BV881V
Community Forum: https://bit.ly/2ELXT0W
Github: https://bit.ly/3go4dcM
Blog: https://bit.ly/2DgyR9B
Questions?: https://bit.ly/30oi8tX
Website: https://bit.ly/2XmtEnN
#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #dremioartic #datamesh #metadata #modernization #datasharing #datagovernance #ETL #datasilos #datagrowth #selfservice #compliance #artic #dataascode #branches #tags
Most companies use Hadoop for big data analytical workloads. The problem is, on-premises Hadoop deployments have failed to deliver business value after it is implemented. Over time, the high cost of operations and poor performance places a limitation on an organization’s ability to be agile. As a result, data platform teams are looking to modernize their Hadoop workloads to the data lakehouse.
In this video, learn about:
As enterprise data platforms look to operate at a more efficient level, they face the pressure to pivot their data management strategies. The increasing volume of data, demand for self-service analytics that meets compliance requirements, and complexity of data distribution channels are all factors to consider when making a business case. In this video, we will cover the three-year Total Economic Impact™ of the data lakehouse and quantifiable benefits to productivity across all teams. You will learn about: - Key challenges organizations face with explosive data growth and data silos - Increasing team productivity and focusing more on high-value projects - Reducing data storage costs and retiring complicated ETL processes
See all upcoming episodes: https://www.dremio.com/gnarly-data-wa...
Join the Dremio developer advocacy and engineering teams for an installment of Apache Iceberg Office Hours. In this video, we’ll have a brief Iceberg presentation on Hidden Partitioning and Partitioning transforms in Iceberg and then lots of dedicated time for Q&A on the presented topic or any other questions or guidance you’re looking for help on in learning about Apache Iceberg or architecting your data lakehouse around Apache Iceberg. Examples of questions you can come to ask: How can I optimize my Iceberg tables for my different use cases? What tools will best handle my ETL job to write to Iceberg? How can I control access to my Iceberg tables? How can I convert data from X into an Iceberg table? How can I get started with Iceberg in Databricks? #datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #dremio #tableau #bestpractices #dashboards #partitionevolution #metadata #icebergpartitioning #partitiontransforms #officehours #iceberg
Tableau is a visual analytics platform that helps more people in organizations see and understand their data. Dremio helps Tableau users accelerate access to data, including cloud data lakes, and it can dramatically improve query performance, delivering analytics for every data consumer at interactive speed. In this video, we'll cover: - how the Dremio open data lakehouse connects Tableau users directly to data lake storage and other data repositories,
- how reflections accelerate query performance for ad hoc analysis and interactive dashboards, and
- how the Dremio semantic layer extends self
-service capabilities beyond the visualization layer, so anyone can join and query data easily.
VIDEO ON YOUTUBE: https://www.youtube.com/watch?v=8fzYLgKHIj0
#datalakehouse #analytics #datawarehouse #datalake #opendatalakehouse #gnarlydatawaves #apacheiceberg #shorts #dremio #tableau #bestpractices #dashboards #optimizing #selfservice
Iceberg has been gaining wide adoption in the industry as the defacto open standard for data lakehouse table formats. Join us as we help you learn the options and strategies you can employ when migrating tables from Delta Lake to Apache Iceberg. We’ll cover:
PRESENTATION ON YOUTUBE: https://youtu.be/11p3AaPduos
Apache Iceberg FAQ: https://www.dremio.com/blog/apache-iceberg-faq/
Apache Iceberg 101: https://www.dremio.com/subsurface/apache-iceberg-101-your-guide-to-learning-apache-iceberg-concepts-and-practices/
Dashboards are the backbone of an organization’s decision-making process. Join Dremio Developer Advocate Dipankar Mazumdar to learn how to easily migrate a BI dashboard (Apache Superset) to your data lakehouse for faster insights.
PRESENTATION ON YOUTUBE: https://youtu.be/1oG1WPlWthc
Blog on Migrating to a Superset Dashboard: https://www.dremio.com/blog/5-easy-steps-to-migrate-an-apache-superset-dashboard-to-your-lakehouse/
#datalakehouse #data #analytics #datawarehouse #datalake #datamesh #dataengineers #dataarchitects #opendatalakehouse #governance #infrastructure #datadomainstructure #implementation #BI #gnarlydatawaves #superset #apache #migrate
Every organization is working to empower their business users with data and insights, but data is siloed, hard to discover, and slow to access. With Dremio, data teams can easily connect to all of their data sources, define and expose the data through a business-friendly user experience, and deliver sub-second queries with our query acceleration technologies.
PRESENTATION ON YOUTUBE: https://youtu.be/XYpNnvR0Vog
Below are additional resources that might find helpful:
- What is a data lakehouse? = https://www.dremio.com/blog/what-is-a-data-lakehouse/https://www.dremio.com/blog/what-is-a...
- Apache Iceberg 101 = https://www.dremio.com/blog/what-is-a-data-lakehouse/https://www.dremio.com/subsurface/apa...
- The Path to Self-Service Analytics on the Data Lake = https://hello.dremio.com/rs/321-ODX-117/images/WP-Dremio-The-Path-to-Self-Service-Analytics-on-the-Data-Lake.pdf
#datalakehouse #data #analytics #datawarehouse #datalake #dataengineers #dataarchitects #opendatalakehouse #governance #infrastructure #dremiocloud #dremiotestdrive #openlakehouse
En liten tjänst av I'm With Friends. Finns även på engelska.